in this paper: Do Political Protests Matter? Evidence from the Tea Party Movement*, the authors use rainfall on the day of the tea party protests as a source of plausibly exogenous variation in rally attendance, i.e. an as an instrumental variable.
I have a question conceptually about this- what exactly in a scenario like this doe we think about with the sampling distribution? The way i see it there are two potential ways to think about the conceptual sampling distribution:
- rainfall fell as it did across the U.S. on the day of the protests. take that distribution as fixed. Now given how rainfall actually fell on that day, we can think of resampling and forming a distribution.
This would also have implications such as checking rainfall on that day and whether its uncorrelated with observables providing strong evidence of the identifying assumptions.
- Rainfall is a part of the underlying dgp we are modeling with an equation like $y = beta Rainfall + epsilon$. So it is not just how rainfall happened to fall on that day as fixed, but the idea would be if we hypothetically went back and time and let the day play out again and again, rainfall would fall different ways each time, and this would generated sampling variability. In this case then, what matters is if the geoclimactic determinants of rainfall is a process that ‘assigns’ rain, and in each iteration/resampling- i.e. going back and starting the day over again- the county assignment of rainfall would be different. if this is the case, then looking at rainfall across time would be important to show that the process generating rain doesnt systematically correlate with determinants of y.
I hope those two ideas made sense, mainly so that someone can correct my logic or point me in the right direction for thinking of these types of things. Are one of the two the ‘correct’ way of thinking about the sampling distribution?