## #StackBounty: #regression #sampling #econometrics #instrumental-variables #sampling-distribution Question about the conceptual sampling…

### Bounty: 100

in this paper: Do Political Protests Matter? Evidence from the Tea Party Movement*, the authors use rainfall on the day of the tea party protests as a source of plausibly exogenous variation in rally attendance, i.e. an as an instrumental variable.

I have a question conceptually about this- what exactly in a scenario like this doe we think about with the sampling distribution? The way i see it there are two potential ways to think about the conceptual sampling distribution:

1. rainfall fell as it did across the U.S. on the day of the protests. take that distribution as fixed. Now given how rainfall actually fell on that day, we can think of resampling and forming a distribution.
This would also have implications such as checking rainfall on that day and whether its uncorrelated with observables providing strong evidence of the identifying assumptions.

2. Rainfall is a part of the underlying dgp we are modeling with an equation like $$y = beta Rainfall + epsilon$$. So it is not just how rainfall happened to fall on that day as fixed, but the idea would be if we hypothetically went back and time and let the day play out again and again, rainfall would fall different ways each time, and this would generated sampling variability. In this case then, what matters is if the geoclimactic determinants of rainfall is a process that ‘assigns’ rain, and in each iteration/resampling- i.e. going back and starting the day over again- the county assignment of rainfall would be different. if this is the case, then looking at rainfall across time would be important to show that the process generating rain doesnt systematically correlate with determinants of y.

I hope those two ideas made sense, mainly so that someone can correct my logic or point me in the right direction for thinking of these types of things. Are one of the two the ‘correct’ way of thinking about the sampling distribution?

Get this bounty!!!

## #StackBounty: #interaction #econometrics #instrumental-variables #endogeneity Interaction with endogenous variables in the first stage

### Bounty: 50

I am working with industry level data and trying to solve an issue with omitted variables bias, by using an instrument. The problem with my instrument is that it only varies relatively little. I.e. for most my industries its value is the same. Or in other words it only varies by groups encompassing several industries.

One of my teachers told me that I can fix that problem by putting additional explanatory variables in the first stage and interacting them with the instrument. And these variables do not have to be exogenous (with regards to my outcome variable).

The following is what I have thought about this, if someone has a better idea how solve the problem I describe above, I would very much appreciate that too!

So basically, first I was thinking of this:

Equation of interest: $$y_i = alpha_0 + alpha_1 x_i + alpha_2 mathbf{X}+ epsilon_i$$

Where $$y_i$$ is the outcome, $$alpha_1$$ the coefficient of interest, $$x_i$$ is the endogenous variable, $$mathbf{X}$$ are covariates.

First stage: $$x_i = beta_0 + beta_1 z_i + beta_2 w_i + beta_3 z_i * w_i + beta_4 mathbf{X} + eta_i$$
To get $$hat x_i$$, $$z_i$$ is an instrument that only has an effect on $$y_i$$ through $$x_i$$, $$w_i$$ is a covariate correlated with $$epsilon_i$$

Second stage: $$y_i = gamma_1 + alpha_1 hat x_i + gamma_2 w_i + gamma_3 mathbf{X} + e_i$$

Then I thought I must have understood something wrong, because this seems kind of weird. But now I read a paper – Nizalova, Murtazashvili (2014) – (https://www.degruyter.com/document/doi/10.1515/jem-2013-0012/html) that explains why interaction effects of one exogenous and one endogenous variable are still consistent.

And another paper – Bun, Harrison (2019) – (https://www.tandfonline.com/doi/full/10.1080/07474938.2018.1427486) that argues similarly: specifically they write:

… we have that, even if we have an endogenous regressor x, the OLS estimator of the ceofficient $$beta_{xw}$$ is consistent and standard heteroskeasticity-robust OLS inference applies.

and

… we show that endogeneity bias can be reduced to zero for the OLS estimator as far as the interaction term is concerned.

Does this mean what I outlined above works? I.e. the interaction term is not endogenous (wrt to $$y_i$$) eventhough $$W_i$$ is?

Get this bounty!!!

## #StackBounty: #regression #econometrics #intuition #instrumental-variables #endogeneity Question about Instrumental variables, endogene…

### Bounty: 50

I have seen his notation to describe the Instrumental Variable framework, and I wish to make sure I understand it. Y is the dependent variable, x is treatment, and z is the instrument:

$$y = f(x,epsilon)$$

$$x = g(z,eta)$$

and the endogeneity structure is defined as: $$cov(epsilon,eta)neq0$$, $$cov(z,epsilon)=0$$, $$cov(z,eta)=0$$

I want to make sure I understand what this is saying.

1. First, is any variable z that can fit this an instrument?

2. If I am say approximating these functions with linear equations, that $$x = pi z + eta$$, is this saying we can partition the entire variation of x as the variation explained by z and then all the remaining variation $$eta$$, and the endogeneity can be expressed as $$cov(epsilon,eta)neq0$$? I am confused because usually this is simply expressed as $$cov(x,epsilon)neq0$$, and I am not familiar with writing this all in terms of errors. is this the same since I can just plug in the model of x as $$cov(pi z + eta,epsilon) = cov(eta,epsilon)$$ given the exogeneity of z?

3. Is this equivalent as saying there exists some subset of variables, $$rin epsilon$$ and $$r in eta$$, i.e. omitted variables that determine x and determine y?

Get this bounty!!!

## #StackBounty: #generalized-linear-model #econometrics #instrumental-variables #marginal-effect #ordered-probit Difference in intuition:…

### Bounty: 50

this is a follow-up to this question.

I wanted to estimate using Stata’s cmp command a system of 2 equations: an ordered probit and a linear equation.

1 – linear model: $$y = alpha + beta z + epsilon_1$$.
2 – ordered probit: $$z^* = gamma x + epsilon_2 \ z = j quad alpha_{j-1} leq z^* leq alpha_j quad j in {-4, -3, dots 3, 4}$$

As the linked question asked, I wanted to derive marginal effects of $$x$$ on $$y$$. Since the margins command in Stata wouldn’t account for this indirect link between the two variables ($$x to z to y$$), I asked the package’s author for a possible alternative. He suggested using $$z$$‘s linear predictor on the first equation. (in case you know cmp, it would be as in `cmp(y = x#) (x = z), vce(robust) ind(\$cmp_cont \$cmp_oprobit) nolr`).

This seems like using $$z$$ as an instrument for $$x$$. I have then a few questions:

1- Is that so?

2- What would be the difference in intuition from the approaches? Is there a way to think about which one fits best?

Get this bounty!!!

## #StackBounty: #econometrics #intuition #instrumental-variables Intuitive understanding of instrumental variables for natural experiments

### Bounty: 50

I am wondering if my understanding of Instrumental vairables to exploit natural experiments is correct, or if I am misunderstanding something.

Is the logic as follows: by using an instrument, you are now comparing the outcomes of those who recieved higher levels of treatment because they had higher exposure to the instrument to those who received lower levels of treatment because they had lower exposure to the instrument, but these latter units would have recieved higher treatment had they been more exposed to the instrument?

so should I think intuitively as if it is to some degree a random experiment on a subset of units?

Get this bounty!!!

## #StackBounty: #econometrics #instrumental-variables #treatment-effect #derivative #marginal-effect How to calculate and interpret a mar…

### Bounty: 100

I am working on the intuition behind local instrumental variables (LIV), also known as the marginal treatment effect (MTE), developed by Heckman & Vytlacil. I have worked some time on this and would benefit from solving a simple example. I hope I may get input on where my example goes awry.

As a starting point the standard local average treatment effect (LATE) is the treatment among individuals induced to take treatment by the instrument ("compliers"), while MTE is the limit form of LATE.

A helpful distinction between LATE and MTE is found between the questions:

• LATE: What is the difference in the treatment effect between those who are more likely to receive treatment compared to others?
• MTE: What is the difference in the treatment effect between those who are marginally more likely to receive treatment compared to others?

In revised form, the author states:

LATE and MTE are similar, except that LATE examines the
difference in outcomes for individuals with different average
treatment probability whereas MTE examines the derivative.
More specifically, MTE aims to answer what the is the average
effect for people who are just indifferent between receiving treatment
or not at a given value of the instrument.

The use of "marginally" and "indifferent" is key and what it specifically implies in this context eludes me. I can’t find an explanation for what these terms imply here.

Generally, I am used to thinking about the marginal effect as the change in outcome with a one unit change in the covariate of interest (discrete variable) or the instantenous change (continuous variable) and indifference in terms of indifference curves (consumer theory).

Aakvik et al. (2005) state:

MTE gives the average effect for persons who are indifferent between
participating or not for a given value of the instrument … [MTE] is
the average effect of participating in the program for people who are
on the margin of indifference between participation in the program
$$D=1$$ or not $$D=0$$ if the instrument is externally set … In brief,
MTE identifies the effect of an intervention on those induced to
change treatment states by the intervention

While Cornelissen et al. (2016) writes:

… MTE is identified by the derivate of the outcome with respect to
the change in the propensity score

From what I gather the MTE is, then, the change in outcome with the change in the probability of receiving treatment, although I am not sure if this is correct. If it is correct I am not sure how to argue for policy or clinical relevance.

Example

To understand the mechanics and interpretation of MTE, I have set up a simple example that starts with the MTE estimator:

$$MTE(X=x, U_{D}=p) = frac{partial E(Y | X=x, P(Z)=p)}{partial p}$$

Where $$X$$ is covariates of interest, $$U_{D}$$ is the "unobserved distaste for treatment" (another term frequently used but not explained at length), $$Y$$ is the outcome, and $$P(Z)$$ is the probability of treatment (propensity score). I apply this to the effect of college on earnings.

We want to estimate the MTE of college ($$D=(0,1)$$) on earnings ($$Y>0$$), using the continous variable distance to college ($$Z$$) as the instrument. We start by obtaining the propensity score $$P(Z)$$, which I read as equal to the predicted value of treatment from the standard first stage in 2SLS:

$$D= alpha + beta Z + epsilon$$

$$=hat{D}=P(Z)$$

Now, to understand how to specifically estimate MTE, it would be helpful to think of the MTE for a specific set of observations defined by specific values of $$X$$ and $$P(Z)$$. Suppose there is only one covariate ($$X$$) necessary to condition on and that for the specific subset at hand we have $$X=5$$ and $$P(Z)=.6$$. Consequently, we have

$$MTE(5, .6) = frac{partial E(Y | X=5, P(Z)=.6)}{partial .6}$$

Suppose further that $$Y$$ for the subset of observations defined by $$(X=5,P(Z)=.6)$$ is 15000,

$$MTE(5, .6) = frac{partial 15000}{partial .6}$$

Question

My understanding of this partial derivate is that the current set up is invalid, and substituting $$partial .6$$ with $$partial p$$ would simply result in 0 as it would be the derivate of a constant. I therefore wonder whether anyone has input on where I went wrong, and how I might arrive at MTE for this simple example.

As for the interpretation, I would interpret the MTE as the change in earnings with a marginal increase in the probability of taking college education among the subset defined by $$(X=5,P(Z)=.6)$$.

Get this bounty!!!

## #StackBounty: #r #categorical-data #interaction #instrumental-variables #2sls A 2SLS when the instrumented variable has two interaction…

### Bounty: 50

I am using `ivreg` and `ivmodel` in `R` to apply a 2SLS.

I would like to instrument one variable, namely $$x_1$$, present in two interaction terms. In this example $$x_1$$ is a factor variable. The regression is specified in this manner because the ratio between $$a$$ and $$b$$ is of importance.

$$y = ax_1 x_2 + bx_1x_3 + cx_4 + e$$

For this instrumented variable I have two instruments $$z_1$$ and $$z_2$$. For both the following causal diagram is applicable (Z only has an indirect effect on Y through X). What is for this problem the correct way to instrument $$x_1$$?

# In the data

Translated to some (fake) sample data the problem looks like:

$$happiness = a(factor:income) + b(factor:sales) + c(educ) + e$$
$$=$$
$$(y = ax_1 x_2 + bx_1x_3 + cx_4 + e)$$

Where the instrument $$z_1$$ is `urban` and $$z_2$$ is `size`. Here I however become to get confused about how to write the regression.

# For the first stage:

What is my dependent variable here?

# For the second stage, should I do:

$$happiness = a(urban:income) + b(urban:sales) + c(educ) + e$$
$$happiness = a(size:income) + b(size:sales) + c(educ) + e$$

Or should I just do:

$$happiness = urban(a:income+b:sales) + c(educ) + e$$
$$happiness = size$$
(a:income+b:sales) + c(educ) + e\$\$

Nevertheless, how should I specify this in `R` ?

``````library(data.table)
library(ivmodel)
library(AER)
panelID = c(1:50)
year= c(2001:2010)
country = c("NLD", "BEL", "GER")
urban = c("A", "B", "C")
indust = c("D", "E", "F")
sizes = c(1,2,3,4,5)
n <- 2
library(data.table)
set.seed(123)
DT <- data.table(panelID = rep(sample(panelID), each = n),
country = rep(sample(country, length(panelID), replace = T), each = n),
year = c(replicate(length(panelID), sample(year, n))),
some_NA = sample(0:5, 6),
Factor = sample(0:5, 6),
industry = rep(sample(indust, length(panelID), replace = T), each = n),
urbanisation = rep(sample(urban, length(panelID), replace = T), each = n),
size = rep(sample(sizes, length(panelID), replace = T), each = n),
income = round(runif(100)/10,2),
Y_Outcome= round(rnorm(10,100,10),2),
sales= round(rnorm(10,10,10),2),
happiness = sample(10,10),
Sex = round(rnorm(10,0.75,0.3),2),
Age = sample(100,100),
educ = round(rnorm(10,0.75,0.3),2))
DT [, uniqueID := .I]                                                         # Creates a unique ID
DT <- as.data.frame(DT)
``````

To make it slightly easier for someone to help who is not familiar with the packages, I have added how the structure of the two packages I use looks.

The structure of the second stage of `ivreg` is as follows:

``````second_stage <- ivreg(Happiness ~ factor:income + factor:sales + educ | urban:income + urban:sales + educ, data=DT)
``````

The structure for `ivmodel` is:

``````second_stage<- ivmodel(Y=DT$$Happiness,D=DT$$factor,Z=DT[,c("urban","size")],X=DT\$educ, na.action = na.omit)
``````

Any help with figuring out how to do this properly would be greatly appreciated!

Get this bounty!!!

## #StackBounty: #large-data #instrumental-variables #hausman Interpretation of the Hausman test (overidentification in relation to IV&#39…

### Bounty: 50

I am using survey data with a huge amount of observations, such as the World Value Surveys. Large sample sizes are obviously very nice, but I have have encountered some downsides as well.

To give an example, in almost every econometric model I specify, about 90% of the variables is highly significant. So I will have to decide whether, in addition to an estimate being statistically significant, it is also economically significant, which is not always an easy thing to do.

The biggest issue is however, that when resorting to Instrumental Variables, the Hausman test for over identification is always very, very, very significant. See to this extent THIS POST.

How do I deal with with this consequence of large sample sizes?

The only thing I can think of is to reduce the sample size. This however seems a very arbitrary way to get the test statistic down.

Get this bounty!!!

## #StackBounty: #large-data #instrumental-variables #hausman Interpretation of the Hausman test (overidentification in relation to IV&#39…

### Bounty: 50

I am using survey data with a huge amount of observations, such as the World Value Surveys. Large sample sizes are obviously very nice, but I have have encountered some downsides as well.

To give an example, in almost every econometric model I specify, about 90% of the variables is highly significant. So I will have to decide whether, in addition to an estimate being statistically significant, it is also economically significant, which is not always an easy thing to do.

The biggest issue is however, that when resorting to Instrumental Variables, the Hausman test for over identification is always very, very, very significant. See to this extent THIS POST.

How do I deal with with this consequence of large sample sizes?

The only thing I can think of is to reduce the sample size. This however seems a very arbitrary way to get the test statistic down.

Get this bounty!!!

## #StackBounty: #large-data #instrumental-variables #hausman Interpretation of the Hausman test (overidentification in relation to IV&#39…

### Bounty: 50

I am using survey data with a huge amount of observations, such as the World Value Surveys. Large sample sizes are obviously very nice, but I have have encountered some downsides as well.

To give an example, in almost every econometric model I specify, about 90% of the variables is highly significant. So I will have to decide whether, in addition to an estimate being statistically significant, it is also economically significant, which is not always an easy thing to do.

The biggest issue is however, that when resorting to Instrumental Variables, the Hausman test for over identification is always very, very, very significant. See to this extent THIS POST.

How do I deal with with this consequence of large sample sizes?

The only thing I can think of is to reduce the sample size. This however seems a very arbitrary way to get the test statistic down.

Get this bounty!!!