*Bounty: 50*

*Bounty: 50*

I am using `ivreg`

and `ivmodel`

in `R`

to apply a 2SLS.

I would like to instrument one variable, namely $x_1$, present in two interaction terms. In this example $x_1$ is a factor variable. The regression is specified in this manner because the ratio between $a$ and $b$ is of importance.

$$y = ax_1 x_2 + bx_1x_3 + cx_4 + e$$

For this instrumented variable I have two instruments $z_1$ and $z_2$. For both the following causal diagram is applicable (Z only has an indirect effect on Y through X).

What is for this problem the correct way to instrument $x_1$?

# In the data

Translated to some (fake) sample data the problem looks like:

$$happiness = a(factor:income) + b(factor:sales) + c(educ) + e$$

$$=$$

$$(y = ax_1 x_2 + bx_1x_3 + cx_4 + e)$$

Where the instrument $z_1$ is `urban`

and $z_2$ is `size`

. Here I however become to get confused about how to write the regression.

# For the first stage:

What is my dependent variable here?

# For the second stage, should I do:

$$happiness = a(urban:income) + b(urban:sales) + c(educ) + e$$

$$happiness = a(size:income) + b(size:sales) + c(educ) + e$$

Or should I just do:

$$happiness = urban*(a:income+b:sales) + c(educ) + e$$**
$$happiness = size*(a:income+b:sales) + c(educ) + e$$

Nevertheless, how should I specify this in `R`

?

```
library(data.table)
library(ivmodel)
library(AER)
panelID = c(1:50)
year= c(2001:2010)
country = c("NLD", "BEL", "GER")
urban = c("A", "B", "C")
indust = c("D", "E", "F")
sizes = c(1,2,3,4,5)
n <- 2
library(data.table)
set.seed(123)
DT <- data.table(panelID = rep(sample(panelID), each = n),
country = rep(sample(country, length(panelID), replace = T), each = n),
year = c(replicate(length(panelID), sample(year, n))),
some_NA = sample(0:5, 6),
Factor = sample(0:5, 6),
industry = rep(sample(indust, length(panelID), replace = T), each = n),
urbanisation = rep(sample(urban, length(panelID), replace = T), each = n),
size = rep(sample(sizes, length(panelID), replace = T), each = n),
income = round(runif(100)/10,2),
Y_Outcome= round(rnorm(10,100,10),2),
sales= round(rnorm(10,10,10),2),
happiness = sample(10,10),
Sex = round(rnorm(10,0.75,0.3),2),
Age = sample(100,100),
educ = round(rnorm(10,0.75,0.3),2))
DT [, uniqueID := .I] # Creates a unique ID
DT <- as.data.frame(DT)
```

To make it slightly easier for someone to help who is not familiar with the packages, I have added how the structure of the two packages I use looks.

The structure of the second stage of `ivreg`

is as follows:

```
second_stage <- ivreg(Happiness ~ factor:income + factor:sales + educ | urban:income + urban:sales + educ, data=DT)
```

The structure for `ivmodel`

is:

```
second_stage<- ivmodel(Y=DT$Happiness,D=DT$factor,Z=DT[,c("urban","size")],X=DT$educ, na.action = na.omit)
```

Any help with figuring out how to do this properly would be greatly appreciated!