# #StackBounty: #r #categorical-data #interaction #instrumental-variables #2sls A 2SLS when the instrumented variable has two interaction…

### Bounty: 50

I am using `ivreg` and `ivmodel` in `R` to apply a 2SLS.

I would like to instrument one variable, namely $$x_1$$, present in two interaction terms. In this example $$x_1$$ is a factor variable. The regression is specified in this manner because the ratio between $$a$$ and $$b$$ is of importance.

$$y = ax_1 x_2 + bx_1x_3 + cx_4 + e$$

For this instrumented variable I have two instruments $$z_1$$ and $$z_2$$. For both the following causal diagram is applicable (Z only has an indirect effect on Y through X). What is for this problem the correct way to instrument $$x_1$$?

# In the data

Translated to some (fake) sample data the problem looks like:

$$happiness = a(factor:income) + b(factor:sales) + c(educ) + e$$
$$=$$
$$(y = ax_1 x_2 + bx_1x_3 + cx_4 + e)$$

Where the instrument $$z_1$$ is `urban` and $$z_2$$ is `size`. Here I however become to get confused about how to write the regression.

# For the first stage:

What is my dependent variable here?

# For the second stage, should I do:

$$happiness = a(urban:income) + b(urban:sales) + c(educ) + e$$
$$happiness = a(size:income) + b(size:sales) + c(educ) + e$$

Or should I just do:

$$happiness = urban(a:income+b:sales) + c(educ) + e$$
$$happiness = size$$
(a:income+b:sales) + c(educ) + e\$\$

Nevertheless, how should I specify this in `R` ?

``````library(data.table)
library(ivmodel)
library(AER)
panelID = c(1:50)
year= c(2001:2010)
country = c("NLD", "BEL", "GER")
urban = c("A", "B", "C")
indust = c("D", "E", "F")
sizes = c(1,2,3,4,5)
n <- 2
library(data.table)
set.seed(123)
DT <- data.table(panelID = rep(sample(panelID), each = n),
country = rep(sample(country, length(panelID), replace = T), each = n),
year = c(replicate(length(panelID), sample(year, n))),
some_NA = sample(0:5, 6),
Factor = sample(0:5, 6),
industry = rep(sample(indust, length(panelID), replace = T), each = n),
urbanisation = rep(sample(urban, length(panelID), replace = T), each = n),
size = rep(sample(sizes, length(panelID), replace = T), each = n),
income = round(runif(100)/10,2),
Y_Outcome= round(rnorm(10,100,10),2),
sales= round(rnorm(10,10,10),2),
happiness = sample(10,10),
Sex = round(rnorm(10,0.75,0.3),2),
Age = sample(100,100),
educ = round(rnorm(10,0.75,0.3),2))
DT [, uniqueID := .I]                                                         # Creates a unique ID
DT <- as.data.frame(DT)
``````

To make it slightly easier for someone to help who is not familiar with the packages, I have added how the structure of the two packages I use looks.

The structure of the second stage of `ivreg` is as follows:

``````second_stage <- ivreg(Happiness ~ factor:income + factor:sales + educ | urban:income + urban:sales + educ, data=DT)
``````

The structure for `ivmodel` is:

``````second_stage<- ivmodel(Y=DT$$Happiness,D=DT$$factor,Z=DT[,c("urban","size")],X=DT\$educ, na.action = na.omit)
``````

Any help with figuring out how to do this properly would be greatly appreciated!

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.