#StackBounty: #r #mixed-model #fixed-effects-model #categorical-encoding The difference in interpretation between a country and a year …

Bounty: 50

I am trying to expand my knowledge about the different interpretations of combinations of fixed effects.

I am using a pooled cross section dataset with observations at the firm level. The dataset spans multiple countries over 2 years (2005-2010).

EDIT: Please see sample data below

My question is very simple. In this scenario, what is the difference of interpretation between including country and year fixed effects, country-year fixed effects, or both?

Is there a case to be made for each option when taking into account my dataset?

I read the following on another site:

When you interact state and year dummies (i.e. when you include state,
year, and state*year in the
regression, which by the way is the same as creating state-year dummies and including them in the
regression), you are assuming that the unobserved state-level heterogeneity varies over time. Also,
you are assuming the time effect to vary by state. If you include state and year separately and no
interaction, you are assuming that the unobserved state-level heterogeneity is constant over time.

If I read this, I get the feeling that it is always better to include the interactions. But in statistics, nothing seems to come for free. So what is the downside here?

Please see my thought process below (and please correct me if I am wrong):

  1. If I add a country dummy, I account for static differences per country. In other words, I control for time constant omitted variable bias.
  2. If I add a year dummy I account for trends (I would say world trends in this case).
  3. If I add a country-year dummy, I am controlling for trends that are country specific.
  4. If I add a country dummy, a year dummy and a country-year dummy, I am doing all of this at once?

For 1 and 2 I am pretty much okay.

By point 3 I begin to wonder: do I need to? Should I always include this? At what cost do I include this country-year dummy?

If I have a country-year dummy without a country and year dummy, does that make sense?

Or should I therefore put them all in? Coming to point 4..

Data

panelID= c(1:50)
year= c(2005, 2010)
country = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J")
urban = c("A", "B", "C")
indust = c("D", "E", "F")
sizes = c(1,2,3,4,5)
n <- 2
library(data.table)
library(dplyr)
set.seed(123)
DT <- data.table(   country = rep(sample(country, length(panelID), replace = T), each = n),
                    year = c(replicate(length(panelID), sample(year, n))),
                    sales= round(rnorm(10,10,10),2),
                    industry = rep(sample(indust, length(panelID), replace = T), each = n),
                    urbanisation = rep(sample(urban, length(panelID), replace = T), each = n),
                    size = rep(sample(sizes, length(panelID), replace = T), each = n))
DT <- DT %>%
group_by(country) %>%
mutate(base_rate = as.integer(runif(1, 12.5, 37.5))) %>%
group_by(country, year) %>%
mutate(taxrate = base_rate + as.integer(runif(1,-2.5,+2.5)))
DT <- DT %>%
group_by(country, year) %>%
mutate(vote = sample(c(0,1),1), 
votewon = ifelse(vote==1, sample(c(0,1),1),0))

# No interaction

summary(ivreg(sales ~ taxrate + as.factor(size) + as.factor(urbanisation) + country + as.factor(year) | 
as.factor(votewon) + as.factor(size) + as.factor(urbanisation) + country + as.factor(year), data=DT))

summary(ivreg(sales ~ taxrate + as.factor(size) + as.factor(urbanisation) + as.factor(vote) + country + as.factor(year) 
| as.factor(votewon) + as.factor(size) + as.factor(urbanisation) + as.factor(vote) + country + as.factor(year), data=DT))

# Interaction

summary(ivreg(sales ~ taxrate + as.factor(size) + as.factor(urbanisation) + country:as.factor(year) | 
as.factor(votewon) + as.factor(size) + as.factor(urbanisation) + country:as.factor(year), data=DT))

summary(ivreg(sales ~ taxrate + as.factor(size) + as.factor(urbanisation) + as.factor(vote) + country:as.factor(year) 
| as.factor(votewon) + as.factor(size) + as.factor(urbanisation) + as.factor(vote) + country:as.factor(year), data=DT))

# Both

summary(ivreg(sales ~ taxrate + as.factor(size) + as.factor(urbanisation) + country*as.factor(year) | 
as.factor(votewon) + as.factor(size) + as.factor(urbanisation) + country*as.factor(year), data=DT))

summary(ivreg(sales ~ taxrate + as.factor(size) + as.factor(urbanisation) + as.factor(vote) + country*as.factor(year) 
| as.factor(votewon) + as.factor(size) + as.factor(urbanisation) + as.factor(vote) + country*as.factor(year), data=DT))


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.