#StackBounty: #causality #observational-study Under heterogeneous treatment effects, will the usual unconfoundedness assumption need mo…

Bounty: 50

Suppose that a set of covariates, $X_i$ follows a distribution that is conditional on another variable, $A_i$, for $i in {1, ldots N}$ individuals. For example, $X_i$ can be income, and $A_i$ can be age, defined as young, middle-age, and elderly. Then, suppose that we have:

$$
X_i mid A_i overset{iid}{sim} F
$$

Then, we say that $X_i$ is only i.i.d. within subsets defined by the age variable. Now let $Y(1),Y(0)$ denote the potential outcomes and $T$ the treatment indicator. Suppose that the joint distribution of the potential outcomes, treatment, and covariates are only i.i.d on subsets defined by $A_i$, such that

$$
(Y_i(1),Y_i(0),T_i, X_i)mid A_i overset{iid}{sim} F
$$

This scenario appears to hint at a heterogeneous treatment effect, since the distributions of the variables are potentially different given which age is it conditioned on. In example,

$$
tau_i = E[Y_i(1)-Y_i(0)]
$$

where $tau_i$ may be different than a single global $tau$.

In such an example, I am wondering how the unconfoundedness assumption needs to be modified given an observational study. For example, if we assume that the following holds,

$$
(Y_i(1),Y_i(0)) perp T_i mid X_i
$$

will it be enough to identify $tau_i$? In other words, if we have the joint distribution specification above, how will identification and estimation of the average treatment effect be impacted?


Get this bounty!!!

#StackBounty: #terminology #observational-study #communication Term for observational study in which different cohorts are compared at …

Bounty: 50

I have a dataset tracking certain outcomes in school children, with all grades being sampled every year for a number of years. There are a number of study designs possible with such a dataset. As I understand it:

Cross-sectional designs would compare, say, different grade at a fixed point in time. For example, they could compare 2018 grade 7 students, 2018 grade 8 students and 2018 grade 9 students.

Longitudinal designs would compare, a single cohort of students across time. For example, they could compare 2016 grade 6 students, 2017 grade 7 students and 2018 grade 8 students.

Is there an analogous term for a design which compares students in a given grade across calendar years?

For example, comparing 2016 year 7 students, 2017 year 7 students and 2018 year 7 students. It is somewhat longitudinal because there is a temporal comparison happening but it isn’t the same as in the longitudinal design described above. Is there common terminology one could use to distinguish between the two?

(In this question ‘grade’ is used to refer to the year level of the student in school, not to a mark on academic assessment.)


Get this bounty!!!

#StackBounty: #mathematical-statistics #causality #observational-study In the superpopulation framework of causal inference, what is th…

Bounty: 100

In the textbook “Causal Inference for Statistics” by Rubin and Imbens, the following argument is made on pg. 39:

“In part of this text we view our sample of size N as a random sample
from an infinite super-population. In that case we employ slightly
different formulations of the restric- tions on the assignment
mechanism. Sampling from the super-population generates a joint
sampling distribution on the quadruple of unit-level variables (Yi(0),
Yi(1), Wi, Xi), i = 1, . . . , N. More explicitly, we assume the
(Yi(0), Yi(1), Wi, Xi) are independently and identically distributed
draws from a distribution indexed by a global parameter.”

I am wondering why the bolded portion is needed to be assumed. In the context of observational studies what happens if 1) we have dependence, and 2) if the tuple is not jointly identically distributed.

Is this a general assumption that can be relaxed? Where do we really need it? Specifically:

1) What happens if we have dependence?

2) What happens if they are NOT identically distributed?


Get this bounty!!!