*Bounty: 50*

*Bounty: 50*

I was trying to implement manually the estimation of nonparametric regression using local-linear approximation with a mixture of discrete and continuous data.

consider a simple model:

$y=f(xc,xd)$

where xc is continuous and xd is discrete

Say that I want to estimate this model non parametrically. Which one of the two following regressions is the correct one (assuming local linear estimation.

1:

$$y=a0+a1*(xc-c)+e$$

2:

$$y=a0+a1*(xc-c)+a2*xd +e$$

Assume that both models are estimated using the correct kernel weights and that xd is a dummy.

I thought the correct model was (1), but npregress in Stata uses (2). Which one would be the correct one?

Thank you

EDIT:

Perhaps a different way to ask the same question.

Say that you have a 3 variables, y, xc (continuous) and xd (discrete), and that you want to estimate a nonparametric, using local linear kernel estimation, for:

$$y=f(xc,xd)$$

Empirically, how would you estimate this model using WLS? which one is the correct specification? equation 1 or equation 2 (assuming weights are appropriately obtained)