*Bounty: 50*

Define the lasso estimate $$hatbeta^lambda = argmin_{beta in mathbb{R}^p} frac{1}{2n} |y – X beta|_2^2 + lambda |beta|_1,$$ where the $i^{th}$ row $x_i in mathbb{R}^p$ of the design matrix $X in mathbb{R}^{n times p}$ is a vector of covariates for explaining the stochastic response $y_i$ (for $i=1, dots n$).

We know that for $lambda geq frac{1}{n} |X^T y|_infty$, the lasso estimate $hatbeta^lambda = 0$. (See, for instance, Lasso and Ridge tuning parameter scope.) In other notation, this is expressing that $lambda_max = frac{1}{n} |X^T y|_infty$.

For full probability under a continuous distribution on $y$, we know that for $lambda$ very small, however, the lasso estimate $hatbeta^lambda$ has no zero entries almost surely. In other words, when there is little regularization, the shrinkage doesn’t zero out any component. What is the value of $lambda$ at at which a component of $hatbeta^lambda$ is initially zero? That is, what is $$lambda_textrm{min}^{(1)} = min_{exists j textrm{ s.t.} hatbeta^lambda_j ne 0 textrm{ and } , hatbeta^{mu} = 0 , forall mu < lambda} lambda$$ equal to, as a function of $X$ and $y$? It may be easier to compute $$lambda^{(2)}*min = sup*{hatbeta^lambda_j ne 0 , forall j} lambda,$$ recognizing that this change point doesn’t have a unique interpretation since nonzero components don’t have to “stay” nonzero as $lambda$ increases.

It seems like both of these may not be available in closed form, since, otherwise, it seems like lasso computational packages would take advantage of it when determining the tuning parameter depth. Be this as it may, what can be said about these?

Get this bounty!!!