## #StackBounty: #distributions #stochastic-processes #nonlinear-regression What distribution may electric vehicle battery capacity data f…

### Bounty: 50

I’m trying to find out the shape of the curve that reflects electro vehicles battery degradation data (depending on cumulative travelling distance). The red line on the plot doesn’t seem a perfect fit. Is it stretched exponential of some sort?

So, as I do not fully understand the nature of the process, I cannot figure out what would be appropriate distributions for such continuous variable as remaining capacity.

Any tips will be much appreciated.

Get this bounty!!!

## #StackBounty: #machine-learning #correlation #nonlinear-regression #r-squared Generalization of Adjusted R-Squared to Nonlinear Models

### Bounty: 50

We are in the prototypical machine learning setting. We have a set of random variables $$X=X_1,ldots,X_p$$ representing predictors, and a random variable $$Y$$ representing the dependent variable. We assume that $$Y=f(X)+epsilon$$ where $$epsilon$$ is a random variable with mean $$0$$, and $$f$$ is some function.

We define the amount of variance explained as:

$$1-frac{Var(epsilon)}{Var(Y)}.$$

I am wondering how to best estimate the amount of variance explained in general but most importantly for the case of only having one predictor ($$p=1$$)

For the special case of $$f$$ being linear, and both $$X$$, and $$epsilon$$ being Gaussian, this problem has received a lot of attention in statistics and lead to the development of adjusted $$R^2$$.

Dropping those assumptions, estimating the predictive ability of a learning algorithm, as done in machine learning, seems to be closely related but also a slightly different question. In particular, the prediction error in machine learning consists of $$text{expected prediction error} = text{bias}^2 + text{variance} + text{irreducible error}.$$ Under $$L^2$$ loss, $$Var(epsilon)=text{irreducible error}$$. Thus, the prediction error is bigger than $$Var(epsilon)$$. From a machine learning perspective, $$Var(epsilon)$$ essentially quantifies how well the optimal prediction function would perform.

Two naive ideas, which both seem to work remarkably well for $$p=1$$ are as follows.

Statistical approach: Do polynomial regression with let’s say a polynomial of degree 10, and then calculate adjusted R-squared as usual. This has two problems. First, one has to decide for a degree of the polynomial. Second, it assumes $$f$$ is in the chosen set of polynomial functions

Machine Learning approach: Use a flexible learner. I used a support vector machine with a radial basis function kernel. Then act as if $$text{expected prediction error} = text{irreducible error}$$. Thus, just use the estimate of prediction error, as obtained from, for example, cross-validation, as the estimate for $$Var(epsilon)$$. For a flexible learner, this should be a consistent estimator for $$Var(epsilon)$$, since with sample size $$N rightarrow infty$$, both $$bias^2$$ and variance should converge to $$0$$. To get $$bias^2$$ to converge to $$0$$ was the reasoning behind choosing a flexible learner. As an estimator for $$Var(Y)$$ just use the normal unbiased variance estimator. This approach could possibly be improved by estimating $$bias^2$$ and variance, which seems to be possible, and subtracting those values from the estimated prediction error

Get this bounty!!!