# #StackBounty: #machine-learning #time-series #neural-networks #data-leakage data leakage when scaling time series

### Bounty: 150

Suppose I want to forecast future values of $$y$$ past values of features $$x$$.
In this example I am using:

• the training set goes from $$t_0$$ to $$t_{15}$$
• values from $$x_{t_0}$$ to $$x_{t_{10}}$$ to forecast $$y_{t_{11}}$$
• values from $$x_{t_1}$$ to $$x_{t_{11}}$$ to forecast $$y_{t_{12}}$$
• and so on until I use $$x_{t_6}$$ to $$x_{t_{15}}$$ to forecast $$y_{t_{16}}$$

I scale the my feature $$x$$ using only data in the training set (up to $$t_{15}$$)

Nevertheless when I try to predict $$y_{t_{17}}$$ you can see from the picture below that I use some data points that have also been used for scaling.

Is this leakage?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.