#StackBounty: #python #machine-learning #keras #deep-learning #forecasting How to handle Shift in Forecasted value

Bounty: 50

I implemented a forecasting model using LSTM in Keras. The dataset is 15mints seperated and I am forecasting for 12 future steps.

The model performs good for the problem. But there is a small problem with the forecast made. It is showing a small shift effect. To get a more clear picture see the below attached figure.

enter image description here

How to handle this problem.? How the data must be transformed to handle this kind of issue.?

The model I used is given below

init_lstm = RandomUniform(minval=-.05, maxval=.05)
init_dense_1 = RandomUniform(minval=-.03, maxval=.06)

model = Sequential()

model.add(LSTM(15, input_shape=(X.shape[1], X.shape[2]), kernel_initializer=init_lstm, recurrent_dropout=0.33))

model.add(Dense(1, kernel_initializer=init_dense_1, activation='linear'))

model.compile(loss='mae', optimizer=Adam(lr=1e-4))

history = model.fit(X, y, epochs=1000, batch_size=16, validation_data=(X_valid, y_valid), verbose=1, shuffle=False)

I made the forecasts like this

my_forecasts = model.predict(X_valid, batch_size=16)

Time series data is transformed to supervised to feed the LSTM using this function

# convert time series into supervised learning problem
def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
    n_vars = 1 if type(data) is list else data.shape[1]
    df = DataFrame(data)
    cols, names = list(), list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
        names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)]
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
        if i == 0:
            names += [('var%d(t)' % (j+1)) for j in range(n_vars)]
        else:
            names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)]
    # put it all together
    agg = concat(cols, axis=1)
    agg.columns = names
    # drop rows with NaN values
    if dropnan:
        agg.dropna(inplace=True)
    return agg

super_data = series_to_supervised(data, 12, 1)

My timeseries is a multi-variate one. var2 is the one that I need to forecast. I dropped the future var1 like

del super_data['var1(t)']

Seperated train and valid like this

features = super_data[feat_names]
values = super_data[val_name]

ntest = 3444

train_feats, test_feats = features[0:-n_test], features[-n_test:]
train_vals, test_vals = values [0:-n_test], values [-n_test:]

X, y = train_feats.values, train_vals.values
X = X.reshape(X.shape[0], 1, X.shape[1])

X_valid, y_valid = test_feats .values, test_vals .values
X_valid = X_valid.reshape(X_valid.shape[0], 1, X_valid.shape[1])

I haven’t made the data stationary for this forecast. I also tried taking difference and making the model as stationary as I can, but the issue remains the same.

I have also tried different scaling ranges for the min-max scaler, hoping it may help the model. But the forecasts are getting worsened.

Other Things I have tried

* Tried other optimizers
* Tried mse loss and custom log-mae loss functions
* Tried varying batch_size
* Tried adding more past timesteps

I understand that the model is replicating the last known value to it, thereby minimizing the loss.

The validation and training loss remains low enough through out the training process. This makes me think whether I need to come up with a new loss function for this purpose.

Is that necessary.? If so what loss function should I go for.?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.