I am currently working on a project which aims to predict the Monthly volatility of the S&P 500 index with the aid of Multilayer Perceptrons (MLP). Actually, I am trying to reproduce some of the results shown here: https://beta.vu.nl/nl/Images/werkstuk-ladokhin_tcm235-91388.pdf
I am also trying to use the same network architectures the author has used, including number of input nodes and etc.
The above document states that data has been divided as below:
- Training set: December 1978 to October 2000 (263 observations)
- Testing set: November 2000 to November 2008 (97 observations)
However, I am insecure as to how the proper training and testing procedure should be, and that is why I am here.
How would I do it?
- Compute the monthly volatility for each month since December 1978 until November 2008.
- Having computed those values, I would separate monthly volatility values which correspond to training values and testing values.
- Now I would aim to create the target arrays, which carry the correct numerical answer for each month. However, I am not sure how I would do it.. Since my network contains one output node, I would probably say my target array looks like
target label = [observed_volatility]for each month of the training set. Is that correct?
- Having trained the network, I would compare the output values obtained that were generated versus the observed volatility for the corresponding month in the testing set.
Am I in the right track? My biggest issue regards the way target arrays should be constructed and how proper training and testing should take place in this case.
Quick observation: I dont know if the term "target array" is universal or not. However, by the aforementioned word, I am trying to refer to the array which contains the correct answer for a certain label, which in my case turns out to be the volatility of a certain month.
Thanks in advance, Lucas