I’m having some difficulty with chaining together two models in an unusual way.
I am trying to replicate the following flowchart:
For clarity, at each timestep of
Model I am attempting to generate an entire time series from
IR[i] (Intermediate Representation) as a repeated input using
Model. The purpose of this scheme is it allows the generation of a ragged 2-D time series from a 1-D input (while both allowing the second model to be omitted when the output for that timestep is not needed, and not requiring
Model to constantly "switch modes" between accepting input, and generating output).
I assume a custom training loop will be required, and I already have a custom training loop for handling statefulness in the first model (the previous version only had a single output at each timestep). As depicted, the second model should have reasonably short outputs (able to be constrained to fewer than 10 timesteps).
But at the end of the day, while I can wrap my head around what I want to do, I’m not nearly adroit enough with Keras and/or Tensorflow to actually implement it. (In fact, this is my first non-toy project with the library.)
I have unsuccessfully searched literature for similar schemes to parrot, or example code to fiddle with. And I don’t even know if this idea is possible from within TF/Keras.
I already have the two models working in isolation. (As in I’ve worked out the dimensionality, and done some training with dummy data to get garbage outputs for the second model, and the first model is based off of a previous iteration of this problem and has been fully trained.) If I have
Model as python variables (let’s call them
model_b), then how would I chain them together to do this?
Edit to add:
If this is all unclear, perhaps having the dimensions of each input and output will help:
The dimensions of each input and output are:
(batch_size, model_a_timesteps, input_size)
(batch_size, model_a_timesteps, ir_size)
IR[i] (after duplication):
(batch_size, model_b_timesteps, ir_size)
(batch_size, model_b_timesteps, output_size)
(batch_size, model_a_timesteps, model_b_timesteps, output_size)