I am trying to predict the progression of disease using certain clinical data (time series data) and covariates (such as age, sex, race etc.). I am aware of the existence of mainstream machine learning and deep learning models for such prediction tasks but since clinical data are longitudinal in nature I want to leverage this and use LSTMs or RNNs (if possible) for predictions.
I have a longitudinal dataset which describes a disease progression for multiple patients (100s of patient data) each with multiple visits (~10-20 visits) at different points of time with some conclusion about the disease at each time step. My point of confusion is how to prepare this dataset for an LSTM model since most of the literature I’ve read on this topic shows data preparation only for one patient. I want to understand how will my model be affected if I
- Ignore the "multiple patients model" and arrange all the data based on only time (date and time of visit).
- Arrange data based on the patient ID first and then the date and time of visit for each patient (nested arrangement if I am clear).