I am trying to examine how an athlete’s performance influences their articulations on Twitter on specific dimensions of research interest (e.g., use of ‘we’ personal pronoun).
I have all the tweets of over 100 athletes along with time stamp. I have a frequency count measure of ‘we’ for every tweet. For every athlete, I also have performance track record – i.e., contest participated, date for contest, and result (win/loss/draw). I would like to statistically analyze the effect of performance on the use of ‘we’ and test hypothesis such as the following:
Hypothesis. Win (loss) decreases (increases) the likelihood of using ‘we’ in tweets.
I would like to understand how to analyze such a time series data. How should I structure the data for such an analysis in R or Python? What regression models are most appropriate for such an analysis in R?