To preface, I’m a stat noob programmer, so sorry if this is a basic question.
I have multiple (300+) exams that I had a variety of users take. The users could decide which exams to take. All the exams had different number of exam-takers (from around 30 to around 10000), different variances, and different means. The data is anonymized (I simply have a list of scores). I would like to order the exams from easiest to hardest.
I’m assuming that all the exam-takers are of relatively similar level (although if this is an assumption we can relax, that would be ideal) and this is a normal distribution. Through a little bit of googling, I found that I could calculate a
sample mean and
sample standard deviation from the data and use these 2 values (along with
n to find the
t statistic). However, I am unclear as to how to proceed.
I imagine there will be some hypothesis test to see which is larger. However, this seems to be computationally inefficient. What is the best way to do this so that I may rank these exams from easiest to hardest (ie. without comparing each one on a one to one basis)?
I was thinking I could some sort of Bayesian system as described in these 2 similar questions (1, 2), although I’m not sure how I would construct such a system. Would this work? This solution, while easy, seems like it wouldn’t work given the scale of numbers (30ish on the low end vs 10000ish on the high end, so how many extra "balancing scores" does one add); I was also thinking of doing this, but also scaling it up by adding 10, instead of the 3 suggested by the linked question (where would I draw the line on number of scores to add?), exam scores of 50% (why 50% instead of 0%?). This question also proposes a more rigorous approach of cross validation, but I don’t fully understand how this would be implemented.