I am looking for good materials to learn more about analyzing very large data sets, say 5 million observations, with a reasonable computational time (say less than 10h).
My main interests is traditional analysis methods, such as linear, Poisson and logistic regression, both mixed and not-mixed. Together with simple illustration and presentation of data, that provide easy understandable interpretation and options for confounder correction.
I am open to alternative methods yielding reasonable similar results with the methods mentioned above, but reducing the computational time. For example, in some cases using a robust sandwich estimator can account for incorrect specified correlation structure but reduce computational time compared with a cluster methods.
I am familiar with the programs R, SAS and STATA, and would prefer methods available in these programs.
If you know of relevant (English/Danish) courses held by universities, I would love to have links for course descriptions for this as well.
I hope to hear from you.