#StackBounty: #r #regression #machine-learning #sas #large-data Material/courses for analyzing very large data sets

Bounty: 50

I am looking for good materials to learn more about analyzing very large data sets, say 5 million observations, with a reasonable computational time (say less than 10h).

My main interests is traditional analysis methods, such as linear, Poisson and logistic regression, both mixed and not-mixed. Together with simple illustration and presentation of data, that provide easy understandable interpretation and options for confounder correction.
I am open to alternative methods yielding reasonable similar results with the methods mentioned above, but reducing the computational time. For example, in some cases using a robust sandwich estimator can account for incorrect specified correlation structure but reduce computational time compared with a cluster methods.

I am familiar with the programs R, SAS and STATA, and would prefer methods available in these programs.

If you know of relevant (English/Danish) courses held by universities, I would love to have links for course descriptions for this as well.

I hope to hear from you.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.