Bounty: 100
My training set contains about 50k entries with which I do an initial learning. On a weekly basis, ~ 5k entries are added; but the same amount “disappears” (as it is user data which has to be deleted after some time).
Therefore I use online learning because I do not have access to the full dataset at a later time. Currently I’m using an SGDClassifier
which works, but my big problem: new categories are appearing and now I can’t use my model any more as they were not in the initial fit
.
Is there a way with SGDClassifier
or some other model? Deep learning?
It doesn’t matter if I have to start from scratch NOW (i.e. use something other than SGDClassifier
), but I need something which enables online learning with new labels.