I have a time-series dataset of incoming face data. Each data point is a facial-feature-vector of length 256 that represents the facial features of a person (it is generated by a modified RESNET). Features that are close together are deemed to belong to the same person.
I am (successfully) clustering the incoming face features by DBSCANing. I’ve recently switched to HDBSCAN also with good results.
My problem is this: DBSCAN and HDBSCAN require I have all the data together at one time. I often have >200,000 features which can be a very large download.
I would much prefer to be able to take every incoming f and assign it to a person without having to collect all the information at one time.
Is there an alternative to this (preferable with a Python implementation)?