Right now, I am working on building a efficient classification for my company. We work as a social monitoring company, basically we collect data from social media sites to see the engagement, comment, share and else for a subscribed client. Then we extract the data feature to perform a sentiment analysis.
However, this is tough, my company did not hire a proper statistician or data analysis/ scientist. They can only do the sentiment analysis of semantic analysis but cannot classify further on the overall positive and negative sentiment on what are the compliments and complaints about. So they simply manually classify data by human labor.
Thus, I want to ask what would be a good model or data pipeline for this case? Or what could be an idea on big data solution here?
P.S. the tricky part is that they still insecure about data accuracy, so they manually classify each message like " fast delivery " -> delivery. It seems inefficient to do this way. So what I want here is either the accuracy model suggestion on further classification or the way to improve efficiency.
P.S. 2 they can analyse 2000 data points for one person per day so I see the insecurity of my company on getting killed by the big data.