#StackBounty: #correlation #clustering #modeling #normalization #standardization Creating a popularity index from multivariate data

Bounty: 50

I am given data from an ecommerce website with features like product_name, product_category product_link, product_id, free_delivery(1 or 0), price, discount, avg_rating, number of reviews, search_rank, date where search_rank is position of the product when a category webpage is opened.

I want to create a popularity_index based on above mentioned features.

My approach till now is to normalize the columns search_rank, ratings and avg_rating and assign weights a,b,c to these and assign popularity_index the value $ax+by+cz$ for each category.

Can I do it in a better way? Do I incorporate some common statistical techniques that I am missing?

Update from comments:

It is a single metric or an index which we can look at to compare two products based on those 3 variables. For example, a product with popularity_index 44.5 is way more popular than some product with popularity_index 1.5. Something on the lines of a socio-economic index or happiness index of countries based on various variables.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.