#StackBounty: #machine-learning #python #deep-learning #clustering #data-mining How to cluster skills in job domain?

Bounty: 100

I have a problem related to clustering, where i need to cluster skill set from job domain.

Let’s say, in a resume a candidate can mention they familiarity with amazon s3 bucket. But each people can mention it in any way. For example,

  1. amazon s3
  2. s3
  3. aws s3

For a human, we can easily understand these three are exactly equavalent. I can’t use kmeans type of clustering because it can fail in a lot of cases.

For example,

  1. spring
  2. spring framework
  3. Spring MVC
  4. Spring Boot

These may fall in same cluster which is wrong. A candidate who knows spring framework might not know sprint boot etc.,

Similarity of word based on embeddings/bow model fail here.

What are the options I have? Currently I manually collected a lot of word variations in a dict format, key is root word value is array of variations of that root word.

Any help is really appreciated?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.