#StackBounty: #pandas #string #list #distance #pdist pairwise distances between list of strings rows within a dataframe column

Bounty: 50

i have a dataframe that has a column of lists of string ids. (see below).
I want to create a distance matrix between all pairwise "distances" between all the rows
(e.g. if 10 rows, then it’s a 10x 10 matrix).
the rows are lists of ids, so I’m not sure how things like pdist can be used.

the values are string ids. just like string names


ids
0   [58545-19, 462423-43, 277581-25]
1   [0]
2   [454950-82, 433701-46, 228790-63, 266250-52, 458759-98, 152986-78, 222217-39, 433515-16, 265589-83, 439403-23, 277892-38, 223497-19, 224072-83, 461887-57, 436147-12, 227479-78, 228893-32, 279415-18, 439426-27, 437742-46, 438156-73, 438458-68, 277898-05, 438675-76, 454658-95, 431222-77, 462579-94, 434939-86, 222211-09, 178215-13, 459566-11, 463200-04, 439278-94, 459505-18, 399139-66, 455735-62, 327382-03, 439040-62, 233779-51, 431387-38, 438589-72, 437892-49, 458178-76]
3   [431380-63]
4   [442539-01, 434388-16, 454950-82, 463197-61, 228893-32, 464322-07, 462579-94, 438781-51, 437273-11, 265395-79, 463560-76, 462525-31, 439426-27, 438458-68, 464300-38, 442676-80]
5   [234729-10, 435926-98, 416670-04, 179514-28]
6   [0]
7   [0]
8   [267726-25, 235217-71, 227314-72, 185293-18, 434447-56, 170271-19, 454661-20]
9   [0]


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.