# #StackBounty: #algorithms #parallel-computing #haskell GroupBy key a sequence of ordered key values

### Bounty: 50

I have a sorted (by key) sequence of key value pairs:

``````< (1,"A"),(1,"C"),(1,"G"),(1,"B"),(2,"D"),(2,"F"),(3,"E") >

``````

And I have to produce a sequence of tuples like this:

``````< (1,< "A","C","G","B" >),(2, <"D","F">),(3, <"E">) >
``````

Where the second element of the tuple is a sequence with the values corresponding to the key.

I am working on haskell, under the hood I have a Seq that is a Vector, but a pseudocode solution is OK.

I have this algorithm that will run in parallel in $$O(lg n)$$, but cost of work is too high (more than $$O(n)$$):

``````comb :: Seq s => ((a, s b) -> (a, s b) -> Bool) -> s (a, s b) -> s (a, s b) -> s (a, s b)
comb f sl sr | lengthS sr == 0 = sl
| lengthS sl == 0 = sr
| f (nthS sl ((lengthS sl) - 1)) (nthS sr 0) =  let
(cl, vl) = (nthS sl ((lengthS sl) - 1))
(cr, vr) = (nthS sr 0)
union = singletonS (cr, appendS vl vr)
in appendS (appendS (takeS ((lengthS sl) - 1) sl) union) (dropS 1 sr)
| otherwise = appendS sl sr

groupDyC :: Seq s => ((a, s b) -> (a, s b) -> Bool) -> s (a, b) -> s (a, s b)
groupDyC f seq = reduceS (comb (f)) emptyS (mapS (x -> singletonS x) (mapS ((c, v) -> (c, singletonS v)) seq))
``````

Where `lengthA`, `nthS` are $$O(1)$$ and `appendS` is linear to the size of both sequences.

My problem is that the solution has to have work in $$O(n)$$ and parallel cost (S) $$O(lg n)$$. How could I create an algorithm that runs on that cost with these operations: Thank you very much.

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.