#StackBounty: #c++ #performance #sorting #quick-sort Fast quicksort implementation

Bounty: 50

Just for learning purposes i wrote an implementation of quicksort algorithm.

I made some modifications to the algorithm to make it faster, that are:

  • two pivots setted to the boundaries of the middle partition(<, =, >), to avoid having two other variables counting the equal elements

  • the partitioning method checks first the presence of less(in left partition) or greater(in right partition) than pivot elements, if both are present, rather than moving the pivot twice, it just swaps them; if not behaves normally.

I benchmarked it a bit and compared the performance to std::sort; my algorithm, with random elements is faster than the STD one when there are more than 10000000, otherwise, with less elements std::sort is faster by some milliseconds(see below for actual benchmark results)

I would prefer some performance-related tips rather than design ones.

#include <algorithm>

template<class iterator>
void quickSort(iterator begin, iterator end)
{
    if (end - begin > 1)
    {
        auto lpivot = begin + (end - begin) / 2;
        auto rpivot = lpivot;

        auto pValue = *lpivot;

        auto left_it = lpivot - 1;
        auto right_it = rpivot + 1;

        auto lValue = *left_it;
        auto rValue = *right_it;

        bool isGreater = false;
        bool isLess = false;

        while (left_it != begin-1 || right_it != end)
        {
            if (lValue >= pValue)
            {
                if (lValue == pValue)
                {
                    lpivot--;
                    std::iter_swap(lpivot, left_it);
                }
                else
                    isGreater = true;
            }

            if (rValue <= pValue)
            {
                if (rValue == pValue)
                {
                    rpivot++;
                    std::iter_swap(rpivot, right_it);
                }
                else
                    isLess = true;
            }
            if (isGreater && isLess)
            {
                std::iter_swap(left_it, right_it);
            }
            else if (isGreater)
            {
                if (left_it != lpivot - 1)
                    std::iter_swap(left_it, lpivot - 1);
                std::iter_swap(rpivot - 1, lpivot - 1);
                std::iter_swap(rpivot, rpivot - 1);
                lpivot--;
                rpivot--;
            }
            else if (isLess)
            {
                if (right_it != rpivot + 1)
                    std::iter_swap(right_it, rpivot + 1);
                std::iter_swap(lpivot + 1, rpivot + 1);
                std::iter_swap(lpivot, lpivot + 1);
                lpivot++;
                rpivot++;
            }

            if (left_it != begin - 1)
                left_it--;
            if (right_it != end)
                right_it++;

            lValue = *left_it;
            rValue = *right_it;

            isGreater = false;
            isLess = false;
        }

        quickSort(begin, lpivot);
        quickSort(rpivot + 1, end);
    }
}

My algorithm benchmark

1000000  random integers --------> 80 ms
2000000  random integers --------> 165 ms
3000000  random integers --------> 247 ms
10000000 random integers --------> 780 ms

1000000  binary random integers -> 4 ms
2000000  binary random integers -> 9 ms
3000000  binary random integers -> 14 ms
10000000 binary random integers -> 49 ms

1000000  sorted integers --------> 19 ms
2000000  sorted integers --------> 43 ms
3000000  sorted integers --------> 65 ms
10000000 sorted integers --------> 232 ms

1000000  reversed integers ------> 17 ms
2000000  reversed integers ------> 37 ms
3000000  reversed integers ------> 60 ms
10000000 reversed integers ------> 216 ms

std::sort benchmark

1000000  random integers --------> 71 ms
2000000  random integers --------> 160 ms
3000000  random integers --------> 237 ms
10000000 random integers --------> 800 ms

1000000  binary random integers -> 4 ms
2000000  binary random integers -> 9 ms
3000000  binary random integers -> 13 ms
10000000 binary random integers -> 45 ms

1000000  sorted integers --------> 9 ms
2000000  sorted integers --------> 21 ms
3000000  sorted integers --------> 33 ms
10000000 sorted integers --------> 137 ms

1000000  reversed integers ------> 12 ms
2000000  reversed integers ------> 25 ms
3000000  reversed integers ------> 40 ms
10000000 reversed integers ------> 150 ms

bechmark code

int main()
{
    std::vector<int> values;

    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> dist(0, 1000000);
    //std::uniform_int_distribution<> dist(0, 1); //just 0s and 1s array

    std::generate_n(std::back_inserter(values), 10000000, std::bind(dist, gen)); //random

    for (int i = 0; i < 10000000; i++)
    {
        //values.push_back(i);              //sorted array
        //values.push_back(10000000 - i);   //reversed array
    }

    typedef std::chrono::high_resolution_clock Time;
    typedef std::chrono::milliseconds ms;
    typedef std::chrono::duration<float> fsec;
    auto t0 = Time::now();

    //quickSort(values.begin(), values.end());
    //std::sort(values.begin(), values.end());

    auto t1 = Time::now();
    fsec fs = t1 - t0;
    ms d = std::chrono::duration_cast<ms>(fs);
    std::cout << fs.count() << "sn";
    std::cout << d.count() << "msn";

    return 0;
}

(I know those arrows suck, they are there just for clarity)


Get this bounty!!!