#StackBounty: #performance #comparative-review #swift #combinatorics On Knuth's "Algorithm L" to generate permutations in…

Bounty: 50

I needed a method to generate all permutations of given elements, so I decided to
implement “Algorithm L (Lexicographic permutation generation)” from
Donald E. Knuth, “GENERATING ALL PERMUTATIONS”
in Swift. The labels L2, L3, L4 refer to the steps in Knuth’s algorithm:

extension Array where Element: Comparable {

    /// Replaces the array by the next permutation of its elements in lexicographic
    /// order.
    ///
    /// It uses the "Algorithm L (Lexicographic permutation generation)" from
    ///    Donald E. Knuth, "GENERATING ALL PERMUTATIONS"
    ///    http://www-cs-faculty.stanford.edu/~uno/fasc2b.ps.gz
    ///
    /// - Returns: `true` if there was a next permutation, and `false` otherwise
    ///   (i.e. if the array elements were in descending order).

    mutating func permute() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        var j = count - 2
        while j >= 0 && self[j] > self[j+1] {
            j -= 1
        }
        if j == -1 {
            return false
        }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        var l = count - 1
        while self[j] > self[l] {
            l -= 1
        }
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&self[lo], &self[hi])
            lo += 1
            hi -= 1
        }
        return true
    }
}

This worked correctly in all my tests, here is a simple example:

do {
    var a = ["a", "b", "c"]
    repeat {
        print(a)
    } while a.permute()
}

Output:

["a", "b", "c"]
["a", "c", "b"]
["b", "a", "c"]
["b", "c", "a"]
["c", "a", "b"]
["c", "b", "a"]

Then I tried to utilize existing methods from the Swift standard library to reduce
the code, which lead to this implementation (called permute2 only to distinguish it
from the first one):

extension Array where Element: Comparable {

    mutating func permute2() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        guard let j = indices.reversed().dropFirst()
            .first(where: { self[$0] <= self[$0 + 1] })
        else { return false }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        let l = indices.reversed()
            .first(where: { self[j] <= self[$0] })!
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        replaceSubrange(j+1..<count, with: self[j+1..<count].reversed())
        return true
    }
}

All loops are replaced by single statements which are almost self-explaining. So this
looks nice, but what about the performance?

For benchmarking, the $ 10! = 3628800 $ permutations of 10 elements are computed:

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
} while a.permute() // or a.permute2()
let end = Date()
print(count, end.timeIntervalSince(start))

The results (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode) are

  • using the “iterative” permute(): 0.03 seconds
  • using the “functional” permute2(): 3 seconds

i.e. the functional method is slower by a factor of 100.

An alternative would be a non-mutating method which returns the next permutation
or nil:

extension Array where Element: Comparable {

    func nextPermutation() -> Array? {
        if count <= 1 { return nil }

        var j = count - 2
        while j >= 0 && self[j] > self[j+1] { j -= 1 }
        if j == -1 { return nil }

        var a = self // Now we need a mutable copy.
        var l = count - 1
        while self[j] > self[l] { l -= 1 }
        swap(&a[j], &a[l])

        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&a[lo], &a[hi])
            lo += 1
            hi -= 1
        }
        return a
    }
}

The benchmark for the non-mutating version is

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
    guard let b = a.nextPermutation() else { break }
    a = b
} while true
let end = Date()
print(count, end.timeIntervalSince(start))

and takes approx. 0.9 seconds on my MacBook, so this is considerably slower than the
mutating method permute().

All feedback is welcome, in particular (but not restricted to):

  • Naming of the methods and variables.
  • Swifty-ness of the implementation.
  • Iterative vs functional approach: Is it possible to use the (existing) functional
    methods without losing performance?
  • Mutating vs. non-mutating method: Which API is clearer? Can we make the non-mutating
    method as fast as the mutating one?

Remark: I also implemented a Sequence-based enumeration, this
is posted as a separate question Sequence-based enumeration of permutations in lexicographic order.


Get this bounty!!!

#StackBounty: #performance #comparative-review #swift #combinatorics On Knuth's "Algorithm L" to generate permutations in…

Bounty: 50

I needed a method to generate all permutations of given elements, so I decided to
implement “Algorithm L (Lexicographic permutation generation)” from
Donald E. Knuth, “GENERATING ALL PERMUTATIONS”
in Swift. The labels L2, L3, L4 refer to the steps in Knuth’s algorithm:

extension Array where Element: Comparable {

    /// Replaces the array by the next permutation of its elements in lexicographic
    /// order.
    ///
    /// It uses the "Algorithm L (Lexicographic permutation generation)" from
    ///    Donald E. Knuth, "GENERATING ALL PERMUTATIONS"
    ///    http://www-cs-faculty.stanford.edu/~uno/fasc2b.ps.gz
    ///
    /// - Returns: `true` if there was a next permutation, and `false` otherwise
    ///   (i.e. if the array elements were in descending order).

    mutating func permute() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        var j = count - 2
        while j >= 0 && self[j] > self[j+1] {
            j -= 1
        }
        if j == -1 {
            return false
        }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        var l = count - 1
        while self[j] > self[l] {
            l -= 1
        }
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&self[lo], &self[hi])
            lo += 1
            hi -= 1
        }
        return true
    }
}

This worked correctly in all my tests, here is a simple example:

do {
    var a = ["a", "b", "c"]
    repeat {
        print(a)
    } while a.permute()
}

Output:

["a", "b", "c"]
["a", "c", "b"]
["b", "a", "c"]
["b", "c", "a"]
["c", "a", "b"]
["c", "b", "a"]

Then I tried to utilize existing methods from the Swift standard library to reduce
the code, which lead to this implementation (called permute2 only to distinguish it
from the first one):

extension Array where Element: Comparable {

    mutating func permute2() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        guard let j = indices.reversed().dropFirst()
            .first(where: { self[$0] <= self[$0 + 1] })
        else { return false }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        let l = indices.reversed()
            .first(where: { self[j] <= self[$0] })!
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        replaceSubrange(j+1..<count, with: self[j+1..<count].reversed())
        return true
    }
}

All loops are replaced by single statements which are almost self-explaining. So this
looks nice, but what about the performance?

For benchmarking, the $ 10! = 3628800 $ permutations of 10 elements are computed:

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
} while a.permute() // or a.permute2()
let end = Date()
print(count, end.timeIntervalSince(start))

The results (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode) are

  • using the “iterative” permute(): 0.03 seconds
  • using the “functional” permute2(): 3 seconds

i.e. the functional method is slower by a factor of 100.

An alternative would be a non-mutating method which returns the next permutation
or nil:

extension Array where Element: Comparable {

    func nextPermutation() -> Array? {
        if count <= 1 { return nil }

        var j = count - 2
        while j >= 0 && self[j] > self[j+1] { j -= 1 }
        if j == -1 { return nil }

        var a = self // Now we need a mutable copy.
        var l = count - 1
        while self[j] > self[l] { l -= 1 }
        swap(&a[j], &a[l])

        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&a[lo], &a[hi])
            lo += 1
            hi -= 1
        }
        return a
    }
}

The benchmark for the non-mutating version is

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
    guard let b = a.nextPermutation() else { break }
    a = b
} while true
let end = Date()
print(count, end.timeIntervalSince(start))

and takes approx. 0.9 seconds on my MacBook, so this is considerably slower than the
mutating method permute().

All feedback is welcome, in particular (but not restricted to):

  • Naming of the methods and variables.
  • Swifty-ness of the implementation.
  • Iterative vs functional approach: Is it possible to use the (existing) functional
    methods without losing performance?
  • Mutating vs. non-mutating method: Which API is clearer? Can we make the non-mutating
    method as fast as the mutating one?

Remark: I also implemented a Sequence-based enumeration, this
is posted as a separate question Sequence-based enumeration of permutations in lexicographic order.


Get this bounty!!!

#StackBounty: #python #performance #benchmarking #pypy Accurately testing Pypy vs CPython performance

Bounty: 50

The Problem Description:

I have this custom “checksum” function:

NORMALIZER = 0x10000


def get_checksum(part1, part2, salt="trailing"):
    """Returns a checksum of two strings."""

    combined_string = part1 + part2 + " " + salt if part2 != "***" else part1
    ords = [ord(x) for x in combined_string]

    checksum = ords[0]  # initial value

    # TODO: document the logic behind the checksum calculations
    iterator = zip(ords[1:], ords)
    checksum += sum(x + 2 * y if counter % 2 else x * y
                    for counter, (x, y) in enumerate(iterator))
    checksum %= NORMALIZER

    return checksum

Which I want to test on both Python3.6 and PyPy performance-wise. I’d like to see if the function would perform better on PyPy, but I’m not completely sure, what is the most reliable and clean way to do it.

What I’ve tried and the Question:

Currently, I’m using timeit for both:

$ python3.6 -mtimeit -s "from test import get_checksum" "get_checksum('test1' * 100000, 'test2' * 100000)"
10 loops, best of 3: 329 msec per loop

$ pypy -mtimeit -s "from test import get_checksum" "get_checksum('test1' * 100000, 'test2' * 100000)"
10 loops, best of 3: 104 msec per loop

My concern is I’m not absolutely sure if timeit is the right tool for the job on PyPy because of the potential JIT warmup overhead.

Plus, the PyPy itself reports the following before reporting the test results:

WARNING: timeit is a very unreliable tool. use perf or something else for real measurements
pypy -m pip install perf
pypy -m perf timeit -s 'from test import get_checksum' "get_checksum('test1' * 1000000, 'test2' * 1000000)"

What would be the best and most accurate approach to test the same exact function performance across these and potentially other Python implementations?


Get this bounty!!!

#StackBounty: #performance #computational-geometry #matlab AABB Intersections with Space Partitions, A Sample Code Performance and Reli…

Bounty: 50

I have originally written the following Matlab code to find intersection between a set of Axes Aligned Bounding Boxes (AABB) and space partitions (here 8 partitions). I believe it is readable by itself, moreover, I have added some comments for even more clarity.

function [A,B] = AABBPart(bbx,it)                                     % bbx: aabb, it: iteration
global F
IT = it+1;
n = size(bbx,1);
F = cell(n,it);
A = Part([min(bbx(:,1:3)),max(bbx(:,4:6))],it,0);                     % recursive partitioning
B = F;                                                                % matlab does not allow
    function s = Part(bx,it,J)                                        %   output to be global
        s = {};
        if it < 1; return; end
        s = cell(8,1);
        p = bx(1:3);
        q = bx(4:6);
        h = 0.5*(p+q);
        prt = [p,h;...                                                % 8 sub-parts (octa)
            h(1),p(2:3),q(1),h(2:3);...
            p(1),h(2),p(3),h(1),q(2),h(3);...
            h(1:2),p(3),q(1:2),h(3);...
            p(1:2),h(1),h(1:2),q(3);...
            h(1),p(2),h(3),q(1),h(2),q(3);...
            p(1),h(2:3),h(1),q(2:3);...
            h,q];
        for j=1:8                                                     % check for each sub-part
            k = 0;
            t = zeros(0,1);
            for i=1:n
                if all(bbx(i,1:3) <= prt(j,4:6)) && ...               % interscetion test for
                        all(prt(j,1:3) <= bbx(i,4:6))                 %   every aabb and sub-parts
                    k = k+1;
                    t(k) = i;
                end
            end
            if ~isempty(t)
                s{j,1} = [t; Part(prt(j,:),it-1,j)];                  % recursive call
                for i=1:numel(t)                                      % collecting the results
                    if isempty(F{t(i),IT-it})
                        F{t(i),IT-it} = [-J,j];
                    else
                        F{t(i),IT-it} = [F{t(i),IT-it}; [-J,j]];
                    end
                end
            end
        end
    end
end

Concerns:

  • In my tests, it seems that probably few intersections are missing, say, 10 or so for 1000 or more setup. So I would be glad if you could help to find out any problematic parts in the code.

  • I am also concerned about using global F. I prefer to get rid of it.

  • Any other better solution in terms of speed, will be loved.

Note that the code is complete. And you can easily try it by some following setup.

n = 10000;                        % in the original application, n would be millions
bbx = rand(n,6);
it = 3;
[A,B] = AABBPart(bbx,it);


Get this bounty!!!

#StackBounty: #performance #comparative-review #swift #combinatorics On Knuth's "Algorithm L" to generate permutations in…

Bounty: 50

I needed a method to generate all permutations of given elements, so I decided to
implement “Algorithm L (Lexicographic permutation generation)” from
Donald E. Knuth, “GENERATING ALL PERMUTATIONS”
in Swift. The labels L2, L3, L4 refer to the steps in Knuth’s algorithm:

extension Array where Element: Comparable {

    /// Replaces the array by the next permutation of its elements in lexicographic
    /// order.
    ///
    /// It uses the "Algorithm L (Lexicographic permutation generation)" from
    ///    Donald E. Knuth, "GENERATING ALL PERMUTATIONS"
    ///    http://www-cs-faculty.stanford.edu/~uno/fasc2b.ps.gz
    ///
    /// - Returns: `true` if there was a next permutation, and `false` otherwise
    ///   (i.e. if the array elements were in descending order).

    mutating func permute() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        var j = count - 2
        while j >= 0 && self[j] > self[j+1] {
            j -= 1
        }
        if j == -1 {
            return false
        }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        var l = count - 1
        while self[j] > self[l] {
            l -= 1
        }
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&self[lo], &self[hi])
            lo += 1
            hi -= 1
        }
        return true
    }
}

This worked correctly in all my tests, here is a simple example:

do {
    var a = ["a", "b", "c"]
    repeat {
        print(a)
    } while a.permute()
}

Output:

["a", "b", "c"]
["a", "c", "b"]
["b", "a", "c"]
["b", "c", "a"]
["c", "a", "b"]
["c", "b", "a"]

Then I tried to utilize existing methods from the Swift standard library to reduce
the code, which lead to this implementation (called permute2 only to distinguish it
from the first one):

extension Array where Element: Comparable {

    mutating func permute2() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        guard let j = indices.reversed().dropFirst()
            .first(where: { self[$0] <= self[$0 + 1] })
        else { return false }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        let l = indices.reversed()
            .first(where: { self[j] <= self[$0] })!
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        replaceSubrange(j+1..<count, with: self[j+1..<count].reversed())
        return true
    }
}

All loops are replaced by single statements which are almost self-explaining. So this
looks nice, but what about the performance?

For benchmarking, the $ 10! = 3628800 $ permutations of 10 elements are computed:

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
} while a.permute() // or a.permute2()
let end = Date()
print(count, end.timeIntervalSince(start))

The results (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode) are

  • using the “iterative” permute(): 0.03 seconds
  • using the “functional” permute2(): 3 seconds

i.e. the functional method is slower by a factor of 100.

An alternative would be a non-mutating method which returns the next permutation
or nil:

extension Array where Element: Comparable {

    func nextPermutation() -> Array? {
        if count <= 1 { return nil }

        var j = count - 2
        while j >= 0 && self[j] > self[j+1] { j -= 1 }
        if j == -1 { return nil }

        var a = self // Now we need a mutable copy.
        var l = count - 1
        while self[j] > self[l] { l -= 1 }
        swap(&a[j], &a[l])

        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&a[lo], &a[hi])
            lo += 1
            hi -= 1
        }
        return a
    }
}

The benchmark for the non-mutating version is

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
    guard let b = a.nextPermutation() else { break }
    a = b
} while true
let end = Date()
print(count, end.timeIntervalSince(start))

and takes approx. 0.9 seconds on my MacBook, so this is considerably slower than the
mutating method permute().

All feedback is welcome, in particular (but not restricted to):

  • Naming of the methods and variables.
  • Swifty-ness of the implementation.
  • Iterative vs functional approach: Is it possible to use the (existing) functional
    methods without losing performance?
  • Mutating vs. non-mutating method: Which API is clearer? Can we make the non-mutating
    method as fast as the mutating one?

Remark: I also implemented a Sequence-based enumeration, this
is posted as a separate question Sequence-based enumeration of permutations in lexicographic order.


Get this bounty!!!

#StackBounty: #performance #comparative-review #swift #combinatorics On Knuth's "Algorithm L" to generate permutations in…

Bounty: 50

I needed a method to generate all permutations of given elements, so I decided to
implement “Algorithm L (Lexicographic permutation generation)” from
Donald E. Knuth, “GENERATING ALL PERMUTATIONS”
in Swift. The labels L2, L3, L4 refer to the steps in Knuth’s algorithm:

extension Array where Element: Comparable {

    /// Replaces the array by the next permutation of its elements in lexicographic
    /// order.
    ///
    /// It uses the "Algorithm L (Lexicographic permutation generation)" from
    ///    Donald E. Knuth, "GENERATING ALL PERMUTATIONS"
    ///    http://www-cs-faculty.stanford.edu/~uno/fasc2b.ps.gz
    ///
    /// - Returns: `true` if there was a next permutation, and `false` otherwise
    ///   (i.e. if the array elements were in descending order).

    mutating func permute() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        var j = count - 2
        while j >= 0 && self[j] > self[j+1] {
            j -= 1
        }
        if j == -1 {
            return false
        }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        var l = count - 1
        while self[j] > self[l] {
            l -= 1
        }
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&self[lo], &self[hi])
            lo += 1
            hi -= 1
        }
        return true
    }
}

This worked correctly in all my tests, here is a simple example:

do {
    var a = ["a", "b", "c"]
    repeat {
        print(a)
    } while a.permute()
}

Output:

["a", "b", "c"]
["a", "c", "b"]
["b", "a", "c"]
["b", "c", "a"]
["c", "a", "b"]
["c", "b", "a"]

Then I tried to utilize existing methods from the Swift standard library to reduce
the code, which lead to this implementation (called permute2 only to distinguish it
from the first one):

extension Array where Element: Comparable {

    mutating func permute2() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        guard let j = indices.reversed().dropFirst()
            .first(where: { self[$0] <= self[$0 + 1] })
        else { return false }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        let l = indices.reversed()
            .first(where: { self[j] <= self[$0] })!
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        replaceSubrange(j+1..<count, with: self[j+1..<count].reversed())
        return true
    }
}

All loops are replaced by single statements which are almost self-explaining. So this
looks nice, but what about the performance?

For benchmarking, the $ 10! = 3628800 $ permutations of 10 elements are computed:

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
} while a.permute() // or a.permute2()
let end = Date()
print(count, end.timeIntervalSince(start))

The results (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode) are

  • using the “iterative” permute(): 0.03 seconds
  • using the “functional” permute2(): 3 seconds

i.e. the functional method is slower by a factor of 100.

An alternative would be a non-mutating method which returns the next permutation
or nil:

extension Array where Element: Comparable {

    func nextPermutation() -> Array? {
        if count <= 1 { return nil }

        var j = count - 2
        while j >= 0 && self[j] > self[j+1] { j -= 1 }
        if j == -1 { return nil }

        var a = self // Now we need a mutable copy.
        var l = count - 1
        while self[j] > self[l] { l -= 1 }
        swap(&a[j], &a[l])

        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&a[lo], &a[hi])
            lo += 1
            hi -= 1
        }
        return a
    }
}

The benchmark for the non-mutating version is

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
    guard let b = a.nextPermutation() else { break }
    a = b
} while true
let end = Date()
print(count, end.timeIntervalSince(start))

and takes approx. 0.9 seconds on my MacBook, so this is considerably slower than the
mutating method permute().

All feedback is welcome, in particular (but not restricted to):

  • Naming of the methods and variables.
  • Swifty-ness of the implementation.
  • Iterative vs functional approach: Is it possible to use the (existing) functional
    methods without losing performance?
  • Mutating vs. non-mutating method: Which API is clearer? Can we make the non-mutating
    method as fast as the mutating one?

Remark: I also implemented a Sequence-based enumeration, this
is posted as a separate question Sequence-based enumeration of permutations in lexicographic order.


Get this bounty!!!

#StackBounty: #performance #comparative-review #swift #combinatorics On Knuth's "Algorithm L" to generate permutations in…

Bounty: 50

I needed a method to generate all permutations of given elements, so I decided to
implement “Algorithm L (Lexicographic permutation generation)” from
Donald E. Knuth, “GENERATING ALL PERMUTATIONS”
in Swift. The labels L2, L3, L4 refer to the steps in Knuth’s algorithm:

extension Array where Element: Comparable {

    /// Replaces the array by the next permutation of its elements in lexicographic
    /// order.
    ///
    /// It uses the "Algorithm L (Lexicographic permutation generation)" from
    ///    Donald E. Knuth, "GENERATING ALL PERMUTATIONS"
    ///    http://www-cs-faculty.stanford.edu/~uno/fasc2b.ps.gz
    ///
    /// - Returns: `true` if there was a next permutation, and `false` otherwise
    ///   (i.e. if the array elements were in descending order).

    mutating func permute() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        var j = count - 2
        while j >= 0 && self[j] > self[j+1] {
            j -= 1
        }
        if j == -1 {
            return false
        }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        var l = count - 1
        while self[j] > self[l] {
            l -= 1
        }
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&self[lo], &self[hi])
            lo += 1
            hi -= 1
        }
        return true
    }
}

This worked correctly in all my tests, here is a simple example:

do {
    var a = ["a", "b", "c"]
    repeat {
        print(a)
    } while a.permute()
}

Output:

["a", "b", "c"]
["a", "c", "b"]
["b", "a", "c"]
["b", "c", "a"]
["c", "a", "b"]
["c", "b", "a"]

Then I tried to utilize existing methods from the Swift standard library to reduce
the code, which lead to this implementation (called permute2 only to distinguish it
from the first one):

extension Array where Element: Comparable {

    mutating func permute2() -> Bool {

        // Nothing to do for empty or single-element arrays:
        if count <= 1 {
            return false
        }

        // L2: Find last j such that self[j] <= self[j+1]. Terminate if no such j
        // exists.
        guard let j = indices.reversed().dropFirst()
            .first(where: { self[$0] <= self[$0 + 1] })
        else { return false }

        // L3: Find last l such that self[j] <= self[l], then exchange elements j and l:
        let l = indices.reversed()
            .first(where: { self[j] <= self[$0] })!
        swap(&self[j], &self[l])

        // L4: Reverse elements j+1 ... count-1:
        replaceSubrange(j+1..<count, with: self[j+1..<count].reversed())
        return true
    }
}

All loops are replaced by single statements which are almost self-explaining. So this
looks nice, but what about the performance?

For benchmarking, the $ 10! = 3628800 $ permutations of 10 elements are computed:

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
} while a.permute() // or a.permute2()
let end = Date()
print(count, end.timeIntervalSince(start))

The results (on a 1.2 GHz Intel Core m5 MacBook, compiled in Release mode) are

  • using the “iterative” permute(): 0.03 seconds
  • using the “functional” permute2(): 3 seconds

i.e. the functional method is slower by a factor of 100.

An alternative would be a non-mutating method which returns the next permutation
or nil:

extension Array where Element: Comparable {

    func nextPermutation() -> Array? {
        if count <= 1 { return nil }

        var j = count - 2
        while j >= 0 && self[j] > self[j+1] { j -= 1 }
        if j == -1 { return nil }

        var a = self // Now we need a mutable copy.
        var l = count - 1
        while self[j] > self[l] { l -= 1 }
        swap(&a[j], &a[l])

        var lo = j + 1
        var hi = count - 1
        while lo < hi {
            swap(&a[lo], &a[hi])
            lo += 1
            hi -= 1
        }
        return a
    }
}

The benchmark for the non-mutating version is

let N = 10
var count = 0
let start = Date()
var a = Array(1...N)
repeat {
    count += 1
    guard let b = a.nextPermutation() else { break }
    a = b
} while true
let end = Date()
print(count, end.timeIntervalSince(start))

and takes approx. 0.9 seconds on my MacBook, so this is considerably slower than the
mutating method permute().

All feedback is welcome, in particular (but not restricted to):

  • Naming of the methods and variables.
  • Swifty-ness of the implementation.
  • Iterative vs functional approach: Is it possible to use the (existing) functional
    methods without losing performance?
  • Mutating vs. non-mutating method: Which API is clearer? Can we make the non-mutating
    method as fast as the mutating one?

Remark: I also implemented a Sequence-based enumeration, this
is posted as a separate question Sequence-based enumeration of permutations in lexicographic order.


Get this bounty!!!

#StackBounty: #python #performance #python-3.x #sorting #heap Sorting almost sorted array with a minimum heap

Bounty: 50

I’m currently looking into the quickest way to sort an almost sorted array:

Given an array of $n$ elements, where each element is at most $k$ away
from its target position, devise an algorithm that sorts in $O(n log k)$
time.

I’ve implemented a sorting function which “slides” through an input list, pushes the element from a sliding window into a min heap (using the heapq built-in module), pops the minimum element which is collected in the result list:

from typing import List
from heapq import heappush, heappop


def sort_almost_sorted(a: List[int], k: int) -> List[int]:
    if not a:
        return []

    length = len(a)
    if k >= length:
        return sorted(a)  # apply built-in "timsort", designed to be quick on almost sorted lists

    result = []
    min_heap = []  # maintain a min heap for a sliding window
    for slice_start in range(0, length, k + 1):  # sliding window
        # push the next slice to min heap
        for index in range(slice_start, min(slice_start + k + 1, length)):
            heappush(min_heap, a[index])

        result.append(heappop(min_heap))

    # put the rest of the heap into the result
    for _ in range(len(min_heap)):
        result.append(heappop(min_heap))

    return result

It works on my sample inputs.

Do you think I’m using the minimum heap appropriately and this implementation is $O(n log k)$? What would you improve code-quality or code-organization wise?


Get this bounty!!!

#StackBounty: #windows-server-2008 #sql-server #performance #vmware-esxi Diagnosing slow data processing on a VM

Bounty: 50

We’re trying to diagnose a VM apparently running slowly, from inside the VM.

The situation:

  • We have an IIS-hosted application on Windows Server 2008R2 running on a 6-core, 12GB virtual machine.
  • We have a database running on an SQL Server 2008R2 SP3 cluster (16+ cores, >16GB RAM)
  • The application is processing a queue of messages against this database. Processing consists mostly of query/fetch and update, maybe a dozen or so roundtrips per message. A limited number of threads (3) are assigned to this workload, which is synchronous, so the threads do block on database responses.
  • The database is apparently lightly loaded: only a few percent of maximum CPU load.
  • To our knowledge, both the database and the VM host are in the same datacentre.

The database reports considerable time spent waiting on async network IO, ie. waiting for the application to consume data.
The application VM is also apparently lightly loaded: ~20% CPU.
The infrastructure is not owned by us and our only access is via RDP to the virtual machine, and SQL Management Studio to the database cluster. We do not have sufficient privileges to run a profiler, though we do record performance counters for both database and VM.

A few weeks ago, the message processing rate abruptly dropped by 70-80%. As far as we are aware, nothing had changed: the application had not been modified or reconfigured, and the performance counters don’t indicate any change in load characteristics. The owners of the infrastructure have stated that nothing has changed at their end.

As part of a restart process, the application was made to reload its message queue. This involves one simple SELECT of a few hundred thousand rows which are then read into in-memory structures. The database serviced the SELECT in a few seconds but then waited for ~10 minutes on the application reading the resultset. This is a single-threaded operation involving very simple deserialisation, and should not take more than a couple of minutes on this hardware.

My current theory is that network latency has increased in some way, but ping only reports ‘<1ms’ and we don’t have a baseline in any case. hrPing reports times between 0.5 and 2ms from the application server to the database.

Another possibility is that the VM’s real CPU capacity has in some way decreased, but I’d expect that to manifest as increased ‘apparent’ load.

Are there any other avenues of investigation available to us?


Get this bounty!!!

#StackBounty: #debian #performance #java Bad performance with Java

Bounty: 50

I have a computer that was running Windows with a lot of big programas (like Adobe Fireworks, and many other stuff) and the computer performance was really good. I decided to format my computer (I didn’t need all this programs because this computer had a different owner) and I installed Debian 8 Jessie (stable).

But since the first fresh installation, every program that needs Java to run (like NetBeans, Google Chrome, Atom (advanced text editor) or anyone) it starts consuming CPU progressively (checked via top command) until I have to reboot manually through the button (the computer is unusable, I can’t open the menu and click on Power off).

I tried using different versions of Java (7 and 8) but nothing worked. Java 7 is installed through the official repositories and version-8 through Oracle downloads official site.

Any word of advice?

EDIT 1: Java version information

lrwxrwxrwx   1 root root    24 May  6  2014 default-java -> java-1.7.0-openjdk-amd64
lrwxrwxrwx   1 root root    20 Nov  7 01:58 java-1.7.0-openjdk-amd64 -> java-7-openjdk-amd64
-rw-r--r--   1 root root  2439 Feb  7 21:22 .java-1.7.0-openjdk-amd64.jinfo
drwxr-xr-x   5 root root  4096 Jan 11 18:54 java-6-openjdk-amd64
drwxr-xr-x   8 root root  4096 Feb 21 11:10 java-7-openjdk-amd64
drwxr-xr-x   9 root root  4096 Feb  3 08:56 jdk-8-oracle-x64
-rw-r--r--   1 root root  2531 Feb  2 10:16 .jdk-8-oracle-x64.jinfo
drwxr-xr-x   2 root root  4096 Feb 21 11:12 openjdk-7

Edit 2: Java performance
Also I’ve noticed via top command that Java process also consumes a lot of CPU (more than 200%).


Get this bounty!!!