#StackBounty: #python #performance #python-3.x #sorting #heap Sorting almost sorted array with a minimum heap

Bounty: 50

I’m currently looking into the quickest way to sort an almost sorted array:

Given an array of $n$ elements, where each element is at most $k$ away
from its target position, devise an algorithm that sorts in $O(n log k)$
time.

I’ve implemented a sorting function which “slides” through an input list, pushes the element from a sliding window into a min heap (using the heapq built-in module), pops the minimum element which is collected in the result list:

from typing import List
from heapq import heappush, heappop


def sort_almost_sorted(a: List[int], k: int) -> List[int]:
    if not a:
        return []

    length = len(a)
    if k >= length:
        return sorted(a)  # apply built-in "timsort", designed to be quick on almost sorted lists

    result = []
    min_heap = []  # maintain a min heap for a sliding window
    for slice_start in range(0, length, k + 1):  # sliding window
        # push the next slice to min heap
        for index in range(slice_start, min(slice_start + k + 1, length)):
            heappush(min_heap, a[index])

        result.append(heappop(min_heap))

    # put the rest of the heap into the result
    for _ in range(len(min_heap)):
        result.append(heappop(min_heap))

    return result

It works on my sample inputs.

Do you think I’m using the minimum heap appropriately and this implementation is $O(n log k)$? What would you improve code-quality or code-organization wise?


Get this bounty!!!

#StackBounty: #windows-server-2008 #sql-server #performance #vmware-esxi Diagnosing slow data processing on a VM

Bounty: 50

We’re trying to diagnose a VM apparently running slowly, from inside the VM.

The situation:

  • We have an IIS-hosted application on Windows Server 2008R2 running on a 6-core, 12GB virtual machine.
  • We have a database running on an SQL Server 2008R2 SP3 cluster (16+ cores, >16GB RAM)
  • The application is processing a queue of messages against this database. Processing consists mostly of query/fetch and update, maybe a dozen or so roundtrips per message. A limited number of threads (3) are assigned to this workload, which is synchronous, so the threads do block on database responses.
  • The database is apparently lightly loaded: only a few percent of maximum CPU load.
  • To our knowledge, both the database and the VM host are in the same datacentre.

The database reports considerable time spent waiting on async network IO, ie. waiting for the application to consume data.
The application VM is also apparently lightly loaded: ~20% CPU.
The infrastructure is not owned by us and our only access is via RDP to the virtual machine, and SQL Management Studio to the database cluster. We do not have sufficient privileges to run a profiler, though we do record performance counters for both database and VM.

A few weeks ago, the message processing rate abruptly dropped by 70-80%. As far as we are aware, nothing had changed: the application had not been modified or reconfigured, and the performance counters don’t indicate any change in load characteristics. The owners of the infrastructure have stated that nothing has changed at their end.

As part of a restart process, the application was made to reload its message queue. This involves one simple SELECT of a few hundred thousand rows which are then read into in-memory structures. The database serviced the SELECT in a few seconds but then waited for ~10 minutes on the application reading the resultset. This is a single-threaded operation involving very simple deserialisation, and should not take more than a couple of minutes on this hardware.

My current theory is that network latency has increased in some way, but ping only reports ‘<1ms’ and we don’t have a baseline in any case. hrPing reports times between 0.5 and 2ms from the application server to the database.

Another possibility is that the VM’s real CPU capacity has in some way decreased, but I’d expect that to manifest as increased ‘apparent’ load.

Are there any other avenues of investigation available to us?


Get this bounty!!!

#StackBounty: #debian #performance #java Bad performance with Java

Bounty: 50

I have a computer that was running Windows with a lot of big programas (like Adobe Fireworks, and many other stuff) and the computer performance was really good. I decided to format my computer (I didn’t need all this programs because this computer had a different owner) and I installed Debian 8 Jessie (stable).

But since the first fresh installation, every program that needs Java to run (like NetBeans, Google Chrome, Atom (advanced text editor) or anyone) it starts consuming CPU progressively (checked via top command) until I have to reboot manually through the button (the computer is unusable, I can’t open the menu and click on Power off).

I tried using different versions of Java (7 and 8) but nothing worked. Java 7 is installed through the official repositories and version-8 through Oracle downloads official site.

Any word of advice?

EDIT 1: Java version information

lrwxrwxrwx   1 root root    24 May  6  2014 default-java -> java-1.7.0-openjdk-amd64
lrwxrwxrwx   1 root root    20 Nov  7 01:58 java-1.7.0-openjdk-amd64 -> java-7-openjdk-amd64
-rw-r--r--   1 root root  2439 Feb  7 21:22 .java-1.7.0-openjdk-amd64.jinfo
drwxr-xr-x   5 root root  4096 Jan 11 18:54 java-6-openjdk-amd64
drwxr-xr-x   8 root root  4096 Feb 21 11:10 java-7-openjdk-amd64
drwxr-xr-x   9 root root  4096 Feb  3 08:56 jdk-8-oracle-x64
-rw-r--r--   1 root root  2531 Feb  2 10:16 .jdk-8-oracle-x64.jinfo
drwxr-xr-x   2 root root  4096 Feb 21 11:12 openjdk-7

Edit 2: Java performance
Also I’ve noticed via top command that Java process also consumes a lot of CPU (more than 200%).


Get this bounty!!!

#StackBounty: #sql-server #performance #query-optimization #sql-server-2014 #database-performance Why doesn't SQL Server use the in…

Bounty: 50

Given the following in a SQL Server 2014 DB:

create table t 
(
    c1 int primary key,
    c2 datetime2(7),
    c3 nvarchar(20),
    c4 as cast(dbo.toTimeZone(c2, c3, 'UTC') as date) persisted
);

create index i on t (c4);

declare @i int = 0;

while @i < 10000 
begin
    insert into t (c1, c2, c3) values
        (@i, dateadd(day, @i, '1970-01-02 03:04:05:6'), 'Asia/Manila');
    set @i = @i + 1;
end;

toTimeZone is a CLR UDF that converts a datetime2 in a time zone to a datetime2 in another time zone.

When I run the following query:

select c1 
from t 
where c4 >= '1970-01-02'
    and c4 <= '1970-03-04';

The execution plan followed by SQL Server indicates that i isn’t used.

Instead there is a scan on the implicit index on the PK followed by a couple of scalar computations before finally a filter using the predicates of the query. The execution plan that I was expecting is a scan on i.

Use the SSDT project in this ZIP file to try and replicate the problem. It includes a mock definition of the CLR UDF. Included also is the execution plan I get.


Get this bounty!!!

#StackBounty: #windows #performance #remote-desktop #iis #sql-server Slow user session in Windows 2016 Server with SQL Server Standard

Bounty: 100

I’ve recently installed a Windows 2016 Server with SQL Server 2016 Standard. The setup is being used as an IIS webserver with Reporting Services.

Anyhow, when connecting to the server via RDP, the performance is terrible – For instance, launching SQL Server Managment Console takes a couple of minutes instead of seconds, even though the overall CPU usage is about 10% or so. Also, just popping up the Startmenu could take 10 sec or so. This all while the website is performing normally.

I’m guessing that since the server is optimized for “Background services” this might have something to do with it, or that there’s some new super-smart throtteling going on with interface/user sessions.

The previous server was Windows Server 2013, configured about the same, but without this issue. — Help me Obi-Wan Kenobi, you’re our only hope!


Get this bounty!!!

#StackBounty: #c++ #performance #sorting #quick-sort Fast quicksort implementation

Bounty: 50

Just for learning purposes i wrote an implementation of quicksort algorithm.

I made some modifications to the algorithm to make it faster, that are:

  • two pivots setted to the boundaries of the middle partition(<, =, >), to avoid having two other variables counting the equal elements

  • the partitioning method checks first the presence of less(in left partition) or greater(in right partition) than pivot elements, if both are present, rather than moving the pivot twice, it just swaps them; if not behaves normally.

I benchmarked it a bit and compared the performance to std::sort; my algorithm, with random elements is faster than the STD one when there are more than 10000000, otherwise, with less elements std::sort is faster by some milliseconds(see below for actual benchmark results)

I would prefer some performance-related tips rather than design ones.

#include <algorithm>

template<class iterator>
void quickSort(iterator begin, iterator end)
{
    if (end - begin > 1)
    {
        auto lpivot = begin + (end - begin) / 2;
        auto rpivot = lpivot;

        auto pValue = *lpivot;

        auto left_it = lpivot - 1;
        auto right_it = rpivot + 1;

        auto lValue = *left_it;
        auto rValue = *right_it;

        bool isGreater = false;
        bool isLess = false;

        while (left_it != begin-1 || right_it != end)
        {
            if (lValue >= pValue)
            {
                if (lValue == pValue)
                {
                    lpivot--;
                    std::iter_swap(lpivot, left_it);
                }
                else
                    isGreater = true;
            }

            if (rValue <= pValue)
            {
                if (rValue == pValue)
                {
                    rpivot++;
                    std::iter_swap(rpivot, right_it);
                }
                else
                    isLess = true;
            }
            if (isGreater && isLess)
            {
                std::iter_swap(left_it, right_it);
            }
            else if (isGreater)
            {
                if (left_it != lpivot - 1)
                    std::iter_swap(left_it, lpivot - 1);
                std::iter_swap(rpivot - 1, lpivot - 1);
                std::iter_swap(rpivot, rpivot - 1);
                lpivot--;
                rpivot--;
            }
            else if (isLess)
            {
                if (right_it != rpivot + 1)
                    std::iter_swap(right_it, rpivot + 1);
                std::iter_swap(lpivot + 1, rpivot + 1);
                std::iter_swap(lpivot, lpivot + 1);
                lpivot++;
                rpivot++;
            }

            if (left_it != begin - 1)
                left_it--;
            if (right_it != end)
                right_it++;

            lValue = *left_it;
            rValue = *right_it;

            isGreater = false;
            isLess = false;
        }

        quickSort(begin, lpivot);
        quickSort(rpivot + 1, end);
    }
}

My algorithm benchmark

1000000  random integers --------> 80 ms
2000000  random integers --------> 165 ms
3000000  random integers --------> 247 ms
10000000 random integers --------> 780 ms

1000000  binary random integers -> 4 ms
2000000  binary random integers -> 9 ms
3000000  binary random integers -> 14 ms
10000000 binary random integers -> 49 ms

1000000  sorted integers --------> 19 ms
2000000  sorted integers --------> 43 ms
3000000  sorted integers --------> 65 ms
10000000 sorted integers --------> 232 ms

1000000  reversed integers ------> 17 ms
2000000  reversed integers ------> 37 ms
3000000  reversed integers ------> 60 ms
10000000 reversed integers ------> 216 ms

std::sort benchmark

1000000  random integers --------> 71 ms
2000000  random integers --------> 160 ms
3000000  random integers --------> 237 ms
10000000 random integers --------> 800 ms

1000000  binary random integers -> 4 ms
2000000  binary random integers -> 9 ms
3000000  binary random integers -> 13 ms
10000000 binary random integers -> 45 ms

1000000  sorted integers --------> 9 ms
2000000  sorted integers --------> 21 ms
3000000  sorted integers --------> 33 ms
10000000 sorted integers --------> 137 ms

1000000  reversed integers ------> 12 ms
2000000  reversed integers ------> 25 ms
3000000  reversed integers ------> 40 ms
10000000 reversed integers ------> 150 ms

bechmark code

int main()
{
    std::vector<int> values;

    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> dist(0, 1000000);
    //std::uniform_int_distribution<> dist(0, 1); //just 0s and 1s array

    std::generate_n(std::back_inserter(values), 10000000, std::bind(dist, gen)); //random

    for (int i = 0; i < 10000000; i++)
    {
        //values.push_back(i);              //sorted array
        //values.push_back(10000000 - i);   //reversed array
    }

    typedef std::chrono::high_resolution_clock Time;
    typedef std::chrono::milliseconds ms;
    typedef std::chrono::duration<float> fsec;
    auto t0 = Time::now();

    //quickSort(values.begin(), values.end());
    //std::sort(values.begin(), values.end());

    auto t1 = Time::now();
    fsec fs = t1 - t0;
    ms d = std::chrono::duration_cast<ms>(fs);
    std::cout << fs.count() << "sn";
    std::cout << d.count() << "msn";

    return 0;
}

(I know those arrows suck, they are there just for clarity)


Get this bounty!!!

#StackBounty: #tls #key-exchange #diffie-hellman #ecc #performance What are the performance differences (for client and server) between…

Bounty: 50

My question is about the client and server performance for (EC)DHE.
I am having difficulties in understanding this picture (which is based on numbers from Ivan Ristic’ book “Bulletproof SSL and TLS”).

Let’s start with the third row “RSA 2048, DHE 2048”. I can understand that client and server computation times differ because creating a signature is different from verifying it.

I understand that ECDHE is faster than DHE and therefore the numbers in the second row are smaller than the ones in the third row.

My questions are:

  1. The effort for DHE and ECDHE is identical for client and server, isn’t it?
  2. Why is the client’s computation time for “RSA 2048, ECDHE 256” smaller than the server’s while for “RSA 2048, DHE 2048” it is the other way around?


Get this bounty!!!

#StackBounty: #javascript #performance #object-oriented #node.js #database Making job listings project more modular and flexible

Bounty: 50

I am creating a pet project where candidates can apply and recruiters can post their listings:

db.js

var pg = require('pg');

var config = {
  host: 'localhost',
  user: 'v',
  password: 'a',
  database: 'j',
};

var pool = new pg.Pool(config);

pool.connect(function(err, client, done) {
  if(err) {
    return console.error('error fetching client from pool', err);
  }
  client.query('SELECT $1::int AS number', ['1'], function(err, result) {
    done(err);

    if(err) {
      return console.error('error running query', err);
    }
    console.log(result.rows[0].number);
    //output: 1
  });
});

pool.on('error', function (err, client) {
  // if an error is encountered by a client while it sits idle in the pool
  // the pool itself will emit an error event with both the error and
  // the client which emitted the original error
  // this is a rare occurrence but can happen if there is a network partition
  // between your application and the database, the database restarts, etc.
  // and so you might want to handle it and at least log it out
  console.error('idle client error', err.message, err.stack)
});

module.exports = pool;

index.js

'use strict';

var express = require('express');
var Promise = require('promise');
var app = express();
var router = express.Router();
var PORT = 3000;
var pool = require('./db.js');

app.use(function (req, res, next) {
  pool.connect(function(error, client, done) {
    // Handle connection errors
    if(error) {
      done();
      console.log(error.message);
      return res.status(500).json({success: false, data: error});
    }
    req.client = client;
    next();
  });
});

router.get('/topActiveUsers', (req, res) => {
  topActiveUsers(req, res);
});

router.get('/users', (req, res) => {
  userInfo(req, res);
});

app.use(router);

app.get('*', function (req, res) {
  res.status(400).send('Invalid route');
});

app.listen(PORT, function () {
  console.log('App listening on port ' + PORT);
});

var topActiveUsers = function topActiveUsers(req, res) {
};

var userInfo = function userInfo(req, res) {
  User.getById(req)
      .then(function (user) {
        return user;
      })
      .then(function getCompanies(user) {
        return user.companies(req);
      })
      .then(function getListings(user) {
        return user.listings(req);
      })
      .then(function getApplications(user) {
        return user.applications(req);
      })
      .then(function success(user) {
        res.json({
          id: user.id,
          name: user.name,
          createdAt: user.createdAt,
          companies: user._companies,
          listings: user._listings,
          applications: user._applications
        });
      })
      .catch(function error(error) {
        console.log('In catch');
        console.log('error', error.message);
        res.end();
        throw error;
      });
};

/**
 * User m2m Company
 * User o2m Listing
 * User m2m applications
 */
function User(opt_data) {
  var data = opt_data || {};

  this.id = data['id'] || null;
  this.name = data['name'] || '';
  this.createdAt = data['created_at'] || new Date();
  this._companies = [];
  this._listings = [];
  this._applications = [];
}
User._RESOURCE_LIMIT = 5;
User._TABLE_NAME = 'users';

User.getById = function getById(req, callback) {
  var client = req.client;
  client.connection.on('message', function(msg) {
    console.log(msg)
    //console.log(msg.name)
  });
  var queryString = 'select * from users where id = $1::int';
  return new Promise(function _promise(resolve, reject) {
    client.query(
        queryString, [req.query.id],
        function result(error, result) {
      if (error) {
        console.log(error.message);
        return reject(error);
      }

      //console.log(result.rows);

      /*
      client.end(function (error) {
        if (error) throw error;
      });*/
      resolve(new User(result.rows[0]));
    });
  });
};

var UserProto = User.prototype;

UserProto.companies = function companies(req) {
  var client = req.client;
  var queryString = 'select c.id, c.name, t.contact_user '+
    'from companies c, teams t '+
    'where t.user_id = $1::int and t.company_id = c.id '+
    'limit $2::int';
  var self = this;
  return new Promise(function _promise(resolve, reject) {
    client.query(
        queryString,
        [req.query.id, User._RESOURCE_LIMIT],
        function result(error, result) {
      if (error) return reject(error);

      //console.log(result.rows);

      /*
      client.end(function (error) {
        if (error) throw error;
      });*/
      self._companies = result.rows.map(function (data) {
        return new Company(data);
      });
      resolve(self);
    });
  });
};

UserProto.listings = function listings(req, callback) {
  var client = req.client;
  var queryString = 'select * from listings '+
    'where created_by = $1::int '+
    'limit $2::int';
  var self = this;
  return new Promise(function _promise(resolve, reject) {
    client.query(
        queryString,
        [req.query.id, User._RESOURCE_LIMIT],
        function result(error, result) {
      if (error) {
        return reject(error);
      }

      //console.log(result.rows);

      /*
      client.end(function (error) {
        if (error) throw error;
      });*/
      self._listings = result.rows.map(function (data) {
        return new Listing(data);
      });
      //console.log('Got listings', self);
      resolve(self);
    });
  });
};

UserProto.applications = function applications(req, callback) {
  var client = req.client;
  var queryString = 'select a.id as app_id, a.created_at, a.cover_letter, '+
    'l.id as list_id, l.name, l.description '+
    'from applications a, listings l '+
    'where a.user_id = $1::int and a.listing_id = l.id '+
    'limit $2::int';
  var self = this;
  return new Promise(function _promise(resolve, reject) {
    client.query(
        queryString,
        [req.query.id, User._RESOURCE_LIMIT],
        function result(error, result) {
      if (error) return reject(error);

      console.log(result.rows);

      /*
      client.end(function (error) {
        if (error) throw error;
      });*/
      self._applications = result.rows.map(function (data) {
        return new Application(data);
      });
      resolve(self);
    });
  });
};

function Company(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.name = data['name'] || '';
  this.isContact = false;
}

function Listing(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.name = data['name'] || '';
  this.description = data['description'] || '';
}

function Application(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.listing = data['listing'] || null;
  this.coverLetter = data['cover_letter'] || '';
}

data.sql

insert into users (id, created_at, name) values
    (1, '2015-01-13 15:30', 'Mark'),
    (2, '2015-01-13 15:30', 'John'),
    (3, '2016-01-01 10:30', 'Melinda'),
    (4, '2016-01-17 23:30', 'Carl'),
    (5, '2016-02-02 16:30', 'Tim'),
    (6, '2016-02-02 16:30', 'Jessica')
;

insert into companies (id, created_at, name) values
    (1, '2015-01-13 15:00', 'Facewall'),
    (2, '2015-01-17 15:00', 'Carl & Co')
;

insert into teams (company_id, user_id, contact_user) values
    (1, 1, TRUE),
    (2, 3, FALSE),
    (2, 4, TRUE)
;

insert into listings (id, created_at, created_by, name, description) values
    (1, '2015-01-15 11:00', 1, 'Join us conquering the world!', 'This is your best chance to be on the right side of the equation...')
;

insert into applications (created_at, user_id, listing_id, cover_letter) values
    ('2015-01-16 12:00', 2, 1, 'Hello, ...')
;

Notes:

  1. I need to apply dependency injection from start. I don’t have enough experience in NodeJS but I have tried one approach here. I would like to know if there is a better approach.
  2. Is the code flexible and maintainable? I see some repetitive logic. How can I refactor it?
  3. Can my code handle thousands of requests? If not, what changes do I need to make?
  4. What if I give this code as part of an interview process? What are the points where I can get rejected?

Update

Here is my updated code:

index.js

'use strict';

var express = require('express');
var Promise = require('promise');
var router = express.Router();
var app = express();

var pool = require('./db.js')();
var User = require('./models');

var PORT = 3000;

app.use(function (req, res, next) {
  pool.connect(function(error, client, done) {
    // Handle connection errors
    if (error) {
      done(error);
      console.log(error.message);
      return res.status(500)
          .json({success: false, data: error});
    }
    req.client = client;
    req.done = done;
    next();
  });
});
**index.js**

'use strict';

var express = require('express');
var Promise = require('promise');
var router = express.Router();
var app = express();

var pool = require('./db.js')();
var User = require('./models');

var PORT = 3000;

app.use(function (req, res, next) {
  pool.connect(function(error, client, done) {
    // Handle connection errors
    if (error) {
      done(error);
      console.log(error.message);
      return res.status(500)
          .json({success: false, data: error});
    }
    req.client = client;
    req.done = done;
    next();
  });
});

router.get('/topActiveUsers', (req, res) => {
  topActiveUsers(req, res);
});

router.get('/users', (req, res) => {
  userInfo(req, res);
});

app.use(router);

app.get('*', function (req, res) {
  res.status(400).send('Invalid route');
});

app.listen(PORT, function () {
  console.log('App listening on port ' + PORT);
});

var topActiveUsers = function topActiveUsers(req, res) {
  var ENTRIES_PER_PAGE = 3;
  var startIndex = 0;
  var total = 0;
  req.query.page = +req.query.page || 0;

  var pageNum = req.query.page > 0 ? req.query.page : 0;
  if (pageNum > 0) {
    startIndex = ENTRIES_PER_PAGE * (pageNum - 1);
  }
  total = ENTRIES_PER_PAGE * (pageNum + 1);

  User.topActiveUsers(req)
      .then(function fullfilled(users) {
        if (users.length < startIndex) {
          throw new Error('Invalid pagination offset');
        }
        if (users.length > total) {
          users = users.slice(startIndex, startIndex + ENTRIES_PER_PAGE);
        } else {
          users = users.splice(startIndex);
        }
        return Promise.all(users.map(function (user) {
          return user.applicationListings(req);
        }));
      })
      .then(function fullfilled(users) {
        var result = users.map(function (user) {
          return {
            id: user.id,
            name: user.name,
            count: user._appliedListings.length,
            createdAt: user.createdAt,
            listings: user._appliedListings
          };
        });
        res.json(result);
      })
      .catch(function rejected(error) {
        console.log(error.message);
        throw error;
      })
      .finally(function () {
        res.end();
      });
};

var userInfo = function userInfo(req, res) {
  User.getById(req)
      // run companies/listings/applications in "parallel"
      .then(function fullfilled(user) {
        return Promise.all([
          user.id,
          user.name,
          user.createdAt,
          user.companies(req),
          user.listings(req),
          user.applications(req)
        ]);
      })
      .then(function fullfilled([
            id, name, createdAt, companies, listings, applications]) {
        res.json({
          id: id,
          name: name,
          createdAt: createdAt,
          companies: companies,
          listings: listings,
          applications: applications
        });
      })
      .catch(function rejected(error) {
        console.log('error', error.message);
        throw error;
      })
      .finally(function () {
        res.end();
      });
};

models/index.js

var Promise = require('promise');

module.exports = User;

/**
 * User m2m Company
 * User o2m Listing
 * User m2m applications
 */
function User(opt_data) {
  var data = opt_data || {};

  this.id = data['id'] || null;
  this.name = data['name'] || '';
  this.createdAt = data['created_at'] || new Date();
  this._companies = [];
  this._listings = [];
  this._applications = [];
  this._appliedListings = [];
}
User._RESOURCE_LIMIT = 5;
var UserProto = User.prototype;

User.topActiveUsers = function topActiveUsers(req) {
  var queryString = "select * from users u inner join "+
      "(select user_id, count(id) cnt from applications "+
      "where id in (select id from applications where "+
      "created_at > current_date - interval '1 week') "+
      "group by user_id) a on u.id = a.user_id order by a.cnt desc";
  return queryPromise(req, queryString)
      .then(function fullfilled(result) {
        return result.rows.map(function(row) {
          return new User(row);
        });
      });
};

User.getById = function getById(req) {
  var queryString = 'select * from users where id = $1::int';
  return queryPromise(req, queryString, [req.query.id])
      .then(function fullfilled(result) {
        return new User(result.rows[0]);
      });
};

UserProto.companies = function companies(req) {
  var queryString = 'select c.id, c.name, t.contact_user '+
      'from companies c, teams t '+
      'where t.user_id = $1::int and t.company_id = c.id '+
      'limit $2::int';
  return queryPromise(req, queryString, [this.id, User._RESOURCE_LIMIT])
    .then(function fullfilled(result) {
      return result.rows.map(function (data) {
        return new Company(data);
      });
    });
};

UserProto.listings = function listings(req) {
  var queryString = 'select * from listings '+
      'where created_by = $1::int '+
      'limit $2::int';
  return queryPromise(req, queryString, [this.id, User._RESOURCE_LIMIT])
    .then(function fullfilled(result) {
      return result.rows.map(function (data) {
        return new Listing(data);
      });
    });
};

UserProto.applicationListings = function applications(req) {
  var queryString = "select * from listings l inner join "+
      "(select listing_id, user_id, created_at from applications) a "+
      "on a.listing_id = l.id "+
      "where a.user_id = $1::int order by a.created_at desc limit 3";
  var self = this;
  return queryPromise(req, queryString, [this.id])
      .then(function fullfilled(result) {
        self._appliedListings = result.rows.map(function (data) {
          return new Listing(data);
        });
        return self;
      });
};

UserProto.applications = function applications(req) {
  var queryString = 'select a.id as app_id, a.created_at, a.cover_letter, '+
    'l.id as list_id, l.name, l.description '+
    'from applications a, listings l '+
    'where a.user_id = $1::int and a.listing_id = l.id '+
    'limit $2::int';
  return queryPromise(req, queryString, [this.id, User._RESOURCE_LIMIT])
      .then(function fullfilled(result) {
        return result.rows.map(function (data) {
          return new Application(data);
        });
      });
};

function Company(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.name = data['name'] || '';
  this.isContact = false;
}

function Listing(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.name = data['name'] || '';
  this.description = data['description'] || '';
}

function Application(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.listing = data['listing'] || null;
  this.coverLetter = data['cover_letter'] || '';
}

function queryPromise(req, queryString, queryParams, debug) {
  if (debug) {
    console.log(queryString, queryParams);
    req.client.connection.on('message', function(msg) {
      console.log(msg)
    });
  }
  return new Promise(function _promise(resolve, reject) {
    req.client.query(
        queryString,
        queryParams || [],
        function result(error, result) {
      req.done(error);

      if (error) {
        console.log('error ' + error.message);
        return reject(error);
      }
      resolve(result);
    });
  });
};

db.js

var pg = require('pg');

module.exports = function() {
  var config = {
    port: 5432,
    max: 10,
    idleTimeoutMillis: 30000
  };
  switch (process.env.NODE_ENV) {
    case 'development':
      config.host = 'localhost';
      config.user = 'xxxx';
      config.password = 'xxxx';
      config.database = 'xxxx';
      break;
    case 'production':
      config.user = 'xxxx';
      config.database = 'xxxx';
      config.password = 'xxxx';
      config.host = 'xxxx'
      break;
    default:
      throw new Error('Invalid enviroment');
  }

  var pool = new pg.Pool(config);

  pool.connect(function(err, client, done) {
    if(err) {
      return console.error('error fetching client from pool', err);
    }
    client.query('SELECT $1::int AS number', ['1'], function(err, result) {
      done(err);

      if(err) {
        return console.error('error running query', err);
      }
      console.log(result.rows[0].number);
      //output: 1
    });
  });

  pool.on('error', function (err, client) {
    // if an error is encountered by a client while it sits idle in the pool
    // the pool itself will emit an error event with both the error and
    // the client which emitted the original error
    // this is a rare occurrence but can happen if there is a network partition
    // between your application and the database, the database restarts, etc.
    // and so you might want to handle it and at least log it out
    console.error('idle client error', err.message, err.stack)
  });

  return pool;
};

tables.sql

create table users (
    id serial primary key,
    created_at timestamp default current_timestamp,
    name character varying(64)
);

create table companies (
    id serial primary key,
    created_at timestamp default current_timestamp,
    name character varying(64)
);

create table teams (
    id serial primary key,
    company_id integer references companies (id),
    user_id integer references users (id),
    contact_user boolean default false
);

create table listings (
    id serial primary key,
    created_at timestamp default current_timestamp,
    created_by integer references users (id),
    name character varying(64),
    description text
);

create table applications (
    id serial primary key,
    created_at timestamp default current_timestamp,
    user_id integer references users (id),
    listing_id integer references listings (id),
    cover_letter text
);
router.get('/topActiveUsers', (req, res) => {
  topActiveUsers(req, res);
});

router.get('/users', (req, res) => {
  userInfo(req, res);
});

app.use(router);

app.get('*', function (req, res) {
  res.status(400).send('Invalid route');
});

app.listen(PORT, function () {
  console.log('App listening on port ' + PORT);
});

var topActiveUsers = function topActiveUsers(req, res) {
  var ENTRIES_PER_PAGE = 3;
  var startIndex = 0;
  var total = 0;
  req.query.page = +req.query.page || 0;

  var pageNum = req.query.page > 0 ? req.query.page : 0;
  if (pageNum > 0) {
    startIndex = ENTRIES_PER_PAGE * (pageNum - 1);
  }
  total = ENTRIES_PER_PAGE * (pageNum + 1);

  User.topActiveUsers(req)
      .then(function fullfilled(users) {
        if (users.length < startIndex) {
          throw new Error('Invalid pagination offset');
        }
        if (users.length > total) {
          users = users.slice(startIndex, startIndex + ENTRIES_PER_PAGE);
        } else {
          users = users.splice(startIndex);
        }
        return Promise.all(users.map(function (user) {
          return user.applicationListings(req);
        }));
      })
      .then(function fullfilled(users) {
        var result = users.map(function (user) {
          return {
            id: user.id,
            name: user.name,
            count: user._appliedListings.length,
            createdAt: user.createdAt,
            listings: user._appliedListings
          };
        });
        res.json(result);
      })
      .catch(function rejected(error) {
        console.log(error.message);
        throw error;
      })
      .finally(function () {
        res.end();
      });
};

var userInfo = function userInfo(req, res) {
  User.getById(req)
      // run companies/listings/applications in "parallel"
      .then(function fullfilled(user) {
        return Promise.all([
          user.id,
          user.name,
          user.createdAt,
          user.companies(req),
          user.listings(req),
          user.applications(req)
        ]);
      })
      .then(function fullfilled([
            id, name, createdAt, companies, listings, applications]) {
        res.json({
          id: id,
          name: name,
          createdAt: createdAt,
          companies: companies,
          listings: listings,
          applications: applications
        });
      })
      .catch(function rejected(error) {
        console.log('error', error.message);
        throw error;
      })
      .finally(function () {
        res.end();
      });
};

models/index.js

var Promise = require('promise');

module.exports = User;

/**
 * User m2m Company
 * User o2m Listing
 * User m2m applications
 */
function User(opt_data) {
  var data = opt_data || {};

  this.id = data['id'] || null;
  this.name = data['name'] || '';
  this.createdAt = data['created_at'] || new Date();
  this._companies = [];
  this._listings = [];
  this._applications = [];
  this._appliedListings = [];
}
User._RESOURCE_LIMIT = 5;
var UserProto = User.prototype;

User.topActiveUsers = function topActiveUsers(req) {
  var queryString = "select * from users u inner join "+
      "(select user_id, count(id) cnt from applications "+
      "where id in (select id from applications where "+
      "created_at > current_date - interval '1 week') "+
      "group by user_id) a on u.id = a.user_id order by a.cnt desc";
  return queryPromise(req, queryString)
      .then(function fullfilled(result) {
        return result.rows.map(function(row) {
          return new User(row);
        });
      });
};

User.getById = function getById(req) {
  var queryString = 'select * from users where id = $1::int';
  return queryPromise(req, queryString, [req.query.id])
      .then(function fullfilled(result) {
        return new User(result.rows[0]);
      });
};

UserProto.companies = function companies(req) {
  var queryString = 'select c.id, c.name, t.contact_user '+
      'from companies c, teams t '+
      'where t.user_id = $1::int and t.company_id = c.id '+
      'limit $2::int';
  return queryPromise(req, queryString, [this.id, User._RESOURCE_LIMIT])
    .then(function fullfilled(result) {
      return result.rows.map(function (data) {
        return new Company(data);
      });
    });
};

UserProto.listings = function listings(req) {
  var queryString = 'select * from listings '+
      'where created_by = $1::int '+
      'limit $2::int';
  return queryPromise(req, queryString, [this.id, User._RESOURCE_LIMIT])
    .then(function fullfilled(result) {
      return result.rows.map(function (data) {
        return new Listing(data);
      });
    });
};

UserProto.applicationListings = function applications(req) {
  var queryString = "select * from listings l inner join "+
      "(select listing_id, user_id, created_at from applications) a "+
      "on a.listing_id = l.id "+
      "where a.user_id = $1::int order by a.created_at desc limit 3";
  var self = this;
  return queryPromise(req, queryString, [this.id])
      .then(function fullfilled(result) {
        self._appliedListings = result.rows.map(function (data) {
          return new Listing(data);
        });
        return self;
      });
};

UserProto.applications = function applications(req) {
  var queryString = 'select a.id as app_id, a.created_at, a.cover_letter, '+
    'l.id as list_id, l.name, l.description '+
    'from applications a, listings l '+
    'where a.user_id = $1::int and a.listing_id = l.id '+
    'limit $2::int';
  return queryPromise(req, queryString, [this.id, User._RESOURCE_LIMIT])
      .then(function fullfilled(result) {
        return result.rows.map(function (data) {
          return new Application(data);
        });
      });
};

function Company(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.name = data['name'] || '';
  this.isContact = false;
}

function Listing(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.name = data['name'] || '';
  this.description = data['description'] || '';
}

function Application(opt_data) {
  var data = opt_data || {};
  this.id = data['id'] || null;
  this.createdAt = data['created_at'] || new Date();
  this.listing = data['listing'] || null;
  this.coverLetter = data['cover_letter'] || '';
}

function queryPromise(req, queryString, queryParams, debug) {
  if (debug) {
    console.log(queryString, queryParams);
    req.client.connection.on('message', function(msg) {
      console.log(msg)
    });
  }
  return new Promise(function _promise(resolve, reject) {
    req.client.query(
        queryString,
        queryParams || [],
        function result(error, result) {
      req.done(error);

      if (error) {
        console.log('error ' + error.message);
        return reject(error);
      }
      resolve(result);
    });
  });
};

db.js

var pg = require('pg');

module.exports = function() {
  var config = {
    port: 5432,
    max: 10,
    idleTimeoutMillis: 30000
  };
  switch (process.env.NODE_ENV) {
    case 'development':
      config.host = 'localhost';
      config.user = 'vivek';
      config.password = 'admin';
      config.database = 'jobbatical';
      break;
    case 'production':
      config.user = 'rragdkrc37';
      config.database = 'rragdkrc37_db';
      config.password = 'cxbrqnn8wp';
      config.host = 'assignment.codsssqklool.eu-central-1.rds.amazonaws.com'
      break;
    default:
      throw new Error('Invalid enviroment');
  }

  var pool = new pg.Pool(config);

  pool.connect(function(err, client, done) {
    if(err) {
      return console.error('error fetching client from pool', err);
    }
    client.query('SELECT $1::int AS number', ['1'], function(err, result) {
      done(err);

      if(err) {
        return console.error('error running query', err);
      }
      console.log(result.rows[0].number);
      //output: 1
    });
  });

  pool.on('error', function (err, client) {
    // if an error is encountered by a client while it sits idle in the pool
    // the pool itself will emit an error event with both the error and
    // the client which emitted the original error
    // this is a rare occurrence but can happen if there is a network partition
    // between your application and the database, the database restarts, etc.
    // and so you might want to handle it and at least log it out
    console.error('idle client error', err.message, err.stack)
  });

  return pool;
};

tables.sql

create table users (
    id serial primary key,
    created_at timestamp default current_timestamp,
    name character varying(64)
);

create table companies (
    id serial primary key,
    created_at timestamp default current_timestamp,
    name character varying(64)
);

create table teams (
    id serial primary key,
    company_id integer references companies (id),
    user_id integer references users (id),
    contact_user boolean default false
);

create table listings (
    id serial primary key,
    created_at timestamp default current_timestamp,
    created_by integer references users (id),
    name character varying(64),
    description text
);

create table applications (
    id serial primary key,
    created_at timestamp default current_timestamp,
    user_id integer references users (id),
    listing_id integer references listings (id),
    cover_letter text
);

So, finally I got rejected. I would really love to know what I did wrong.


Get this bounty!!!

#StackBounty: #java #performance #reinventing-the-wheel #library #numerical-methods Arbitrary Precision nth Principal Root in Java – Ma…

Bounty: 50

This post is the first in the MathCore series.

The next post is here: Arbitrary precision π (Circular Constant) in Java – MathCore #2


Disclaimer

My project is too big to be reviewed in a single question. So, I’ll post one class at a time.

The relevant methods needed from other classes are added as snippets.


Roots.java

The purpose of this class is to calculate the principal n-th root of a non-negative BigDecimal with the specified precision.

The precision used internally is actually a bit more so that the answer returned is fully accurate up to the specified precision and rounding can be done with confidence.

package mathcore.ops;

import java.math.BigDecimal;
import java.math.BigInteger;
import java.math.MathContext;

import static java.math.BigDecimal.ONE;
import static java.math.BigDecimal.ZERO;
import static java.math.BigDecimal.valueOf;

/**
 * A portion of BigMath refactored out to reduce overall complexity.
 * <p>
 * This class handles the calculation of n-th principal roots.
 *
 * @author Subhomoy Haldar
 * @version 1.0
 */
class Roots {
    // Make this class un-instantiable
    private Roots() {
    }

    /**
     * Uses the n-th root algorithm to find principal root of a verified value.
     *
     * @param a  The value whose n-th root is sought.
     * @param n  The root to find.
     * @param c0 The initial (unexpanded) MathContext.
     * @return The required principal root.
     */
    private static BigDecimal nthRoot(final BigDecimal a,
                                      final int n,
                                      final MathContext c0) {
        // Obtain constants that will be used in every iteration
        final BigDecimal N = valueOf(n);
        final int n_1 = n - 1;

        // Increase precision by "n";
        final int newPrecision = c0.getPrecision() + n;

        final MathContext c = BigMath.expandContext(c0, newPrecision);

        // The iteration limit (quadratic convergence)
        final int limit = n * n * (31 - Integer.numberOfLeadingZeros(newPrecision)) >>> 1;

        // Make an initial guess:
        BigDecimal x = guessRoot(a, n);
        BigDecimal x0;

        // Iterate
        for (int i = 0; i < limit; i++) {
            x0 = x;
            BigDecimal delta = a.divide(x0.pow(n_1), c)
                    .subtract(x0, c)
                    .divide(N, c);
            x = x0.add(delta, c);
        }

        return x.round(c);
    }

    /**
     * Constructs an initial guess for the n-th principal root of
     * a given positive value.
     *
     * @param a The value whose n-th root is sought.
     * @param n The root to find.
     * @return An initial guess.
     */
    private static BigDecimal guessRoot(BigDecimal a, int n) {
        // 1. Obtain first (1/n)th of total bits of magnitude
        BigInteger magnitude = a.unscaledValue();
        final int length = magnitude.bitLength() * (n - 1) / n;
        magnitude = magnitude.shiftRight(length);

        // 2. Obtain the correct scale
        final int newScale = a.scale() / n;

        // 3. Construct the initial guess
        return new BigDecimal(magnitude, newScale);
    }

    /**
     * Returns the principal n-th root of the given positive value.
     *
     * @param decimal The value whose n-th root is sought.
     * @param n       The value of n needed.
     * @param context The MathContext to specify the precision and RoundingMode.
     * @return The principal n-th root of {@code decimal}.
     * @throws ArithmeticException If n is lesser than 2 or {@code decimal} is negative.
     */
    static BigDecimal principalRoot(final BigDecimal decimal,
                                    final int n,
                                    final MathContext context)
            throws ArithmeticException {
        if (n < 2)
            throw new ArithmeticException("'n' must at least be 2.");
        // Quick exits
        if (decimal.signum() == 0)
            return ZERO;
        if (decimal.compareTo(ONE) == 0)
            return ONE;
        if (decimal.signum() < 0)
            throw new ArithmeticException("Only positive values are supported.");
        return nthRoot(decimal, n, context);
    }
}

The method actually used is the last one: static BigDecimal principalRoot(...) in BigMath.java

Relevant methods of BigMath.java

... truncated ...

/**

 * Returns the square root of the given positive {@link BigDecimal} value.
 * The result has two extra bits of precision to ensure better accuracy.
 *
 * @param decimal The value whose square root is sought.
 * @param context The MathContext to specify the precision and RoundingMode.
 * @return The square root of {@code decimal}.
 * @throws ArithmeticException If {@code decimal} is negative.
 */
public static BigDecimal sqrt(final BigDecimal decimal,
                              final MathContext context)
        throws ArithmeticException {
    return Roots.principalRoot(decimal, 2, context);
}

/**
 * Returns the principal n-th root of the given positive value.
 *
 * @param decimal The value whose n-th root is sought.
 * @param n       The value of n needed.
 * @param context The MathContext to specify the precision and RoundingMode.
 * @return The principal n-th root of {@code decimal}.
 * @throws ArithmeticException If n is lesser than 2 or {@code decimal} is negative.
 */
public static BigDecimal principalRoot(final BigDecimal decimal,
                                       final int n,
                                       final MathContext context) {
    return Roots.principalRoot(decimal, n, context);
}

/**
 * A utility method that helps obtain a new {@link MathContext} from an existing
 * one. The new Context has the new precision specified but retains the {@link java.math.RoundingMode}.
 * <p>
 * Usually, it is used to increase the precision and hence "expand" the Context.
 *
 * @param c0           The initial {@link MathContext}.
 * @param newPrecision The required precision.
 * @return The new expanded Context.
 */
public static MathContext expandContext(MathContext c0, int newPrecision) {
    return new MathContext(
            newPrecision,
            c0.getRoundingMode()    // Retain rounding mode
    );
}
... truncated ...

Here’s sample usage with output

public static void main(String[] args) {
    int digits = 100;
    BigDecimal two = BigDecimal.valueOf(2);
    MathContext context = new MathContext(digits, RoundingMode.HALF_EVEN);

    BigDecimal root2 = BigMath.sqrt(two, context);
    System.out.println(root2.round(context));

    BigDecimal square = root2.pow(2, context);
    System.out.println(square);
}

Output

1.414213562373095048801688724209698078569671875376948073176679737990732478462107038850387534327641573
2.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

I welcome comments on all aspects of the code. I’d especially appreciate points on any cases which I’ve overlooked or ways to break my code.

My only request is please do not re-iterate the fact that I’m reinventing the wheel. I’m doing it so that I can learn.


Get this bounty!!!

#StackBounty: Efficient extraction of patch features over an image

Bounty: 50

Input:

  • data – a matrix of size n*m*3*3 (complex values)
  • indices – a list of coordinates (x,y), where x < n and y < m
  • fp – a feature parameter which is a tuple of ((fp11, fp12), (fp21, fp22)), id)
  • reference – a list of 3*3 matrices
  • swrd – a function which computes a similarity value between two complex valued 3*3 matrices

Output:

  • feature_values – a list of features – one feature for each index in (indices)

Functionality:

Given an image (data) were each pixel is a 3*3 matrix. And there is a list of target pixels (indices). For each target pixel, I want to extract features of the patch surrounding it.

A patch feature is either:
a) the swrd of a pixel in the patch with a reference matrix or
b) the swrd of two pixels in the patch

Thus a feature can be described by the relative coordinates fp11, fp12 (x and y offset of pixel of interest 1) and fp21, fp22 (x and y offset of pixel of interest 2).
If fp11 == fp21 and fp12== fp22, then i want to compute a), else i want to compute b).
The reference matrix of interest is defined by the feature parameter called id.

Note that the indices of interest are already filtered so that the sum x+fp__ < n and y+fp__< m for all possible fp__.

Code

Computing the symetric revised wishart distance with regularization in case a matrix A or B is not invertible

def srwd(A, B):
    """This function computes the symetric revised wishart distance as from the paper
    SPECTRAL CLUSTERING OF POLARIMETRIC SAR DATA WITH WISHART-DERIVED DISTANCE MEASURES"""
    try:
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    except:
        A, B = A.reshape(3, 3) + np.eye(3) * 0.000001, B.reshape(3, 3) + np.eye(3) * 0.000001
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    return abs(dist)

Getting the features with the input as given above:

def feature(data, indices, fp, reference):
    # fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
    # where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
    res = []
    if fp[0] != fp[1]:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], data[x + fp[1][0]][y + fp[1][1]]))
    else:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], reference[fp[2]]))
    return res

Finally there is another loop such as:

for fp in feature_params:
    feature_values = feature(data, indices, fp, reference)
    #here work on feature_values

The current implementation is rather inefficent and a bottleneck of the whole process. How could I improve it?

Is there a chance to efficiently compute a feature matrix efficiently and operate on it afterwards?

An executable toy example including the whole code is given here (allowing copy-paste)

import numpy as np
from numpy.linalg import inv

#toy example
data = np.random.rand(1000, 1000, 3, 3) #an image of 1000*1000 pixels, each pixel a 3*3 matrix
indices = np.random.randint(3,96, size = (10000,2)) # a list of 10000 target pixels (lets assume they are unique)
reference = [np.random.rand(3,3)] # a single reference matrix in a list (in actual application there are multiple reference matrices)
feature_params = [((0,0),(-1,-1), 0), ((0,0), (0,0), 0), ((0,1), (0,0), 0), ((1,0), (0,0), 0), ((1,1), (0,0), 0)] 



def srwd(A, B):
    """This function computes the symetric revised wishart distance as from the paper
    SPECTRAL CLUSTERING OF POLARIMETRIC SAR DATA WITH WISHART-DERIVED DISTANCE MEASURES"""
    try:
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    except:
        A, B = A.reshape(3, 3) + np.eye(3) * 0.000001, B.reshape(3, 3) + np.eye(3) * 0.000001
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    return abs(dist)


def feature(data, indices, fp, reference):
    # fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
    # where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
    res = []
    if fp[0] != fp[1]:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], data[x + fp[1][0]][y + fp[1][1]]))
    else:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], reference[fp[2]]))
    return res

for fp in feature_params:
    feature_values = feature(data, indices, fp, reference)        
    #here work on feature_values

A final note about the dimensions of the actual problem:

  • Image of size 6000*1700,
  • around 500 features in feature_params
  • indices is list of around 8.000.000 target indices


Get this bounty!!!