#Tree #Traversals

Below is the Python code for traversing trees in various recursive modes like In order, Preorder, Post Order and their reverse orders…

The code is provided in python, but can be easily translated to Java/JS/PHP etc

#HackerRank: Computing the Correlation

Problem

You are given the scores of N students in three different subjects – MathematicsPhysics and Chemistry; all of which have been graded on a scale of 0 to 100. Your task is to compute the Pearson product-moment correlation coefficient between the scores of different pairs of subjects (Mathematics and Physics, Physics and Chemistry, Mathematics and Chemistry) based on this data. This data is based on the records of the CBSE K-12 Examination – a national school leaving examination in India, for the year 2013.

Pearson product-moment correlation coefficient

This is a measure of linear correlation described well on this Wikipedia page. The formula, in brief, is given by:

where x and y denote the two vectors between which the correlation is to be measured.

Input Format

The first row contains an integer N.
This is followed by N rows containing three tab-space (‘\t’) separated integers, M P C corresponding to a candidate’s scores in Mathematics, Physics and Chemistry respectively.
Each row corresponds to the scores attained by a unique candidate in these three subjects.

Input Constraints

1 <= N <= 5 x 105
0 <= M, P, C <= 100

Output Format

The output should contain three lines, with correlation coefficients computed
and rounded off correct to exactly 2 decimal places.
The first line should contain the correlation coefficient between Mathematics and Physics scores.
The second line should contain the correlation coefficient between Physics and Chemistry scores.
The third line should contain the correlation coefficient between Chemistry and Mathematics scores.

So, your output should look like this (these values are only for explanatory purposes):

0.12
0.13
0.95

Test Cases

There is one sample test case with scores obtained in Mathematics, Physics and Chemistry by 20 students. The hidden test case contains the scores obtained by all the candidates who appeared for the examination and took all three tests (Mathematics, Physics and Chemistry).
Think: How can you efficiently compute the correlation coefficients within the given time constraints, while handling the scores of nearly 400k students?

Sample Input

20
73  72  76
48  67  76
95  92  95
95  95  96
33  59  79
47  58  74
98  95  97
91  94  97
95  84  90
93  83  90
70  70  78
85  79  91
33  67  76
47  73  90
95  87  95
84  86  95
43  63  75
95  92  100
54  80  87
72  76  90

Sample Output

0.89  
0.92  
0.81

There is no special library support available for this challenge.

Solution(Source)

 

#HackerEarth: #BattleOfBots 9: Taunt

Problem

Taunt is a two player board game which is played on a 10X4 grid of cells and is played on opposite sides of the game-board. Each player has an allocated color, Orange ( First Player ) or Green ( Second Player ) being conventional. Each player has nine piece in total. The players move their pieces towards to his / her opponent’s area by moving their pieces strategically. Each piece has a different moving feature and is one of the 3 types of pieces.

Piece 1: It can move to horizontally or vertically adjacent cell, if the cell doesn’t contain a piece of same color.

enter image description here

Piece 2: It can move to horizontally adjacent cell or can move two steps forward, if the cell doesn’t contain a piece of same color (except the piece itself).

enter image description here

This type of piece can move to its own position if its in the second last row of the grid and going downward or if its in the second row of the grid and going upward.

enter image description here

Piece 3: It can move two step diagonally in the forward direction, if the cell doesn’t contain a piece of same color (except the piece itself).

enter image description here enter image description here

This type of piece can move to its own position if its in the second last row of the grid and going downward or if its in the second row of the grid and going upward.

enter image description here

Players take turns involving moves of pieces as mentioned above and can captures opponent’s piece by jumping on or over opponent’s pieces.

Note: Forward direction for first player is downward and for second player is upward.

If a piece (except piece 1) is moving downward and touches the last row, its direction will change i.e. now it will move upward. Similarly, once if a piece (except piece 1) is moving upward and touches the first row, its direction will change i.e. now it will move downward.

Rules:

  • Player can only move according to the moves mentioned above.
  • A player may not move an opponent’s piece.
  • A player can captures opponent’s piece by jumping on or over opponent pieces.

The game will end after 100 moves ( 50 moves for each player ) or when any of the players don’t have any move left. At the end of the game the player with majority of pieces will win.

We will play it on an 10X4 grid. The top left of the grid is [0,0] and the bottom right is [9,3].

Input:
The input will be a 10X4 matrix consisting only of 0,1or2. Next line will contain an integer denoting the total number of moves till the current state of the board. Next line contains an integer – 1 or 2 which is your player id.

In the given matrix, top-left is [0,0] and bottom-right is [9,3]. The y-coordinate increases from left to right, and x-coordinate increases from top to bottom.

A cell is represented by 3 integers.

First integer denotes the player id (1 or 2).
Second integer denotes the type of piece (1, 2 or 3).
Third integer denotes the direction of the piece (0 (upward) or 1 (downward)). When the piece is of first type, direction doesn’t matter as the piece is free to move to horizontally or vertically adjacent cell, if the cell doesn’t contain a piece of same color.

Empty cell is represented by 000.

Output:
In the first line print the coordinates of the cell separated by space, the piece you want to move.
In second line print the coordinates of the cell in which the piece will jump.
You must take care that you don’t print invalid coordinates. For example, [1,1] might be a valid coordinate in the game play if the piece is able to jump to [1,1], but [9,10] will never be. Also if you play an invalid move or your code exceeds the time/memory limit while determining the move, you lose the game.

Starting state
The starting state of the game is the state of the board before the game starts.

131 131 131 121
121 121 111 111
111 000 000 000
000 000 000 000
000 000 000 000
000 000 000 000
000 000 000 000
000 000 000 210
210 210 220 220
220 230 230 230

First Input
This is the input give to the first player at the start of the game.

131 131 131 121
121 121 111 111
111 000 000 000
000 000 000 000
000 000 000 000
000 000 000 000
000 000 000 000
000 000 000 210
210 210 220 220
220 230 230 230
0
1

 

SAMPLE INPUT
000 000 000 000
000 000 000 111
000 000 111 130
000 000 000 000
000 000 000 000
000 220 000 000
131 000 000 000
121 000 210 000
000 210 131 000
000 210 000 000
58
1
SAMPLE OUTPUT
8 2
8 0

Explanation

This is player 1’s turn, and the player will move the piece at [8,2] and will take two steps diagonally in downward direction and will be at [8,0]
After his/her move the state of game becomes:

000 000 000 000
000 000 000 111
000 000 111 130
000 000 000 000
000 000 000 000
000 220 000 000
131 000 000 000
121 000 210 000
130 210 000 000
000 000 000 000
59
2

Note: Direction of the piece is also changed from 1 to 0 as the piece was moving downward and touches the last row. This state will be fed as input to program of player 2.

Here is the code of the default bot.

Time Limit:1.0 sec(s) for each input file.
Memory Limit:256 MB
Source Limit:1024 KB

Sample Game

Installing Apache UserGrid on linux

About the Project

Apache Usergrid is an open-source Backend-as-a-Service (BaaS or mBaaS) composed of an integrated distributed NoSQL database, application layer and client tier with SDKs for developers looking to rapidly build web and/or mobile applications. It provides elementary services and retrieval features like:

  • User Registration & Management
  • Data Storage
  • File Storage
  • Queues
  • Full Text Search
  • Geolocation Search
  • Joins

It is a multi-tenant system designed for deployment to public cloud environments (such as Amazon Web Services, Rackspace, etc.) or to run on traditional server infrastructures so that anyone can run their own private BaaS deployment.

For architects and back-end teams, it aims to provide a distributed, easily extendable, operationally predictable and highly scalable solution. For front-end developers, it aims to simplify the development process by enabling them to rapidly build and operate mobile and web applications without requiring backend expertise.

Usergrid 2.1.0 Deployment Guide

Though the Usergrid Deployment guide seems to be simple enough, I faced certain hiccups and it took me about 4 days to figure out what I was doing wrong.

This document explains how to deploy the Usergrid v2.1.0 Backend-as-a-Service (BaaS), which comprises the Usergrid Stack, a Java web application, and the Usergrid Portal, which is an HTML5/JavaScript application.

Prerequsites

Below are the software requirements for Usergrid 2.1.0 Stack and Portal. You can install them all on one computer for development purposes, and for deployment you can deploy them separately using clustering.

Linux or a UNIX-like system (Usergrid may run on Windows, but we haven’t tried it)

Download the Apache Usergrid 2.1.0 binary release from the official Usergrid releases page:

After untarring the files that you need for deploying Usergrid Stack and Portal are ROOT.war and usergrid-portal.tar.

Stack STEP #1: Setup Cassandra

As mentioned in prerequisites, follow the installation guide given in link

Usergrid uses Cassandra’s Thrift protocol
Before starting cassandra, on Cassandra 2.x releases you MUST enable Thrift by setting start_rpc in your cassandra.yaml file:

    #Whether to start the thrift rpc server.
    start_rpc: true

Note:DataStax no longer supports the DataStax Community version of Apache Cassandra or the DataStax Distribution of Apache Cassandra. It is best to follow the Apache Documentation

Once you are up and running make a note of these things:

  • The name of the Cassandra cluster
  • Hostname or IP address of each Cassandra node
    • in case of same machine as Usergrid, then localhost. Usergrid would then be running on single machine embedded mode.
  • Port number used for Cassandra RPC (the default is 9160)
  • Replication factor of Cassandra cluster

Stack STEP #2: Setup ElasticSearch

Usergrid also needs access to at least one ElasticSearch node. As with Cassandra, you can setup single ElasticSearch node on your computer, and you should run a cluster in production.

Steps:

  • Download and unzip Elasticsearch
  • Run bin/elasticsearch (or bin\elasticsearch -d on Linux as Background Process) (or bin\elasticsearch.bat on Windows)
  • Run curl http://localhost:9200/

Once you are up and running make a note of these things:

  • The name of the ElasticSearch cluster
  • Hostname or IP address of each ElasticSearch node
    • in case of same machine as Usergrid, then localhost. Usergrid would then be running on single machine embedded mode.
  • Port number used for ElasticSearch protocol (the default is 9200)

Stack STEP #3: Setup Tomcat

The Usergrid Stack is contained in a file named ROOT.war, a standard Java EE WAR ready for deployment to Tomcat. On each machine that will run the Usergrid Stack you must install the Java SE 8 JDK and Tomcat 7+.

Stack STEP #4: Configure Usergrid Stack

You must create a Usergrid properties file called usergrid-deployment.properties. The properties in this file tell Usergrid how to communicate with Cassandra and ElasticSearch, and how to form URLs using the hostname you wish to use for Usegrid. There are many properties that you can set to configure Usergrid.

Once you have created your Usergrid properties file, place it in the Tomcat lib directory. On a Linux system, that directory is probably located at /path/to/tomcat7/lib/

The Default Usergrid Properties File

You should review the defaults in the above file. To get you started, let’s look at a minimal example properties file that you can edit and use as your own.

Please note that if you are installing Usergrid on the same machine as Cassandra Server, then set the following property to true

   #Tell Usergrid that Cassandra is not embedded.
   cassandra.embedded=true

Stack STEP #5: Deploy ROOT.war to Tomcat

The next step is to deploy the Usergrid Stack software to Tomcat. There are a variety of ways of doing this and the simplest is probably to place the Usergrid Stack ROOT.war file into the Tomcat webapps directory, then restart Tomcat.

Look for messages like this, which indicate that the ROOT.war file was deployed:

INFO: Starting service Catalina
Jan 29, 2016 1:00:32 PM org.apache.catalina.core.StandardEngine startInternal
INFO: Starting Servlet Engine: Apache Tomcat/7.0.59
Jan 29, 2016 1:00:32 PM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deploying web application archive /usr/share/tomcat7/webapps/ROOT.war

Does it work?

you can use curl:

curl http://localhost:8080/status

If you get a JSON file of status data, then you’re ready to move to the next step. You should see a response that begins like this:

{
“timestamp” : 1454090178953,
“duration” : 10,
“status” : {
“started” : 1453957327516,
“uptime” : 132851437,
“version” : “201601240200-595955dff9ee4a706de9d97b86c5f0636fe24b43”,
“cassandraAvailable” : true,
“cassandraStatus” : “GREEN”,
“managementAppIndexStatus” : “GREEN”,
“queueDepth” : 0,
“org.apache.usergrid.count.AbstractBatcher” : {
“add_invocation” : {
“type” : “timer”,
“unit” : “microseconds”,
… etc. …

Initialize the Usergrid Database

Next, you must initialize the Usergrid database, index and query systems.

To do this you must issue a series of HTTP operations using the superuser credentials. You can only do this if Usergrid is configured to allow superused login via this property usergrid.sysadmin.login.allowed=true and if you used the above example properties file, it is allowed.

The three operation you must perform are expressed by the curl commands below and, of course, you will have ot change the password test to match the superuser password that you set in your Usergrid properties file.

curl -X PUT http://localhost:8080/system/database/setup -u superuser:test
curl -X PUT http://localhost:8080/system/database/bootstrap -u superuser:test
curl -X GET http://localhost:8080/system/superuser/setup -u superuser:test

When you issue each of those curl commands, you should see a success message like this:

{
“action” : “cassandra setup”,
“status” : “ok”,
“timestamp” : 1454100922067,
“duration” : 374
}

Now that you’ve gotten Usergrid up and running, you’re ready to deploy the Usergrid Portal.

Deploying the Usergrid Portal

The Usergrid Portal is an HTML5/JavaScript application, a bunch of static files that can be deployed to any web server, e.g. Apache HTTPD or Tomcat.

To deploy the Portal to a web server, you will un-tar the usergrid-portal.tar file into directory that serves as the root directory of your web pages.

Once you have done that there is one more step. You need to configure the portal so that it can find the Usergrid stack. You do that by editing the portal/config.js and changing this line:

Usergrid.overrideUrl = ’http://localhost:8080/‘;

To set the hostname that you will be using for your Usergrid installation.

I have deployed a sample instance and tested the same. You can find the system ready configurations in TechUtils repository

Launch HTML code in browser from Python

Lets say you have generated some html content for a web page dynamically and have it in memory variable in python.

In order to view and test this content, you would need a Python program that prints out the HTML code, which then would have to be copied and pasted to a HTML file, then from there, you can test it in a browser.

In Python, there is a way to launch such HTML code in a web browser so that you don’t have to go through the copy and paste method every time

Using webbrowser.open:

Source

Java Code to Zip all folders in a particular folder.

A small utility code to create multiple zip files for all folders in the a particular folder.

for example

- c:/path/to/folder
    -> folder 1
    -> folder 2
    -> folder 3
    -> folder 4

Output:

- c:/path/to/folder
    -> folder 1
    -> folder 2
    -> folder 3
    -> folder 4
    -> folder 1.zip
    -> folder 2.zip
    -> folder 3.zip
    -> folder 4.zip

original source: https://goo.gl/sp0bqr

LDAP Connector

Below is a sample code to perform LDAP Queries. Just modify the configuration information and then provide any valid query to get the search results.

You can also modify the code to get custom business logic as required.

 

Sort a list of tuples by Nth item in Python

Suppose you have a list of tuples that looks something like this:

[('abc', 121),('abc', 231),('abc', 148), ('abc',221)]

And you want to sort this list in ascending order by the integer value inside the tuples.

We can achieve this using the key keyword with sorted().

sorted([('abc', 121),('abc', 231),('abc', 148), ('abc',221)], key=lambda x: x[1])

key should be a function that identifies how to retrieve the comparable element from your data structure. For example, the second element of the tuple, so we access [1].

Source: StackOverflow.com

Best way to select random rows PostgreSQL

Given, you have a very large table with 500 Million rows, and you have to select some random 1000 rows out of the table and you want it to be fast.

Given the specifications:

  • You assumed to have a numeric ID column (integer numbers) with only few (or moderately few) gaps.
  • Ideally no or few write operations.
  • Your ID column should have been indexed! A primary key serves nicely.

The query below does not need a sequential scan of the big table, only an index scan.

First, get estimates for the main query:

SELECT count(*) AS ct              -- optional
     , min(id)  AS min_id
            , max(id)  AS max_id
            , max(id) - min(id) AS id_span
FROM   big;

The only possibly expensive part is the count(*) (for huge tables). You will get an estimate, available at almost no cost (detailed explanation here):

SELECT reltuples AS ct FROM pg_class WHERE oid = 'schema_name.big'::regclass;

As long as ct isn’t much smaller than id_span, the query will outperform most other approaches.

WITH params AS (
    SELECT 1       AS min_id           -- minimum id <= current min id
         , 5100000 AS id_span          -- rounded up. (max_id - min_id + buffer)
    )
SELECT *
FROM  (
    SELECT p.min_id + trunc(random() * p.id_span)::integer AS id
    FROM   params p
          ,generate_series(1, 1100) g  -- 1000 + buffer
    GROUP  BY 1                        -- trim duplicates
    ) r
JOIN   big USING (id)
LIMIT  1000;                           -- trim surplus
  • Generate random numbers in the id space. You have “few gaps”, so add 10 % (enough to easily cover the blanks) to the number of rows to retrieve.
  • Each id can be picked multiple times by chance (though very unlikely with a big id space), so group the generated numbers (or use DISTINCT).
  • Join the ids to the big table. This should be very fast with the index in place.
  • Finally trim surplus ids that have not been eaten by dupes and gaps. Every row has a completely equal chance to be picked.

Short version

You can simplify this query. The CTE in the query above is just for educational purposes:

SELECT *
FROM  (
    SELECT DISTINCT 1 + trunc(random() * 5100000)::integer AS id
    FROM   generate_series(1, 1100) g
    ) r
JOIN   big USING (id)
LIMIT  1000;

Refine with rCTE

Especially if you are not so sure about gaps and estimates.

WITH RECURSIVE random_pick AS (
   SELECT *
   FROM  (
      SELECT 1 + trunc(random() * 5100000)::int AS id
      FROM   generate_series(1, 1030)  -- 1000 + few percent - adapt to your needs
      LIMIT  1030                      -- hint for query planner
      ) r
   JOIN   big b USING (id)             -- eliminate miss

   UNION                               -- eliminate dupe
   SELECT b.*
   FROM  (
      SELECT 1 + trunc(random() * 5100000)::int AS id
      FROM   random_pick r             -- plus 3 percent - adapt to your needs
      LIMIT  999                       -- less than 1000, hint for query planner
      ) r
   JOIN   big b USING (id)             -- eliminate miss
   )
SELECT *
FROM   random_pick
LIMIT  1000;  -- actual limit

We can work with a smaller surplus in the base query. If there are too many gaps so we don’t find enough rows in the first iteration, the rCTE continues to iterate with the recursive term. We still need relatively few gaps in the ID space or the recursion may run dry before the limit is reached – or we have to start with a large enough buffer which defies the purpose of optimizing performance.

Duplicates are eliminated by the UNION in the rCTE.

The outer LIMIT makes the CTE stop as soon as we have enough rows.

This query is carefully drafted to use the available index, generate actually random rows and not stop until we fulfill the limit (unless the recursion runs dry). There are a number of pitfalls here if you are going to rewrite it.

Wrap into function

For repeated use with varying parameters:

CREATE OR REPLACE FUNCTION f_random_sample(_limit int = 1000, _gaps real = 1.03)
  RETURNS SETOF big AS
$func$
DECLARE
   _surplus  int := _limit * _gaps;
   _estimate int := (           -- get current estimate from system
      SELECT c.reltuples * _gaps
      FROM   pg_class c
      WHERE  c.oid = 'big'::regclass);
BEGIN

   RETURN QUERY
   WITH RECURSIVE random_pick AS (
      SELECT *
      FROM  (
         SELECT 1 + trunc(random() * _estimate)::int
         FROM   generate_series(1, _surplus) g
         LIMIT  _surplus           -- hint for query planner
         ) r (id)
      JOIN   big USING (id)        -- eliminate misses

      UNION                        -- eliminate dupes
      SELECT *
      FROM  (
         SELECT 1 + trunc(random() * _estimate)::int
         FROM   random_pick        -- just to make it recursive
         LIMIT  _limit             -- hint for query planner
         ) r (id)
      JOIN   big USING (id)        -- eliminate misses
   )
   SELECT *
   FROM   random_pick
   LIMIT  _limit;
END
$func$  LANGUAGE plpgsql VOLATILE ROWS 1000;

Call:

SELECT * FROM f_random_sample();
SELECT * FROM f_random_sample(500, 1.05);

You could even make this generic to work for any table: Take the name of the PK column and the table as polymorphic type and use EXECUTE … But that’s beyond the scope of this post. See:

Possible alternative

IF your requirements allow identical sets for repeated calls (and we are talking about repeated calls) I would consider a materialized view. Execute above query once and write the result to a table. Users get a quasi random selection at lightening speed. Refresh your random pick at intervals or events of your choosing.

Postgres 9.5 introduces TABLESAMPLE SYSTEM (n)

It’s very fast, but the result is not exactly random. The manual:

The SYSTEM method is significantly faster than the BERNOULLI method when small sampling percentages are specified, but it may return a less-random sample of the table as a result of clustering effects.

And the number of rows returned can vary wildly. For our example, to get roughly 1000 rows, try:

SELECT * FROM big TABLESAMPLE SYSTEM ((1000 * 100) / 5100000.0);

Where n is a percentage. The manual:

The BERNOULLI and SYSTEM sampling methods each accept a single argument which is the fraction of the table to sample, expressed as a percentage between 0 and 100. This argument can be any real-valued expression.

Bold emphasis mine.

Related:

Source

K random combinations of N elements in List in Java

Given a List of N Strings, generate and print all possible combinations of R elements in array and return X random combinations from the result. Following is the code for implementing it: