#StackBounty: #data-structures #randomized-algorithms #correctness-proof Proof of Randomized Self-Adjusting Binary Search Tree

Bounty: 50

I developed a randomized self-adjusting binary search tree years ago, which I called a shuffle tree, but was unable to ever have it published because my proofs were rejected (with little explanation). I’ve since given up the hope of publishing (I’m not an academic so it doesn’t matter so much), but perhaps I can have some closure: I’m going to present the tree here, and perhaps someone can help me understand where my proofs fall short? Through testing, I’m quite certain that my understanding of the data structure is correct, but the proofs were always lacking.

First, understand how a top-down splay tree can be implemented around a traverse() function. Shuffle trees can be implemented similarly, where all operations defer to a traverse() function for the balancing operation.

I’m going to begin with a C traverse function for shuffle trees, then I’ll explain:

// returns node with key k,
// or returns the leaf containing
// the closest key to k.
node *  Traverse( key k, node *root, int treesize ) {
    signed int iCounter = rand() % treesize;
    node *pRet = 0;
    node *p = root;
    while ( p ) {
        pRet = p;
        if ( k < value(p) ) {
            p = left(p);
            if (( ! iCounter )&& p ) {
                RotateRight( pRet );
                pRet = parent(p);
            } // end if
        } else if ( value(p) < k ) {
            p = right(p);
            if (( ! iCounter )&& p ) {
                RotateLeft( pRet );
                pRet = parent(p);
            } // end if
        } else
            break ; // break while
        --iCounter;
        iCounter >>= 1;
    } // end while
    return ( pRet );
} 

The rotations used are simple single rotations.

Like a scapegoat tree, shuffle trees sample depth to find imbalance, but unlike scapegoat trees, they execute at most one rotation per access to attempt to restore balance.

At the beginning of traversal, we set an integer count-down value, to a random number in the range [0,N-1], where N is the size of the tree. As we iterate from a parent node to its child, we decrease the counter with I := (I-1)/2. When the counter equals zero, then the current node becomes a candidate rotation pivot. If we need to iterate past the candidate pivot, then we will commit to the rotation. We rotate the pivot away from the direction of traversal.

As search depth increases, the likelihood of a rotation increases. No rotations will occur beyond depth lgN. The counter requires lgN random bits per operation. Shuffle trees also record their
size, so that the counter can be set. No balancing
information needs to be recorded in tree nodes.
Searches may not navigate to a leaf; if the working
set is clustered near the root, then deep searches
will not be required. As a result, rotations can occur less frequently in a well-configured tree.
If a node in the tree is not weight-balanced, then
an access is more likely to traverse into its larger
sub-tree. A rotation at the node probably moves some of the descendants from the larger sub-tree to
the smaller one. Since these operations are probabilistic, rotations will occur which can deteriorate
balance; but as the imbalance increases, the likelihood of a rotation that improves balance
increases.
In effect, the balancing technique is a kind of random sampling. Nodes are selected randomly
from traversal paths and their balance is manipulated. Frequently used data attract more attention
and, therefore, benefit more from balancing activity than infrequently used data. The tree
eventually approximates a weight-blanced configura
tion for the data set, where the probability of
access is the weight for each node.

Here’s where it gets dicey: proving that a traversal occurs in lgN.

I argue the probability that an adversary can select a node
to force a rotation that impairs balance is

(1 - Pw) * Product over A of ( 1 - px )

Where Pw is a number estimating the overall weight balance of the tree (Pw >= 0.5) and px is the weight of node x, the probability it is accessed. Set A is the set containing the pivot and all its ancestors.

If rotations occur which impair balance, and Pw increases, then the overall probability of a favorable rotation increases. As Pw increases, the probability of a favorable rotation dwarfs the probability of a poor rotation. The tree does not linearize, and lgN is maintained.

Eh. What do you think?


Get this bounty!!!

#StackBounty: #data-structures #dynamic-programming Encoding a tree

Bounty: 50

Let $A$ be an $m times n$ matrix. Let $F(A,S(t),t)$ be a program function, where $t$ is an integer counter beginning at $0$ and stopping at some $k leq m,$ and such that $S(0)$ is empty. The function first iterates $t=1$ and then operates on $A$ in such a way that indices for a subset of columns of $A$ are stored in $S(1).$ For each index $k in S(1),$ the program recursively calls itself and collects another subset of columns of $A$ that depend on which $k in S(1)$ is under examination. How should $S$ be constructed to record the indices of each branch $k$ of $S(1)$ so that the indices in further recursive calls remain connected and in order with their respective parent indices?

I have not programmed in years, so I apologize if the question is poorly worded, and would welcome any editing or correction if clarification is needed.

In short, I need to set up some kind of data structure $S$ for indices of $A$ that can be described as a set of trees; the pseudocode I have written intends to traverse branches of these trees, operating on the columns corresponding to the stored indices.

My problem is that I do not know what data structure to use to store the indices properly. I thought of creating an array for each branch that follows each possible line of recursive calls of my program, but it becomes hard to think of even how to label these arrays, as I do not know beforehand how many I need nor how many recursive calls my program will make prior to termination. My only other thought was to use a language like MATLAB that could create an adjacency matrix for each tree belonging to each index $k$ collected in the first function call, but I am not quite sure how I could program this.

Any help would be appreciated!

EDIT: Here is the pseudocode for the function:

The function receives a matrix $A$ and a vector $b.$ The matrix is assumed to have more columns than rows, so that solutions will be underdetermined (and therefore infinite over real and complex fields, and likely large for finite fields), assuming the system has a solution to begin with.

My job is to find all minimal support solutions, or solutions to the system that have the most 0’s as entries.

Of course (as part of my dissertation shows, oddly for the first time in published literature over every finite field, though it has been shown for binary and for real/complex already) the problem is NP-Hard, making the program my advisor and I devised hopelessly exponential – but it is what it is; my advisor wanted me to sort of start something that might continue in the spirit of the SAT solvers out there, I guess, since algorithms that seek minimal support solutions over finite fields are (at the time of this post) comparatively few.

===================

FUNCTION $MinSup([A|b],r)$

Matrix $A$, vector $b$ (as an augmented system), “depth” $r$

initialize $A=A_0,$ $b=b_0,$ $r=1$

ASSUPTION: Ax=b has at least one solution, $m=numrows(A) < n=numcolumns(A)$, and A is row-independent.

RETURNS: Support for the set of minimal support vectors $X$ to the system $Ax=b.$

PSEUDOCODE (sorry, SE is forcing weird formatting I cannot fix)

  1. Form augmented system $[A|b].$ (EDIT: No longer needed as the program receives an augmented system from the start)

  2. Use row operations on $[A|b]$ until column b has a 1 on top and 0’s on the bottom (doing row exchanges to get a nonzero top if necessary).

  3. Look for any rows that have all zeros from rows $r$ through m (and a nonzero top). Record these indices in $S.$

    1. If S is not empty, then for each $k in S,$ if $r=1$ RETURN the set $x_k=(a_k)^{-1}e_k$ and $x_j=0$ otherwise (solutions are weight one). If $r neq 1,$ for each $k in S$ BACKSOLVE column $k$ and RETURN each smallest resulting minimal support solution (the matrices $A$ sent to the function when $r>1$ will have $r-1$ columns of the identity matrix already).

Else

  1. For all columns that have a nonzero top row entry, record the indices of each column in $S.$ Iterate $r=r+1.$ If $r=m$ then RETURN all $n$ choose $m$ solutions as your minimal support set.

  2. ELSE For each index $k in S,$ BACKSOLVE column $k$ (we need to use row $r-1$ at this point to pivot for each of these, and of course store the augmented system somewhere in case we need to return to this level). There should be a copy of (a multiple of) column $k$ as the “new $b.$” Send this updated augmented system and index $A’, b’,r$ to another call of $MinSup([A|b],r).$

========

There will probably be a small parent program that can cull through multiple responses for the MinSup calls. Essentially, I keep calling until I “hit the jackpot,” or the deepest iteration $r$ for which my row-exchanged update to $b$ on that particular call identifies a perfect multiple for rows $r$ and below (has all zeros from row $r+1$ to $m$).

As anyone can see, this is obviously a combinatorial nightmare, but it is what it is and I have to program it – and given that my advisor is now one year past retirement, and knows nothing but Mathematica, I’ll have to program it on that somehow. It is ugly, but it is what it is.


Get this bounty!!!

HackerRank: Repeated String

Problem

Lilah has a string, s, of lowercase English letters that she repeated infinitely many times.

Given an integer, n, find and print the number of letter a‘s in the first letters of Lilah’s infinite string.

Input Format

The first line contains a single string, s.
The second line contains an integer, n.

Constraints

  • 1<=|s|<=100
  • 1<=|n|<=10^12
  • For 25% of the test cases, n <= 10^6

Output Format

Print a single integer denoting the number of letter a’s in the first letters of the infinite string created by repeating infinitely many times.

Sample Input 0

aba
10

Sample Output 0

7

Explanation 0

The first n = 10 letters of the infinite string are abaabaabaa. Because there are 7 a‘s, we print on a new line.

Sample Input 1

a
1000000000000

Sample Output 1

1000000000000

Explanation 1

Because all of the first n=1000000000000 letters of the infinite string are a, we print 1000000000000 on a new line.

Solution

HackerRank: Circular Array Rotation

Problem

John Watson performs an operation called a right circular rotation on an array of integers, [a(0),a(1).a(2)...a(n-2),a(n-1)]. After performing one right circular rotation operation, the array is transformed from

[a(0),a(1).a(2)...a(n-2),a(n-1)]

to

[a(n-1),a(0),a(1).a(2)...a(n-2)].

Watson performs this operation k times. To test Sherlock’s ability to identify the current element at a particular position in the rotated array, Watson asks q queries, where each query consists of a single integer, m, for which you must print the element at index in the rotated array (i.e., the value of a(m)).

Input Format

The first line contains space-separated integers, n, k, and q, respectively.
The second line contains space-separated integers, where each integer i describes array element a(i)(where 0 <= i <= n).
Each of the q subsequent lines contains a single integer denoting m.

Constraints

  • 0 <= i <= 10^5
  • 0 <= a(i) <= 10^5
  • 0 <= k <= 10^5
  • 0 <= q <= 500
  • 0 <= m <= N-1

Output Format

For each query, print the value of the element at index m of the rotated array on a new line.

Sample Input
3 2 3
1 2 3
0
1
2
Sample Output
2
3
1

Explanation

After the first rotation, the array becomes [3,1,2].
After the second (and final) rotation, the array becomes [2,3,1].

Let’s refer to the array’s final state as array b. For each query, we just have to print the value of b(m) on a new line:

  • m=0 , so we print 2 on a new line.
  • m=1 , so we print 3 on a new line.
  • m=2 , so we print 1 on a new line.

Soluton

HackerEarth: Battle Of Bots 6: Draughts

Problem:

Sample Game

Draughts is a two player board game which is played on a 8X8 grid of cells and is played on opposite sides of the game-board. Each player has an allocated color, Red ( First Player ) or White ( Second Player ) being conventional. Players take turns involving diagonal moves of uniform game pieces in the forward direction only and mandatory captures by jumping over opponent pieces.

Rules:

  • Player can only move diagonally to the adjacent cell and in forward direction, if the diagonally adjacent cell is vacant.
  • A player may not move an opponent’s piece.
  • If the diagonally adjacent cell contains an opponent’s piece, and the cell immediately beyond it is vacant, the opponent’s piece may be captured (and removed from the game) by jumping over it in the forward direction only.
  • If a player made a jump, then its mandatory to keep on jumping as long as the jump is possible.
  • Player cannot move to the diagonally adjacent cell once the player made a jump.

The game will end when any of the players don’t have any move left. At the end of the game the player with majority of pieces will win.

We will play it on an 8X8 grid. The top left of the grid is [0,0] and the bottom right is [7,7].

Input:
The input will be a 8X8 matrix consisting only of 0o2. Then another line will follow which will contain a number –  1 or 2 which is your player id. In the given matrix, top-left is [0,0] and bottom-right is [7,7]. The x-coordinate increases from left to right, and y-coordinate increases from top to bottom.

The cell marked 0 means it doesn’t contain any stones. The cell marked 1 means it contains first player’s stone which is Red in color. The cell marked 2 means it contains second player’s stone which is white in color.

Output:
In the first line print the coordinates of the cell separated by space, the piece he / she wants to move.
In second line print an integer N, number of steps or jumps the piece will make in one move.
In the next N lines print the coordinates of the cells in which the piece will make jump.
You must take care that you don’t print invalid coordinates. For example, [1,1] might be a valid coordinate in the game play if [1,1] in diagonal to the piece in which is going to jump, but [9,10] will never be. Also if you play an invalid move or your code exceeds the time/memory limit while determining the move, you lose the game.

Starting state
The starting state of the game is the state of the board before the game starts.

0 1 0 1 0 1 0 1
1 0 1 0 1 0 1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 2 0 2 0 2 0 2
2 0 2 0 2 0 2 0

First Input
This is the input give to the first player at the start of the game.

0 1 0 1 0 1 0 1
1 0 1 0 1 0 1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 2 0 2 0 2 0 2
2 0 2 0 2 0 2 0
1
SAMPLE INPUT
0 1 0 1 0 1 0 1
1 0 1 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0
0 0 2 0 0 0 0 0
0 0 0 2 0 0 0 2
2 0 2 0 2 0 2 0
1
SAMPLE OUTPUT
2 5
2
4 3
6 1

Explanation

This is player 1’s turn, and the player will move the piece at [2,5] and will make two jumps. First jump will be at [4,3and second jump will be at [6,1]

After his/her move the state of game becomes:

0 1 0 1 0 1 0 1
1 0 1 0 1 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 1 0 2 0 2 0 2
2 0 2 0 2 0 2 0

This state will be fed as input to program of player 2.

Other valid move for the first player is

2 5
1
3 6

But the following are invalid moves.
Case 1:

2 5
1
4 3

Because after making a jump its possible to jump again and its mandatory to jump as long as its possible to jump.

Case 2:

2 5
2
4 3
5 4

Because after making a jump its invalid to move to diagonally adjacent cell.

Here is the code of the Random Bot.

Time Limit:1.0 sec(s) for each input file.
Memory Limit:256 MB
Source Limit:1024 KB

Solution

This is the solution submitted by me

HackerEarth: Battle Of Bots #5: Reversi

Problem:

Reversi is a two player board game which is played on a 10 x 10 grid of cells. Each player has an allocated color, Black ( First Player ) or White ( Second Player ) being conventional. Players take turns placing a stone of their color on a single cell. A player must place a stone on the board, in such a position that there exists at least one straight (horizontal, vertical, or diagonal) occupied line between the new stone and another stone of same color, with one or more contiguous other color stone between them.

During a game, any stone of the opponent’s color that are in a straight line and bounded by the stone just placed and another stone of the current player’s color are turned over to the current player’s color. The game will end when the board is completely filled or both the players don’t have any move left. At the end of the game the player with majority of stone will win.

We will play it on an 10 x 10 grid. The top left of the grid is [0,0] and the bottom right is [9,9]. The rule is that a cell[i,j] is connected to any of top, left, right, or bottom cell.

Input:
The input will be a 10 x 10 matrix consisting only of 0,1,2 or 3. Then another line will follow which will contain a number – 1 or 2 which is your player id.

In the given matrix, top-left is [0,0] and bottom-right is [9,9].

In cell[row,column], row increases from top to bottom and column increases from left to right.

The cell marked 0 means it doesn’t contain any stones. The cell marked 1 means it contains first player’s stone which is Black in color. The cell marked 2 means it contains second player’s stone which is white in color. The cell marked 3 means it is a valid place for player whose turn it is.

Output:
Print the coordinates of the cell separated by space, where you want to play your move. You must take care that you don’t print invalid coordinates. For example, [1] might be a valid coordinate in the game play if cell[i,j]=3, but [10] will never be. Also if you play an invalid move or your code exceeds the time/memory limit while determining the move, you lose the game.

Starting state
The starting state of the game is the state of the board before the game starts.

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 1 0 0 0 0
0 0 0 0 1 2 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0

First Input
This is the input give to the first player at the start of the game.

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 3 0 0 0 0 0
0 0 0 3 2 1 0 0 0 0
0 0 0 0 1 2 3 0 0 0
0 0 0 0 0 3 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
1

Scoring
The scores are calculated by running tournament of all submissions. Your latest submission will be taken into tournament. Scores are assigned according to the Glicko-2 rating system. For more information and questions, see Bot problem judge.

SAMPLE INPUT

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 3 0 0 0 0 0
0 0 0 3 2 1 3 0 0 0
0 0 0 0 2 2 0 0 0 0
0 0 0 3 1 1 2 3 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
1
SAMPLE OUTPUT
4 3

Explanation

This is player 1’s turn, and the player puts his/her stone in cell[4,3].
After his/her move the state of game becomes:

0 0 0 0 0 0 0 0 0 0  
0 0 0 0 0 0 0 0 0 0  
0 0 0 0 0 0 0 0 0 0  
0 0 0 3 0 3 0 0 0 0  
0 0 0 1 1 1 0 0 0 0  
0 0 0 3 1 2 0 0 0 0  
0 0 0 3 1 1 2 0 0 0  
0 0 0 1 0 3 0 0 0 0  
0 0 3 0 0 0 0 0 0 0  
0 0 0 0 0 0 0 0 0 0  

This state will be fed as input to program of player 2.


Time Limit:1.0 sec(s) for each input file.
Memory Limit: 256 MB
Source Limit: 1024 KB
Marking Scheme:Marks are awarded if any testcase passes.
Allowed Languages:C, CPP, CLOJURE, CSHARP, D, ERLANG, FSHARP, GO, GROOVY, HASKELL, JAVA, JAVA8, JAVASCRIPT, JAVASCRIPT_NODE, LISP, LISP_SBCL, LUA, OBJECTIVEC, OCAML, OCTAVE, PASCAL, PERL, PHP, PYTHON, PYTHON3, R, RACKET, RUBY, RUST, SCALA, SWIFT, VB

My Solution:

CodeEval: Penultimate Word

Challenge Description:

Write a program which finds the next-to-last word in a string.

Input Sample:

Your program should accept as its first argument a path to a filename. Input example is the following

some line with text
another line

Each line has more than one word.

Output Sample:

Print the next-to-last word in the following way.

with
another

Solution:

 

CodeEval: Shortest Repetition

Challenge Description:

Write a program to determine the shortest repetition in a string.
A string is said to have period p if it can be formed by concatenating one or more repetitions of another string of length p. For example, the string “xyzxyzxyzxyz” has period 3, since it is formed by 4 repetitions of the string “xyz”. It also has periods 6 (two repetitions of “xyzxyz”) and 12 (one repetition of “xyzxyzxyzxyz”).

Input Sample:

Your program should accept as its first argument a path to a filename. Each line will contain a string of up to 80 non-blank characters. E.g.

abcabcabcabc
bcbcbcbcbcbcbcbcbcbcbcbcbcbc
dddddddddddddddddddd
adcdefg

Output Sample:

Print out the smallest period of the input string. E.g.

3
2
1
7

Solution:

 

CodeEval: PASS TRIANGLE

CHALLENGE DESCRIPTION:

By starting at the top of the triangle and moving to adjacent numbers on the row below, the maximum total from top to bottom is 27.

   5
  9 6
 4 6 8
0 7 1 5

5 + 9 + 6 + 7 = 27

INPUT SAMPLE:

Your program should accept as its first argument a path to a filename. Input example is the following:

5
9 6
4 6 8
0 7 1 5

You make also check full input file which will be used for your code evaluation.

OUTPUT SAMPLE:

The correct output is the maximum sum for the triangle. So for the given example the correct answer would be 27

Solution

HackerRank: Gemstones

Problem

John has discovered various rocks. Each rock is composed of various elements, and each element is represented by a lower-case Latin letter from ‘a’ to ‘z’. An element can be present multiple times in a rock. An element is called a gem-element if it occurs at least once in each of the rocks.

Given the list of N rocks with their compositions, display the number of gem-elements that exist in those rocks.

Input Format

The first line consists of an integer, N, the number of rocks.
Each of the next N lines contains a rock’s composition. Each composition consists of lower-case letters of English alphabet.

Constraints
1N100
Each composition consists of only lower-case Latin letters (‘a’-‘z’).
1≤ length of each composition 100

Output Format

Print the number of gem-elements that are common in these rocks. If there are none, print 0.

Sample Input

3
abcdde
baccd
eeabg

Sample Output

2

Explanation

Only “a” and “b” are the two kinds of gem-elements, since these are the only characters that occur in every rock’s composition.

Solution