#StackBounty: #bayesian #algorithms #prior How to extend Solomonoff's universal prior to stochastic models?

Bounty: 150

Solomonoff’s universal prior for models is based on the algorithmic complexity of a computer program $p$ which executes that model. Where $l$ is the length of the computer program, the prior is proportional to $2^{-l}$.

This works fine for deterministic models. We have an observation and we want to understand what model best explains the observation. If the program $p$ returns an output equal to the observation then this program is valid, otherwise it is invalid. We can apply Bayes rule to get a posterior probability by considering all programs which returned the desired output.

I’m confused about how this is extended to the case where there is randomness in the observations we collect.

Take a simple example: Suppose we are considering a linear regression model $y=ax+b$. The computer program which executes this model has a length $l$ which is a function of $a$ and $b$ (bigger parameters require more bits to model). The prior is $pi(a,b)$.

But this program doesn’t include a random element. The regression model is $y=ax+b+epsilon$ where $epsilon$ is random noise with a distribution $N(0,sigma^2)$

How should we consider a prior for the noise? A computer program can’t produce a random noise so this can’t be a part of the program length. Should the prior for $sigma^2$ be considered separately to the prior for the model? Or should the number of bits required to encode the numeric value $sigma^2$ be included in the length of the program?

In addition to the variance of the noise, the underlying normal distribution would seen to have some degree of complexity associated with it. Should the bits required to describe a normal distribution be included in the model prior?


Get this bounty!!!

#StackBounty: #bayesian #algorithms #prior How to extend Solomonoff's universal prior to stochastic models?

Bounty: 150

Solomonoff’s universal prior for models is based on the algorithmic complexity of a computer program $p$ which executes that model. Where $l$ is the length of the computer program, the prior is proportional to $2^{-l}$.

This works fine for deterministic models. We have an observation and we want to understand what model best explains the observation. If the program $p$ returns an output equal to the observation then this program is valid, otherwise it is invalid. We can apply Bayes rule to get a posterior probability by considering all programs which returned the desired output.

I’m confused about how this is extended to the case where there is randomness in the observations we collect.

Take a simple example: Suppose we are considering a linear regression model $y=ax+b$. The computer program which executes this model has a length $l$ which is a function of $a$ and $b$ (bigger parameters require more bits to model). The prior is $pi(a,b)$.

But this program doesn’t include a random element. The regression model is $y=ax+b+epsilon$ where $epsilon$ is random noise with a distribution $N(0,sigma^2)$

How should we consider a prior for the noise? A computer program can’t produce a random noise so this can’t be a part of the program length. Should the prior for $sigma^2$ be considered separately to the prior for the model? Or should the number of bits required to encode the numeric value $sigma^2$ be included in the length of the program?

In addition to the variance of the noise, the underlying normal distribution would seen to have some degree of complexity associated with it. Should the bits required to describe a normal distribution be included in the model prior?


Get this bounty!!!

#StackBounty: #bayesian #algorithms #prior How to extend Solomonoff's universal prior to stochastic models?

Bounty: 150

Solomonoff’s universal prior for models is based on the algorithmic complexity of a computer program $p$ which executes that model. Where $l$ is the length of the computer program, the prior is proportional to $2^{-l}$.

This works fine for deterministic models. We have an observation and we want to understand what model best explains the observation. If the program $p$ returns an output equal to the observation then this program is valid, otherwise it is invalid. We can apply Bayes rule to get a posterior probability by considering all programs which returned the desired output.

I’m confused about how this is extended to the case where there is randomness in the observations we collect.

Take a simple example: Suppose we are considering a linear regression model $y=ax+b$. The computer program which executes this model has a length $l$ which is a function of $a$ and $b$ (bigger parameters require more bits to model). The prior is $pi(a,b)$.

But this program doesn’t include a random element. The regression model is $y=ax+b+epsilon$ where $epsilon$ is random noise with a distribution $N(0,sigma^2)$

How should we consider a prior for the noise? A computer program can’t produce a random noise so this can’t be a part of the program length. Should the prior for $sigma^2$ be considered separately to the prior for the model? Or should the number of bits required to encode the numeric value $sigma^2$ be included in the length of the program?

In addition to the variance of the noise, the underlying normal distribution would seen to have some degree of complexity associated with it. Should the bits required to describe a normal distribution be included in the model prior?


Get this bounty!!!

#StackBounty: #bayesian #algorithms #prior How to extend Solomonoff's universal prior to stochastic models?

Bounty: 150

Solomonoff’s universal prior for models is based on the algorithmic complexity of a computer program $p$ which executes that model. Where $l$ is the length of the computer program, the prior is proportional to $2^{-l}$.

This works fine for deterministic models. We have an observation and we want to understand what model best explains the observation. If the program $p$ returns an output equal to the observation then this program is valid, otherwise it is invalid. We can apply Bayes rule to get a posterior probability by considering all programs which returned the desired output.

I’m confused about how this is extended to the case where there is randomness in the observations we collect.

Take a simple example: Suppose we are considering a linear regression model $y=ax+b$. The computer program which executes this model has a length $l$ which is a function of $a$ and $b$ (bigger parameters require more bits to model). The prior is $pi(a,b)$.

But this program doesn’t include a random element. The regression model is $y=ax+b+epsilon$ where $epsilon$ is random noise with a distribution $N(0,sigma^2)$

How should we consider a prior for the noise? A computer program can’t produce a random noise so this can’t be a part of the program length. Should the prior for $sigma^2$ be considered separately to the prior for the model? Or should the number of bits required to encode the numeric value $sigma^2$ be included in the length of the program?

In addition to the variance of the noise, the underlying normal distribution would seen to have some degree of complexity associated with it. Should the bits required to describe a normal distribution be included in the model prior?


Get this bounty!!!

#StackBounty: How to apply ant colony optimization to the TSP but repeating nodes and edges

Bounty: 50

I’m learning the Ant Colony Optimization Algorithm and I would like to apply it to a variation of the TSP problem (find the path that start from a node, crosses all nodes and finish in the initial node) where you can cross a node or edge more than once. Actually, my problem wouldn’t have solution for the TSP because some nodes only have one edge.

Basically, I see two problems:

  1. In most of examples I’ve seen, nodes are represented by (X,Y) positions and they suppose that is a complete graph, so from every node you can go to every node, which involves that an ant will always have a edge to go to a node that hasn’t been visited yet. But in this problem, it may happen that an ant arrives to a node where al the adjacents nodes have already been visited, so, how would it decide the way to go? The worst case would be that there is only one node left to visit and it is the one farthest from the current node.

  2. What repeating edges involves to pheromones. If and edge is crossed several times in the tour, it could or could not have much more pheromones, depending on the implementation, and I’m not sure if it could have a negative impact.

This is a very simple example, where red edges mean that the cost is 1 and blue that is very big, for example 20. As you can see for the best path you need to repeat edges.

enter image description here

Regards.

EDIT: I could use Floyd to get the shortest paths between each node, but the real graph has around 1000 nodes and 2500 edges (sparse graph), and Floyd is O(N^3) so I’m not sure if it is a reasonable option (I’ve seen Johnson too). But in that case I would have to consider that the path from one node to another may cross other vistied or not visited nodes, which changes the logic of the ant colony optimization, and it can have problem with data structures, because if I store all the possible paths one option would be to have a matrix of 1000×1000 (it could be optimized to 1000×500) and in each value of the matrix a list of more than 1000 nodes, but that way the memory gets to big (1000x1000x1000).


Get this bounty!!!

K random combinations of N elements in List in Java

Given a List of N Strings, generate and print all possible combinations of R elements in array and return X random combinations from the result. Following is the code for implementing it:

HackerRank: Repeated String

Problem

Lilah has a string, s, of lowercase English letters that she repeated infinitely many times.

Given an integer, n, find and print the number of letter a‘s in the first letters of Lilah’s infinite string.

Input Format

The first line contains a single string, s.
The second line contains an integer, n.

Constraints

  • 1<=|s|<=100
  • 1<=|n|<=10^12
  • For 25% of the test cases, n <= 10^6

Output Format

Print a single integer denoting the number of letter a’s in the first letters of the infinite string created by repeating infinitely many times.

Sample Input 0

aba
10

Sample Output 0

7

Explanation 0

The first n = 10 letters of the infinite string are abaabaabaa. Because there are 7 a‘s, we print on a new line.

Sample Input 1

a
1000000000000

Sample Output 1

1000000000000

Explanation 1

Because all of the first n=1000000000000 letters of the infinite string are a, we print 1000000000000 on a new line.

Solution

Apache Commons DbUtils Mini Wrapper

This is a very small DB Connector code in Java as a wrapper class to Apache DBUtils.

The Commons DbUtils library is a small set of classes designed to make working with JDBC easier. JDBC resource cleanup code is mundane, error prone work so these classes abstract out all of the cleanup tasks from your code leaving you with what you really wanted to do with JDBC in the first place: query and update data.

Some of the advantages of using DbUtils are:

  • No possibility for resource leaks. Correct JDBC coding isn’t difficult but it is time-consuming and tedious. This often leads to connection leaks that may be difficult to track down.
  • Cleaner, clearer persistence code. The amount of code needed to persist data in a database is drastically reduced. The remaining code clearly expresses your intention without being cluttered with resource cleanup.
  • Automatically populate Java Bean properties from Result Sets. You don’t need to manually copy column values into bean instances by calling setter methods. Each row of the Result Set can be represented by one fully populated bean instance.

DbUtils is designed to be:

  • Small – you should be able to understand the whole package in a short amount of time.
  • Transparent – DbUtils doesn’t do any magic behind the scenes. You give it a query, it executes it and cleans up for you.
  • Fast – You don’t need to create a million temporary objects to work with DbUtils.

DbUtils is not:

  • An Object/Relational bridge – there are plenty of good O/R tools already. DbUtils is for developers looking to use JDBC without all the mundane pieces.
  • A Data Access Object (DAO) framework – DbUtils can be used to build a DAO framework though.
  • An object oriented abstraction of general database objects like a Table, Column, or Primary Key.
  • A heavyweight framework of any kind – the goal here is to be a straightforward and easy to use JDBC helper library.

Wrapper:

HackerRank: Circular Array Rotation

Problem

John Watson performs an operation called a right circular rotation on an array of integers, [a(0),a(1).a(2)...a(n-2),a(n-1)]. After performing one right circular rotation operation, the array is transformed from

[a(0),a(1).a(2)...a(n-2),a(n-1)]

to

[a(n-1),a(0),a(1).a(2)...a(n-2)].

Watson performs this operation k times. To test Sherlock’s ability to identify the current element at a particular position in the rotated array, Watson asks q queries, where each query consists of a single integer, m, for which you must print the element at index in the rotated array (i.e., the value of a(m)).

Input Format

The first line contains space-separated integers, n, k, and q, respectively.
The second line contains space-separated integers, where each integer i describes array element a(i)(where 0 <= i <= n).
Each of the q subsequent lines contains a single integer denoting m.

Constraints

  • 0 <= i <= 10^5
  • 0 <= a(i) <= 10^5
  • 0 <= k <= 10^5
  • 0 <= q <= 500
  • 0 <= m <= N-1

Output Format

For each query, print the value of the element at index m of the rotated array on a new line.

Sample Input
3 2 3
1 2 3
0
1
2
Sample Output
2
3
1

Explanation

After the first rotation, the array becomes [3,1,2].
After the second (and final) rotation, the array becomes [2,3,1].

Let’s refer to the array’s final state as array b. For each query, we just have to print the value of b(m) on a new line:

  • m=0 , so we print 2 on a new line.
  • m=1 , so we print 3 on a new line.
  • m=2 , so we print 1 on a new line.

Soluton

HackerEarth: Battle Of Bots 6: Draughts

Problem:

Sample Game

Draughts is a two player board game which is played on a 8X8 grid of cells and is played on opposite sides of the game-board. Each player has an allocated color, Red ( First Player ) or White ( Second Player ) being conventional. Players take turns involving diagonal moves of uniform game pieces in the forward direction only and mandatory captures by jumping over opponent pieces.

Rules:

  • Player can only move diagonally to the adjacent cell and in forward direction, if the diagonally adjacent cell is vacant.
  • A player may not move an opponent’s piece.
  • If the diagonally adjacent cell contains an opponent’s piece, and the cell immediately beyond it is vacant, the opponent’s piece may be captured (and removed from the game) by jumping over it in the forward direction only.
  • If a player made a jump, then its mandatory to keep on jumping as long as the jump is possible.
  • Player cannot move to the diagonally adjacent cell once the player made a jump.

The game will end when any of the players don’t have any move left. At the end of the game the player with majority of pieces will win.

We will play it on an 8X8 grid. The top left of the grid is [0,0] and the bottom right is [7,7].

Input:
The input will be a 8X8 matrix consisting only of 0o2. Then another line will follow which will contain a number –  1 or 2 which is your player id. In the given matrix, top-left is [0,0] and bottom-right is [7,7]. The x-coordinate increases from left to right, and y-coordinate increases from top to bottom.

The cell marked 0 means it doesn’t contain any stones. The cell marked 1 means it contains first player’s stone which is Red in color. The cell marked 2 means it contains second player’s stone which is white in color.

Output:
In the first line print the coordinates of the cell separated by space, the piece he / she wants to move.
In second line print an integer N, number of steps or jumps the piece will make in one move.
In the next N lines print the coordinates of the cells in which the piece will make jump.
You must take care that you don’t print invalid coordinates. For example, [1,1] might be a valid coordinate in the game play if [1,1] in diagonal to the piece in which is going to jump, but [9,10] will never be. Also if you play an invalid move or your code exceeds the time/memory limit while determining the move, you lose the game.

Starting state
The starting state of the game is the state of the board before the game starts.

0 1 0 1 0 1 0 1
1 0 1 0 1 0 1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 2 0 2 0 2 0 2
2 0 2 0 2 0 2 0

First Input
This is the input give to the first player at the start of the game.

0 1 0 1 0 1 0 1
1 0 1 0 1 0 1 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 2 0 2 0 2 0 2
2 0 2 0 2 0 2 0
1
SAMPLE INPUT
0 1 0 1 0 1 0 1
1 0 1 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 2 0 0 0
0 0 0 0 0 0 0 0
0 0 2 0 0 0 0 0
0 0 0 2 0 0 0 2
2 0 2 0 2 0 2 0
1
SAMPLE OUTPUT
2 5
2
4 3
6 1

Explanation

This is player 1’s turn, and the player will move the piece at [2,5] and will make two jumps. First jump will be at [4,3and second jump will be at [6,1]

After his/her move the state of game becomes:

0 1 0 1 0 1 0 1
1 0 1 0 1 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 1 0 2 0 2 0 2
2 0 2 0 2 0 2 0

This state will be fed as input to program of player 2.

Other valid move for the first player is

2 5
1
3 6

But the following are invalid moves.
Case 1:

2 5
1
4 3

Because after making a jump its possible to jump again and its mandatory to jump as long as its possible to jump.

Case 2:

2 5
2
4 3
5 4

Because after making a jump its invalid to move to diagonally adjacent cell.

Here is the code of the Random Bot.

Time Limit:1.0 sec(s) for each input file.
Memory Limit:256 MB
Source Limit:1024 KB

Solution

This is the solution submitted by me