#StackBounty: #python #performance #algorithm #strings #search Naive implementation of KMP algorithm

Bounty: 50

After reading this answer to the question “High execution time to count overlapping substrings”, I decided to implement the suggested Knuth-Morris-Pratt (KMP) algorithm. I used the pseudo-code listed on Wikipedia for the functions kmp_table and kmp_search.

However, when running it on some corner-cases, I have observed that it is a lot slower than the standard str.find, which apparently uses a modified Boyer-Moore-Horspool algorithm and should thus have worse worst-case performance.

The specific case I looked at is:

$ ipython -i kmp.py
In [1]: text = "A"*1000000 + "B"
In [2]: word = "A"*100 + "B"
In [3]: %timeit kmp_search(text, word)
1 loop, best of 3: 410 ms per loop
In [4}: %timeit text.find(word)
1000 loops, best of 3: 703 µs per loop

So the difference is about a factor 1000 for this input. This is probably due to the fact that the native one is written in C and this is written in Python, but I still wanted to see if I did anything stupid here or missed any obvious optimization.

def kmp_table(word):
    table = [0] * len(word)
    position, candidate = 2, 0
    table[0] = -1

    while position < len(word):
        if word[position - 1] == word[candidate]:
            table[position] = candidate + 1
            candidate += 1
            position += 1
        elif candidate > 0:
            candidate = table[candidate]
        else:
            table[position] = 0
            position += 1
    return table


def kmp_search(text, word):
    m, i = 0, 0
    table = kmp_table(word)
    while m + i < len(text):
        if word[i] == text[m + i]:
            if i == len(word) - 1:
                return m
            i += 1
        else:
            if table[i] > -1:
                m += i - table[i]
                i = table[i]
            else:
                m += 1
                i = 0
    return len(text)


Get this bounty!!!

Convert Comma separated String to Rows in Oracle SQL

Many times we need to convert a comma separated list of terms in a single string and convert it rows in SQL query.

for example

 India, USA, Russia, Malaysia, Mexico

Needs to be converted to:

 Country
 India
 USA
 Russia
 Malaysia
 Mexico

The following SQL script can help in this:

just replace the required values with your string and your delimiter.

HackerRank: Repeated String

Problem

Lilah has a string, s, of lowercase English letters that she repeated infinitely many times.

Given an integer, n, find and print the number of letter a‘s in the first letters of Lilah’s infinite string.

Input Format

The first line contains a single string, s.
The second line contains an integer, n.

Constraints

  • 1<=|s|<=100
  • 1<=|n|<=10^12
  • For 25% of the test cases, n <= 10^6

Output Format

Print a single integer denoting the number of letter a’s in the first letters of the infinite string created by repeating infinitely many times.

Sample Input 0

aba
10

Sample Output 0

7

Explanation 0

The first n = 10 letters of the infinite string are abaabaabaa. Because there are 7 a‘s, we print on a new line.

Sample Input 1

a
1000000000000

Sample Output 1

1000000000000

Explanation 1

Because all of the first n=1000000000000 letters of the infinite string are a, we print 1000000000000 on a new line.

Solution

CodeEval: Penultimate Word

Challenge Description:

Write a program which finds the next-to-last word in a string.

Input Sample:

Your program should accept as its first argument a path to a filename. Input example is the following

some line with text
another line

Each line has more than one word.

Output Sample:

Print the next-to-last word in the following way.

with
another

Solution:

 

CodeEval: Shortest Repetition

Challenge Description:

Write a program to determine the shortest repetition in a string.
A string is said to have period p if it can be formed by concatenating one or more repetitions of another string of length p. For example, the string “xyzxyzxyzxyz” has period 3, since it is formed by 4 repetitions of the string “xyz”. It also has periods 6 (two repetitions of “xyzxyz”) and 12 (one repetition of “xyzxyzxyzxyz”).

Input Sample:

Your program should accept as its first argument a path to a filename. Each line will contain a string of up to 80 non-blank characters. E.g.

abcabcabcabc
bcbcbcbcbcbcbcbcbcbcbcbcbcbc
dddddddddddddddddddd
adcdefg

Output Sample:

Print out the smallest period of the input string. E.g.

3
2
1
7

Solution:

 

HackerRank: Alternating Characters

Problem

Shashank likes strings in which consecutive characters are different. For example, he likes ABABA, while he doesn’t like ABAA. Given a string containing characters A and B only, he wants to change it into a string he likes. To do this, he is allowed to delete the characters in the string.

Your task is to find the minimum number of required deletions.

Input Format

The first line contains an integer T, i.e. the number of test cases.
The next T lines contain a string each.

Output Format

For each test case, print the minimum number of deletions required.

Constraints

1T10
1≤ length of string 10^5

Sample Input

5
AAAA
BBBBB
ABABABAB
BABABA
AAABBB

Sample Output

3
4
0
0
4

Explanation

AAAA A, 3 deletions
BBBBB B, 4 deletions
ABABABAB ABABABAB, 0 deletions
BABABA BABABA, 0 deletions
AAABBB AB, 4 deletions because to convert it to AB we need to delete 2 A’s and 2 B’s

Solution

Combinations of a String

Problem:
Write an algorithm to print all possible combinations of characters in a string.

Solution:
Since we need to generate combinations, we can start with a single character and then continue to add a character to combinations we have seen so far.

Let’s use “XYZ” as an example.

public void buildTree(String input, StringBuffer output, int k)
{
    for (int i = k; i < input.length(); i++)
    {
        output.append(input.charAt(i));
        buildTree(input, output, i + 1);
        output.deleteCharAt(output.length() - 1);
    }
} 

public static void main(String[] args){
     buildTree("XYZ", new StringBuffer(), 0);
}

Dry Run just to give an idea:

X–>Y–>Z
–>Z–>Y

Y–>X–>Z
–>Z–>X

Z–>Y–>X
–>X–>Y

How to replace in Javascript, when replacement string is a variable

How to replace in Javascript, when replacement string is a variable

Very often we use replace method in javascript while replacing a string literal by another string literal.
But what if we need to replace a string whose value is held in a variable.

Here is the solution…

You can use a regular expression (often referred to as a RegEx or a RegExp). Regular expressions are much more powerful than standard string matching as they can use very complicated logic

// Let's take a look at the above example using regular expressions.
strReplaceSimple = strText.replace( new RegExp( "th", "" ), "[X]" );

alert( strReplaceSimple );

As you can see, we have the same replace happening. So let’s take a look at what’s going on. Instead of passing simple target string to the replace() method, we are passing in a regular expression (new RegExp()) object instance. The RegExp() takes two arguments, the expression and the search flags (left blank in our example). There are two universally valid flags: [g] which means globally replace and [i] which
means case INsensitive. By default, the regular expression is NOT global and case sensitive.

// So let's try to do a global replace using regular expressions.
strReplaceAll = strText.replace( new RegExp( "th", "g" ), "[X]" );

alert( strReplaceAll );

We just did a global replace in ONE line of code.

strReplaceAll = strText.replace( new RegExp( "th", "gi" ), "[X]" );

alert( strReplaceAll );

We just replaced out that additional “Th” simply by adding the flag [i] into the regular expression. That’s how powerful regular expressions are. But there’s more. Regular expressions are more than just flags. Much more!

Image that for some reason, you knew about regular expressions, but you didn’t know about the case insensitive flag [i]. You could have performed the same replace using this:

strReplaceAll = strText.replace( new RegExp( "(T|t)(H|h)", "g" ), "[X]" );

alert( strReplaceAll );

This groups the two letters together, and for each letter it tells the replacing algorithm to match t OR T followed by h OR H. There is sooo much more that regular expressions can do. Unfortunately, that is outside the scope of this entry. You should really look into regular expression both in Javascript and in ColdFusion / Java. They are amazing.

But what happens if you don’t want to do a simple replace? The replace method allows some very interesting flexibility. Up until now, we have been passing a simple string in a the “replace-in” argument ([X]). But, you don’t have to. You can pass in a function pointer instead.

For this example, let’s replace out the target string with a random letter in the brackets, not necessarily the X. First we have to create a function that will return the resultant random string

function RandomString(){
// Create an array of letters.
var arrLetters = ["A","B","C","D","E","V","W","X","Y","Z"];

// Use the random() method and the modulus (%) operator to
// pick a random letter from the above array.
var intLetter = (Math.floor( Math.random() * 10 ) % 9);

// Return the random letter string we get from the
// array of letters above.
return( "[" + arrLetters[ intLetter ] + "]" );
}

Try calling the function on its own a few times, just to see how it behaves.

alert(
RandomString() + "n" + RandomString() + "n" +
RandomString() + "n" + RandomString() + "n" +
RandomString() + "n" + RandomString() + "n" +
RandomString() + "n" + RandomString() + "n"
);

As you can see, it randomly (as random as possible) picks a letter to return. Now, let’s call the replace with the RandomString() method sent as the second argument. We will do this a few times so you can see the randomness in effect.

alert( strText.replace( "th", RandomString ) );
alert( strText.replace( "th", RandomString ) );
alert( strText.replace( "th", RandomString ) );

Notice that we are passing in a POINTER to the function but not actually calling it. RandomString vs. RandomString(). There’s one thing I did not mention yet. Not only can you pass in a function as an argument, but when the replace method is taking place, it passes in the target match as an argument to this function. We could have re-written the function as such:

function RandomString2( strTargetInstance) // This is the target string match instance.
{
var arrLetters = ["A","B","C","D","E","V","W","X","Y","Z"];
var intLetter = (Math.floor( Math.random() * 10 ) % 9);

// Return the random letter string we get from the
// array of letters above. This time, though, we are
// going to include the target string to demonstrate
// that it has been passed in.
return( "[" + strTargetInstance + " : " + arrLetters[ intLetter ] + "]" );
}

Now, we will run it again, just once, so you can see it in action.

alert( strText.replace( "th", RandomString2 ) );

Want to read more on this? do VISIT HERE