#StackBounty: #mysql Implementing a table that maintains a max heap property

Bounty: 50

For the last few days I was thinking about implementing data structures in MySQL. My idea was to create a table (or a set of tables) that allows me to implement the properties of a binary max heap. In a binary max heap, the largest node is at the top of the tree (the root node). Each insertion takes O(N)=lg(N) approximately. To test this theory I created the following case:

I created a table leaderboard_2 with the following structure:

CREATE TABLE leaderboard_2 (
  _id TINYINT(4) NOT NULL AUTO_INCREMENT,
  screen_name VARCHAR(50) NOT NULL,
  score TINYINT(4) NOT NULL,
  PRIMARY KEY (_id) 
) ENGINE = InnoDB;

To maintain the heap, I created a second table leaderboard_heap with the following structure:

CREATE TABLE leaderboard_heap (
  _id TINYINT(4) NOT NULL AUTO_INCREMENT,
  player_id TINYINT(4) NOT NULL,
  PRIMARY KEY(_id),
  FOREIGN KEY(player_id) REFERENCES leaderboard_2(_id)
) ENGINE = InnoDB;

Every time a row is inserted into leaderboard_2, I run a trigger to insert a row into leaderboard_heap with the _id of the newly created row. The trigger definition is as follows:

DELIMITER ;;
CREATE TRIGGER heap_insert
AFTER INSERT ON leaderboard_2
FOR EACH ROW
BEGIN 
  DECLARE heap_len TINYINT;
  INSERT INTO leaderboard_heap VALUES (NULL, NEW._id);
  CALL check_length_heap(heap_len);

  IF heap_len > 1 THEN
  BEGIN
    DECLARE k, comp TINYINT(4);
    SET k = heap_len;
    CALL comp_row(k, FLOOR(k/2), comp);

    -- swim
    WHILE (k > 1 AND comp > 0) DO
      CALL exch_row(k, FLOOR(k/2));
      SET k = FLOOR(k/2);
      CALL comp_row(k, FLOOR(k/2), comp);
    END WHILE;
  END;
  END IF;
END;;
DELIMITER ;

I also created the following procedures as helpers:

CREATE PROCEDURE check_length_heap(OUT heap_len TINYINT)
BEGIN
  SELECT COUNT(lh._id) INTO heap_len FROM leaderboard_heap AS lh;
END

The procedure above checks the current number of rows in leaderboard_heap table.

CREATE PROCEDURE comp_row(IN row_a_id TINYINT, IN row_b_id TINYINT, OUT comp TINYINT)
BEGIN
  DECLARE rowAScore, rowBScore TINYINT;
  SELECT score INTO rowAScore FROM leaderboard_2
    WHERE _id = row_a_id;
  SELECT score INTO rowBScore FROM leaderboard_2
    WHERE _id = row_b_id;

  IF rowAScore < rowBScore THEN
    SET comp = -1;
  ELSEIF rowAScore > rowBScore THEN
    SET comp = 1;
  ELSE
    SET comp = 0;
  END IF;
END

Using the _id value, the scores from two players are selected from leaderboard_2 table and a comparison value is set (OUT comp).

CREATE PROCEDURE exch_row(IN row_a_id TINYINT, row_b_id TINYINT)
BEGIN
  DECLARE aPlayerId, bPlayerId TINYINT(4);

  -- make temp copy of player_id
  SELECT player_id INTO aPLayerId FROM leaderboard_heap
    WHERE _id = row_a_id;
  SELECT player_id INTO bPlayerId FROM leaderboard_heap
    WHERE _id = row_b_id;

  -- update row with id = row_a_id
  UPDATE leaderboard_heap as lh
    SET lh.player_id = bPlayerId
  WHERE lh._id = row_a_id;
  -- update row with id = row_b_id
  UPDATE leaderboard_heap as lh
    SET lh.player_id = aPlayerId
  WHERE lh._id = row_b_id;


END

The current row’s player_id is exchanged with (swapped) the parent row where, if the _id of the current node (or, row) is k, then parent node (or, row) _id is FLOOR(k/2). The values are swapped as long as k != 1 AND player in current row has score greater than player in parent row.

I used this same algorithm in both Java and JavaScript and the method works fine. In this case however, whenever there’s an insert, the _id in leaderboard_heap is swapped instead of player_id.

For example, let’s say we run the following query:

INSERT INTO leaderboard_2 VALUES ('Liu Kang', 43);
INSERT INTO leaderboard_2 VALUES ('Kung Lao', 47);

The leaderboard_2 table looks as follows:

SELECT * FROM `leaderboard_2`
_id screen_name score   
1   Liu Kang    43  
2   Kung Lao    47  
Showing rows 0 -  1 (2 total, Query took 0.0001 seconds.)

After this insert, the table should look like this (if the code for heapify works as expected):

SELECT * FROM `leaderboard_heap` WHERE 1
_id player_id   
1   2   
2   1   
Showing rows 0 -  1 (2 total, Query took 0.0001 seconds.)

Instead, I get the following:

SELECT * FROM `leaderboard_heap` WHERE 1
_id player_id   
 2  1   
 1  2   
 Showing rows 0 -  1 (2 total, Query took 0.0001 seconds.)

At this point I am not sure what’s happening. From my code I can see no reason why the exchange wouldn’t work. I tested the exchange code in another table and there it worked as expected. Any suggestion or help will be greatly appreciated. I added all the necessary code to make the problem as clear as possible.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.