#StackBounty: Efficient extraction of patch features over an image

Bounty: 50

Input:

  • data – a matrix of size n*m*3*3 (complex values)
  • indices – a list of coordinates (x,y), where x < n and y < m
  • fp – a feature parameter which is a tuple of ((fp11, fp12), (fp21, fp22)), id)
  • reference – a list of 3*3 matrices
  • swrd – a function which computes a similarity value between two complex valued 3*3 matrices

Output:

  • feature_values – a list of features – one feature for each index in (indices)

Functionality:

Given an image (data) were each pixel is a 3*3 matrix. And there is a list of target pixels (indices). For each target pixel, I want to extract features of the patch surrounding it.

A patch feature is either:
a) the swrd of a pixel in the patch with a reference matrix or
b) the swrd of two pixels in the patch

Thus a feature can be described by the relative coordinates fp11, fp12 (x and y offset of pixel of interest 1) and fp21, fp22 (x and y offset of pixel of interest 2).
If fp11 == fp21 and fp12== fp22, then i want to compute a), else i want to compute b).
The reference matrix of interest is defined by the feature parameter called id.

Note that the indices of interest are already filtered so that the sum x+fp__ < n and y+fp__< m for all possible fp__.

Code

Computing the symetric revised wishart distance with regularization in case a matrix A or B is not invertible

def srwd(A, B):
    """This function computes the symetric revised wishart distance as from the paper
    SPECTRAL CLUSTERING OF POLARIMETRIC SAR DATA WITH WISHART-DERIVED DISTANCE MEASURES"""
    try:
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    except:
        A, B = A.reshape(3, 3) + np.eye(3) * 0.000001, B.reshape(3, 3) + np.eye(3) * 0.000001
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    return abs(dist)

Getting the features with the input as given above:

def feature(data, indices, fp, reference):
    # fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
    # where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
    res = []
    if fp[0] != fp[1]:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], data[x + fp[1][0]][y + fp[1][1]]))
    else:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], reference[fp[2]]))
    return res

Finally there is another loop such as:

for fp in feature_params:
    feature_values = feature(data, indices, fp, reference)
    #here work on feature_values

The current implementation is rather inefficent and a bottleneck of the whole process. How could I improve it?

Is there a chance to efficiently compute a feature matrix efficiently and operate on it afterwards?

An executable toy example including the whole code is given here (allowing copy-paste)

import numpy as np
from numpy.linalg import inv

#toy example
data = np.random.rand(1000, 1000, 3, 3) #an image of 1000*1000 pixels, each pixel a 3*3 matrix
indices = np.random.randint(3,96, size = (10000,2)) # a list of 10000 target pixels (lets assume they are unique)
reference = [np.random.rand(3,3)] # a single reference matrix in a list (in actual application there are multiple reference matrices)
feature_params = [((0,0),(-1,-1), 0), ((0,0), (0,0), 0), ((0,1), (0,0), 0), ((1,0), (0,0), 0), ((1,1), (0,0), 0)] 



def srwd(A, B):
    """This function computes the symetric revised wishart distance as from the paper
    SPECTRAL CLUSTERING OF POLARIMETRIC SAR DATA WITH WISHART-DERIVED DISTANCE MEASURES"""
    try:
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    except:
        A, B = A.reshape(3, 3) + np.eye(3) * 0.000001, B.reshape(3, 3) + np.eye(3) * 0.000001
        dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)      
    return abs(dist)


def feature(data, indices, fp, reference):
    # fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
    # where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
    res = []
    if fp[0] != fp[1]:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], data[x + fp[1][0]][y + fp[1][1]]))
    else:
        for i in indices:
            x, y = i
            res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], reference[fp[2]]))
    return res

for fp in feature_params:
    feature_values = feature(data, indices, fp, reference)        
    #here work on feature_values

A final note about the dimensions of the actual problem:

  • Image of size 6000*1700,
  • around 500 features in feature_params
  • indices is list of around 8.000.000 target indices


Get this bounty!!!