#StackBounty: #python #tensorflow #deep-learning #glcm #gabor-filter Tensorflow custom filter layer definition like glcm or gabor

Bounty: 50

I want to apply various filters like GLCM or Gabor filter bank as a custom layer in Tensorflow, but I could not find enough custom layer samples. How can I apply these type of filters as a layer?

The process of generating GLCM is defined in the scikit-image library as follows:

from skimage.feature import greycomatrix, greycoprops
from skimage import data
#load image
img = data.brick()
#result glcm
glcm = greycomatrix(img, distances=[5], angles=[0], levels=256, symmetric=True, normed=True)

The use of Gabor filter bank is as follows:

import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage as ndi
from skimage import data
from skimage.util import img_as_float
from skimage.filters import gabor_kernel

shrink = (slice(0, None, 3), slice(0, None, 3))
brick = img_as_float(data.brick())[shrink]
grass = img_as_float(data.grass())[shrink]
gravel = img_as_float(data.gravel())[shrink]
image_names = ('brick', 'grass', 'gravel')
images = (brick, grass, gravel)

def power(image, kernel):
    # Normalize images for better comparison.
    image = (image - image.mean()) / image.std()
    return np.sqrt(ndi.convolve(image, np.real(kernel), mode='wrap')**2 +
                   ndi.convolve(image, np.imag(kernel), mode='wrap')**2)

# Plot a selection of the filter bank kernels and their responses.
results = []
kernel_params = []
for theta in (0, 1):
    theta = theta / 4. * np.pi
    for sigmax in (1, 3):
        for sigmay in (1, 3):
            for frequency in (0.1, 0.4):
                kernel = gabor_kernel(frequency, theta=theta,sigma_x=sigmax, sigma_y=sigmay)
                params = 'theta=%d,f=%.2fnsx=%.2f sy=%.2f' % (theta * 180 / np.pi, frequency,sigmax, sigmay)
                kernel_params.append(params)
                # Save kernel and the power image for each image
                results.append((kernel, [power(img, kernel) for img in images]))

fig, axes = plt.subplots(nrows=6, ncols=4, figsize=(5, 6))
plt.gray()
fig.suptitle('Image responses for Gabor filter kernels', fontsize=12)
axes[0][0].axis('off')
# Plot original images
for label, img, ax in zip(image_names, images, axes[0][1:]):
    ax.imshow(img)
    ax.set_title(label, fontsize=9)
    ax.axis('off')
for label, (kernel, powers), ax_row in zip(kernel_params, results, axes[1:]):
    # Plot Gabor kernel
    ax = ax_row[0]
    ax.imshow(np.real(kernel))
    ax.set_ylabel(label, fontsize=7)
    ax.set_xticks([])
    ax.set_yticks([])
    # Plot Gabor responses with the contrast normalized for each filter
    vmin = np.min(powers)
    vmax = np.max(powers)
    for patch, ax in zip(powers, ax_row[1:]):
        ax.imshow(patch, vmin=vmin, vmax=vmax)
        ax.axis('off')
plt.show()

How do I define these and similar filters in tensorflow.

I tried above code but it didnt gave the same results like : https://scikit-image.org/docs/dev/auto_examples/features_detection/plot_gabor.html

enter image description here

I got this: enter image description here

import numpy as np
import matplotlib.pyplot as plt
import tensorflow.keras.backend as K
from tensorflow.keras import Input, layers
from tensorflow.keras.models import Model
from scipy import ndimage as ndi

from skimage import data
from skimage.util import img_as_float
from skimage.filters import gabor_kernel

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'


def gfb_filter(shape,size=3, tlist=[1,2,3], slist=[2,5],flist=[0.01,0.25],dtype=None):
    print(shape)
    fsize=np.ones([size,size])
    kernels = []
    for theta in tlist:
        theta = theta / 4. * np.pi
        for sigma in slist:
            for frequency in flist:
                kernel = np.real(gabor_kernel(frequency, theta=theta,sigma_x=sigma, sigma_y=sigma))
                kernels.append(kernel)
    gfblist = []
    for k, kernel in enumerate(kernels):
        ck=ndi.convolve(fsize, kernel, mode='wrap')
        gfblist.append(ck)
    
    gfblist=np.asarray(gfblist).reshape(size,size,1,len(gfblist))
    print(gfblist.shape)
    return K.variable(gfblist, dtype='float32')


dimg=img_as_float(data.brick())
input_mat = dimg.reshape((1, 512, 512, 1))

def build_model():
    input_tensor = Input(shape=(512,512,1))
    x = layers.Conv2D(filters=12, 
                      kernel_size = 3,
                      kernel_initializer=gfb_filter,
                      strides=1, 
                      padding='valid') (input_tensor)

    model = Model(inputs=input_tensor, outputs=x)
    return model

model = build_model()
out = model.predict(input_mat)
print(out)

o1=out.reshape(12,510,510)
plt.subplot(2,2,1)
plt.imshow(dimg)

plt.subplot(2,2,2)
plt.imshow(o1[0,:,:])

plt.subplot(2,2,3)
plt.imshow(o1[6,:,:])

plt.subplot(2,2,4)
plt.imshow(o1[10,:,:])


Get this bounty!!!

#StackBounty: #deep-learning #pipelines Deep Learning Pipeline motivation

Bounty: 50

A Deep Learning Pipeline consists of the following 5 points:

  1. Define and prepare problem
  2. Summarize and understand data
  3. Process and prepare data
  4. Evaluate algorithms
  5. Improve results

Here is the source:
enter image description here

From the book called Deep Learning Pipeline:Building a Deep Learning Model with TensorFlow

What is the exact motivation behind the Deep Learning Pipeline?

I think it if because everything is an automated process to save up money and time.

Does anyone know/has come across a good book/paper about the Deep Learning Pipelines Motivation?


Get this bounty!!!

#StackBounty: #python #keras #deep-learning Problem with KerasRegressor & multiple output

Bounty: 50

I have 3 inputs and 3 outputs. I am trying to use KerasRegressor and cross_val_score to get my prediction score.

my code is:

# Function to create model, required for KerasClassifier
def create_model():

    # create model
    # #Start defining the input tensor:
    input_data = layers.Input(shape=(3,))

    #create the layers and pass them the input tensor to get the output tensor:
    layer = [2,2]
    hidden1Out = Dense(units=layer[0], activation='relu')(input_data)
    finalOut = Dense(units=layer[1], activation='relu')(hidden1Out)

    u_out = Dense(1, activation='linear', name='u')(finalOut)   
    v_out = Dense(1, activation='linear', name='v')(finalOut)   
    p_out = Dense(1, activation='linear', name='p')(finalOut)   

    #define the model's start and end points
    model = Model(input_data,outputs = [u_out, v_out, p_out])    

    model.compile(loss='mean_squared_error', optimizer='adam')

    return model

#load data
...

input_var = np.vstack((AOA, x, y)).T
output_var = np.vstack((u,v,p)).T

# evaluate model
estimator = KerasRegressor(build_fn=create_model, epochs=num_epochs, batch_size=batch_size, verbose=0)
kfold = KFold(n_splits=10)

I tried:

results = cross_val_score(estimator, input_var, [output_var[:,0], output_var[:,1], output_var[:,2]], cv=kfold)

and

results = cross_val_score(estimator, input_var, [output_var[:,0:1], output_var[:,1:2], output_var[:,2:3]], cv=kfold)

and

results = cross_val_score(estimator, input_var, output_var, cv=kfold)

I got the error msg like:

Details:
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 3 array(s), but instead got the following list of 1 arrays: [array([[ 0.69945297, 0.13296847, 0.06292328],

or

ValueError: Found input variables with inconsistent numbers of samples: [72963, 3]

So how do I solve this problem?

Thanks.


Get this bounty!!!

#StackBounty: #deep-learning #cnn #training #computer-vision #pytorch Troubles Training a Faster R-CNN RPN using a Resnet 101 backbone …

Bounty: 100

Training Problems for a RPN

I am trying to train a network for region proposals as in the anchor box-concept
from Faster R-CNN.

I am using a pretrained Resnet 101 backbone with three layers popped off. The popped off
layers are the conv5_x layer, average pooling layer, and softmax layer.

As a result my convolutional feature map fed to the RPN heads for images
of size 600*600 results is of spatial resolution 37 by 37 with 1024 channels.

I have set the gradients of only block conv4_x to be trainable.
From there I am using the torchvision.models.detection rpn code to use the
rpn.AnchorGenerator, rpn.RPNHead, and ultimately rpn.RegionProposalNetwork classes.
There are two losses that are returned by the call to forward, the objectness loss,
and the regression loss.

The issue I am having is that my model is training very, very slowly (as in the loss is improving very slowly). In Girschick’s original paper he says he trains over 80K minibatches (roughly 8 epochs since the Pascal VOC 2012 dataset has about 11000 images), where each mini batch is a single image with 256 anchor boxes, but my network from epoch to epoch improves its loss VERY SLOWLY, and I am training for 30 + epochs.

Below is my class code for the network.

class ResnetRegionProposalNetwork(torch.nn.Module):
    def __init__(self):
        super(ResnetRegionProposalNetwork, self).__init__()
        self.resnet_backbone = torch.nn.Sequential(*list(models.resnet101(pretrained=True).children())[:-3])
        non_trainable_backbone_layers = 5
        counter = 0
        for child in self.resnet_backbone:
            if counter < non_trainable_backbone_layers:
                for param in child.parameters():
                    param.requires_grad = False
                counter += 1
            else:
                break

        anchor_sizes = ((32,), (64,), (128,), (256,), (512,))
        aspect_ratios = ((0.5, 1.0, 2.0),) * len(anchor_sizes)
        self.rpn_anchor_generator = rpn.AnchorGenerator(
            anchor_sizes, aspect_ratios
        )
        out_channels = 1024
        self.rpn_head = rpn.RPNHead(
            out_channels, self.rpn_anchor_generator.num_anchors_per_location()[0]
        )

        rpn_pre_nms_top_n = {"training": 2000, "testing": 1000}
        rpn_post_nms_top_n = {"training": 2000, "testing": 1000}
        rpn_nms_thresh = 0.7
        rpn_fg_iou_thresh = 0.7
        rpn_bg_iou_thresh = 0.2
        rpn_batch_size_per_image = 256
        rpn_positive_fraction = 0.5

        self.rpn = rpn.RegionProposalNetwork(
            self.rpn_anchor_generator, self.rpn_head,
            rpn_fg_iou_thresh, rpn_bg_iou_thresh,
            rpn_batch_size_per_image, rpn_positive_fraction,
            rpn_pre_nms_top_n, rpn_post_nms_top_n, rpn_nms_thresh)

    def forward(self,
                images,       # type: ImageList
                targets=None  # type: Optional[List[Dict[str, Tensor]]]
                ):
        feature_maps = self.resnet_backbone(images)
        features = {"0": feature_maps}
        image_sizes = getImageSizes(images)
        image_list = il.ImageList(images, image_sizes)
        return self.rpn(image_list, features, targets)

I am using the adam optimizer with the following parameters:
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, ResnetRPN.parameters()), lr=0.01, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, amsgrad=False)

My training loop is here:

for epoch_num in range(epochs): # will train epoch number of times per execution of this program
        loss_per_epoch = 0.0
        dl_iterator = iter(P.getPascalVOC2012DataLoader())
        current_epoch = epoch + epoch_num
        saveModelDuringTraining(current_epoch, ResnetRPN, optimizer, running_loss)
        batch_number = 0
        for image_batch, ground_truth_box_batch in dl_iterator:
            #print(batch_number)
            optimizer.zero_grad()
            boxes, losses = ResnetRPN(image_batch, ground_truth_box_batch)
            losses = losses["loss_objectness"] + losses["loss_rpn_box_reg"]
            losses.backward()
            optimizer.step()
            running_loss += float(losses)
            batch_number += 1
            if batch_number % 100 == 0:  # print the loss on every batch of 100 images
                print('[%d, %5d] loss: %.3f' %
                      (current_epoch + 1, batch_number + 1, running_loss))
                string_to_print = "n epoch number:" + str(epoch + 1) + ", batch number:" 
                                  + str(batch_number + 1) + ", running loss: " + str(running_loss)
                printToFile(string_to_print)
                loss_per_epoch += running_loss
                running_loss = 0.0
        print("finished Epoch with epoch loss " + str(loss_per_epoch))
        printToFile("Finished Epoch: " + str(epoch + 1) + " with epoch loss: " + str(loss_per_epoch))
        loss_per_epoch = 0.0

I am considering trying the following ideas to fix the network training very slowly:

  • trying various learning rates (although I have already tried 0.01, 0.001, 0.003 with similar results
  • various batch sizes (so far the best results have been batches of 4 (4 images * 256 anchors per image)
  • freezing more/less layers of the Resnet-101 backbone
  • using a different optimizer altogether
  • different weightings of the loss function

Any hints or things obviously wrong with my approach MUCH APPRECIATED. I would be happy to give any more information to anyone who can help.

Edit: My network is training on a fast GPU, with the images and bounding boxes as torch tensors.


Get this bounty!!!

#StackBounty: #neural-network #deep-learning #lstm #natural-language-process #gan How is the Gaussian noise given to this BLSTM based G…

Bounty: 50

In a conditional GAN, we give a random noise along with a label to the generator as input. In this paper, I don’t understand why in one section of the paper, they say they are giving the random noise as input and the in another section of the paper they are saying it is concatenated to the output.

page 2

page 2

page 2 footnote

page 2 footnote

page 3 model setup section

page 3 model setup section

little overview of the paper: Code switching is a phenomenon in spoken language where we switch between two different languages. Mixed language models improve the accuracy of automatic speech recognition to higher degree but the problem is less availability of mixed language written sentences. Thus, as a data augmentation technique, a conditional GAN is developed to synthesize English, Mandarin mixed sentences from a pure Mandarin sentence. The trained generator acts as an agent telling which words in the Mandarin sentence have to be translated. It outputs a binary array (of length equal to input Mandarin sentence length). Both generator and discriminator are BLSTM networks.


Get this bounty!!!

#StackBounty: #deep-learning #supervised-learning How to combine human-labelled data with user behavior data?

Bounty: 50

I am working on a supervised learning problem for a web-search task, where I have access to a relatively small set of human-labeled examples and lots of user-behavior data.

Now, user behavior data is biased, because of presentation bias, position bias etc. So it’s likely that its’ distribution will be different from human-labeled data.

I am planning to use both to train a Neural Network model.

Now I am confused about how to combine both datasets?


Get this bounty!!!

#StackBounty: #neural-network #deep-learning #backpropagation #cost-function #math Chain function in backpropagation

Bounty: 100

I’m reading a Neural Networks Tutorial. In order to answer my question you might have to take a brief look at it.

I understand everything until something they declare as "chain function":

enter image description here

It’s located under 4.5 Backpropagation in depth.

I know that the chain rule says that the derivative of a composite function equals to the product of the inner function derivative and the outer function derivative, but still don’t understand how they got to this equation.

Update:

Please read the article and try to only use the terms and signs that are declared there. It’s confusing and tough enough anyway, so I really don’t need extra-knowledge now. Just use the same "language" as in the article and explain how they arrived to that equation.


Get this bounty!!!

#StackBounty: #machine-learning #deep-learning #information-theory Calculating the entropy of a neural network

Bounty: 50

I am looking to calculate the information contained in a neural network. I am also looking to calculate the maximum information contained by any neural network in a certain amount of bits. These two measures should be comparable (as in I can compare whether my current neural network has reached the max or is less than the max and by how much).

Information is relative, so I define it relative to the real a priori distribution of the data that the neural network is trying to estimate.

I have come across Von Neumann entropy which can be applied to a matrix, but because it is not additive I can’t apply it to a series of weight matrices (assuming the weight matrices encode all the information of a neural network).

I found three other papers Entropy-Constrained raining of Deep Neural Networks , Entropy and mutual information in models of deep neural networks and Deep Learning and the Information Bottleneck Principle. The second contains a link to this github repo, but this method requires the activation functions and weight matrices to be known which is not the case for finding the max entropy of any neural network in n bits.

How can I calculate the amount of information contain in/entropy of a neural network? And how can I calculate the same measure for any neural network in n bits?


Get this bounty!!!

#StackBounty: #neural-network #deep-learning #mathematics How propagate the error delta in backpropagation in convolutional neural netw…

Bounty: 50

My CNN has the following structure:

  • Output neurons: 10
  • Input matrix (I): 28×28
  • Convolutional layer (C): 3
    feature maps with a 5×5 kernel (output dimension is 3x24x24)
  • Max pooling layer (MP): size 2×2 (ouput dimension is 3x12x12)
  • Fully connected layer (FC): 432×10 (3*12*12=432 max pooling layer flattened and vectorized)

After making the forward pass, I calculate the error delta in the output layer as:

$delta^L = (a^L-y) odot sigma'(z^L) (1)$

Being $a^L$ the predicted value and $z^L$ the dot product of the weights, plus the biases.

I calculate the error deltas for the next layers with:

$delta^l = ((w^{l+1})^T delta^{l+1}) odot sigma'(z^l) (2)$

And derivative of the error w.r.t. the weights being

$frac{partial C}{partial w^l_{jk}} = a^{l-1}_k delta^l_j (3)$

I’m able to update the weights (and biases) of $FC$ with no problem. At this point, error delta $delta$ is 10×1.

For calculating the error delta for $MP$ , I find the dot product of $FC$ and the error delta itself, as defined in equation 2. That gives me an error delta of 432×1. Because there are no parameters in this layer, and the flattening and vectorization, I just need to follow the reverse process and reshape it to 3x12x12, being that the error in $MP$.

To find the error delta for $C$, I upsample the error delta following the reverse process of the max pooling ending with a 3x24x24 delta. Finding the hadamard product of each of those matrixes with each of the $σ′$ of the feature maps gives me the error delta for $C$.

But now, how am I supposed to update the kernels, if they’re 5×5, and I is 28×28? $I$ have the error delta for the layer, but I don’t know how to update the weights with it. Also for the bias, as it’s a single value for the whole feature set.


Get this bounty!!!

#StackBounty: #neural-network #deep-learning How propagate the error delta in backpropagation in convolutional neural networks (CNN)?

Bounty: 50

My CNN has the following structure:

  • Output neurons: 10
  • Input matrix (I): 28×28
  • Convolutional layer (C): 3
    feature maps with a 5×5 kernel (output dimension is 3x24x24)
  • Max pooling layer (MP): size 2×2 (ouput dimension is 3x12x12)
  • Fully connected layer (FC): 432×10 (3*12*12=432 max pooling layer flattened and vectorized)

After making the forward pass, I calculate the error delta in the output layer as:

$delta^L = (a^L-y) odot sigma'(z^L) (1)$

Being $a^L$ the predicted value and $z^L$ the dot product of the weights, plus the biases.

I calculate the error deltas for the next layers with:

$delta^l = ((w^{l+1})^T delta^{l+1}) odot sigma'(z^l) (2)$

And derivative of the error w.r.t. the weights being

$frac{partial C}{partial w^l_{jk}} = a^{l-1}_k delta^l_j (3)$

I’m able to update the weights (and biases) of $FC$ with no problem. At this point, error delta $delta$ is 10×1.

For calculating the error delta for $MP$ , I find the dot product of $FC$ and the error delta itself, as defined in equation 2. That gives me an error delta of 432×1. Because there are no parameters in this layer, and the flattening and vectorization, I just need to follow the reverse process and reshape it to 3x12x12, being that the error in $MP$.

To find the error delta for $C$, I upsample the error delta following the reverse process of the max pooling ending with a 3x24x24 delta. Finding the hadamard product of each of those matrixes with each of the $σ′$ of the feature maps gives me the error delta for $C$.

But now, how am I supposed to update the kernels, if they’re 5×5, and I is 28×28? $I$ have the error delta for the layer, but I don’t know how to update the weights with it. Also for the bias, as it’s a single value for the whole feature set.


Get this bounty!!!