#StackBounty: #matlab #deep-learning #neural-network #time-series #conv-neural-network What is the purpose of a sequence folding layer …

Bounty: 50

When designing a CNN for 1D time series signal classification in MATLAB i get the error that the 2dconvolutional layer does not take sequences as input. From my understanding it is perfectly possible to convolve of an "array" with a 3×1 filter. To resolve this issue MATLAB suggests to use a "sequence folding layer". What would be the function of such a sequence folding layer and how would the architecture need to be changed?

I get the following error message:
enter image description here

Get this bounty!!!

#StackBounty: #conv-neural-network #tensorflow LeNet-5 Subsample Layer in Tensorflow

Bounty: 100

In Tensorflow, how do you implement the LeNet-5 pooling layers with trainable coefficient and bias terms?

Reading through the LeNet-5 paper, the subsample layers are described as follows:

Layer S2 is a sub-sampling layer with 6 feature maps of size 14×14. Each unit in each feature map is connected to a 2×2 neighborhood in the corresponding feature map in C1. The fout inputs to a unit in S2 are added, then multiplied by a trainable coefficient, and added to a trainable bias. The result is passed through a sigmoidal function. The 2×2 receptive fields are non-overlapping, therefore feature maps in S2 have half the number of rows and columns as feature maps in C1. Layer S2 has 12 trainable parameters and 5,880 connections.


However, in my search for examples of implementing LeNet-5 in Tensorflow, I haven’t seen this pooling layer implemented with the trainable coefficient and bias. Instead, something like the following is used:

model = keras.Sequential()
                        kernel_size=(5, 5), 
model.add(layers.AveragePooling2D(pool_size=(2, 2), 
                                  strides=(2, 2), 
                        kernel_size=(5, 5), 
model.add(layers.AveragePooling2D(pool_size=(2, 2), 
                                  strides=(2, 2), 
model.add(layers.Dense(units=120, activation='tanh'))
model.add(layers.Dense(units=84, activation='tanh'))
model.add(layers.Dense(units=10, activation = 'softmax'))

Calling model.summary() on a model like this yields:

Model: "sequential"
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 6)         156       
average_pooling2d (AveragePo (None, 14, 14, 6)         0         
conv2d_1 (Conv2D)            (None, 10, 10, 16)        2416      
average_pooling2d_1 (Average (None, 5, 5, 16)          0         
flatten (Flatten)            (None, 400)               0         
dense (Dense)                (None, 120)               48120     
dense_1 (Dense)              (None, 84)                10164     
dense_2 (Dense)              (None, 10)                850       
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0

The pooling layers have no trainable parameters. Maybe those parameters aren’t so important for performance, but I’m curious how to implement the original pooling layers in Tensorflow.

Get this bounty!!!

#StackBounty: #deep-learning #caffe #conv-neural-network #imagenet #deep-dream How to use classes to "control dreams"?

Bounty: 50


I have been playing around with Deep Dream and Inceptionism, using the Caffe framework to visualize layers of GoogLeNet, an architecture built for the Imagenet project, a large visual database designed for use in visual object recognition.

You can find Imagenet here: Imagenet 1000 Classes.

To probe into the architecture and generate ‘dreams’, I am using three notebooks:

  1. https://github.com/google/deepdream/blob/master/dream.ipynb

  2. https://github.com/kylemcdonald/deepdream/blob/master/dream.ipynb

  3. https://github.com/auduno/deepdraw/blob/master/deepdraw.ipynb

The basic idea here is to extract some features from each channel in a specified layer from the model or a ‘guide’ image.

Then we input an image we wish to modify into the model and extract the features in the same layer specified (for each octave),
enhancing the best matching features, i.e., the largest dot product of the two feature vectors.

So far I’ve managed to modify input images and control dreams using the following approaches:

  • (a) applying layers as 'end' objectives for the input image optimization. (see Feature Visualization)
  • (b) using a second image to guide de optimization objective on the input image.
  • (c) visualize Googlenet model classes generated from noise.

However, the effect I want to achieve sits in-between these techniques, of which I haven’t found any documentation, paper, or code.

Desired result

To have one single class or unit belonging to a given 'end' layer (a) guide the optimization objective (b) and have this class visualized (c) on the input image:

An example where class = 'face' and input_image = 'clouds.jpg':

enter image description here
please note: the image above was generated using a model for face recognition, which was not trained on the Imagenet dataset. For demonstration purposes only.

Working code

Approach (a)

from cStringIO import StringIO
import numpy as np
import scipy.ndimage as nd
import PIL.Image
from IPython.display import clear_output, Image, display
from google.protobuf import text_format
import matplotlib as plt    
import caffe
model_name = 'GoogLeNet' 
model_path = 'models/dream/bvlc_googlenet/' # substitute your path here
net_fn   = model_path + 'deploy.prototxt'
param_fn = model_path + 'bvlc_googlenet.caffemodel'
model = caffe.io.caffe_pb2.NetParameter()
text_format.Merge(open(net_fn).read(), model)
model.force_backward = True
open('models/dream/bvlc_googlenet/tmp.prototxt', 'w').write(str(model))
net = caffe.Classifier('models/dream/bvlc_googlenet/tmp.prototxt', param_fn,
                       mean = np.float32([104.0, 116.0, 122.0]), # ImageNet mean, training set dependent
                       channel_swap = (2,1,0)) # the reference model has channels in BGR order instead of RGB

def showarray(a, fmt='jpeg'):
    a = np.uint8(np.clip(a, 0, 255))
    f = StringIO()
    PIL.Image.fromarray(a).save(f, fmt)
# a couple of utility functions for converting to and from Caffe's input image layout
def preprocess(net, img):
    return np.float32(np.rollaxis(img, 2)[::-1]) - net.transformer.mean['data']
def deprocess(net, img):
    return np.dstack((img + net.transformer.mean['data'])[::-1])
def objective_L2(dst):
    dst.diff[:] = dst.data 

def make_step(net, step_size=1.5, end='inception_4c/output', 
              jitter=32, clip=True, objective=objective_L2):
    '''Basic gradient ascent step.'''

    src = net.blobs['data'] # input image is stored in Net's 'data' blob
    dst = net.blobs[end]

    ox, oy = np.random.randint(-jitter, jitter+1, 2)
    src.data[0] = np.roll(np.roll(src.data[0], ox, -1), oy, -2) # apply jitter shift
    objective(dst)  # specify the optimization objective
    g = src.diff[0]
    # apply normalized ascent step to the input image
    src.data[:] += step_size/np.abs(g).mean() * g

    src.data[0] = np.roll(np.roll(src.data[0], -ox, -1), -oy, -2) # unshift image
    if clip:
        bias = net.transformer.mean['data']
        src.data[:] = np.clip(src.data, -bias, 255-bias)

def deepdream(net, base_img, iter_n=20, octave_n=4, octave_scale=1.4, 
              end='inception_4c/output', clip=True, **step_params):
    # prepare base images for all octaves
    octaves = [preprocess(net, base_img)]
    for i in xrange(octave_n-1):
        octaves.append(nd.zoom(octaves[-1], (1, 1.0/octave_scale,1.0/octave_scale), order=1))
    src = net.blobs['data']
    detail = np.zeros_like(octaves[-1]) # allocate image for network-produced details
    for octave, octave_base in enumerate(octaves[::-1]):
        h, w = octave_base.shape[-2:]
        if octave > 0:
            # upscale details from the previous octave
            h1, w1 = detail.shape[-2:]
            detail = nd.zoom(detail, (1, 1.0*h/h1,1.0*w/w1), order=1)

        src.reshape(1,3,h,w) # resize the network's input image size
        src.data[0] = octave_base+detail
        for i in xrange(iter_n):
            make_step(net, end=end, clip=clip, **step_params)
            # visualization
            vis = deprocess(net, src.data[0])
            if not clip: # adjust image contrast if clipping is disabled
                vis = vis*(255.0/np.percentile(vis, 99.98))

            print octave, i, end, vis.shape
        # extract details produced on the current octave
        detail = src.data[0]-octave_base
    # returning the resulting image
    return deprocess(net, src.data[0])

I run the code above with:

end = 'inception_4c/output'
img = np.float32(PIL.Image.open('clouds.jpg'))
_=deepdream(net, img)

Approach (b)

Use one single image to guide 
the optimization process.

This affects the style of generated images 
without using a different training set.

def dream_control_by_image(optimization_objective, end):
    # this image will shape input img
    guide = np.float32(PIL.Image.open(optimization_objective))  
    h, w = guide.shape[:2]
    src, dst = net.blobs['data'], net.blobs[end]
    src.data[0] = preprocess(net, guide)

    guide_features = dst.data[0].copy()
    def objective_guide(dst):
        x = dst.data[0].copy()
        y = guide_features
        ch = x.shape[0]
        x = x.reshape(ch,-1)
        y = y.reshape(ch,-1)
        A = x.T.dot(y) # compute the matrix of dot-products with guide features
        dst.diff[0].reshape(ch,-1)[:] = y[:,A.argmax(1)] # select ones that match best

    _=deepdream(net, img, end=end, objective=objective_guide)

and I run the code above with:

end = 'inception_4c/output'
# image to be modified
img = np.float32(PIL.Image.open('img/clouds.jpg'))
guide_image = 'img/guide.jpg'
dream_control_by_image(guide_image, end)

Failed approach

And this is how I tried to access individual classes, hot encoding the matrix of classes and focusing on one (so far to no avail):

def objective_class(dst, class=50):
   # according to imagenet classes 
   #50: 'American alligator, Alligator mississipiensis',
   one_hot = np.zeros_like(dst.data)
   one_hot.flat[class] = 1.
   dst.diff[:] = one_hot.flat[class]

Could someone please guide ME in the right direction here? I would greatly appreciate it.

Get this bounty!!!

#StackBounty: #conv-neural-network #pooling How effective is using the Fractional Max Pooling vs Normal Max Pooling

Bounty: 50

I was reading this paper, named "Fractional Max-Pooling", by Benjamin Graham. I was looking into creating better models for CIFAR-10 dataset, and basically found this paper here.

Now, I was delving deeper into the paper, and understanding it, but then I came into this answer on SO, which states that:

In summary, I personally think the good performance in the Fractional
Max-Pooling paper is achieved by a combination of using
spatially-sparse CNN with fractional max-pooling and small filters
(and network in network) which enable building a deep network even
when the input image spatial size is small. Hence in regular CNN
network, simply replace regular max pooling with fractional max
pooling does not necessarily give you a better performance.

Has anyone else looked into FMPs? Are they really useful?

Any help is appreciated.

Get this bounty!!!

#StackBounty: #neural-networks #conv-neural-network #tensorflow #invariance High resolution in style transfer

Bounty: 100

I’m investigating a bit about neural style transfer and its practical applications and I’ve encountered a major issue. Are there methods for high resolution style transfer? I mean, the original Gatys’ algorithm based on optimization is obviously capable of producing high resolution results, but it’s a slow process so it’s not valid for practical use.

What I’ve seen is that all pretrained neural style transfer models are trained with low-resolution images. For example, tensorflow example is trained with 256×256 style images and 384×384 content images. The example explains that the size of the content can be arbitrary, but if you use 720×720 images or higher, the quality drops a lot, showing only small patterns of the style massively repeated. If you upscale content and style size accordingly, the result is even worse, it vanishes. Here are some examples of what I’m explaining:

The original 384x384 result with 250x250 style size

The original 384×384 result with 250×250 style size

1080x1080 result with 250x250 style size

1080×1080 result with 250×250 style size. Notice that it just repeats a lot those small yellow circles.

1080x1080 result with 700x700 style size

1080×1080 result with 700×700 style size. Awful result.

So my question is, is there a way to train any these models with size invariance? I don’t care if I have to train the model myself, but I don’t know how to get good, fast and arbitrary results with size invariance.

Get this bounty!!!

#StackBounty: #neural-networks #conv-neural-network How to add appropriate noise to a neural network with constant weights so that back…

Bounty: 50

I have a neural network in a synthetic experiment I am doing where scale matters and I do not wish to remove it & where my initial network is initialized with a prior that is non-zero and equal everywhere.

How do I add noise appropriately so that it trains well with the gradient descent rule?

$$ w^{<t+1>} := w^{<t>} – eta nabla_W L(W^{<t>}) $$


Get this bounty!!!

#StackBounty: #neural-networks #conv-neural-network Convolutional neural network fails even when given answer

Bounty: 50

I was having problems with a CNN giving the prediction as true for everything regardless of input. Taking advice from this forum, I simplified the input to give it the output as the input and it’s still unable to make the prediction correctly! Shape is 99,22, 2. The output boolean is in the input in the 3rd dimension of the input.

Here’s an example of 1 sample of the input: https://pastebin.com/jCVU3brn to predict the output as 0.

def CNN(train_X, train_y, test_X, test_y):

model = Sequential([
  Conv2D(30, kernel_size=3, activation="relu", input_shape=(99, 25, 2)),
  Conv2D(64, kernel_size=3, activation="relu"),
  Dense(1, activation='softmax')

    # Compile the model.

# Train the model.
preds = np.round(model.predict(test_X), 0)    

return preds

Model summary:

Layer (type)                 Output Shape              Param #   
conv2d_11 (Conv2D)           (None, 97, 23, 30)        570       
conv2d_12 (Conv2D)           (None, 95, 21, 64)        17344     
flatten_4 (Flatten)          (None, 127680)            0         
dense_4 (Dense)              (None, 1)                 127681    
Total params: 145,595
Trainable params: 145,595
Non-trainable params: 0

Get this bounty!!!

#StackBounty: #conv-neural-network Simple understanding of Convolutional Neutral Network

Bounty: 50

I want to ask a basic understanding of CNN.

Let say I have 1 dataset (100 pictures) with

  1. Class A (Picture of Cat: 40 pictures)
  2. Class B (Picture of Dog: 60 pictures)

And then, I input 100 pictures into CNN and run it.

My question is:

  1. What is the output should I look at?
  2. Is that mean if I input a picture (either Cat and Dog), I can know the picture (is cat or is a dog) by looking at the output?

Thank you.

Get this bounty!!!

#StackBounty: #time-series #multiple-regression #predictive-models #conv-neural-network #keras Use CNN to forecast time series value ac…

Bounty: 50

I would like to use a CNN to predict a value based on some historical data. The concept is easy: I have a numerical value (label) the depends on some other numerical values (features). Each set of features is linked to just one label. From a set of features I want to predict the label.

This is the model schema I’ve imagined for the situation:

model = tf.keras.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=(Params.time_frame, Params.features)))

#model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=256, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling'))
#model.add(tf.keras.layers.AveragePooling1D(pool_size=2, strides=1))
model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=128, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling'))
model.add(tf.keras.layers.AveragePooling1D(pool_size=2, strides=1))
model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=64, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling', padding='SAME'))
model.add(tf.keras.layers.AveragePooling1D(pool_size=2, strides=1))
model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=32, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling', padding='SAME'))
model.add(tf.keras.layers.AveragePooling1D(pool_size=2, strides=1))
model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=16, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling', padding='SAME'))
model.add(tf.keras.layers.AveragePooling1D(pool_size=2, strides=1))
model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=8, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling', padding='SAME'))
model.add(tf.keras.layers.AveragePooling1D(pool_size=2, strides=1))
model.add(tf.keras.layers.Conv1D(kernel_size=2, filters=4, strides=1, use_bias=True, activation='relu', kernel_initializer='VarianceScaling', padding='SAME'))


model.add(tf.keras.layers.Dense(units=128, kernel_initializer='VarianceScaling',activation='relu'))
model.add(tf.keras.layers.Dense(units=128, kernel_initializer='VarianceScaling',activation='relu'))
model.add(tf.keras.layers.Dense(units=1, kernel_initializer='VarianceScaling',activation=tf.keras.activations.sigmoid))

model.compile(optimizer=tf.keras.optimizers.Adam(0.0001), loss=tf.keras.losses.mean_squared_error, metrics=['accuracy'])

I’m using TensorFlow with Keras. Since I’m working with a time series I’ve used some 1-D Conv Layer. At the end I’ve tried to preserve the historical data with two fully connected Dense layers.

Now when I try to train the model I notice that accuracy metric never update after first two epochs:

Train on 4889 samples
Epoch 1/50
4889/4889 [==============================] - 1s 151us/sample - loss: 0.0416 - accuracy: 2.0454e-04
Epoch 2/50
4889/4889 [==============================] - 0s 80us/sample - loss: 0.0011 - accuracy: 4.0908e-04
Epoch 3/50
4889/4889 [==============================] - 0s 79us/sample - loss: 8.2775e-04 - accuracy: 4.0908e-04
Epoch 50/50
4889/4889 [==============================] - 0s 86us/sample - loss: 4.3382e-04 - accuracy: 4.0908e-04

Accuracy is also very low. So I think that there’s must be a problem with the model structure. My knowledge in ML in currently basic, so I need an advice to keep the right direction.

Get this bounty!!!