To train a neural network, I modified a code I found on YouTube. It looks as follows:
def data_generator(samples, batch_size, shuffle_data = True, resize=224): num_samples = len(samples) while True: random.shuffle(samples) for offset in range(0, num_samples, batch_size): batch_samples = samples[offset: offset + batch_size] X_train =  y_train =  for batch_sample in batch_samples: img_name = batch_sample label = batch_sample img = cv2.imread(os.path.join(root_dir, img_name)) #img, label = preprocessing(img, label, new_height=224, new_width=224, num_classes=37) img = preprocessing(img, new_height=224, new_width=224) label = my_onehot_encoded(label) X_train.append(img) y_train.append(label) X_train = np.array(X_train) y_train = np.array(y_train) yield X_train, y_train
Now, I tried to train a neural network using this code, train sample size is 105.000 (image files which contain 8 characters out of 37 possibilities, A-Z, 0-9 and blank space).
I used a relatively small batch size (32, I think that is already too small) to get it more efficient but nevertheless it took like forever to train one quarter of the first epoch (I had 826 steps per epoch, and it took 90 minutes for 199 steps…
steps_per_epoch = num_train_samples // batch_size).
The following functions are included in the data generator:
def shuffle_data(data): data=random.shuffle(data) return data
I don’t think we can make this function anyhow more efficient or exclude it from the generator.
def preprocessing(img, new_height, new_width): img = cv2.resize(img,(new_height, new_width)) img = img/255 return img
For preprocessing/resizing the data I use this code to get the images to a unique size of e.g. (224, 224, 3). I think, this part of the generator takes the most time, but I don’t see a possibility to exclude it from the generator (since my memory would be full, if we resize the images outside the batches).
#One Hot Encoding of the Labels from numpy import argmax # define input string def my_onehot_encoded(label): # define universe of possible input values characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ ' # define a mapping of chars to integers char_to_int = dict((c, i) for i, c in enumerate(characters)) int_to_char = dict((i, c) for i, c in enumerate(characters)) # integer encode input data integer_encoded = [char_to_int[char] for char in label] # one hot encode onehot_encoded = list() for value in integer_encoded: character = [0 for _ in range(len(characters))] character[value] = 1 onehot_encoded.append(character) return onehot_encoded
I think, in this part there could be one approach to make it more efficient. I am thinking about to exclude this code from the generator and produce the array y_train outside of the generator, so that the generator does not have to one hot encode the labels every time.
What do you think? Or should I maybe go for a completely different approach?