#StackBounty: #python #tensorflow #keras Where does keras actually initialize the dataset?

Bounty: 50

I’m trying to understand the implementation of SGD in tensorflow.

In the constructor (__init__ method) of class TensorLikeDataAdapter, self._dataset is initialized by this line

https://github.com/tensorflow/tensorflow/blob/r2.5/tensorflow/python/keras/engine/data_adapter.py#L346

self._dataset = dataset

I tried to print the value out with this line

print('enumerate_epochs self._dataset', list(self._dataset))

and I got

<_OptionsDataset shapes: ((None, 2), (None,)), types: (tf.float32, tf.float32)>

which seems to indicate that the dataset hasn’t yet been actually loaded.

At the very begining of the enumerate_epochs method

https://github.com/tensorflow/tensorflow/blob/r2.5/tensorflow/python/keras/engine/data_adapter.py#L1196

I added this line

def enumerate_epochs(self):
    print('enumerate_epochs self._dataset', list(self._dataset))

and I got 3 (I set epoch=3) of the actual dataset, which means the dataset has been initialized and randomized somewhere before.

I went through the whole data_adapter.py but failed to locate where the dataset is actually initialized.

highlight

I also tried this line

  print('data_handler._dataset', data_handler._dataset)
  for epoch, iterator in data_handler.enumerate_epochs():

and I got

data_handler._dataset <_OptionsDataset shapes: ((None, 2), (None,)), types: (tf.float32, tf.float32)>

However, this line

def _truncate_execution_to_epoch(self):
    print('_truncate_execution_to_epoch self._dataset', list(self._dataset))

gives 3 (epoch=3) of the actual dataset, which means somewhere just in between the dataset is actually initialized though I couldn’t imagine where it could be!

I also tried class DataHandler

print('DataHandler self._dataset', list(self._dataset))
self._configure_dataset_and_inferred_steps(strategy, x, steps_per_epoch,
                                           class_weight, distribute)

and I got this error

AttributeError: 'DataHandler' object has no attribute '_dataset'

Could someone help me to see the light at the end of the tunnel.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.