#StackBounty: #python-3.x #tensorflow #tensorflow-datasets Scaling tf.io.gfile.GFile over 100MB/s throughput

Bounty: 50

To keep a GPU fully utilized during training I need to be able to feed about 250 MB/s of raw data to the GPU (the data is uncompressible). I am accessing the data over a fast network which can feed well over 2GB/sec without a problem. Python’s GIL makes it rather hard to get those speeds into the same process that runs Tensorflow without negatively impacting the training loop. Python 3.8’s shared memory may alleviate this, but that’s not supported by Tensorflow just yet.

So I’m using tf.io.gfile.GFile to read data over the network (data is stored on a high bandwidth S3 compliant interface). The value of GFile is that it doesn’t engage the GIL, and thus plays nicely with the training loop. In order to achieve high throughput there needs to be significant parallelization of the network IO.

I only seem to be able to get about 75-100 MB/sec out this approach though.

I’ve timed two approaches:

  • Create a tf.data.Dataset and use tf.data.Dataset.map(mymapfunc, num_parallel_calls=50) (I’ve tried many values of num_parallel_calls including AUTOTUNE).
  • Create a function that reads data using tf.io.gfile.GFile and simply run it using multiple threads in a concurrent.futures.ThreadPoolExecutor, attempting thread counts up to about 100 (there’s no improvement above about 20, and eventually more threads slow it down).

In both cases I’m topping out at 75-100 MB/sec.


I’m wondering if there’s a reason for GFile to hit an upper limit that is perhaps more
obvious to someone else.

I’m also making an assumption I should validate: tf.io.gfile.GFile
runs in numpy land, in both cases above I’m running GFile operations
from python land (in the case of tf.data.Dataset I’m using
tf.py_function). If GFile is meant to run as part of the graph
operations more efficiently I’m unaware of this and need to be

Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.