The TensorFlow project announced the release of version 2.3.0, featuring new mechanisms for reducing input pipeline bottlenecks, Keras layers for pre-processing, and memory profiling.
TensorFlow developer advocate Josh Gordon outlined the highlights of the new release in a recent blog post. The tf.data package includes a new service API which can distribute input preprocessing to a cluster of worker machines, which can increase the throughput of data during training. Additionally, a snapshot API can persist the results of the preprocessing pipeline to disk, reducing the work required on subsequent training runs. An experimental Keras Preprocessing Layers API allows some preprocessing operations to be incorporated into the deep-learning models, simplifying deployment of the models. The TF Profiler includes new tools for memory profiling and Python tracing, to assist in debugging. Gordon commented on Twitter that TensorFlow 2.3 is a "solid, user-focused release."
The tf.data package was introduced in TensorFlow version 1.4 as a way to define input preprocessing, or extract-transform-load (ETL), pipelines. This allowed the training machine's CPU to perform IO and compute activities on a batch of data, such as loading from disk and transforming features, in parallel with the GPU operating on the previous batch of data. In some cases, however, the preprocessing of a batch may take longer than the GPU processing, resulting in an idle GPU. The new release provides a mechanism for distributing the preprocessing to a cluster of machines, for example on Google Kubernetes Engine. For training scenarios where the preprocessing is very compute intensive, the new release also provides a snapshot API to persist the preprocessed data to disk. The snapshot can also detect changes to the input pipeline, such as changes to the transform code, which will cause recomputation of the snapshot.
Any preprocessing logic applied to training data must also be applied to inference data input to the trained model in production. This can complicate the deployment, as the input pipeline code must also be replicated in the system that uses the model. With the new experimental Keras Preprocessing Layers API, some preprocessing tasks can be included in the model definition and are applied automatically to data at inference time. For example, a new TextVectorization layer can be used to include logic for tokenizing and vectorizing strings. Other new layers are available for image processing and categorical data handling.
The TF Profiler, introduced in TensorFlow 2.2, has two new features. A memory profiler can monitor memory usage during training and display results in the Profiler dashboard "with no extra work." The memory profiler can assist with debugging out-of-memory and memory fragmentation issues. The Profiler also includes a new Python Tracer to help view the Python call stack during execution.
TensorFlow's main competitor, PyTorch, also had a recent release featuring, among other things, a memory profiler. While TensorFlow was still the dominant framework in industry as of late last year, a Reddit discussion about the new release included several comments expressing frustration with TensorFlow. One user noted:
As an industry professional and not a researcher, TensorFlow X is still great and one of [TensorFlow's] biggest strengths but I find people saying this increasingly online - becoming sick of TensorFlow.
The TensorFlow source code and 2.3 release notes are available on GitHub.