Uber open-sourced Neuropod, an abstraction layer for machine learning frameworks that allows researchers to build models in the framework of their choice while reducing the effort of integration, allowing the same production system to swap out models implemented in different frameworks. Neuropod currently supports several frameworks, including TensorFlow, PyTorch, Keras, and TorchScript.
Vivek Panyam, senior autonomy engineer at Uber's Advanced Technologies Group (ATG), described Neuropod in a blog post. ATG researchers have adopted several deep-learning frameworks over the years, including Caffe2, TensorFlow, PyTorch, and JAX. Using a new framework in production required integration work on several different system components. ATG engineers developed Neuropod as a way to reduce the integration burden. According to Panyam,
Over the last year, we have deployed hundreds of Neuropod models across Uber ATG, Uber AI, and the core Uber business.
ATG's deep-learning framework of choice has evolved over time, as researchers found that different frameworks may be more or less suitable to a given task. In 2016, ATG's primary framework was Caffe2. Over the next several years, the team added support for TensorFlow, PyTorch, JAX, and others. Supporting a new framework required integration with each component in Uber's infrastructure and tooling, and often introduced problems such as dependency conflicts or memory corruption. It was difficult to compare performance of models from different frameworks intended to solve the same problem, since each model had framework-specific metrics pipelines.
The solution was to build Neuropod, an abstraction layer for models. To use Neuropod, model developers first create a "problem API" definition or spec that defines the model interface. Any model that implements the spec can be executed by Neuropod, which invokes the proper framework. This allows developers to easily swap out different models, even ones written in different frameworks, without re-writing the consuming code. Neuropod currently only supports consuming models from Python or C++ code, but Panyam says it is "straightforward" to add support for additional languages.
#Example Spec for 2d Object Detection (source: https://neuropod.ai/)
INPUT_SPEC = [
# BGR image
{"name": "image", "dtype": "uint8", "shape": (1200, 1920, 3)},
]
OUTPUT_SPEC = [
# shape: (num_detections, 4): (xmin, ymin, xmax, ymax)
# These values are in units of pixels. The origin is the top left corner
# with positive X to the right and positive Y towards the bottom of the image
{"name": "boxes", "dtype": "float32", "shape": ("num_detections", 4)},
# The list of classes that the network can output
# This must be some subset of ['vehicle', 'person', 'motorcycle', 'bicycle']
{"name": "supported_object_classes", "dtype": "string", "shape": ("num_classes",)},
# The probability of each class for each detection
# These should all be floats between 0 and 1
{"name": "object_class_probability", "dtype": "float32", "shape": ("num_detections", "num_classes")},
]
The models themselves are deployed in a package (called a "neuropod") which can also contain test code and custom framework operations. The models can be executed in the calling process or out-of-process. The Neuropod documentation highlights several potential benefits of out-of-process execution, such as not sharing the Python GIL between multiple models.
Panyam joined a discussion about Neuropod on Hacker News, answering several questions. One user wondered how Neuropod differs from ONNX, a machine-learning interoperation format. Panyam replied that:
Neuropod is an abstraction layer so it can do useful things on top of just running models locally. For example, we can transparently proxy model execution to remote machines...Including GPUs in all our cluster machines doesn’t make sense from a resource efficiency perspective so instead, if we proxy model execution to a smaller cluster of GPU-enabled servers, we can get higher GPU utilization while using fewer GPUs.
Panyam also noted that the project's roadmap includes adding support for ONNX as well as TensorRT.
Neuropod joins several other open-source deep-learning projects released by Uber. In 2017, Uber open-sourced Michelangelo, an end-to-end ML platform, and Horovod, a distributed deep-learning training framework. Last year Uber released Ludwig, a code-free deep-learning toolbox. Neuropod's source code is available on GitHub.