Facebook AI Research (FAIR) has released Detectron2, a PyTorch-based computer vision library that brings a series of new research and production capabilities to the popular framework.
Since its release in 2018, the original Detectron object detection platform has become one of FAIR’s most widely adopted open-source projects. While the first Detectron was written in Caffe2, Detectron2 represents a full rewrite of the original framework in PyTorch from the ground up, with several new object detection capabilities.
Detectron was, at the time of its initial release, a huge boost for the AI community. It enabled many to quickly and easily build state-of-the-art object detection models. Yet Detectron was stuck with a few limitations — limitations that quickly became deal-breakers for many AI practitioners.
- Caffe2 was complicated, so implementing custom object detection models was a big challenge
- From the beginning, Detectron was only designed for object detection and no other computer vision tasks such as semantic segmentation or pose estimation
- With deploying machine learning models to production becoming a hot topic over the past couple of years, Detectron quickly fell behind as it lacked the capability for exporting inference models
Detectron2 was built to tackle those deal-breakers, making for a more robust and modern library. From the Detectron2 team at FAIR:
We built Detectron2 to meet the research needs of Facebook AI and to provide the foundation for object detection in production use cases at Facebook. We are now using Detectron2 to rapidly design and train the next-generation pose detection models that power Smart Camera, the AI camera system in Facebook’s Portal video-calling devices. By relying on Detectron2 as the unified library for object detection across research and production use cases, we are able to rapidly move research ideas into production models that are deployed at scale.
The move to PyTorch aligns with the AI community’s growing need and desire for a flexible yet easy-to-use library. PyTorch itself is modular by design, making it far easier to extend than Caffe2. The vast majority of the AI community already uses just two libraries: TensorFlow and PyTorch.
Detectron2 has expanded to handle computer vision tasks beyond object detection including semantic segmentation, panoptic segmentation, pose estimation, and DensePose. The authors have made a noticeable effort to add pre-trained state-of-the-art models like Cascade R-CNN, Panoptic FPN, and TensorMask.
FAIR’s team hinted in their official blog post that they’re planning to release an additional component to the library, Detectron2go, to make it easier to deploy models to production. It’s said to include features like network quantization, model optimization, and formatting for mobile deployment.