Researchers at MIT's Quantum Photonics Laboratory have developed the Digital Optical Neural Network (DONN), a prototype deep-learning inference accelerator that uses light to transmit activation and weight data. At the cost of a few percentage points of accuracy, the system can achieve an transmission energy advantage of up to 1000x over traditional electronic devices.
The team described the system and several experiments in a paper published in Nature's Scientific Reports. By using optical signals instead of electric currents, DONN consumes a constant amount of energy to transmit data between layers of the neural network; by contrast, an electronic accelerator chip's energy consumption increases with transmission distance. This allows DONN to scale to handle larger deep learning models while keeping energy costs low. DONN requires 3 femtojoules (fJ) to perform a single 8-bit multiply-and-accumulate (MAC) operation, compared to more than 1,000 fJ for an electronic chip. According to the researchers,
[T]he efficient optical data distribution provided by the DONN architecture will become critical for continued growth of DNN performance through increased model sizes and greater connectivity.
Large deep-learning models require correspondingly large compute and memory resources, for both training and inference; often the models also require accelerator hardware such as GPUs or TPUs to perform the computation in a timely manner. These hardware resources consume a lot of energy, and a large percentage of that is spent on memory access and data movement. This pushes chip designers to keep memory as close as physically possible to the computation elements, as these energy costs increase with distance.
However, optical data transmission does not incur the same energy transmission costs; this is one reason fiber-optics has become a major conduit for long-distance data transmission. While optical computing, especially for digital signal processing, has been an active area of research for decades, the development of photonic integrated circuits has spurred an interest in its application to deep learning. Much of the work focuses on optical implementations of the core linear algebra functions used by neural networks. For example, in 2019, the MIT team published a paper describing a system that encodes the input and weight values as light intensities and uses an optical receiver to compute the values' product.
That approach was essentially an analog computation, which can be noisy, and therefore could reduce the accuracy of the neural network model output. By contrast, DONN retrains digital computation and instead uses optical pathways to "fan out" the neural network's activation values and weights to electronic multiply-accumulate operators. Although the digital computation does not have the noise sources that an analog method would, there are sources of bit error in the optical transmission. In a series of experiments, the team determined the bit error rate and measured its effect on model accuracy for a MNIST image classification task. Although some of the bit errors can be mitigated with error-correction schemes, the worst-case effect on accuracy was less than 3 percentage points.
In addition to the ongoing research at the Quantum Photonics Laboratory, MIT has spun out two startups that are working on optical deep-learning accelerators. One of these, Lightmatter, is developing a prototype system that implements matrix multiplication optically. In a discussion about Lightmatter's prototype on Hacker News, one commenter stated:
But to put massive neural networks orders of magnitude larger than GPT-3 into a robot requires a significant step change in the efficiency and scale of neural network compute...their tech could be a boon for machine learning and robotics some day.
MIT has open-sourced their code for running image classification experiments on the DONN on GitHub.