The expansive world of deep neural networks is at the forefront of today’s most sophisticated machine-learning applications. However, the rapid growth in these models is testing the capabilities of conventional electronic computing hardware. Enter photonic hardware — a promising alternative that harnesses light to carry out machine-learning computations, providing a potential solution that is both faster and more energy-efficient.
Yet, the challenge remains: certain neural network computations cannot be executed entirely on photonic devices, frequently necessitating off-chip electronics, which can slow down processes and reduce efficiency. A new breakthrough from scientists at MIT and collaborators aims to tackle these limitations. They’ve devised a state-of-the-art photonic chip capable of executing all essential computations for deep neural networks directly on-chip.
This innovative optical device performs crucial computations necessary for machine-learning classification tasks in under half a nanosecond while maintaining an impressive accuracy exceeding 92 percent — results that rival traditional computational methods. The chip consists of interconnected modules that create an optical neural network and is produced using commercially viable foundry techniques, paving the way for potential scalability and integration into existing electronic systems.
Looking ahead, this photonic processor may transform the landscape of deep learning, enhancing the efficiency and speed of computationally intensive applications such as lidar technology, cutting-edge scientific research in fields like astronomy and particle physics, as well as rapid telecommunications.
“In many scenarios, the speed of obtaining results is just as critical as model performance,” explains Saumil Bandyopadhyay, a visiting scientist in the Quantum Photonics and AI Group. “Now that we have a full optical system capable of executing neural networks on a nanosecond scale, we can begin exploring more advanced applications and algorithms.”
Bandyopadhyay collaborated with an array of experts in this research, including Alexander Sludds, Nicholas Harris, Darius Bunandar, and others from institutions such as the University of Massachusetts and NTT Research. Their research is published today in Nature Photonics.
Advancing Machine Learning Through Photonics
At the core of deep neural networks are numerous layers of interconnected nodes or neurons, which process input data to create output. A key operation in these networks involves linear algebra for matrix multiplication, essential for transforming data through various layers. Beyond these linear functions, nonlinear operations also play a vital role in enabling networks to discern intricate patterns — through activation functions, for instance — offering the capability to tackle complex problems.
In a significant prior advancement in 2017, Englund’s team, working alongside Marin Soljačić’s lab, successfully demonstrated an optical neural network on a single chip that could handle matrix multiplication using light. However, the challenge remained: nonlinear operations had to revert to electrical signals sent to processors, slowin the system down.
“Creating optical nonlinearity presents challenges, as photons interact sparingly,” Bandyopadhyay notes. “This makes it energy-intensive to trigger nonlinear effects, complicating the development of scalable systems.”
The team addressed these hurdles with nonlinear optical function units (NOFUs), which merge electronics and optics to seamlessly incorporate nonlinear operations onto the chip.
The design of their optical deep neural network involved three layers of devices capable of performing both linear and nonlinear functions. Initially, the system encodes deep neural network parameters into light. Then, an array of programmable beamsplitters, established in their previous research, executes matrix multiplication on these inputs. Following this, programmable NOFUs perform nonlinear functions by redirecting slight portions of light to photodiodes, translating optical signals into electrical current, all while consuming minimal energy.
“We remain in the optical realm continuously until the final reading, achieving ultra-low latency,” Bandyopadhyay emphasizes.
This capability to maintain low latency facilitates efficient in situ training, a process known to typically demand excessive energy in digital frameworks. “This holds particular benefits for systems focusing on real-time optical signal processing, such as in navigation or telecommunications,” he adds.
The photonic system achieved an outstanding 96 percent accuracy during training and maintained over 92 percent accuracy during inference, comparable to mainstream hardware. Moreover, the chip executes essential calculations in under half a nanosecond.
“This research presents computing not just as a process of mapping inputs to outputs, but as a transformation that can be adapted to new architectures of linear and nonlinear physics, ushering in a new realm of computational efficiency,” states Englund.
All components were fabricated using conventional processes typical of CMOS chip manufacturing, suggesting that this technology could be mainstreamed using existing, reliable methods that minimize fabrication errors.
Future endeavors will focus on scaling this technology and integrating it with practical electronics, such as cameras and telecommunications systems. Additionally, the researchers are eager to investigate algorithms that optimize the advantages of optics for speedier and more energy-efficient training systems.
This pioneering research received support from the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, and NTT Research.
Photo credit & article inspired by: Massachusetts Institute of Technology