MEMS relay-based integrated logic systems (Liu and Stojanović groups)
Hardware Accelerators for AI
A. Hardware-Accelerator Matched Compression Algorithms
A major research focus of the Stojanović group has been to develop hardware accelerator matched compression algorithms for deep neural networks (DNNs) to make them deployable on edge devices in real-time. For example, the team introduced the new MPDCompress algorithm based on matrix permutation decomposition. Through in-training application of random binary masks, a more efficient block-based implementation and compression was achieved.1 Recently, the Stojanović group also introduced BagNet (Berkeley Analog Generator with Layout Optimizer) as a new optimization framework.4 BagNet learns to reduce the number of simulations of evolutionary-based combinatorial optimizers using a DNN, which discriminates against generated samples, before running simulations (Figure 1). With this approach, the discriminator achieves at least two orders of magnitude improvement on sample efficiency for several large circuit examples.
Figure 1. High-level architecture of the BagNet (Berkeley Analog Generator with Layout Optimizer) optimizer.
B. Physics-Based Digital Optimization
The Yablonovitch group investigates continuous-time analog machines for digital optimization, such as solving the computationally hard Ising Hamiltonian optimization problem, which describes the total interaction energy of any given network of coupled magnetic spins. The motivation for designing analog machines for the Ising problem is that there are physical principles in the domains of optics and electrical circuits, which can achieve orders-of-magnitude time speedups over digital algorithms while matching the quality of solutions obtained by them. In fact, physics itself, performs optimizations in the normal course of dynamical evolution and Nature provides a number of optimization principles. As an example, the team designed and analyzed an electrical LC oscillator-based Ising machine (Figure 2) based on Minimum Entropy Generation.2 The Yablonovitch group also found that this system, as well as many previously developed systems, are in fact physical implementations of the well-known method of Lagrange multipliers in optimization theory.5 In addition, optimization of a much larger class of merit functions beyond the Ising Hamiltonian can be performed by appropriate modifications of the electrical LC oscillator-based systems. This reinterpretation led the team to the conclusion that they possess now fast analog ways of performing Lagrange multiplier optimization with immediate application in a large number of areas involving optimization such as control systems, operations research and artificial intelligence.
Figure 2. Schematic of the LC oscillator system for a 2-spin problem.
C. Restricted Boltzmann Machine Neural Networks
The Salahuddin group successfully used the Restricted Boltzmann Machine (RBM) as a stochastic neural network capable of solving NP-Hard combinatorial optimization problems efficiently.6 By exploiting the intrinsic parallelism present in the RBM architecture on a flexible Field Programmable Gate Array (FPGA) based accelerator, the team showed that this sampling framework is competitive with state-of-the-art computing machines based on novel physics. In agreement with the findings by the Yablonovitch group,5 this work showed that there is no scaling advantage for the accelerators based on novel physics, indicating that classical hardware is sufficient for solving the computationally difficult problems. The RBM structure and sampling algorithm were demonstrated for an Ising model problem. The input graph for the Ising model type algorithm is fully connected, with no restrictions on the size or magnitude of the weight matrix (Figure 3A). The Ising model is then mapped to an RBM by making two copies of each graph node and edge and arranging them into a bipartite graph (Figure 3B). One copy is in the “visible” layer neurons and one in the “hidden” layer neurons, with no intra-layer edges. Each physical copy of the neuron is connected by a “coupling” parameter, C, which constrains the two copies to be the same value. Due to the lack of intra-layer connections, the layers can be sampled in parallel (Figure 3C). Each of the neurons in a layer is sampled in parallel and used to calculate the values of the opposite layer, creating a two-step sampling procedure. This sampling procedure proceeds until the output of the algorithm has reached the ground state, or until the algorithm output is of sufficient quality.
Figure 3. Demonstration of RBM structure and sampling algorithm. A) Structure of the input graph for an Ising model type algorithm. The graph is fully connected, with no restrictions on the size or magnitude of the weight matrix. B) Mapping of the Ising model to an RBM. C) Parallel sampling layers.
- L. Supic, R. Naous, R. Sredojevic, A. Faust, and V. Stojanović, “MPDCompress – Matrix Permutation Decomposition Algorithm for Deep Neural Network Compression,” arXiv:1805.12085, May 2018.
- E. Yablonovitch, T. P. Xiao, and S. K. Vadlamani, “Classical Machines, for Solving the Toughest Problems in Computer Science,” in Frontiers in Optics + Laser Science APS/DLS, OSA Technical Digest (Optical Society of America, 2019), paper JTu4A.114, Sep 2019.
- M. Kellman, M. Lustig, L. Waller, “How to do Physics-based Learning,” arXiv:2005.13531, May 2020.
- K. Hakhamaneshi, N. Werblun, P. Abbeel, and V. Stojanovic, “BagNet: Berkeley Analog Generator with Layout Optimizer Boosted with Deep Neural Networks,” 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Westminster, CO, USA, 2019, pp. 1-8, Jan 2020.
- S. K. Vadlamani, T. P. Xiao, and E. Yablonovitch, “Physics Successfully Implements Lagrange Multiplier Optimization,” Proceedings of the National Academy of Sciences, vol. 117, no. 43, pp. 26639-26650, Oct 2020
- S. Patel, L. Chen, P. Canoza, and S. Salahuddin, “Ising Model Optimization Problems on a FPGA Accelerated Restricted Boltzmann Machine,” arXiv:2008.04436, Oct 2020.