#### 6100812

# Optical and Electrical Memories for Analog Optical Computing

Sadra Rahimi Kari, *Student Member, IEEE*, Carlos A. Ríos Ocampo, *Member, IEEE*, Lei Jiang<sup>®</sup>, *Member, IEEE*, Jiawei Meng<sup>®</sup>, *Student Member, IEEE*, Nicola Peserico, *Member, IEEE*, Volker J. Sorger<sup>®</sup>, *Senior Member, IEEE*, Juejun Hu<sup>®</sup>, *Member, IEEE*, and Nathan Youngblood<sup>®</sup>, *Member, IEEE* 

(Invited Paper)

Abstract-Key to recent successes in the field of artificial intelligence (AI) has been the ability to train a growing number of parameters which form fixed connectivity matrices between layers of nonlinear nodes. This "deep learning" approach to AI has historically required an exponential growth in processing power which far exceeds the growth in computational throughput of digital hardware as well as trends in processing efficiency. New computing paradigms are therefore required to enable efficient processing of information while drastically improving computational throughput. Emerging strategies for analog computing in the photonic domain have the potential to drastically reduce latency but require the ability to modify optical processing elements according to the learned parameters of the neural network. In this point-of-view article, we provide a forward-looking perspective on both optical and electrical memories coupled to integrated photonic hardware in the context of AI. We also show that for programmed memories, the READ energy-latency-product of photonic random-access memory (PRAM) can be orders of magnitude lower compared to electronic SRAMs. Our intent is to outline path for PRAMs to become an integral part of future foundry processes and give these promising devices relevance for emerging AI hardware.

Manuscript received 7 July 2022; revised 23 January 2023; accepted 23 January 2023. Date of publication 1 February 2023; date of current version 8 February 2023. The work of Volker J. Sorger was supported by the Air Force Office for Scientific Research PECASE under Grant FA9550-1-20-0193. This work was supported by the U.S. National Science Foundation under Grants ECCS-2028624, ECCS-2132929, DMR-2003325, CISE-2105972, and ECCS-2210168/2210169. (*Corresponding author: Nathan Youngblood.*)

Sadra Rahimi Kari and Nathan Youngblood are with the Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15261 USA (e-mail: sar247@pitt.edu; nathan.youngblood@pitt.edu).

Carlos A. Ríos Ocampo is with the Department of Materials Science and the Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD 20742 USA (e-mail: riosc@umd.edu).

Lei Jiang is with the Department of Intelligent Systems Engineering, Indiana University Bloomington, Bloomington, IN 47408 USA (e-mail: jiang60@iu.edu).

Jiawei Meng and Nicola Peserico are with the Department of Electrical and Computer Engineering, School of Engineering & Applied Science, George Washington University, Washington, DC 20052 USA (e-mail: mengj@email.gwu.edu; npeserico@email.gwu.edu).

Volker J. Sorger is with the Department of Electrical and Computer Engineering, School of Engineering & Applied Science, George Washington University, Washington, DC 20052 USA, and also with the Optelligence LLC, Upper Marlboro, MD 20772 USA (e-mail: sorger@gwu.edu).

Juejun Hu is with the Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139-4307 USA (email: hujuejun@mit.edu).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSTQE.2023.3239918.

Digital Object Identifier 10.1109/JSTQE.2023.3239918

*Index Terms*—Artificial intelligence, neural network hardware, analog computers, optical computing, analog processing circuits.

#### I. INTRODUCTION

**R** ECENT progress in the field of AI has been fueled by two major research thrusts: 1) finding ways to train increasingly large deep neural networks (DNNs) and 2) applying new insights from neuroscience to computing algorithms and hardware, commonly known as "neuromorphic computing." These approaches to AI make the shift from specialized "expert models" which rely on a human understanding of the data to generalized "neural networks" which typically use a very large number of free parameters to statistically fit the data [1]. In fact, the performance of a DNN has been shown to improve when the number of free parameters exceeds that of the available training data [2]. The vast and tunable 3D connectivity of billions of neurons in the brain is similarly considered a key contributor to intelligence in humans and other animals. Thus, the immense number of trainable parameters in biological and deep neural networks leads to both its generality as well as computational complexity [3].

In both deep learning and neuromorphic computing, the compute operations needed varies drastically from the precise, sequential arithmetic operations that have driven digital hardware design for the past half century. Instead, computation is limited by memory access bottlenecks rather than processor speed, leading to memory-centric design approaches (e.g., weight stationary systolic arrays [4], in-memory computation [5], etc.). These approaches typically minimize the movement of fixed parameters to improve latency and energy efficiency. However, since all electrical processors are fundamentally limited by an energy-bandwidth tradeoff stemming from the capacitance of their interconnects [6], this ultimately limits the maximum compute efficiency achievable (typically measured in operations per watt, "OPS/W").

Computation in the optical domain is an exciting alternative to electrical processors which side-steps the energy-bandwidth tradeoff [7], [8], [9]. The bandwidth of an optical channel (waveguide, fiber, or even free space) is independent of modulation frequency and therefore extremely high data throughput can be achieved in the optical domain. Additionally, the wave nature of

1077-260X © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

See https://www.ieee.org/publications/rights/index.html for more information.

optical signals allows passive elements to achieve unitary linear transformations with no power penalty in lossless materials [10].

Optical computing can be divided into digital and analog approaches. Digital optical computing represents input and output as discrete values (e.g., 0s and 1s) and has several advantages over traditional digital electronic computing, including higher clock speeds and lower power consumption [11], [12]. In contrast, analog optical computing represents input and output as continuous values and co-locates compute and memory which has the potential to be lower latency and more energy efficient than digital optical computing at the expense of precision and accuracy. These properties make optical analog computing highly attractive for ultrafast, low-power linear operations-the major computational bottleneck in today's neural networks. However, for these optical computations to be useful, they must be coupled to the trained parameters of the neural network through analog optical memories. Given the importance of energy efficiency, latency, and the recent exciting advancements in analog optical computing, we have chosen to focus solely on memories for analog optical computing in this article.

Storing optical data in digital format requires the use of additional components such as an optical digital-to-analog converter (DAC), which increases energy consumption and footprint. Considering the footprint of electronic integrated systems and photonic integrated ones, optical counterparts still have a significant disadvantage. Storing digital information increases the size of the memory block depending on the number of bits allocated to each input. Also, based on different technologies developed for optical memories, achieving multi state memories are attainable. Thus, Analog optical memories are a realistic option as they do not require additional components and are capable of long-term data storage.

In this point of view article, we first identify important features needed for analog optical memories (i.e., memory devices which modulate an analog optical signal according to a predetermined input) and their respective challenges. We then discuss different approaches used by the community to implement optical memory for processing information in the optical domain. Next, we perform an energy-latency analysis to identify the applications where these various approaches have a distinct advantage. Finally, we end our discussion with an outlook of the current state of optical memory technologies and present a roadmap identifying the key technological challenges where continued innovation is most needed.

# II. KEY REQUIREMENTS OF PHOTONIC MEMORY TECHNOLOGY

At the highest level, photonic computing strategies can be most generally divided into two main categories—coherent or incoherent. These distinct strategies place important physical constraints on the optical memory cells used since in the case of coherent photonic architectures, both the amplitude and phase of the optical signals are used to perform computation [13], [14], [15]. Incoherent architectures instead use only the amplitude of the optical signal to perform computation, but require sources with many different optical frequencies to prevent unwanted interference effects [16], [17], [18]. Therefore, for coherent architectures, the insertion loss (IL), amplitude-independent phase control, and fabrication variability of the memory cell directly impact the compute accuracy [19]. These strict requirements are largely reduced for incoherent architectures, but extinction ratio (ER), crosstalk, and precision of the memory cell still limit the ultimate accuracy that can be achieved [12]. Despite these architecture-specific requirements, several key metrics of the memory cell have similar impact on the performance of the photonic processor regardless of the computing strategy. Here, we summarize these metrics and their importance for photonic computing.

Insertion loss (IL). IL of the memory cell impacts the maximum optical power that can be transmitted and read out by detection circuitry when the memory is in the fully "on" state. Since computation occurs in the analog domain, the precision of the optical readout is fundamentally limited by photon shot noise. Improving the IL, therefore reduces the optical power required to perform computation. For coherent architectures, if the IL differs between two interfering optical paths, the interference contrast will be reduced and limit compute accuracy (sometimes also referred to as "fidelity" [13]).

*Precision.* Optical memory cells are typically tuned with a continuous parameter since they are analog in nature. Therefore, the maximum achievable precision is typically limited by either the stability of the memory cell itself, the noise of the control circuitry, or the optoelectronic noise at detection. Fortunately, many studies have shown that neural networks require relatively low precision memory (even as low as 1 or 2 bits [20], [21], [22]) and that uncorrelated noise can serve as a method for regularization and improved resilience [23], [24], [25].

*Extinction ratio (ER).* The ER of the memory cell is linked to the precision and determines the maximum optical contrast between the "on" and "off" states. Improving the ER will help to distinguish between neighboring analog levels of transmission or phase, increasing the maximum compute precision (and typically accuracy) achievable. Detecting the difference in intensity between the add and drop ports of a microring resonator (MRR) or in the relative transmission of two memory cells are methods for improving ER while also achieving both positive and negative values for weights [18].

As IL and ER both affect the dynamic range of the output, they directly affect the signal-to-noise ratio (SNR) and the precision is therefore dependent on both. Furthermore, the presence of noise will have a detrimental impact on the precision of computations performed in an analog system which requires higher optical power levels to overcome. Thus, bit precision ( $N_b$ ), IL, and ER are all interrelated and together determine the energy consumption of the analog processor. The effects of noise present at the electro-optic (E/O) and opto-electronic (O/E) interfaces have been explored in detail in prior work [12], [26] and we point the reader to these resources for further information. In the simple case of a shot-noise-limited optical signal, the minimum optical power swing required to resolve a given precision is [12]:

$$\Delta P = \frac{2h\upsilon}{\eta} \cdot f_{mod} \cdot 2^{2N_b} \tag{1}$$

where hv is the photon energy,  $\eta$  is the overall quantum efficiency of the optical circuit (e.g., waveguide loss) and photodetector, and  $f_{mod}$  is the modulation frequency. The following can be derived for the power swing due to an amplitude-modulated optical memory cell:

$$P_{max} = P_{in} \cdot 10^{-\frac{IL}{10}}$$
 (2)

$$P_{min} = P_{in} \cdot 10^{-\frac{IL + ER}{10}}$$
(3)

$$\Delta P = P_{in} \cdot 10^{-\frac{IL}{10}} \left( 1 - 10^{-\frac{ER}{10}} \right) \tag{4}$$

Combining (1) and (4) gives the following result for the minimum input power required to achieve a precision of  $N_b$ :

$$P_{in} = \frac{2h\upsilon}{\eta} \cdot \frac{f_{mod}}{10^{-\frac{IL}{10}} \left(1 - 10^{-\frac{ER}{10}}\right)} \cdot 2^{2N_b}$$
(5)

Based on (5), as IL increases and ER decreases, the observable change in the output power of the system decreases, and the minimum input power required for computation increases. Therefore, IL and ER directly impact the precision and efficiency of the optical processing unit.

*Programming latency.* While access and read latency can be a bottleneck for electronic memory cells, the write speed of the memory cell is usually the limiting factor for photonics. Reading the state of memory in the optical domain is fundamentally limited by the speed of light traveling through the bus waveguides, but in practice readout is limited by the speed of the detection circuitry at the output. Therefore, in the case of frequent weight updates, the programming latency could dominate (especially in the limit of large matrix operations which exceed the available on-chip photonic memory [27]). Therefore, minimizing the latency for frequent weight updates is crucial for maximizing throughput when faced with realistic constrains on physical optical hardware.

*Programming energy and static power*. Similar to the case of latency, if the computing application requires frequent updating of the optical weights (e.g., in a photonic tensor core [28]), the optical memory cell programming energy could potentially dominate the power consumption of the chip. Additionally, when using volatile optical responses to store data—such as thermo-optic, electro-absorptive, or plasma-dispersion effects—the static power consumption needed to hold a fixed weight can contribute a significant amount to the overall power budget of the computing system [12].

*Cycling endurance.* The minimum number of cycles required for an optical memory cell will vary greatly depending on the use case. For example, a fixed-weight architecture that does not require frequent weight updates (e.g., a small convolutional layer implemented optically [16], [17]) will have a much lower cycling requirement compared to a neuromorphic architecture where accumulation of optical pulses occurs in the memory cells themselves [29], [30]. As a point of reference, NAND flash memory used in consumer-grade USB flash drives typically have endurances ranging from  $10^4$  to  $10^6$  cycles [31], but these devices are used for storage rather than computation.

 TABLE I

 COMPARISON OF METRICS FOR VARIOUS OPTICAL MODULATORS

| Technology     | Speed    | <b>Energy/Power</b>         | IL (ER) | DAC? |
|----------------|----------|-----------------------------|---------|------|
| Segmented      | 20 Gb/s  | 155 fJ/bit                  | 5.5 dB  | No   |
| P-N MRR [33]   | (NRZ)    |                             | (3 dB)  |      |
|                | 40 Gb/s  | 42 fJ/bit                   |         |      |
|                | (PAM-4)  |                             |         |      |
| Segmented SIS- | 20 Gb/s  | 4.5 pJ/bit                  | NA      | No   |
| CAP MZI [36]   | (NRZ)    |                             |         |      |
|                | 40 Gb/s  | 250 fJ/bit                  | NA      |      |
|                | (PAM-16) |                             |         |      |
| Single P-N     | 44 Gb/s  | 17.4 fJ/bit                 | 0.9 dB  | Yes  |
| MRR [35]       | (NRZ)    |                             | (8 dB)  |      |
| Thermal        | 2.4 μs   | $12.7 \text{ mW} (P_{\pi})$ | 0.5 dB  | Yes  |
| MZI [47]       |          |                             | (20 dB) |      |
| Thermal        | 1.3 μs   | 1.47 nm/mW                  | NA      | Yes  |
| MRR [48]       |          |                             | (15 dB) |      |

TABLE II METRICS OF VARIOUS WAVEGUIDE-INTEGRATED NONVOLATILE OPTICAL MEMORIES

|                                       | Area<br>per bit<br>(μm²) | Switching<br>Energy<br>(nJ) | Switching<br>Latency<br>(µs) | Insertion<br>Loss<br>(dB) |
|---------------------------------------|--------------------------|-----------------------------|------------------------------|---------------------------|
| PCM (Electrical switching) [72], [73] | 6<br>0.04*               | 9<br>4*                     | 0.1<br>0.4*                  | 4.03<br>11.5*             |
| PCM (Optical<br>switching) [74]       | 0.4                      | 1                           | 0.5                          | 1                         |
| Ferroelectric [75]                    | 2,200                    | 0.027                       | 1                            | 0.07                      |
| Charge Trapping [76]                  | 19.6                     | 0.029                       | 600,000                      | 2                         |

\*Demonstration using a hybrid plasmonic-photonic phase-change memory cell.

Footprint. The footprint of the optical memory cell limits the integration density on chip and can be the limiting factor for scalability. This has important implications on the efficiency and latency of the photonic processor since smaller memory arrays will require more frequent weight updates than large-scale memory arrays for the same matrix operation [27]. With the CMOS technology pushing the bounds of nanometer scale channel lengths, electronic memory banks (SRAM banks) clearly occupy a smaller area in comparison to their optical counterparts. Due to the diffraction limit, evanescent coupling, and scattering losses related to minimum bend radii, classical photonic components typically have dimensions on the micron scale rather than nanometer scale. This leads to optical memory banks with a footprint that is many orders of magnitude larger than electronic memory banks. While optical memory may not be as efficient in terms of area compared to electronic memory, this may be an acceptable tradeoff for certain applications when considering read latency and energy efficiency, as shown in Table III. Additionally, the compute density can be much greater in the optical domain due to high-speed analog operations [12].

# **III. CURRENT IMPLEMENTATIONS OF PHOTONIC MEMORY**

# A. Electronic Memories Coupled to Optical Components

One common method for implementing optical memory is to use an optical modulator coupled to electrical memory

#### TABLE III

PERFORMANCE TABLE OF PHOTONIC RANDOM-ACCESS MEMORIES (P-RAM, O-SRAM, AND OPTICAL CACHE) AS COMPARED TO RELEVANT ELECTRONIC MEMORIES. OPTICAL MEMORIES CAN SHOW SEVERAL ORDERS-OF-MAGNITUDE HIGHER READ PERFORMANCE THAN THEIR ELECTRONIC COUNTERPARTS. THIS IS PARTICULARLY RELEVANT FOR NETWORK EDGE AI WITH SELDOMLY UPDATED WEIGHTS (I.E., RARE WRITE OPERATIONS), BUT FREQUENT READS. NOTE, THIS DOES NOT INCLUDE ADDC ENERGY OR LATENCY FOR OPTICAL READ OPERATIONS SINCE COMPUTATION CAN OCCUR OPTICALLY ACROSS MULTIPLE MEMORY CELLS BEFORE ADC. ADDING THE AREA TO THE READ ENERGY AND LATENCY SHOWS AN ABOUT 5× HIGHER FIGURE-OF-MERIT BASED ON A (AREA×READ ENERGY×READ LATENCY)<sup>-1</sup>

|                                           | Area<br>per bit<br>(μm²) | Read energy<br>(pJ/bit) | Read Latency<br>(ps) |
|-------------------------------------------|--------------------------|-------------------------|----------------------|
| eDRAM cell (28nm<br>HKMG) [89]            | 0.035                    | 19                      | >1,800               |
| SRAM cell [87], [90]<br>(7nm Fin-FET, 6T) | ~0.01                    | 5                       | 210                  |
| SRAM cache [88]<br>(64-byte block size)   | 0.055                    | 0.35                    | 1,660                |
| P-RAM [81]                                | 1                        | 0.01                    | < 50                 |
| O-SRAM [49]                               | 6.2                      | 0.0032                  | < 50                 |
| <b>Optical Cache (8KB)</b><br>[91]        | 120*                     | 0.143*                  | 33.7*                |



\*Simulated rather than experimental values.

(see Fig. 1a). This first involves digital-to-analog conversion (DAC) of the digital weight, followed by electrical-to-optical conversion (E/O) of the analog electrical signal. E/O conversion is most commonly achieved by modulating the real or imaginary refractive index of a material through different physical effects, such as thermo-optic, electro-absorption, or plasma-dispersion [32], [33], [34], [35], [36], [37]. This approach to optical memory has the notable benefit of foundry compatibility which has enabled several key proof-of-concept demonstrations of photonic processors [8]. It is worth noting, however, that incorporating electronic memories coupled with optical modulators will increase the complexity and overall cost of fabrication. For a fully integrated system, there are two viable options. The first is to use foundries that support hybrid CMOS-silicon photonics fabrication [38]. The second option is to fabricate two separate chips, one for the electronic memory and related circuitry (e.g., DACs and drivers) and the other for the photonic system [39]. While this places more complexity on packaging, a multi-chip approach may be beneficial in terms of eliminating thermal crosstalk or the impact of heat generated by the electronic system (specifically drivers) on the performance of the photonic system.

An additional benefit of this approach is that by decoupling the device used for optical modulation from that of data storage, both devices can independently optimize important metrics that could be high challenging to optimize in a single material platform (e.g., programming speed and cycling endurance). However, most physical effects used for optical modulation are both volatile and weak (e.g.,  $\Delta n \sim 10^{-3}$  to  $10^{-4}$  per volt, °C, etc.). This translates to constant external biasing (e.g., P-N junction) or power dissipation (e.g., resistive microheater) to maintain the

Fig. 1. Overview figure illustrating various state-of-the-art memory technologies introduced in this section: (a) electronic memory coupled to optical components; (b) on-chip memories based on nonvolatile photonics; and (c) passive optical memories.

state of an optical weight, as well as large device footprints for non-resonant devices such as MZIs and electro-absorptive modulators. Below, we briefly describe the most common devices used to implement optical memory and their operation.

A Mach-Zehnder Interferometer (MZI) is a reconfigurable  $2 \times 2$  photonic coupler that uses two pairs of phase shifters and bidirectional couplers to implement a  $2 \times 2$  unitary weight matrix U (as illustrated in Fig. 2a). Normalized incident field amplitudes are used to represent the elements of an input vector  $\vec{A}$ . The optical output vector from the MZI is then equal to  $\vec{B} = U\vec{A}$ . To reconfigure the weight matrix U, a pair of phase shifters are arranged on any two arms of the MZI to control both the interference and relative phase of the two outputs. Assuming coherent inputs, 50:50 couplers, and two phase shifters  $\varphi$  and  $\theta$ , the output amplitudes can be described as:

$$\vec{B} = \begin{bmatrix} e^{j\varphi}\sin\left(\theta\right) & \cos\left(\theta\right) \\ e^{j\varphi}\cos\left(\theta\right) & -\sin\left(\theta\right) \end{bmatrix} \vec{A}.$$
 (6)

MZIs can be organized into a mesh to serve as an optical linear unit that performs matrix multiplications [40]. An  $N \times N$ arbitrary unitary matrix can be deployed on MZIs connected in various mesh topologies, e.g., triangular [41], rectangular [42], and binary tree [43]. While mathematically elegant, one drawback of this approach is the requirement of  $\sim N^2$  MZIs to implement arbitrary  $N \times N$  matrices through the singular value decomposition approach [40] which can lead to large footprints and low compute density [12]. In addition to unitary operations, MZIs can also be used to directly modulate the optical amplitude of transmitted light in alternative architectures



Fig. 2. Electronic memories coupled to optical modulators. (a) Schematic of a reconfigurable MZI implementing the  $2 \times 2$  unitary matrix U. (b) Schematic of a programmable add-drop MRR using differential weighting to implement positive and negative weights.

to the ones mentioned above. This has been used in the case of coherent crossbar arrays which have the potential to extend to multiple wavelengths more easily [44].

A *Micro-Ring Resonator* (MRR) is a reconfigurable optical device that can be used to tune the relative transmission of its through and drop ports at specific optical frequencies which depend on the radius of the ring (illustrated in Fig. 2b) [45]. To implement matrix multiplication, an  $N \times N$  array of MRRs can be used in a wavelength-division multiplexing (WDM) scheme to form a "broadcast and weight" architecture [18]. Input vectors are encoded as the modulated light intensities of multiple wavelengths, while each MRR acts as a filter to selectively apply attenuation to a specific input wavelength according to a corresponding matrix element [46]. Crosstalk between MRRs of similar optical resonance and free spectral range limit the ultimate size of the  $N \times N$  matrix which can be implemented. Moreover, MRRs also suffer from high sensitivity to temperature and fabrication variations.

Resistive heaters and P-N junctions are most commonly used as phase shifters in MZIs and MRRs [33], [34], [35], [37]. These two modulation approaches have certain advantages and disadvantages for optical memory. For instance, despite having very low insertion losses, resistive heaters suffer from slow switching speeds (hundreds of kHz) and high static power consumption (several mW). On the other hand, P-N junctions offer high switching speeds and typically dissipate very little static power. However, their insertion loss is high due to free-carrier absorption and also dependent on the applied bias, making them unsuitable for photonic processors using the coherent schemes mentioned above.

When using these volatile optical modulators as memory units, each modulator requires designated control circuitry to read digital data from memory and then hold the transmission or phase of the modulator constant. This not only introduces complexity to the integrated system, but it also increases static power dissipation from the DAC and driver blocks needed to hold the state of each modulator. When combined with the energy and latency of high-speed DACs, this can increase the overall power consumption and latency of the photonic processor and is analyzed in more detail in Section IV.

In recent years several methods have been used to eliminate the need for DACs and directly use binary data with E/O modulators. Examples include directly modulating light with binary inputs using segmented MZIs [36] and MRRs [33] with up to 4 bits of resolution. This is a promising approach for optical memory as such schemes can even improve the DAC linearity [33]. We compare the various modulation schemes described above in Table I.

We also wish to note that there has been a substantial amount of research conducted on optically-addressable digital memory cells which are analogous to various electronic memory cell architectures. This includes optical SRAM [49], optical DRAM [50], and optical RAM [51], which can be coupled to other optical components. This opens up the possibility of replacing electronic memories with their optical counterparts. Section V is dedicated to various memory technologies, and a comparison of these memory types is presented in Table III.

#### B. On-chip Memories Based on Nonvolatile Photonics

A second approach for implementing on-chip photonic memories involves nonvolatile optical materials or phenomena, where the stored weights are recorded in the form of erasable refractive index and/or optical absorption changes (see Fig. 1b). The examples include: 1) phase change materials (PCMs), which exhibit giant optical property change upon undergoing a nonvolatile amorphous-crystalline structural transition [52]; 2) ferroelectric (FE) crystals exemplified by BaTiO<sub>3</sub> (BTO) whose electric polarization can be switched by an external electrical field in a nonvolatile manner [53]; and 3) charge accumulation in a floating gate or charge trapping in a dielectric layer, the mechanism responsible for data storage in electronic flash memories, which modifies the optical attributes in a Si waveguide via free carrier plasma dispersion [54] (Fig. 3). All the schemes are amenable to electrical writing and optical reading [55], [56], [57], [58]. Another key feature of these memories is multilevel operation capacity, where the presence of intermediate states (corresponding to e.g., mixtures of amorphous/crystalline phases in PCMs [59] or partial FE domain switching in FE crystals [60]) can be used to encode multi-bit information in one single memory cell [61], [62], [63]. In-memory computing based on nonvolatile photonic memories have been demonstrated in single memory cells [64] as well as in large crossbar arrays [65].

Compared to electronic memory driven approaches discussed in the previous section, nonvolatile photonic memories allow fixed weight storage with zero static power dissipation while affording improved long-term data retention. These nonvolatile photonic memory technologies also each boasts unique advantages with respective technical limitations.

In addition to using variable attenuation to represent weights as is illustrated in Fig. 3(b), low-loss PCMs [67] can execute phase-only encoding functions in a coherent network [68]. PCM photonic memory cells are also ultra-compact, only a few microns in length. However, they require relatively large



Fig. 3. Nonvolatile optical memory technologies. (a) Schematic illustration of a PCM-integrated photonic memory; (b) operating mechanism of the PCMintegrated memory: less optical power is transmitted through the waveguide if the PCM is in the crystalline state than when it is in the amorphous state [52]. Write pulses are used to alter the structure of PCMs from crystalline to amorphous and vice versa, resulting in changes in the amount of optical power transmitted through these devices. By using modified write pulses, it is possible to achieve multiple intermediate states between the crystalline and amorphous phases, enabling a range of optical transmission levels and thus allowing for variable attenuation. (c) Cross-section structure of a nonvolatile waveguide phase shifter integrated with FE BTO crystal, which can serve as a basic building block for photonic memory; (d) schematics depicting progressive FE domain switching with increasing the voltage applied between the electrodes [53]; (e) tilted and (f) cross-sectional schematics of a photonic memory device based on charge accumulation in a floating gate. The black arrows indicate the charge carrier flow directions during write and erase operations [66].

switching power (sub-nJ for all-optical switching [52] and a few nJ's for electrothermal switching [69]). Moreover, their cycling endurance must be further improved [70]. In comparison, FE devices claim considerably reduced switching power consumption down to tens of pJ's [53] as well as enhanced endurance [71], although they require much larger footprint and a constant DC bias to maintain electro-optic index change during readout. Both PCM and FE devices also involve new materials and special processes (backend deposition for PCMs and wafer bonding for FE crystals) for integration with standard Si photonic foundry process. The charge accumulation or trapping devices hold the advantage of full CMOS compatibility, although they suffer from similar limitations as their electronic flash memory counterpart in low write/erase speed and endurance. Table II summarizes these various optical memory technologies with relevant performance metrics.

# C. Passive Optical Memories

Controlling signal propagation through delay lines is another promising approach to implement optical memory. This approach has been used as volatile optical memory for computing in both recurrent and convolutional photonic neural networks [16], [77], [78], [79]. When combined with time-multiplexing and wavelength dispersion, optical delay lines have been used to achieve extremely high computational throughput with ultra-low latencies [16]. The fact that they are fully passive and have minimal latency (i.e., time of flight of the optical signal) are two major advantages of using optical delay lines for temporary data storage. However, optical delay lines require significant area on-chip—limited by the bending radius and spacing between neighboring waveguides—which increases with the required delay. In addition, it is challenging to efficiently tune these delays after fabrication. Heterogeneous approaches which integrate multiple optical degrees of freedom using WDM, optical memories, and delay lines is a promising direction for photonic computing [79].

# IV. ENERGY-LATENCY ANALYSIS

In order to establish a comparison between emerging memory technologies in the optical domain (O) with their electronic (E) counterparts, we can utilize the figure of merit defined as the READ-WRITE operations ratio, as well as the overall energy and latency cost when considering E/O and O/E conversions.

## A. READ Operation

For an ideal photonic memory based on PCMs or other nonvolatile material platform, the READ operation requires the energies for the creation and detection of a single photon to access the stored data [80]. Considering a laser source, a memory insertion loss (0.0075 dB/bit [81]), and photodetector readout, the READ (access) energy of a photonic random-access memory (P-RAM) takes <1 fJ/bit for an on-off-keyed signal at 30 GHz data rates, or, about 10 fJ/bit access for a higher bit resolution (e.g., PAM-16 for a 4-bit one) [82], [83], [84], [85]. State-of-the-art SRAM memory using two inverters, which can be in one of two bistable states, has an access latency of 0.21 ns and costs about 5 pJ/bit access [86], [87]. Energy and latency penalties increase when accessing data stored in SRAM cache memories, costing around 180 pJ and 1.66 ns per access for FinFET-based technologies [88]. Thus, a generic photonic link offers MAC operations and memory access of  $10-100 \times$  higher MAC/s/J/access than SRAM, highlighting how a P-RAM can improve the performance of a computational processor.

Table III presents a comparison of various relevant electronic and optical memory technologies. It is worth noting that these technologies should be compared on an apples-to-apples basis, i.e., electronic DRAM (eDRAM cell) with optical DRAM (P-RAM), SRAM with optical SRAM (O-SRAM), and electronic cache (SRAM cache) with optical cache. According to the data, it is clear that optical memories generally have larger footprints by two or three orders of magnitude compared to their electronic counterparts. On the other hand, optical memories tend to have lower latency by two or three orders of magnitude. Finally, optical memories tend to have lower read energy by an order of magnitude.

#### B. WRITE Operation

When writing data to a P-RAM cell, triggering the phase transition of the chalcogenide material, switching ferroelectric domains, etc. is required. This leads to a strong modulation of optical properties (phase for materials such as  $Sb_2Se_3$  and BTO, or amplitude for materials such as GST, GSST, and GSSe). In the case of PCMs, local annealing is used to switch



Fig. 4. Trend of total energy consumption for writing over time for P-RAM and SRAM. PCM-based P-RAM does not require additional energy once written, while FE-based P-RAM requires a DC voltage to read the information. SRAM requires a constant power to overcome internal leakage, power that becomes more prominent as DAC and E/O conversion are required to interface the optical waveguides.

the material—typically either using all-optical heating or an on-chip electro-thermal microheater (e.g., ITO, doped silicon, or metal heaters [82], [84], [92]). This multilevel, ultra-compact approach using PCMs with low IL (such as GSST and GSSe [80], [81]) enables highly efficient fixed weight banks with low power consumption. Compared with writing to SRAM cells, the writing of P-RAM based on (Joule) heating is limited by the behavior of heat propagation and thus requires higher writing energies (few pJ to sub-nJ for all-optical approaches [92] and few nJ for integrated microheaters [93]), as well as higher latency (sub- $\mu$ s). In comparison, the SRAM address line, that is operated for opening and closing the switch and to control the certain transistors that permits reading, can experience a writing speed of  $\sim 1$  to 2 ns per access with an associated energy down to <10 pJ/bit. However, unlike the volatile SRAM which needs constant external voltage applied once the information is written to preserve from the current leakage ( $\sim 2 \text{ nW/bit [88], [90]}$ ), PCM based non-volatile P-RAM does not require continuous external energy after the information is written. Thus, one state of PCM can be maintained passively long term. From an energy perspective, PCM based P-RAM is more suitable for applications which do not require frequent updates and instead require low-cost, long-term data storage which can be rapidly accessed once the information is written. In fact, there is a point beyond which P-RAM becomes more energy efficient compared to the SRAM energy requirements for storing information (Fig. 4). For novel PCM materials, researchers might look for any compounds with lower switching temperatures to further reduce the WRITE energy of the P-RAM, and so reducing the threshold time where P-RAM is more efficient for storing information than SRAM.

#### C. Electrical-Optical Conversion

Conversion between the electrical and optical domains is already an overhead cost that many systems pay every day. Assessing the cost in terms of power and latency for these conversions shapes the system design and choice of memory, especially when



Fig. 5. A roadmap for optical computing is presented, emphasizing the current technologies and future advancements that are necessary to achieve a high-performance, state-of-the-art computing unit.

considering neural networks. Considering electronic memories such as SRAM, the electrical signal needs to go through a DAC ( $\sim$ 1 nJ and  $\sim$ 3 ns [94]), driving amplifier, and electro-optical modulator to convert it into an optical signal. In the same fashion, the detected optical signal requires a trans-impedance amplifier (TIA) and ADC to convert the processed data back to the electronic domain [39]. In this kind of architecture, where each step of the network has to perform a E/O/E conversion, it is straightforward to realize that scaling to multiple processing layers can introduce several problems, such as the need to buffer intermediate information in an SRAM cache, as well as limit the latency and efficiency of the network due to the DACs and ADCs.

A full optical network, where the weights are stored in a nonvolatile fashion by means of P-RAM elements [28], [74], [95], the signals are converted once to the optical domain, and converted back once at the end of the network, would take full advantage of the wide bandwidth provided by the optical domain and extremely low latency and low energy consumption. However, a lack of efficient, nonlinear optical elements with low optical threshold powers currently limits the practicality of this approach for deep neural networks.

# V. OUTLOOK AND ROADMAP FOR DATA STORAGE IN OPTICAL COMPUTING

#### A. Roadmap for Electronic Memories for Optical Computing

Efficient integration of high-density electronic storage with analog optical computing platforms is a challenge that requires alleviating (or removing) the energy-consuming digitalto-analog and electro-optical conversions. The simplest solution is seemingly to adopt a completely analog technology using, for instance, memristors in the electrical domain directly integrated to photonic waveguides [96], [97], [98]. DACs for data input and ADCs for data output are not needed if the optical processor is communicating with an analog environment and E/O conversion can be realized employing the same memristive element. However, the world runs on digital technology and computing with an analog architecture would certainly require data type conversion. The prospect of E/O conversion of digital signals using optical DACs (see Section III-A and Table I), and ideally also ADCs, open the possibility of faster operations with simplified circuitry. The latency can also be further optimized by bringing the electronic memory bank closer to the photonic processor using monolithic co-integration of nanoelectronics and photonics rather than using two separate chiplets [38].

Moreover, novel modulation approaches for electro-optical conversion are necessary to avoid the widespread use of thermooptical control, which faces serious heating issues when scaling to hundreds of simultaneously operating devices. Similarly, faster carrier-based modulation faces high IL and large form factors—both of which are detrimental to computing tasks since the complexity of the photonic circuitry can afford neither. Optomechanical modulators [99], while still volatile unless using latches or bi-stability [100], [101], are potential CMOS-compatible platforms given their low insertion losses, low powers, and form factors comparable to thermo-optic modulators. Provided CMOS integration in the future, optical modulators based on 2D materials could provide an even closer to optimal platform for energy-efficient modulation [102].

# *B.* Roadmap for Photonic Memories Based on Nonvolatile Materials

Photonic integrated technologies, as available in current commercial foundries, must deal with large form factors due to waveguide footprints, a fact that could improve in the future by adopting smaller node CMOS fabrication processes to achieve reliable nanophotonic structures [12]. The current form-factor limitation means that electronics' storage densities of 10 Gb/mm<sup>2</sup> [103] are likely unachievable with photonic memories, especially those based on material platforms directly embedded into the photonic circuits. Yet, the prospect of a novel optical memory class that, despite the lower storage density, can contribute to and enhance the performance of the memory hierarchy in hybrid optoelectronic architectures-especially photonic computational memory-is enough to motivate the development of an "ideal" photonic memory. The target performance metrics for optical memories (described in detail in Section II) are ultimately determined by the computing task at hand, just like the different electronic technologies in a Von Neumann computer's memory hierarchy. Whether volatile or nonvolatile, written with higher or lower frequency, etc., some features that any ideal photonic memory should have include:

- 1. CMOS compatibility for guaranteed scalability
- 2. Low IL comparable to the propagation loss of the platform (<1 dB/cm)
- READ and WRITE energy consumption of <fJ and fJ-pJ, respectively
- 4. Large modulation depths >10 dB for amplitude modulation and at least  $2\pi$  for phase modulation
- 5. WRITE cyclability  $>10^8$
- Precision and stability that are not compromised by environmental effects such as temperature or material degradation.

Despite the challenges described in Section III, there is still ample room for improved performance in nonvolatile photonic memory technologies. For instance, even though the PCM photonic memories come with limited endurance today (>  $5 \times 10^5$ cycles [81]), there does not appear to be any intrinsic limitations that precludes them from reach endurance levels attained in PCM-based RF switches  $(1.5 \times 10^8 \text{ cycles } [104])$  and electronic memories (>  $2 \times 10^{12}$  cycles [105]). Their energy consumption can also be minimized by searching for new PCM compositions with reduced liquidus temperature and fast crystallization kinetics or by reducing the device's effective area through thermal engineering [106]. On the other hand, development of new FE crystals compatible with CMOS backend processing, such as HfO<sub>2</sub>-based oxide alloys [107], [108], could potentially facilitate their integration with standard photonic integrated circuits. Finally, other alternative emerging nonvolatile integrated photonics platforms may also prove useful for photonic memory applications [109], [110], [111]. Whether backend, frontend, or eventually fully integrated into CMOS fabrication processes, the novel active material-based approaches require a scalable fabrication to guarantee high density photonic architectures and mass production.

## C. Optical Memories in Edge/Cloud Computing

Alleviating the von Neumann bottleneck, especially if using fiber optics to store and fetch data-commonly done in data centers for cloud computing-is the longstanding promise of optical memories in conventional computers. This task is yet to be demonstrated given the complexity of realizing high-density optical storage, mostly due to the lack of fully CMOS compatible platforms and their large footprints. On the other hand, the development of fully integrated optical or electronic memory with a photonic processor either in a von Neumann [112] or braininspired architectures [8], [13], [113], together with integrated light sources and photodetectors, can lead to the development of packaged devices with the portability and processing capacity required to enhance edge computing. Inference [13], [29] and high-throughput matrix-vector multiplications [12], [95] have already led to outstanding, high-performance demonstrations using on-chip photonic processors—systems that can be integrated to future edge computing devices.

#### ACKNOWLEDGMENT

N.Y. acknowledges support from the University of Pittsburgh Momentum Fund. C.R. acknowledges support from the Minta Martin Foundation through the University of Maryland. V.J.S. acknowledges support from the George Washington University Nanofabrication and Imaging Center (GWNIC).

#### REFERENCES

- Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," *Nature*, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.
- [2] M. Soltanolkotabi, A. Javanmard, and J. D. Lee, "Theoretical insights into the optimization landscape of over-parameterized shallow neural networks," *IEEE Trans. Inf. Theory*, vol. 65, no. 2, pp. 742–769, Feb. 2019, doi: 10.1109/TIT.2018.2854560.

- [3] N. C. Thompson, K. Greenewald, K. Lee, and G. F. Manso, "The computational limits of deep learning," MIT Initiative on the Digital Economy Research Brief, vol. 4, 2020. [Online]. Available: http://arxiv. org/abs/2007.05558
- [4] N. P. Jouppi et al., "In-datacenter performance analysis of a tensor processing unit," in *Proc. 44th Annu. Int. Symp. Comput. Architecture*, 2017, pp. 1–12, doi: 10.1145/3079856.3080246.
- [5] A. Sebastian, M. le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, "Memory devices and applications for in-memory computing," *Nature Nanotechnol.*, vol. 15, pp. 529–544, 2020, doi: 10.1038/s41565-020-0655-z.
- [6] D. A. B. Miller, "Are optical transistors the logical next step?," *Nature Photon.*, vol. 4, no. 1, pp. 3–5, Jan. 2010, doi: 10.1038/nphoton.2009.240.
- [7] D. A. B. Miller, "Attojoule optoelectronics for low-energy information processing and communications," *J. Lightw. Technol.*, vol. 35, no. 3, pp. 346–396, Feb. 2017, doi: 10.1109/JLT.2017.2647779.
- [8] B. J. Shastri et al., "Photonics for artificial intelligence and neuromorphic computing," *Nature Photon.*, vol. 15, no. 2, pp. 102–114, Feb. 2021, doi: 10.1038/s41566-020-00754-y.
- [9] G. Wetzstein et al., "Inference in artificial intelligence with deep optics and photonics," *Nature*, vol. 588, no. 7836, pp. 39–47, Dec. 2020, doi: 10.1038/s41586-020-2973-6.
- [10] N. C. Harris et al., "Linear programmable nanophotonic processors," *Optica*, vol. 5, no. 12, pp. 1623–1631, Dec. 2018, doi: 10.1364/OP-TICA.5.001623.
- [11] K. Hinton, G. Raskutti, P. M. Farrell, and R. S. Tucker, "Switching energy and device size limits on digital photonic signal processing technologies," *IEEE J. Sel. Topics Quantum Electron.*, vol. 14, no. 3, pp. 938–945, May/Jun. 2008, doi: 10.1109/JSTQE.2008.916242.
- [12] M. A. Nahmias et al., "Photonic multiply-accumulate operations for neural networks," *IEEE J. Sel. Topics Quantum Electron.*, vol. 26, no. 1, Jan./Feb. 2020, Art. no. 7701518, doi: 10.1109/JSTQE.2019.2941485.
- [13] Y. Shen et al., "Deep learning with coherent nanophotonic circuits," *Nature Photon.*, vol. 11, no. 7, pp. 441–446, Jun. 2017, doi: 10.1038/nphoton.2017.93.
- [14] T. Zhou et al., "Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit," *Nature Photon.*, vol. 15, no. 5, pp. 367–373, May 2021, doi: 10.1038/s41566-021-00796-w.
- [15] X. Lin et al., "All-optical machine learning using diffractive deep neural networks," *Science*, vol. 361, no. 6406, pp. 1004–1008, Sep. 2018, doi: 10.1126/science.aat8084.
- [16] X. Xu et al., "11 TOPS photonic convolutional accelerator for optical neural networks," *Nature*, vol. 589, no. 7840, pp. 44–51, Jan. 2021, doi: 10.1038/s41586-020-03063-0.
- [17] J. Feldmann et al., "Parallel convolutional processing using an integrated photonic tensor core," *Nature*, vol. 589, no. 7840, pp. 52–58, Jan. 2021, doi: 10.1038/s41586-020-03070-1.
- [18] A. N. Tait, M. A. Nahmias, B. J. Shastri, and P. R. Prucnal, "Broadcast and weight: An integrated network for scalable photonic spike processing," *J. Lightw. Technol.*, vol. 32, no. 21, pp. 4029–4041, Nov. 2014, doi: 10.1109/JLT.2014.2345652.
- [19] M. Y.-S. Fang, S. Manipatruni, C. Wierzynski, A. Khosrowshahi, and M. R. DeWeese, "Design of optical neural networks with component imprecisions," *Opt. Exp.*, vol. 27, no. 10, pp. 14009–14029, May 2019, doi: 10.1364/OE.27.014009.
- [20] F. Zokaee et al., "LightBulb: A photonic-nonvolatile-memorybased accelerator for binarized convolutional neural networks," in *Proc. IEEE/ACM Des., Automat. Test Europe Conf. Exhib.*, 2020, pp. 1438–1443.
- [21] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized neural networks," in *Proc. Adv. Neural Inf. Process. Syst.*, 2016, vol. 29, pp. 4107–4115.
- [22] Y. Umuroglu et al., "FINN," in Proc. ACM/SIGDA Int. Symp. Field-Programmable Gate Arrays, 2017, pp. 65–74, doi: 10.1145/3020078.3021744.
- [23] H. Noh, T. You, J. Mun, and B. Han, "Regularizing deep neural networks by noise: Its interpretation and optimization," in *Proc. Adv. Neural Inf. Process. Syst.*, vol. 30, 2017, Paper 2653. [Online]. Available: https://proceedings.neurips.cc/paper/2017/ file/217e342fc01668b10cb1188d40d3370e-Paper.pdf
- [24] Z. You, J. Ye, K. Li, Z. Xu, and P. Wang, "Adversarial noise layer: Regularize neural network by adding noise," in *Proc. IEEE Int. Conf. Image Process.*, 2019, pp. 909–913, doi: 10.1109/ICIP.2019.8803055.
- [25] C. Wu et al., "Harnessing optoelectronic noises in a photonic generative network," *Sci. Adv.*, vol. 8, no. 3, Jan. 2022, Art. no. eabm2956, doi: 10.1126/sciadv.abm2956.

- [26] A. N. Tait, "Quantifying power in silicon photonic neural networks," *Phys. Rev. Appl.*, vol. 17, no. 5, May 2022, Art. no. 054029, doi: 10.1103/PhysRevApplied.17.054029.
- [27] N. Youngblood, "Coherent photonic crossbar arrays for largescale matrix-matrix multiplication," *IEEE J. Sel. Topics Quantum Electron.*, vol. 29, no. 2, Mar./Apr. 2023, Art. no. 6100211, doi: 10.1109/JSTQE.2022.3171167.
- [28] M. Miscuglio and V. J. Sorger, "Photonic tensor cores for machine learning," *Appl. Phys. Rev.*, vol. 7, no. 3, Sep. 2020, Art. no. 031404, doi: 10.1063/5.0001942.
- [29] J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran, and W. H. P. Pernice, "All-optical spiking neurosynaptic networks with self-learning capabilities," *Nature*, vol. 569, no. 7755, pp. 208–214, May 2019, doi: 10.1038/s41586-019-1157-8.
- [30] J. Feldmann et al., "Calculating with light using a chip-scale all-optical abacus," *Nature Commun.*, vol. 8, no. 1, Dec. 2017, Art. no. 1256, doi: 10.1038/s41467-017-01506-3.
- [31] A. Spinelli, C. Compagnoni, and A. Lacaita, "Reliability of NAND flash memories: Planar cells and emerging issues in 3D devices," *Computers*, vol. 6, no. 2, Apr. 2017, Art. no. 16, doi: 10.3390/computers6020016.
- [32] R. Amin et al., "ITO-based electro-absorption modulator for photonic neural activation function," *APL Mater.*, vol. 7, no. 8, Aug. 2019, Art. no. 081112, doi: 10.1063/1.5109039.
- [33] S. Moazeni et al., "A 40-gb/s PAM-4 transmitter based on a ring-resonator optical DAC in 45-nm SOI CMOS," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3503–3516, Dec. 2017, doi: 10.1109/JSSC.2017.2748620.
- [34] P. Dong et al., "Thermally tunable silicon racetrack resonators with ultralow tuning power," *Opt. Exp.*, vol. 18, no. 19, pp. 20298–20304, Sep. 2010, doi: 10.1364/OE.18.020298.
- [35] E. Timurdogan et al., "An ultralow power athermal silicon modulator," *Nature Commun.*, vol. 5, no. 1, Sep. 2014, Art. no. 4008, doi: 10.1038/ncomms5008.
- [36] X. Wu et al., "A 20Gb/s NRZ/PAM-4 1V transmitter in 40nm CMOS driving a Si-photonic modulator in 0.13μm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, 2013, pp. 128–129, doi: 10.1109/ISSCC.2013.6487667.
- [37] N. C. Harris et al., "Efficient, compact and low loss thermo-optic phase shifter in silicon," *Opt. Exp.*, vol. 22, no. 9, pp. 10487–10493, May 2014, doi: 10.1364/OE.22.010487.
- [38] A. H. Atabaki et al., "Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip," *Nature*, vol. 556, no. 7701, pp. 349–354, 2018, doi: 10.1038/s41586-018-0028-z.
- [39] F. Ashtiani, A. J. Geers, and F. Aflatouni, "An on-chip photonic deep neural network for image classification," *Nature*, vol. 606, no. 7914, pp. 501–506, Jun. 2022, doi: 10.1038/s41586-022-04714-0.
- [40] D. A. B. Miller, "Self-configuring universal linear optical component," *Photon. Res.*, vol. 1, no. 1, pp. 1–15, Jun. 2013, doi: 10.1364/PRJ.1.000001.
- [41] M. Reck, A. Zeilinger, H. J. Bernstein, and P. Bertani, "Experimental realization of any discrete unitary operator," *Phys. Rev. Lett.*, vol. 73, no. 1, pp. 58–61, Jul. 1994, doi: 10.1103/PhysRevLett.73.58.
- [42] W. R. Clements, P. C. Humphreys, B. J. Metcalf, W. S. Kolthammer, and I. A. Walsmley, "Optimal design for universal multiport interferometers," *Optica*, vol. 3, no. 12, pp. 1460–1465, Dec. 2016, doi: 10.1364/OP-TICA.3.001460.
- [43] D. A. B. Miller, "Self-aligning universal beam coupler," Opt. Exp., vol. 21, no. 5, pp. 6360–6370, Mar. 2013, doi: 10.1364/OE.21.006360.
- [44] A. Totovic, G. Giamougiannis, A. Tsakyridis, D. Lazovsky, and N. Pleros, "Programmable photonic neural networks combining WDM with coherent linear optics," *Sci. Rep.*, vol. 12, no. 1, Apr. 2022, Art. no. 5605, doi: 10.1038/s41598-022-09370-y.
- [45] W. Bogaerts et al., "Silicon microring resonators," *Laser Photon. Rev.*, vol. 6, no. 1, pp. 47–73, Jan. 2012, doi: 10.1002/lpor.201100017.
- [46] C. Huang et al., "Demonstration of scalable microring weight bank control for large-scale photonic integrated circuits," *APL Photon.*, vol. 5, no. 4, Apr. 2020, Art. no. 040803, doi: 10.1063/1.5144121.
- [47] M. R. Watts et al., "Adiabatic thermo-optic Mach–Zehnder switch," Opt. Lett., vol. 38, no. 5, pp. 733–735, Mar. 2013, doi: 10.1364/OL.38.000733.
- [48] A. H. Atabaki, A. A. Eftekhar, S. Yegnanarayanan, and A. Adibi, "Sub-100-nanosecond thermal reconfiguration of silicon photonic devices," *Opt. Exp.*, vol. 21, no. 13, pp. 15706–15718, Jul. 2013, doi: 10.1364/OE.21.015706.
- [49] T. Alexoudi et al., "III–V-on-Si photonic crystal nanocavity laser technology for optical static random access memories," *IEEE J. Sel. Topics Quantum Electron.*, vol. 22, no. 6, pp. 295–304, Nov./Dec. 2016, doi: 10.1109/JSTQE.2016.2593636.

- [50] G. Berrettini, L. Poti, and A. Bogoni, "Optical dynamic RAM for alloptical digital processing," *IEEE Photon. Technol. Lett.*, vol. 23, no. 11, pp. 685–687, Jun. 2011, doi: 10.1109/LPT.2011.2123087.
- [51] A. Tsakyridis, T. Alexoudi, A. Miliou, N. Pleros, and C. Vagionas, "10 Gb/s optical random access memory (RAM) cell," *Opt. Lett.*, vol. 44, no. 7, pp. 1821–1824, Apr. 2019, doi: 10.1364/OL.44.001821.
- [52] C. Rios et al., "Integrated all-photonic non-volatile multi-level memory," *Nature Photon.*, vol. 9, no. 11, pp. 725–732, Oct. 2015, doi: 10.1038/nphoton.2015.182.
- [53] J. Geler-Kremer et al., "A ferroelectric multilevel non-volatile photonic phase shifter," *Nature Photon.*, vol. 16, pp. 491–497, May 2022, doi: 10.1038/s41566-022-01003-0.
- [54] C. A. Barrios and M. Lipson, "Silicon photonic read-only memory," J. Lightw. Technol., vol. 24, no. 7, pp. 2898–2905, Jul. 2006, doi: 10.1109/JLT.2006.875964.
- [55] H. Zhang et al., "Miniature multilevel optical memristive switch using phase change material," ACS Photon., vol. 6, no. 9, pp. 2205–2212, Sep. 2019, doi: 10.1021/acsphotonics.9b00819.
- [56] J. Zheng et al., "Nonvolatile electrically reconfigurable integrated photonic switch enabled by a silicon PIN diode heater," *Adv. Mater.*, vol. 32, no. 31, Jun. 2020, Art. no. 2001218, doi: 10.1002/adma.202001218.
- [57] N. Farmakidis et al., "Electronically reconfigurable photonic switches incorporating plasmonic structures and phase change materials," *Adv. Sci.*, vol. 9, 2022, Art. no. 2200383, doi: 10.1002/ADVS.202200383.
- [58] M. Grajower, N. Mazurski, J. Shappir, and U. Levy, "Nonvolatile silicon photonics using nanoscale flash memory technology," *Laser Photon. Rev.*, vol. 12, no. 4, Apr. 2018, Art. no. 1700190, doi: 10.1002/LPOR.201700190.
- [59] Y. Zhang et al., "Electrically reconfigurable non-volatile metasurface using low-loss optical phase-change material," *Nature Nanotechnol.*, vol. 16, pp. 661–666, 2021, doi: 10.1038/s41565-021-00881-9.
- [60] M. Mishra, N. R. Das, A. Melloni, and F. Morichetti, "Modelling domain switching of ferroelectric BaTiO3 integrated in silicon photonic waveguides," *Opt. Commun.*, vol. 448, pp. 19–25, 2019, doi: 10.1016/J.OPTCOM.2019.05.001.
- [61] X. Li et al., "Fast and reliable storage using a 5 bit, nonvolatile photonic memory cell," *Optica*, vol. 6, no. 1, pp. 1–6, Jan. 2019, doi: 10.1364/OP-TICA.6.000001.
- [62] D. Yao et al., "Energy-efficient non-volatile ferroelectric based electrostatic doping multilevel optical readout memory," *Opt. Exp.*, vol. 30, no. 8, pp. 13572–13582, Apr. 2022, doi: 10.1364/OE.456048.
- [63] C. Wu et al., "Programmable phase-change metasurfaces on waveguides for multimode photonic convolutional neural network," *Nature Commun.*, vol. 12, 2021, Art. no. 96, doi: 10.1038/s41467-020-20365-z.
- [64] C. Ríos et al., "In-memory computing on a photonic platform," Sci. Adv., vol. 5, no. 2, Feb. 2019, Art. no. eaau5759, doi: 10.1126/sciadv.aau5759.
- [65] J. Feldmann et al., "Parallel convolutional processing using an integrated photonic tensor core," *Nature*, vol. 589, no. 7840, pp. 52–58, Jan. 2021, doi: 10.1038/s41586-020-03070-1.
- [66] J. F. Song et al., "Integrated photonics with programmable nonvolatile memory," *Sci. Rep.*, vol. 6, no. 1, Mar. 2016, Art. no. 22616, doi: 10.1038/srep22616.
- [67] C. Ríos et al., "Ultra-compact nonvolatile photonics based on electrically reprogrammable transparent phase change materials," *PhotoniX*, vol. 3, 2022, Art. no. 26, doi: https://doi.org/10.1186/s43074-022-00070-4.
- [68] N. Youngblood, "Coherent photonic crossbar arrays for largescale matrix-matrix multiplication," *IEEE J. Sel. Topics Quantum Electron.*, vol. 29, no. 2, Mar./Apr. 2023, Art. no. 6100211, doi: 10.1109/JSTQE.2022.3171167.
- [69] Y. Zhang et al., "Myths and truths about optical phase change materials: A perspective," *Appl. Phys. Lett.*, vol. 118, no. 21, May 2021, Art. no. 210501, doi: 10.1063/5.0054114.
- [70] L. Martin-Monier et al., "Endurance of chalcogenide optical phase change materials: A review," *Opt. Mater. Exp.*, vol. 12, no. 6, pp. 2145–2167, Jun. 2022, doi: 10.1364/ome.456428.
- [71] R. Cao et al., "Improvement of endurance in HZO-based ferroelectric capacitor using Ru electrode," *IEEE Electron Device Lett.*, vol. 40, no. 11, pp. 1744–1747, Nov. 2019, doi: 10.1109/LED.2019.2944960.
- [72] H. Zhang et al., "Miniature multilevel optical memristive switch using phase change material," ACS Photon., vol. 6, no. 9, pp. 2205–2212, Sep. 2019, doi: 10.1021/acsphotonics.9b00819.
- [73] N. Farmakidis et al., "Electronically reconfigurable photonic switches incorporating plasmonic structures and phase change materials," *Adv. Sci.*, vol. 9, 2022, Art. no. 2200383, doi: 10.1002/advs.202200383.

- [74] C. Wu et al., "Programmable phase-change metasurfaces on waveguides for multimode photonic convolutional neural network," *Nature Commun.*, vol. 12, no. 1, Dec. 2021, Art. no. 96, doi: 10.1038/s41467-020-20365-z.
- [75] J. Geler-Kremer et al., "A ferroelectric multilevel non-volatile photonic phase shifter," *Nature Photon.*, vol. 16, no. 7, pp. 491–497, Jul. 2022, doi: 10.1038/s41566-022-01003-0.
- [76] J.-F. Song et al., "Integrated photonics with programmable nonvolatile memory," *Sci. Rep.*, vol. 6, no. 1, Mar. 2016, Art. no. 22616, doi: 10.1038/srep22616.
- [77] K. Vandoorne et al., "Experimental demonstration of reservoir computing on a silicon photonics chip," *Nature Commun.*, vol. 5, pp. 1–6, 2014, doi: 10.1038/ncomms4541.
- [78] D. Brunner et al., "Tutorial: Photonic neural networks in delay systems," J. Appl. Phys., vol. 124, no. 15, Oct. 2018, Art. no. 152004, doi: 10.1063/1.5042342.
- [79] S. Xu, J. Wang, S. Yi, and W. Zou, "High-order tensor flow processing using integrated photonic circuits," vol. 13, Dec. 2021, Art. no. 7970.
- [80] Y. Zhang et al., "Broadband transparent optical phase change materials for high-performance nonvolatile photonics," *Nature Commun.*, vol. 10, no. 1, Dec. 2019, Art. no. 4279, doi: 10.1038/s41467-019-12196-4.
- [81] J. Meng et al., "Electrical programmable low-loss high cyclable nonvolatile photonic random-access memory," 2022. Accessed: Jul. 05, 2022. [Online]. Available: https://arxiv.org/abs/2203.13337v4
- [82] R. Chen et al., "Broadband nonvolatile electrically controlled programmable units in silicon photonics," ACS Photon., vol. 9, no. 6, pp. 2142–2150, Jun. 2022, doi: 10.1021/acsphotonics.2c00452.
- [83] J. Meng, M. Miscuglio, and V. J. Sorger, "Multi-level nonvolatile photonic memories using broadband transparent phase change materials," in *Proc. OSA Adv. Photon. Congr.*, 2021, Paper IF3A.2, doi: 10.1364/IPRSN.2021.IF3A.2.
- [84] K. Kato, M. Kuwahara, H. Kawashima, T. Tsuruoka, and H. Tsuda, "Current-driven phase-change optical gate switch using indium-tinoxide heater," *Appl. Phys. Exp.*, vol. 10, no. 7, 2017, Art. no. 072201, doi: 10.7567/APEX.10.072201.
- [85] X. Li et al., "Fast and reliable storage using a 5 bit, nonvolatile photonic memory cell," *Optica*, vol. 6, no. 1, pp. 1–6, Jan. 2019, doi: 10.1364/OP-TICA.6.000001.
- [86] X. Liang, K. Turgay, and D. Brooks, "Architectural power models for SRAM and CAM structures based on hybrid analytical/empirical techniques," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Des.*, 2007, pp. 824–830, doi: 10.1109/ICCAD.2007.4397367.
- [87] A. Biswas and A. P. Chandrakasan, "CONV-SRAM: An energy-efficient SRAM with in-memory dot-product computation for low-power convolutional neural networks," *IEEE J. Solid-State Circuits*, vol. 54, no. 1, pp. 217–230, Jan. 2019, doi: 10.1109/JSSC.2018.2880918.
- [88] D. P. Ravipati et al., "FN-CACTI: Advanced CACTI for Fin-FET and NC-FinFET technologies," *IEEE Trans. Very Large Scale Integration Syst.*, vol. 30, no. 3, pp. 339–352, Mar. 2022, doi: 10.1109/TVLSI.2021.3123112.
- [89] H. Li, M. Bhargava, P. N. Whatmough, and H.-S. P. Wong, "On-chip memory technology design space explorations for mobile deep neural network accelerators," in *Proc. 56th Annu. Des. Automat. Conf.*, 2019, pp. 1–6, doi: 10.1145/3316781.3317874.
- [90] A. Shafaei, Y. Wang, X. Lin, and M. Pedram, "FinCACTI: Architectural analysis and modeling of caches with deeply-scaled FinFET devices," in *Proc. IEEE Comput. Soc. Annu. Symp. VLSI*, 2014, pp. 290–295, doi: 10.1109/ISVLSI.2014.94.
- [91] P. Maniotis, D. Fitsios, G. T. Kanellos, and N. Pleros, "Optical buffering for chip multiprocessors: A 16GHz optical cache memory architecture," *J. Lightw. Technol.*, vol. 31, no. 24, pp. 4175–4191, Dec. 2013, doi: 10.1109/JLT.2013.2290741.
- [92] C. Ríos et al., "Integrated all-photonic non-volatile multi-level memory," *Nature Photon.*, vol. 9, no. 11, pp. 725–732, Sep. 2015, doi: 10.1038/nphoton.2015.182.
- [93] J. Zheng et al., "Nonvolatile electrically reconfigurable integrated photonic switch enabled by a silicon PIN diode heater," *Adv. Mater.*, vol. 32, no. 31, Aug. 2020, Art. no. 2001218, doi: 10.1002/adma.202001218.
- [94] W. S. Juanda and J. S. Chang, "A calibration-free/DEM-free 8-bit 2.4-GS/s single-core digital-to-analog converter with a distributed biasing scheme," *IEEE Trans. Very Large Scale Integration Syst.*, vol. 26, no. 11, pp. 2299–2309, Nov. 2018, doi: 10.1109/TVLSI.2018.2850919.
- [95] J. Feldmann et al., "Parallel convolutional processing using an integrated photonic tensor core," *Nature*, vol. 589, no. 7840, pp. 52–58, Jan. 2021, doi: 10.1038/s41586-020-03070-1.

- [96] B. M. Tossoun, X. Sheng, J. P. Strachan, D. Liang, and R. G. Beausoleil, "Hybrid memristor optoelectronic integrated circuits for optical computing," *Proc. SPIE*, vol. 12005, pp. 32–36, 2022, doi: 10.1117/12.2614073.
- [97] B. Tossoun, X. Sheng, J. P. Strachan, D. Liang, and R. G. Beausoleil, "Memristor Photonics," in *Proc. Photon. Switching Comput.*, 2021, Paper Tu5B.3, doi: 10.1364/PSC.2021.Tu5B.3.
- [98] K. Portner et al., "Analog nanoscale electro-optical synapses for neuromorphic computing applications," ACS Nano, vol. 15, no. 9, pp. 14776–14785, Sep. 2021, doi: 10.1021/acsnano.1c04654.
- [99] C. Errando-Herranz et al., "MEMS for photonic integrated circuits," *IEEE J. Sel. Topics Quantum Electron.*, vol. 26, no. 2, Mar./Apr. 2020, Art. no. 8200916, doi: 10.1109/JSTQE.2019.2943384.
- [100] H. Sattari, A. Toros, T. Graziosi, and N. Quack, "Bistable silicon photonic MEMS switches," *Proc. SPIE*, vol. 10931, 2019, Art. no. 109310D, doi: 10.1117/12.2507192.
- [101] P. Edinger et al., "A bistable silicon photonic mems phase switch for nonvolatile photonic circuits," in *Proc. IEEE 35th Int. Conf. Micro Electro Mech. Syst. Conf.*, 2022, pp. 995–997, doi: 10.1109/MEMS51670.2022.9699739.
- [102] I. Datta et al., "Low-loss composite photonic platform based on 2D semiconductor monolayers," *Nature Photon.*, vol. 14, no. 4, pp. 256–262, Apr. 2020, doi: 10.1038/s41566-020-0590-4.
- [103] A. Goda, "Recent progress on 3D NAND flash technologies," *Electronics*, vol. 10, no. 24, Dec. 2021, Art. no. 3156, doi: 10.3390/electronics10243156.
- [104] J.-S. Moon et al., "Reconfigurable infrared spectral imaging with phase change materials," *Proc. SPIE*, vol. 10982, 2019, Art. no. 109820X-1, doi: 10.1117/12.2519492.
- [105] W. Kim et al., "ALD-based confined PCM with a metallic liner toward unlimited endurance," in *Proc. IEEE Int. Electron Devices Meeting*, 2016, pp. 4.2.1–4.2.4, doi: 10.1109/IEDM.2016.7838343.
- [106] Z. Fang et al., "Ultra-low-energy programmable non-volatile silicon photonics based on phase-change materials with graphene heaters," *Nature Nanotechnol.*, vol. 17, no. 8, pp. 842–848, Jul. 2022, doi: 10.1038/s41565-022-01153-w.
- [107] M. Halter et al., "Back-end, CMOS-compatible ferroelectric fieldeffect transistor for synaptic weights," ACS Appl. Mater. Interfaces, vol. 12, no. 15, pp. 17725–17732, Apr. 2020, doi: 10.1021/AC-SAMI.0C00877/SUPPL\_FILE/AM0C00877\_SI\_001.PDF.
- [108] J. Qin et al., "Enhanced second harmonic generation from ferroelectric HfO<sub>2</sub>-based hybrid metasurfaces," ACS Nano, vol. 13, no. 2, pp. 1213–1222, Feb. 2019, doi: 10.1021/ACSNANO. 8B06308/ASSET/IMAGES/LARGE/NN-2018-06308K\_0006.JPEG.
- [109] C. Lian et al., "Photonic (computational) memories: Tunable nanophotonics for data storage and computing," *Nanophotonics*, 2022, doi: 10.1515/NANOPH-2022-0089.
- [110] J. Parra, I. Olivares, A. Brimont, and P. Sanchis, "Toward nonvolatile switching in silicon photonic devices," *Laser Photon. Rev.*, vol. 15, no. 6, Jun. 2021, Art. no. 2000501, doi: 10.1002/LPOR.202000501.
- [111] Y. Zhai et al., "Toward non-volatile photonic memory: Concept, material and design," *Mater. Horiz.*, vol. 5, no. 4, pp. 641–654, 2018, doi: 10.1039/C8MH00110C.
- [112] A. Narayan, Y. Thonnart, P. Vivet, A. K. Coskun, and A. Joshi, "Architecting optically-controlled phase change memory," *ACM Trans. Archit. Code Optim.*, vol. 19, no. 4, Dec. 2022, Art. no. 48, doi: https://doi.org/10.1145/3533252.
- [113] V. Bangari et al., "Digital electronics and analog photonics for convolutional neural networks (DEAP-CNNs)," *IEEE J. Sel. Topics Quantum Electron.*, vol. 26, no. 1, Jan. 2020, Art. no. 7701213, doi: 10.1109/JSTQE.2019.2945540.



Sadra Rahimi Kari (Student Member, IEEE) received the B.S. degree in electrical and computer engineering from the University of Tabriz, Tabriz, Iran, in 2019. From 2020 to 2021, he was a Research Assistant with Istanbul Technical University, Istanbul, Turkey, where his primary focus was on efficient hardware implementation of ANNs. In 2022, he joined the University of Pittsburgh, Pittsburgh, PA, USA, to start his Ph.D. studies. He is currently working on developing photonic devices and architectures for machine learning.



**Carlos A. Ríos Ocampo (Carlos Ríos)** (Member, IEEE) received the B.S. degree in physics from the University of Antioquia, Medellín, Colombia, in 2010, the M.Sc. degree from the Karlsruhe Institute of Technology, Karlsruhe, Germany, in 2014, and the Ph.D. degree from the University of Oxford, Oxford, U.K., in 2017. He is currently an Assistant Professor with the University of Maryland, College Park, MD, USA. Prior to joining UMD, he was a Postdoctoral Associate with MIT between 2018 and 2021. His scientific interests focus on studying and developing

new on-chip technologies driven by optical nanomaterials and nanophotonics.



Lei Jiang (Member, IEEE) received the B.S. and M.S. degrees from Shanghai Jiao Tong University China, Shanghai, China, in 2006 and 2008, respectively, and the Ph.D. degree from the University of Pittsburgh, Pittsburgh, PA, USA, in 2014. He is currently an Assistant Professor with Indiana University Bloomington, Bloomington, IN, USA. His research interests include photonic hardware accelerator design and emerging nonvolatile memory technologies, e.g., PCM, STT-MRAM, and RERAM.



Jiawei Meng (Student Member, IEEE) received the B.S. degree in computer engineering from Miami University, Oxford, OH, USA, in 2017, and the M.S. degree in electrical engineering in 2019 from George Washington University, Washington, DC, USA, where he is currently working toward the Ph.D. degree in electrical engineering. His research interests include photonic integrated circuit analysis and design, phase change material on integrated circuits, specifically designing for photonic random-access memory.



**Dr. Nicola Peserico** (Member, IEEE) received the Ph.D. degree with the Politecnico di Milano, Milan, Italy, in 2018. In 2019, he joined Femtorays, Italy, a silicon photonics startup for biosensing. He is currently a Postdoc Researcher with the Department of Electrical and Computer Engineering, George Washington University, Washington, DC, USA. His research interests include silicon pho- tonics, AI/ML accelerators, optoelectronics devices and components, and bio-sensing with photonic integrated circuits.



**Dr. Volker J. Sorger** (Senior Member, IEEE) is currently an Associate Professor with the Department of Electrical and Computer Engineering and the Director of the Institute on AI & Photonics, the Head of the Devices and Intelligent Systems Laboratory, George Washington University, Washington, DC, USA. His research interests include devices and optoelectronics, AI/ML accelerators, mixed-signal ASICs, quantum matter and quantum processors, and cryptography. For his work, Dr. Sorger was the recipient of multiple awards, including the Presidential PECASE

Award, AFOSR YIP Award, Emil Wolf Prize, and National Academy of Sciences award of the year. Dr. Sorger is currently an Associate Editor for OPTICA, serves on the board of Chip, and was the former editor-in-chief of Nanophotonics. He is a Fellow of Optica (former OSA), a Fellow of SPIE, and a Fellow of the German National Academic Foundation. He is a Co-Founder of Optelligence Company.



Juejun Hu (Member, IEEE) received the B.S. degree in materials science and engineering from Tsinghua University, Beijing, China, in 2004, and the Ph.D. degree in materials science and engineering from Massachusetts Institute of Technology (MIT), Cambridge, MA, USA, in 2009. He is currently a Professor of materials science and engineering with MIT. Prior to joining MIT, He was an Assistant Professor with the University of Delaware, Newark, DE, USA, from 2010 to 2014. His primary research interests include new optical materials exemplified by chalcogenide

compounds, and enhanced photon-matter interactions in nanophotonic structures.



Nathan Youngblood (Member, IEEE) received the B.S. degree in physics from Bethel University, St. Paul, MN, USA, in 2011, and the Ph.D. degree in electrical engineering from the University of Minnesota, Minneapolis, MN, USA, in 2016, where he was involved in the integration of 2-D materials with silicon photonics for optoelectronic applications. After postdoctoral training with the University of Oxford, Oxford, U.K., he joined the Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA, in 2019. His research interests

include integrated photonics, high-speed optoelectronics, artificial intelligence, and novel computing methods with light.