1 Introduction

Reservoir computing simplifies model training by keeping the randomly generated middle layer unchanged and only training the output connection. Physical reservoirs, which use physical phenomena to replace the middle layer, are gaining popularity as they may be more efficient than traditional digital circuits [1]. Physical reservoirs require nonlinearity, short-term memory, and high dimensionality to be effective. Various physical reservoirs have been studied, including light-based [2], oscillator-based [3], and mechanical reservoirs [4].

With the rise of IoT, edge computing is becoming more prevalent in applications where centralized data processing is not suitable. Deep learning requires significant computational power, making edge devices with limited hardware and energy resources challenging. Additionally, the demand for computing proximity to users necessitates the development of smaller devices.

FRET is a phenomenon where excited states are transferred between adjacent QDs based on the QD types and their distance as discussed in Chap. 4. The densely populated, randomly generated QD network inherently exposes several nonlinear relationships between the input excitation light and output fluorescence. The excitation state represents memory, making the FRET behavior promising for physical reservoirs. However, the tiny size of QDs makes it challenging to know which QD emits each photon by photodiode (PD) arrays or image sensors. Additionally, the short duration of state holding and fluorescence lifetime require repeated sensing. To overcome these issues, optical devices often use lenses and delay lines, but they increase the device size.

This study proposes a simple structure that utilizes FRET for computing without the use of lenses or delay lines and can be easily miniaturized to address the issues mentioned. The structure comprises tiny LEDs as excitation light sources, a sheet of QD network, a filter that can eliminate excitation light and transmit fluorescence, and a photodiode array. As a first step, this study mostly considers only a single type of QD network for FRET-based reservoir computing. Using a FRET simulator, we confirm the feasibility of the device by mapping several tasks requiring nonlinearity. A proof-of-concept (PoC) device is implemented using a commercial image sensor and a droplet sheet of QDs. Experimental validation shows that XOR and MNIST tasks can be performed using the PoC device. Finally, we discuss the advantage of computational energy.

2 Proposed Device Structure

2.1 Device Structure

We propose a device structure for FRET-based reservoir computing, which is shown in Fig. 1 [5]. The physical reservoir, which serves as the intermediate layer in reservoir computing, corresponds to the gray box in the figure. The device consists of a 2D-array light source (input), a sheet containing numerous randomly placed QDs, and a 2D PD array (output) arranged in a straight line. The light source provides the optical input that excites the QDs, and the PD measures the intensity of the fluorescence. To minimize the form factor of the device, lenses are not used in this structure. The light source, QDs, and PDs are intended to be stacked and housed in a single package in the expected final implementation.

Fig. 1
figure 1

©[2022] IEEE. Reprinted, with permission, from [5]

Proposed device structure.

Fig. 2
figure 2

©[2022] IEEE. Reprinted, with permission, from [5]

Closer view near QDs.

As mentioned earlier, the excited state memory of QDs is short, and detecting fluorescence on a per-QD basis is impractical due to the size mismatch between QDs and PDs. To address these, the proposed structure includes a digital memory to form a recurrent network. Additionally, single-photon detection is challenging due to the low sensitivity of typical PDs and isotropic photon emission. Therefore, stable reservoir output requires repeated input and accumulation. Taking into account this accumulation time, the reservoir operates discretely in time like a sequential circuit, and its recurrent behavior progresses via feedback through the memory. Finally, the digitized reservoir output is fed to a lightweight machine learning model, such as linear support vector machines or ridge regression, to obtain the final output.

A closer view of the light source, QDs, and PDs is shown in Fig. 2, where a filter is used to eliminate the excitation light. High rejection-ratio bandpass filters, such as those in [6], are recommended for this purpose. Since QDs are much smaller than PDs and the photon emission is isotropic, each PD receives photons emitted by many QDs. Nonetheless, even in this configuration, the proposed structure can exploit a nonlinear input-output relationship suitable for reservoir computing.

Fig. 3
figure 3

©[2022] IEEE. Reprinted, with permission, from [5]

Network construction.

2.2 Network Mapping

The proposed device structure enables a switching matrix function through the feedback memory, allowing for the selective mapping of an echo state network (ESN) onto the device. An ESN is a non-physical reservoir implementation consisting of artificial neurons whose recurrent connections and weights are randomly determined, with a nonlinear transformation performed using a nonlinear activation function [7].

This work proposes to adjust the memory switching matrix to selectively map a compact ESN achieving high performance onto the proposed device structure. Figure 3 illustrates the correspondence between the ESN and the network on the proposed device, where a set of 3\(\times \)3 light source array and 3\(\times \)3 PD array is considered equivalent to a node in an ESN. The spatial and temporal overlap of the light from the light sources can provide interaction between multiple light sources, and the arrows between the nodes from PD to the light sources represent delayed PD output and recurrent inputs from the light sources. The weights can be set by adjusting the relative positions of the light source and PD. In the device, the nodes are separated in time using external memory, while in space, they are sufficiently distant from each other to avoid unexpected FRETs between nodes. In addition to the conventional training of the output part, the feedback matrix also needs to be determined.

2.3 Experiments

2.3.1 Setup

For the experiments, QD networks are generated randomly based on the conditions described in Table 1, assuming the use of QD585 [8]. Other device parameters are also listed in Table 1. Two types of light sources are used: DC and pulsed sources. The intensity of the DC source corresponds to the input value, while the input value is represented by the pulse count in unit time for the pulsed source. Specifically, the input light is pulsed with a period of 10/(input value) [ns], where the pulse width is constant at 1 ns.

Table 1 Simulation setup

In this study, we utilized a simulator [9] to simulate the behavior of QDs, which was introduced in Chap. 5. The simulator employs a Monte Carlo method to stochastically simulate the state transitions of QDs, including excitation, FRET, fluorescence, and inactivation, based on tRSSA [10]. Unlike the original FRET simulator, which only simulates the QD states, our simulation framework replicates the proposed device structure and simulates it as a complete device, with the FRET simulator as its core engine. The input light and fluorescence, for instance, decay based on the squared distance in our framework.

2.3.2 Memory-Unnecessary Tasks

We begin by evaluating two tasks that approximate nonlinear functions without the need for memory.

Logistic map

The logistic map is a chaotic system that is highly sensitive to small changes in initial conditions. It can be expressed by the equation \(x_{t+1} = \alpha x_t (1 - x_t)\), where our experiment assumes a fixed value of \(\alpha = 4.0\) and an initial value of \(x_0 = 0.2\).

In this logistic mapping, the output value becomes the next input value, and the prediction process relies on the predictions made one after another. To learn the function, ridge regression is employed. The training process achieved a mean squared error (MSE) of 9.61\(\times 10^{-9}\), using a 5\(\times \)5 PD array and a DC light source.

Fig. 4
figure 4

©[2022] IEEE. Reprinted, with permission, from [5]

Prediction result of logistic map in time domain.

Fig. 5
figure 5

©[2022] IEEE. Reprinted, with permission, from [5]

Prediction result of logistic map in input-output domain.

The performance of the trained model is shown in Fig. 4, with the original function Y, the trained data Train, and the prediction Test. The training is conducted over the first 100 steps, and the prediction is performed after the 100-th step. Although the system is chaotic, the first 17 steps (i.e., 100–117) are well-approximated, indicating that the approximation is viable. In Fig. 5, we plot an X-Y diagram with the input on the horizontal axis and the output on the vertical axis. It is evident that the function and the prediction are nearly identical.

XOR

We conducted an experiment to test whether the proposed structure can derive nonlinearity using a two-input XOR function, \(y = \textrm{XOR}(x_1, x_2)\). To classify the output, we used a linear support vector machine (SVM), which cannot approximate XOR alone. We used 200 cases for training and an additional 50 cases for evaluation, with the inputs \(x_1\) and \(x_2\) given from two locations using pulsed light sources. For 0 and 1 inputs, the pulse frequencies were set to 50 and 100 MHz, respectively.

We tested two configurations: 2 \(\times \) 2 and 3\(\times \)3 PD arrays. In both cases, we gave the input lights to the most distant diagonal locations. The training and evaluation in the 3 \(\times \) 3 PD array achieved 100% accuracy, whereas in the 2 \(\times \) 2 case, both the training and evaluation accuracies were 75%, indicating poor approximation of the XOR function. We attribute this difference in accuracy to the variation in distance between the light source and the PD. In the 3 \(\times \) 3 case, there were six variations in distance, while in the 2 \(\times \) 2 case, there were only three.

2.3.3 Memory-Necessary Temporal Tasks

Next, we will evaluate tasks that require memory in the reservoir.

Time-series XOR

In the time-series XOR experiment, random 0 and 1 inputs are used to predict the XOR result of the current and previous inputs, i.e., \(d(n) = \textrm{XOR}( u(n),u(n-1) )\). The input is generated by the same pulsed light source as in the previous XOR experiment. The feedback input is adjusted based on the amount of photons received by the associated PD, with the pulse period being inversely proportional to the amount of photons received.

Figure 6 shows one network configuration tested, where there is a one-step memory from the right node to the left node. The right node provides the previous input to the left node, which can then process both the current and previous inputs. Each node has 3 \(\times \) 3 PDs, resulting in 18 outputs as a reservoir, with a linear support vector machine applied to the reservoir outputs. Both training and evaluation achieved 100% accuracy in the experiment.

Fig. 6
figure 6

©[2022] IEEE. Reprinted, with permission, from [5]

Network structure for time-series XOR.

Fig. 7
figure 7

©[2022] IEEE. Reprinted, with permission, from [5]

NARMA10 network structure.

NARMA10

Next, we evaluate the capability of reservoir computing using NARMA [11], which is a standard benchmark for testing reservoir performance. NARMA is expressed by the following equation:

$$\begin{aligned} d(n+1)\!=\!a_1d(n)\!+\!a_2 d(n)\! \sum _{i=0}^{m-1} d(n-i)\!+\!a_3 u(n-m+1) u(n)\!+\!a_4, \end{aligned}$$
(1)

where \(a_i\) are constants, and u(n) is the input at time n. The output d depends on the inputs of the previous m steps, which means m-step memory is necessary. In this experiment, we evaluate the widely used NARMA10 with \(m=10\).

Due to the extensive number of steps in NARMA10, resulting in a long simulation time, only 100 FRET simulation trials were conducted. The experiment is designed for online training, where weights are sequentially updated in every step using the sequential least-squares method. The total number of training steps was 600 for three different sets of NARMA10 parameters, with 200 training steps for each set.

To find appropriate network structures, we generated multiple ESNs by varying the number of nodes and edges. The models were then trained and evaluated using the RMSE metric. Among them, we selected a 20-node network with cyclic structures depicted in Fig. 7 that provided accurate results. We then mapped this network onto the proposed device structure following the procedure explained in Sect. 2. Each node in the network was assumed to have a 3 \(\times \) 3 PD array and a 3 \(\times \) 3 light source array.

Fig. 8
figure 8

©[2022] IEEE. Reprinted, with permission, from [5]

NARMA10 prediction result.

The results of the NARMA10 experiment are presented in Fig. 8, where the blue line corresponds to the target function d(n) and the orange line represents the predicted values. At the beginning of the training, there is some distance between the blue and orange lines, but they gradually overlap as the training progresses. The root mean square error (RMSE) between the target and predicted values is 0.020.

3 Proof-of-Concept Prototyping

To realize reservoir computing using QDs, it is necessary to confirm whether QD fluorescence can be observed in a small experimental system and whether its output can be used for learning. In this section, we construct a small experimental system using a commercial image sensor, observe the fluorescence of QDs, and evaluate the possibility of learning from the output.

Fig. 9
figure 9

Experimental setup with LEDs for XOR task

3.1 Implementation

As part of the experimental setup, we installed a device incorporating an image sensor in a darkroom. Figures 9 and 10 depict the experimental systems used for the XOR and MNIST tasks, respectively, with the main difference being the used light source. Thinly coated QDs were applied to the cover glass, which was then placed closely on the image sensor. The image was obtained by illuminating the QDs with excitation light from above using a light source fixed to the XYZ stage substrate, and capturing the image with the image sensor. For the XOR task, two 430 nm LEDs were used as the light sources, while for the MNIST task, a laser micro-projector with a resolution of 1280 \(\times \) 720 (HD301D1, Ultimems, Inc.) was employed.

Fig. 10
figure 10

Experimental setup with a projector for MNIST

The QDs used in the XOR task are QDs (CdSe/ZnS, ALDRIICH) with a single center wavelength of 540 nm, and are fabricated in thin film form to realize a network structure. 100 \(\upmu \)L of a solution of 30 \(\upmu \)L of QDs and 270 \(\upmu \)L of thermosetting resin (Sylgard 184) is poured into a cover glass. The film is deposited by spin coating. The resin is then heated to cure it. Regarding the QDs employed in the MNIST task, two types were used: CdSe/ZnS QDs with a 600 nm wavelength and CdSe/ZnS QDs with a 540 nm wavelength.

The performance of the filter used to separate the excitation light from the fluorescence is essential to observe fluorescence using a compact lensless image sensor. This image sensor is a commercial image sensor (CMV 20000, AMS) with a pixel count of 2048 \(\times \) 1088 and a pixel size of 5.5 \(\upmu \)m. We implemented a bandpass filter on the image sensor consisting of a long-pass interference filter, a fiber optic plate (FOP), a short-pass interference filter, and an absorption filter. Interference filters have a high rejection rate for perpendicular light but are angle-dependent, allowing scattered light to pass through. Therefore, by combining an absorption filter with angle-independent absorption with the interference filter, a high excitation light rejection is achieved, which is transmission of \(10^{-8}\) at the assumed excitation light wavelength of 430–450 nm [6]. This allows the fluorescence of QDs whose wavelength is 540 nm and 600 nm to be transmitted through the filter, and only the excitation light is removed.

3.2 Evaluation

3.2.1 XOR

The nonlinearity of the reservoir layer is necessary to function as a reservoir computing device. To evaluate this nonlinearity, we conducted an experiment to check whether it is possible to solve the XOR problem using a linear learner when two inputs are input from the light source. The captured images are used for training linear SVM. An example of the input image is shown in Fig. 11. Sixty images are taken for each of 00, 01, 10, and 11 (240 images in total), 40 for training and 20 for inference. Comparison is made between the case with QD and the case without QD (glass only, no excitation filter).

Fig. 11
figure 11

Example of fluorescence pictures captured by the image sensor for XOR task

The experimental procedure involves determining the DC light intensity required to represent input values of 0 and 1, in order to achieve maximum task accuracy. Subsequently, the LEDs are turned off for input 0, while a constant current is applied for input 1. To ensure equal light intensity from both LEDs, the magnitude of the constant current for input 1 is adjusted to compensate for any spectral shift that may cause the two LEDs to exhibit different intensities at the same current.

The SVM utilized a fixed pixel range size of 32 \(\times \) 32 pixels, and learning was performed at each location by shifting the location within the entire image (2048 \(\times \) 1088 pixels). To address the issue of higher accuracy in areas with lower pixel values, images were captured with varying exposure times. This ensured that the pixel values in the bright areas of the image were approximately the same for both cases: with and without QDs. The accuracy for each location in the 32 \(\times \) 32 pixel case is depicted in Fig. 12.

It was observed that using a shorter exposure time of 4 ms resulted in improved accuracy, both with and without QDs. Specifically, with a 4 ms exposure time, the QDs enhanced accuracy across a wider range of the image sensor, indicating their contribution to nonlinear computation. Additionally, the difference in accuracy between exposure times of 4 and 40 ms suggests that the image sensor itself may also exhibit nonlinearity. This finding underscores the necessity for image sensors dedicated to reservoir computing to possess nonlinear pixels, which may not be suitable for conventional image sensing applications.

Fig. 12
figure 12

Spatial distributions of XOR task accuracy across the image sensor. Each 32 \(\times \) 32 pixels are given to SVM

Fig. 13
figure 13

Example of fluorescence pictures captured by the image sensor for MNIST task

3.2.2 MNIST

We employed the Newton conjugate gradient method to train the reservoir output for MNIST using logistic regression. Experiments were conducted on four different input image sizes: 28 \(\times \) 28, 140 \(\times \) 140, 280 \(\times \) 280, and 600 \(\times \) 600, as illustrated in Fig. 13. The fluorescence images shown in Fig. 13 serve as representative examples. For the 28 \(\times \) 28 image size, we used 900 fluorescence pictures for training and 150 pictures for testing. For the other sizes, we used 1,500 pictures for training and 500 pictures for testing. Distant pixels that did not observe fluorescence were excluded from the training data. To prevent pixel value saturation in the image sensor, we modified the color of the projector light when QDs were not employed.

Table 2 MNIST accuracy. The highest accuracy with linear regressor is 87.0%. The accuracies in bold are better than that of the linear regressor

Table 2 presents the accuracy results, indicating that the accuracy decreases as the input image size decreases. For an input image size of 600 \(\times \) 600, the accuracies with and without QDs were 88.8 and 87.6%, respectively. In comparison, the accuracy achieved using the original MNIST data was 87.0% under the same training conditions. Therefore, in both cases, the PoC implementation achieved higher accuracy than the linear regressor. Furthermore, the accuracy was higher when QDs were used compared to when they were not. This improvement in accuracy can be attributed to the nonlinearity of QDs and the image sensor. We will further investigate the conditions under which QDs can offer a more significant advantage.

4 Discussion on Energy Advantage

This section aims to investigate the potential power-saving benefits of the proposed reservoir computing device. To accomplish this, we conduct an analysis of the energy consumption of the physical reservoir computer and compare it with that of a digital circuit implementation.

4.1 Power Estimation Approach

Consider a structure consisting of a light source and a sensor directly below it, as shown in Fig. 14. Energy consumption in this structure is generated by the light source and sensor. For simplicity, we consider the energy consumed by the photodiode directly below the light source that receives the most light. Photons emitted from the light source excite the QD, from which photons stochastically enter the photodiode on the sensor at the origin directly below. The sensor part that receives the fluorescence and converts it into an output consists of a photodiode, a comparator, and an 8-bit counter. The comparator detects a certain voltage drop on the photodiode and converts it into a pulse, which is counted by the 8-bit counter. If the operating time is the time it takes for the sensor at the origin to count 256 pulses, this can be obtained from the number of photons incident on the photodiode per unit time, and the sensitivity and conversion efficiency of converting the photons to a voltage. The energy consumption is calculated by multiplying the power consumption of the sensor and the light source by the operation time.

Fig. 14
figure 14

Assumed structure

4.1.1 Probability of Photon Incidence

It is assumed that photons emitted from a light source follow a normal angular distribution with the divergence angle of the light source, which is the angle at which the intensity is halved, serving as the half-width. The photons are stochastically directed toward that angle. The probability of a photon entering a 10 \(\upmu \)m \(\times \) 10 \(\upmu \)m section of the QD surface located 1mm away from the light source is illustrated in Fig. 15, where the probability density is integrated over each section of the QD surface to obtain the section probability. The photons emitted from the QD as fluorescence are presumed to travel in a random direction.

Figure 16 shows the probability of a photon incident on the photodiode at the origin, which is 1 \(\upmu \)m away from the QD surface. For greater distances, photons enter the photodiode at the origin from a broad range of locations, but for 1 \(\upmu \)m, photons primarily enter the photodiode from the QD directly above it.

Fig. 15
figure 15

Incidence probability from the light source to the QD surface (Distance 1 mm)

Fig. 16
figure 16

Incidence probability from the QD surface to the origin PD (Distance 1 \(\upmu \)m)

4.1.2 Photon Input/Output Ratio in QD network

Let us calculate the input-output ratio of photons in a QD. The decay of the fluorescence intensity I(t) in a QD is expressed as an exponential decay as follows [12]:

$$\begin{aligned} \frac{dI(t)}{dt} = -\frac{1}{\tau _0} I(t), \end{aligned}$$
(2)
$$\begin{aligned} I(t) = I_0 \textrm{exp}(-\frac{t}{\tau _0}). \end{aligned}$$
(3)

In this study, \(\tau _{0}\) is assumed to be constant, but it is known that the average fluorescence lifetime of the entire QD varies depending on the density and light intensity [13]. FRET phenomena are often observed between donors and acceptors, but they can also occur in a single QD. For simplicity, we consider fluorescence in a single type of QD.

The QDs that are newly excited by incoming photons are assumed to be QDs that are not currently excited. It is also assumed that the QDs are equally spaced lattice structures. Adding the excited term, we obtain the following equation:

$$\begin{aligned} \frac{dI(t)}{dt} = -\frac{1}{\tau _0} I(t) + (N_A -I(t)) \times \frac{\sigma _A}{S} \times N_{photon}, \end{aligned}$$
(4)

where I(t) is the number of excited QDs, \(N_A\) is the number of QDs in the region of interest, \(\sigma _A\) is absorption cross section of QD, S is region area, and \(N_{photon}\) is the number of photons injected into the region per unit time.

Since \(N_{photon}\) is a constant in the case of DC incidence, \(\frac{dI(t)}{dt}=0\) in equilibrium, we have

$$\begin{aligned} I(t) = \frac{N_A\times \frac{\sigma _A}{S}\times N_{photon}}{\frac{1}{\tau _0}+\frac{\sigma _A}{S}\times N_{photon}}. \end{aligned}$$
(5)

In the case of pulse input, \(N_{photon}\) is changed to a square wave, and we use the transient response given by the following equation. For pulse input in this study, the period was set to 20 ns, with an on-time of 1 ns and an off-time of 19 ns:

$$\begin{aligned} \frac{dI(t)}{dt}= \left\{ \begin{array}{llll} &{} -\frac{1}{\tau _0} I(t), &{}\quad &{} \text {(light source off)} \\ &{} -\frac{1}{\tau _0} I(t) + (N_A -I(t)) \times \frac{\sigma _A}{S}\times N_{photon}. &{} &{} \text {(light source on)} \end{array} \right. \end{aligned}$$
(6)

At this time, the number of photons emitted from the QD surface as fluorescence per unit time is given by substituting I(t) into Eq. (2) and multiplying it by the QD’s emission efficiency.

4.1.3 Sensor Energy

The sensor part is supposed to consist of a photodiode, a comparator, and an 8-bit counter. The assumed structure is shown in Fig. 17. In the photodiode, the voltage across it under reverse bias gradually decreases depending on the number of incident photons. The overall operation of the sensor is as follows.

Fig. 17
figure 17

Sensor structure and waveforms in it

  1. 1.

    Photodiode: A photodiode converts an incident photon into a voltage by storing it for a certain period of time, and the voltage is reduced from the supply voltage.

  2. 2.

    Source follower: The voltage drop of the photodiode is reduced to a voltage in the range appropriate for the operation of the next-stage amplifier.

  3. 3.

    Differential amplifier: The voltage change at the photodiode is amplified.

  4. 4.

    Inverter: The output voltage is converted into pulses.

  5. 5.

    8-bit counter: Up to 256 8-bit pulses are counted.

The input voltage to the differential amplifier is determined by the photons incident on the photodiode multiplied by the voltage conversion efficiency. In this evaluation, the voltage drop required to output one pulse is 100 mV. Therefore, the operating time is the time required for this 100 mV voltage drop with the highest photon intensity multiplied by 256. In addition, the power consumption of the sensor unit is calculated by simulating the consumption energy when a triangular wave is assumed to be applied to the sensor unit as a voltage drop.

4.2 Result

4.2.1 Energy Dissipation

Using the probability of incidence on the photodiode at the origin and the assumed conversion efficiency, we calculate the time required for the comparator to output one pulse, and multiply it by the power consumption of the sensor and the light source to obtain the energy consumption, as shown in Table 3. The parameters used in the calculation are shown in Table 4. In this study, a laser diode (NDV4316, NICHIA) is assumed as the light source, and the sensitivity and conversion efficiency are based on the values of existing image sensors (S11639, Hamamatsu Photonics).

Table 3 Operating time and energy consumption of DC light and pulsed light
Table 4 Parameters used for energy calculation

As shown in Table 3, using pulsed input light results in longer operation time. The light source does not consume energy when off, but the sensor continues to operate. Therefore, the energy consumption of the sensor increases roughly proportional to the operation time even when the light source is off. From an energy standpoint, DC input is better than pulsed input for the light source, but the pulsed input might be better in terms of utilizing the short-term memory of the QD.

4.2.2 Energy Comparison with Digital Circuit Implementation

To assess the energy efficiency of the computed energy discussed in the previous section, we compare it with the energy consumption in a digital circuit implementation. In neural networks, the energy consumption in multiplication and accumulation (MAC) operations and memory access becomes a concern as the complexity grows. The energy consumption for 32-bit floating-point operations and memory access is presented in Table 5 [14]. We will utilize these values for the comparative evaluation in this section.

Table 5 Energy consumption of 32-bit floating-point arithmetic and memory access

Let m be the number of light sources and n be the number of PDs. Consider adding up weighted inputs in a fully connected layer. We assume that the input transformation in the QD network is equivalent to the calculation in a fully connected layer. If we read weights from RAM (DRAM, SRAM), we need to perform addition, multiplication, and weight reading \(m\times n\) times each. On the other hand, the energy consumption of the light sources and sensors is proportional to their respective numbers. Therefore, the energy consumption for digital circuit implementation and physical reservoir computing implementation can be calculated as follows:

  • SRAM: (0.9 + 3.7 + 5) \(\times \) mn = 9.6 pJ \(\times \) mn

  • DRAM: (0.9 + 3.7 + 640) \(\times \) mn = 644.6 pJ \(\times \) mn

  • DC light source + sensor: 8.4 \(\upmu \)J \(\times \) m + 11.4 nJ \(\times \) n

  • pulse light source + sensor: 9.14 \(\upmu \)J \(\times \) m + 238 nJ \(\times \) n.

Tables 6 and 7 show the values of m and n where the energy consumption of the light sources and the sensors is smaller than that of digital circuit implementation. We compared the energy consumption of SRAM read and DRAM read in digital circuit implementation. In this simulation, we assume that the comparator outputs 1 pulse at 100 mV, and when 10 mV corresponds to 1 pulse, the time becomes 1/10. From Tables 6 and 7, we can see that increasing the number of light sources m and PDs n results in lower energy consumption than digital circuit implementation. This is because in the proposed device, the energy consumption only increases by the sum of the energy consumption of the light sources and sensors when m and n are increased. Comparing DC and pulse light sources, the DC light source gives smaller m value, meaning that the DC light source is more energy efficient.

In this evaluation, we assumed that the conversion of the input in the QD network corresponds to the computation in the fully connected layer, but more energy-saving operation can be expected when more complex conversions are performed.

Table 6 m, n values achieving lower power dissipation than digital implementation (DC light)
Table 7 m, n values achieving lower power dissipation than digital implementation (pulsed light)

5 Summary

In this chapter, we explored the viability of a compact implementation for FRET-based optical reservoir computing. The proposed device can be integrated into a single package containing the light source, QDs, filters, photodetectors, and digital signal processing capabilities. Our evaluations, which included simulation-based and proof-of-concept-based assessments, demonstrated that the proposed device is capable of performing tasks and is energy efficient for large-input large-output computation. Moving forward, we plan to develop a dedicated chip for FRET-based reservoir computing, which will be integrated into a single package.