1 Introduction

Flexibly manipulating electromagnetic waves and information is undisputedly of central importance to people’s daily lives in the modern society. Intelligent metasurfaces, as conceptually illustrated in Fig. 1a, emerge as smart platforms for controlling the wave–information–matter interactions without the manual intervention in response to the proper time and conditions. Intelligent metasurfaces evolve from engineered composite materials [1,2,3,4,5,6], including metamaterials and metasurfaces [7,8,9,10,11,12,13,14,15,16,17,18,19,20], and particularly from information metamaterials and metasurfaces [19, 21,22,23,24]. Over the past decades, we have witnessed great progress of metamaterials and metasurfaces with different forms and characteristics, such as artificial dielectrics [2,3,4], left-handed materials [7,8,9,10], plasmonic metamaterials [6], zero-index metamaterials [11], spoof surface plasmonic polaritons (or designer SPPs) [12], Huygens’ metamaterials [16], digital metamaterials [18], coding metamaterials [19], reprogrammable metamaterials [19], time-varying or temporal-modulating metasurfaces [20, 25], and so on [26,27,28], and their unprecedented success in tailoring wave–matter interactions that cannot be achieved in nature, as summarized in Fig. 1b. The metamaterials and metasurfaces have remarkably refreshed human insights into many fundamental laws, for instance, the Snell’s law [7, 9], diffraction limit [7, 8, 29,30,31,32,33,34,35,36,37], and reciprocity [38,39,40], and have unlocked many novel functional devices and systems, like cloak [41,42,43,44,45,46], tunneling [47, 48], hologram [49, 50], and so on. Recently, the conventional structure-alone or passive metasurfaces have made strides towards intelligent metasurfaces [51,52,53,54,55,56,57,58,59,60,61,62] by integrating with algorithms and nonlinear materials [63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83] (or active devices [19, 25, 84]). Similar to conventional metasurfaces, the intelligent metasurface is composed of a two-dimensional array of judiciously designed unit cells (called meta-atoms). However, the meta-atom of intelligent metasurface is integrated with tunable functional materials or active devices, and designed to be in-situ reprogrammable under the control of proper algorithms.

Fig. 1
figure 1

Conceptual illustration and history backgrounds of intelligent metasurface. a Intelligent metasurface, AI-empowered artificial materials, is a smart platform enabling various functions (for instance, data mining, communication, energy harvest, control, sensing, etc.) by processing its illuminated waves on the physical level. b About 70-years evolutions of Artificial Intelligence (AI) and Artificial Material (AM), in while some milestones of new principles, mechanisms and physical phenomena are correspondingly marked as well. Here, the column-like histograms report the trends of published papers in AI and AM, respectively. These data sources about such statistical analysis come from the web-of-science, in which the keywords in advanced searching are set as ‘machine learning or deep learning’ for AI and ‘left-handed materials or metamaterials or metasurfaces’ for AM, respectively

In comparison to conventional metasurfaces, the intelligent metasurface exhibits three crucial properties: digitalization, programmability and intelligence, providing us with a transformative opportunity to control the wave–information–matter interactions without human intervention. Here, the digitalization enables the intelligent metasurface to encode/decode and store the digital information on the physical level; the programmability means that the intelligent metasurface is capable of realizing distinct functions with one physical entity, and the switching among which by changing the control code sequences; while the intelligence indicates that the intelligent metasurface has onsite or cloud algorithms as its brain, and is capable of decision-making, self-programming and performing a series of successive tasks without the human supervision. Therefore, reconfigurable and reprogrammable metasurfaces [18, 19] can be ascribe as the infancy stage of the intelligent metasurface, since they are strictly not intelligent according to the definition above. In a word, the intelligent metasurfaces could provide us with smart platforms for manipulating the wave–information–matter interactions, which hold promising potentials in setting up a direct connection between the physical world and digital world, and serve as a natural role of merging the physical entity with its digital twin.

The intelligence is the core of the intelligent metasurfaces, and the algorithms (especially deep learning techniques) can take this role well. Parallel to the development of artificial material (AM), artificial intelligence (AI), such as deep learning strategies [85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109], have gained great success by leaps and bounds in data mining and knowledge discovery (see Fig. 1b for some important breakthroughs over the past 70 years in this area). Pioneered by Pitts and McCulloch in the 1940s and coined by Hinton in 2006 [96], deep learning has proven extraordinarily useful in nearly every field of science and engineering, making impossible tasks with the traditional methods possible in these fields. Certainly, deep learning has considerably impacted the metamaterials and metasurfaces from metamaterial design to intelligent devices and systems. The segregation of AMs and AIs will undisputedly give birth to very broad and active research directions. Here, we specifically focus on novel wave-information-critical architectures for dealing with the data crisis, rather than the research that leverages AI as the design tools for AM [107,108,109,110,111,112,113,114,115]. Although deep learning has gained tremendous success in science, engineering and military, the most hither-to-mystery yet fundamental problem is why the functions learned by artificial neural networks over a number of training data can generalize quite well for the unseen data. More specifically, artificial neural networks (ANNs) were usually treated as black boxes, lacking of theoretical understanding of when and why they work well or fail in terms of training and generalization. Recently, this noble problem has begun to be explored by some researchers [116,117,118,119,120,121], and we faithfully expect that the mystery of deep learning can be unlocked in the near future, providing theoretical foundations for the intelligent metasurfaces.

In this article, we review a collection of recent progress on the intelligent metasurfaces, including wave–information–matter interaction control, novel wireless communications, and wave-based computing. We begin by providing the historical background of the intelligent metasurfaces, underlying the physical mechanisms, and giving several representative applications of the wave–information–matter control. Then we explore the utilization of intelligent metasurfaces in novel wireless communication architectures, with particular emphasis on the metasurface-modulated backscatter communication. Here, the ambient metasurface-modulated backscatter communication utilizes wireless signals already available in our daily lives and works in a direct modulation manner, which has no requirement on the allocation of new frequency spectrum and associated microwave devices, enabling us to develop our ‘green’ information society. We also explore the applications of intelligent metasurfaces in wave-based computing, focusing on the emerging research direction in the intelligent sensing strategies. Finally, we comment on the challenges and perspectives of this emerging interdisciplinary research area, with its potential to create new science and engineering paradigms for designing a smart society.

2 Control of wave–information–matter interactions

The metamaterials have been evolved from engineered structures to intelligent wave agents, which have made huge strides in controlling the wave–information–matter interactions that could not be achieved with the natural materials. Historically, there are two milestone events during the evolution of intelligent metasurfaces: active metamaterials [64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81] and programmable coding metamaterials [19, 21,22,23,24, 82, 84, 122,123,124,125]. In contrast to the passive (i.e., structure-only) metamaterials, the active metamaterials are the hybrid structures embedded tunable functional materials (like Ge2Sb2Te5) and active elements (like PIN diodes), allowing us to control the wave–information–matter interactions in a dynamic way at the cost of power consumption. We have seen that the active metamaterials could be optimized to be nearly passive in terms of energy consumption since only a power supply is needed in support of controlling the embedded active devices or functional nonlinear materials. The conventional active metamaterials (e.g., tunable and reconfigurable metamaterials) come with two important limitations: the limited reconfigurability and the physics-alone wave–matter control. In order to resolve the limitations, Cui et al. proposed a novel kind of active metamaterials, i.e., reprogrammable coding metamaterials [19], in which each building meta-atom has a finite number of quantized physical states and can be utilized for encoding digital information on the physical level, bridging the digital world and physical world. In this way, the reprogrammable coding metasurface can be regarded as a general-purpose device of manipulating wave–information–matter interactions, in the sense that it can realize a large number of distinct functionalities and switch them in real time. Nonetheless, the reprogrammable coding metasurface has a limitation that it needs to work in a trial-and-error mode in order to meet the customized requirements, hindering its real-time utilizations. The intelligent metasurface extends remarkably the reprogrammable coding metasurface by integrating the algorithm (especially deep learning solution) as its ‘brain’, and thus has the intelligence in terms of self-programming and decision-making to be adaptive to the change of surrounding environment without the human supervision. Thus, we expect that the intelligent metasurface holds promising potential in merging the physical entity with its digital twin as a whole.

2.1 Basic concepts and principles

The intelligent metasurface can be regarded as an computing device based on the weak wave–matter interactions [126,127,128,129,130,131,132,133,134], which processes the input of temporal–spatial illumination (amplitude, phase, polarization, ….) and outputs the desired wave. Regarding the topic of metamaterials-based wave computing, we would like to refer readers of interest to [126] for a comprehensive review. Here, we provide a non-rigorous but helpful theoretical insight into the physical mechanism of the wave–information–matter interactions by the intelligent metasurface. Assuming that, for a time-varying metasurface, the meta-atom at location \({\varvec{r}}\) and time t has the scattering coefficient \(\Gamma \left( {{\varvec{r}},t} \right)\). Given the s-linearly polarized plane-wave illumination with the wavevector \(\left( {{\varvec{\kappa}}_{i} ,\sqrt {k_{i}^{2} - \left| {{\varvec{\kappa}}_{i} } \right|^{2} } \hat{\user2{z}}} \right)\) and angular frequency \(\omega_{i}\), the p-polarized wave scattered by the metasurface can be represented as \(H_{ps} \left( {{\varvec{\kappa}}_{o} ,\omega_{o} } \right) = {\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{o} - {\varvec{\kappa}}_{i} ,\omega_{o} - \omega_{i} } \right)\), where the scattering wavevector is \(\left( {{\varvec{\kappa}}_{o} , \sqrt {k_{o}^{2} - \left| {{\varvec{\kappa}}_{o} } \right|^{2} } \hat{\user2{z}}} \right)\) and \(\omega_{o}\) denotes the scattering angular frequency. Here, \({\tilde{\Gamma }}\) represents the Fourier representation of \({\Gamma }\) from the \(\left( {{\varvec{r}},t} \right)\)-space to (\({\varvec{\kappa}},\omega\))-space. In addition, \({\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{o} - {\varvec{\kappa}}_{i} ,\omega_{o} - \omega_{i} } \right)\) represents the spatial–temporal behavior of the metasurface-modulated electromagnetic (EM) wave and forms the principal foundation of the wave–information–matter interactions with the intelligent metasurface, which can be understood as an extension of the generalized Snell’s law [14, 15]. In light of the basic property of Fourier transform, we know that the conjugate symmetry of \({\tilde{\Gamma }}\) will be broken when \({\Gamma }\) is not real-valued, which would give rise to the violation of the Lorentz reciprocity. More specifically, when \({\Gamma }\left( {{\varvec{r}},t} \right)\) is not real, we can observe the nonreciprocal behavior of the propagation-only waves in the frequency domain and momentum domain, and simultaneously in both, i.e., \({\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{o} - {\varvec{\kappa}}_{i} ,\omega_{o} - \omega_{i} } \right) \ne {\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{i} - {\varvec{\kappa}}_{o} ,\omega_{o} - \omega_{i} } \right)\), \({\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{o} - {\varvec{\kappa}}_{i} ,\omega_{o} - \omega_{i} } \right) \ne {\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{o} - {\varvec{\kappa}}_{i} ,\omega_{i} - \omega_{o} } \right)\) and \({\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{o} - {\varvec{\kappa}}_{i} ,\omega_{o} - \omega_{i} } \right) \ne {\tilde{\Gamma }}\left( {{\varvec{\kappa}}_{i} - {\varvec{\kappa}}_{o} ,\omega_{i} - \omega_{o} } \right)\), which is consistent with the results in [36] (see Fig. 2e). Furthermore, it can be deduced that the metasurface has the maximum spatial bandwidth of \(n\omega_{o} /c\) it can achieve, when its constituting meta-atoms are arranged on the 1/\(n\)-wavelength scale, where c is the speed velocity of light in free space. If the spatial bandwidth of metasurface is \({\varvec{\kappa}}\), then the transverse component of the scattering wavevector is \({\varvec{\kappa}}_{o} = {\varvec{\kappa}} + {\varvec{\kappa}}_{i}\). This expression shows an important conclusion that the scattering wave is surface-wave propagating along the metasurface if \(\left| {{\varvec{\kappa}}_{o} } \right| = \left| {{\varvec{\kappa}} + {\varvec{\kappa}}_{i} } \right| > \omega_{o} /c\), since \(\sqrt {k_{o}^{2} - \left| {{\varvec{\kappa}}_{o} } \right|^{2} }\) is a pure imaginary number. Therefore, when meta-atoms of metasurface are well designed and arranged on the 1/n-wavelength scale, i.e., \(n \ge 4\), the incident plane waves can be efficiently converted into the surface-waves propagating along the metasurface, regardless of the incidence direction.

Fig. 2
figure 2

Examples of wave–information–matter manipulations with intelligent metasurface. a The first programmable coding metasurface controlled with FPGA. The meta-atom embedded with a PIN diode and its binary EM responses are plotted. Additionally, selected experimental results of dynamic beam manipulation are shown [19]. b Smart self-reprogrammable metasurface integrated with a gyroscope sensor for orientation tracking and selected experimental results [52]. c Non-linear harmonic manipulation with time–space-coding metasurface and selected experimental results [185]. d Smart Doppler cloaking with time–space-coding metausrface [59]. e Surface-wave-assisted nonreciprocity with spatio-temporally modulated metasurface, where the constituting meta-atom with geometrical parameters is detailed [40]. f Smart invisibility cloaking with metasurface. Experimental setup and selected results at different illumination frequencies are shown [55]. g Mathematical differential operations with time–space-coding metasurface and selected experimental results [133]. h Smart wireless power transfers with the 1-bit and 2-bit intelligent metasurfaces. Experimental configuration is inserted. The efficiencies of WPT with different setups are compared (bottom) [60]. i Dynamic holograms with 1-bit programmable coding metasurface. Selected experimental holographic images and corresponding coding patterns of metasurface are provided at left-bottom and right-bottom corners, respectively [84]. Figures (a)–(i) adapted with permission under a CC BY 4.0 license

The intelligent metasurface is composed of controllable meta-atoms, and each meta-atom has a number of quantized physical states. Among options of designing the intelligent metasurface, the one-bit coding strategy has been widely explored and demonstrated to be favored in terms of design process, fabrication complexity, cost, and energy consumption. We mean by the one-bit coding metasurface that each meta-atom has binary physical status [19]. For instance, the binary phased-coding meta-atom has two distinct phase responses with a difference of 180° when it is illuminated with a normal-incident plane wave. So that one state of the binary meta-atom can be treated as with a phase response 0° and the other 180°. It is noted that the binary code is not necessarily restricted to reflection phase responses, but also can represent the phase or amplitude of the EM transmission, two distinct EM boundaries, etc. Similar to the results in the community of microwave antennas, the phase quantization of meta-atom will give rise to quantization noise [135,136,137,138], i.e., quantization energy leakage of main lobes into side lobes. In order to overcome this drawback, the reprogrammable coding metasurfaces with two-bit quantization and beyond have also been investigated [139, 182]. By now, most of the phase-only reprogrammable coding metasurfaces are of the reflection type, which not only require the high-gain feeding source, but also suffer from the feeding blockage effect. In the community of microwave antennas [140, 141], various reconfigurable transmit-arrays have been developed to mitigate the difficulties involved in the reconfigurable reflect-arrays by exploring multilayered frequency-selection surfaces, microstrip line coupled patches, aperture-coupled stacked patches, and so on. In principle, such strategies can be applicable to the design of the transmission-type reprogrammable metasurfaces. A few transmission-type reprogrammable metasurfaces have been proposed recently; however, working bandwidth and finished cost have to be trade-off carefully to avoid the drawback of low efficiency. Besides, for the design of transmission-type programmable metasurfaces, the effect on the EM performance from the bias lines needs to be carefully considered.

2.2 Beam manipulations

The intelligent metasurface is usually regarded as a reprogrammable device, which is capable of converting the illumination beam to that with the desired wavefront or/and waveform by reprogramming its control digital coding. We here consider the first porotype of one-bit reflection-type reprogrammable coding metasurface invented by Cui et al. [19] as an illustrative example, as shown in Fig. 2a. The reprogrammable coding metasurface consists of 30 × 30 one-bit phased-coding meta-atoms, and each meta-atom is embedded with a PIN diode (SMP-1320) to realize the switching of the ‘0’ and ‘1’ states, leading to a 180° phase difference of the EM reflections between them. Specifically, the ‘0’ means that the PIN diode is at ‘OFF’ since the applied DC voltage is smaller than its threshold value, while the ‘1’ for ‘ON’. The minimum recovery time of PIN diode can be designed to be as small as 10 ns, enabling a maximum radiation pattern changing speed with hundreds of MHz. To simplify the system complexity of control circuits, 30 × 30 meta-atoms were divided into six groups, each having 5 × 30 digital meta-atoms and being independently controlled by the FPGA. When the reprogrammable metasurface is illuminated by a plane wave, it is capable of producing distinct radiation patterns by changing its digital coding properly through the FPGA. Figure 2a plots two distinct radiation patterns from different coding sequences of the reprogrammable metasurface, which shows the in-situ wave programmability of the programmable metasurface. Here, for the “010101” coding sequence, the normally incident beam will be deflected to two symmetrical radiation directions, while as the coding sequence changes to “001011”, the radiation pattern has multiple radiation beams. We can see from above discussions that, similar to the field digital programmability of FPGA in lower frequencies, the reprogrammable metasurface has the in-situ physical programmability of tailoring the wavefield over the entire frequencies and beyond. Although this system is a one-dimensional (1D) proof-of-principle demonstration, it could be extended to the 2D case and beyond with independently controllable meta-atoms. Then, the intelligent metasurface could be optimized to be a general-purpose wave-based analog computer (more details provided in Sect. 4), which is capable of performing really complicated mathematical operations under the illuminated waves on physical level, from fundamental algebraic operations to modern deep learning processing (see Fig. 2g). Note that such wave computer can be nearly universal for wave-information processing, in the sense that these distinct operations can be accomplished with one physical entity without using any hardware modifications by reprogramming the control digital coding sequences.

In the pioneering work by Cui et al., the task of beam manipulation is prior to be known, implying that the control digital sequences are calculated offline under human supervisions, and configured into the FPGA beforehand. Now, a natural question comes up: can the programmable metasurface in-situ change the control sequences to adapt to the change of ambient environment without human intervention? Here, we provide two encouraging examples. The first is the so-called smart Doppler cloak (see Fig. 2d) proposed by Zhang et al. [59], which consists of a time-modulated intelligent metasurface embedded with a velocity detector, an arbitrary waveform generator (AWG) and an unmanned feedback system. The velocity sensor detects the velocity of a moving target and then sends the message to a microcontroller unit (MCU). After receiving the velocity information, the MCU will instruct the AWG through a host computer to generate the signal modulated with the detected Doppler frequency for driving the time–space-coding metasurface. Zhang et al. demonstrated that their intelligent metasurface system is capable of achieving the Doppler cloaking effect with the relative frequency band of 40% for arbitrarily-polarized incoming EM waves, and that such cloak is able to respond self-adaptively to the changing velocity of the moving objects and then cancel different Doppler shifts in real-time, without any human intervention. Another example is the smart self-programmable metasurface by Ma et al., which consists of a gyroscope sensor and an online unmanned feedback algorithm, as shown in Fig. 2b. Ma et al. [52] demonstrated that their intelligent metasurface with the changing orientation remains to maintain its beam direction without human supervision. The proposed scheme of intelligent metasurface can be extendable by equipping other kinds of relevant sensors to adapt to the change of surrounding environment, including humidity, temperature, and illuminating light.

Before closing this subsection, we would like to remark that the aforementioned beam-steering strategies could be extended for other frequencies by developing corresponding active metasurfaces [57, 58, 75,76,77,78,79,80,81]. For instance, Wu et al. reported an all-dielectric electric-optic active meaturface based on the quantum-confined Stark effect at near-infrared wavelengths [58]. Holsteen et al. proposed the electrically reconfigurable metasurfaces with the switching frequency of through the microelectromechanical movement of silicon antenna arrays created in standard silicon-on-insulator technology [57]. Zhang et al. proposed the electrically reconfigurable metasurface using a low-loss optical phase change material, i.e. Ge2Sb2Se4T [79]. In a word, we expect that the intelligent metasurface will play a critical role in designing the future unmanned devices that are consistent with the ambient environment.

2.3 Wireless power transfer

Wireless power transfer (WPT) and energy harvesting are of great importance in ever-growing energy-hungry practical scenarios in the society of IoT [142,143,144,145,146,147]. Pioneered by Tesla’s invention in 1904, the WPT has been remarkably advanced in the realm of non-radiative transfer, showing promising potential in the cardiac pacemakers, electric vehicles and consumer electronics, and so on. For instance, the Assawaworrarit et al. proposed a robust WPT scheme using a nonlinear parity-time-symmetric circuit, which is robust over a distance variation of approximately one meter [147]. The WPT solutions rely on the magnetic coupling in the near-field, implying that the high-efficiency power transfer cannot be achieved for the non-cooperative devices moving in the long distance, like the aerial vehicles [148]. As mentioned previously, the intelligent metasurface is capable of tracking the moving devices with the dynamical beam forming, and the resultant system supports the robust WPT for the moving devices in the realm of radiative transfer, offering a promising route for solving the above-mentioned problems.

Recently, Han et al. proposed a WPT system with the intelligent metasurface for the dynamic charging applications [60], as shown in Fig. 2h, where the portable electronic devices, including cellphones and laptops, must be charged. Such smart WPT system consists of two major parts: the physical layer and the application layer. On the physical layer, a reflection-type two-bit intelligent metasurface along with an input wireless energy source is deployed as the power station, which is responsible for focusing the wireless power towards multiple intended devices individually. Here, the intelligent metasurface has been equipped with several active sensors (e.g., camera, magnetic coil, receiving antenna, etc.) for localizing electronic devices and receiving their charging requests. On the application layer, the smart WPT will make a series of successive decisions (including the device localization, coding pattern calculation and assignment, and others), and perform the wireless power charging or power transfer, when it detects the charging request. In contrast to current WPT solutions requiring the charged device to be fixed, the smart WPT system by Han et al. explores the unique capability of the intelligent metasurface in manipulating dynamically the focusing beams towards the intended devices, and thus supports the wireless powers transfer to the moving devices in the radiative region. Thereby, the annoying charging cables or charging pads can be avoided. Han et al. demonstrated experimentally that their WPT system could obtain the improvement of energy harvest by about 16.3 dB compared to the case using the measured fixed near-field-focusing (NFF) form [60].

Several potential advantages of the proposed smart WPT system with intelligent metasurface are remarked here. Firstly, the smart WPT system can be guaranteed to human exposure under the level of EM safety. For instance, the smart WPT system can be designed to be able to instantly shut off the power delivering when detecting a person moving close or falling into the charging region, and resume as the human leaves. Secondly, the smart WPT system can be optimized as simultaneous wireless information and power transfer system, when it is integrated with wireless sensor networks, communication modules and advanced algorithms. Thirdly, the WPT can be further extended to meet various needs such as automatic charging, monitoring, and microwave hyperthermia. In a word, the proposed WPT strategy could open a new avenue for the WPT with the high efficiency, safety, and intelligence.

2.4 Dynamic holograms

Here, we discuss another interesting application of intelligent metasurfaces, i.e., dynamic holograms. For this purpose, the intelligent metasurface is designed to convert the illumination into that with the desired profile of wavefront. Metasurfaces have been widely used for designing holograms in the past decade [49, 50, 149,150,151,152], and the metasurface holograms were proposed in various frequency regimes to achieve holographic images with high efficiency, good image quality and full colors. However, most of them are usually limited to the “static” scenario in the sense that only one or a few specific images can be generated once the metasurface is fabricated. Li et al. proposed the first reprogrammable metasurface hologram [84], which addressed several critical issues associated with static metasurface holograms, featuring the simplicity, being rewritable, high image quality and high efficiency. Figure 2i shows the sketch map of the programmable holographic images based on a one-bit programmable metasurface with 20 × 20 macro meta-atoms, where each constituting meta-atom is a hybrid structure integrated with a PIN diode. By incorporating a PIN diode into the meta-atom, the EM state of the meta-atom can be tailored by controlling the ‘ON’ and ‘OFF’ states of the PIN diode with different biased voltages. Thus, the desirable phase responses across the metasurface hologram can be achieved in an inexpensive and dynamic way. As such, a single metasurface hologram can accomplish various functions dynamically. Figure 2i reports a set of experimental holographic images, namely, a sentence of “LOVE PKU! SEU! NUS!”, and accordingly the coding patterns of the metasurface hologram have also been reported with a total hologram reconfiguration time of around 33 ns. Here, we would like to provide three-aspect remarks on the proposed reprogrammable metasurface hologram. Firstly, the reprogrammable metasurface hologram can be extended to exhibit multiple digital bits for both phase and amplitude modulations, which leads to more versatile devices with adaptive and rewritable functionalities. Secondly, an immediate interesting application of such a hologram is to design the AI-empowered sensing systems, where the measurement modes desired by machine learning techniques can be generated by intelligent metasurface on the physical level, as detailed in Sect. 4. Thirdly, the reprogrammable metasurface hologram could be extended for other frequencies and beyond by exploring dynamic modulation masks. Several switchable diodes commercially available may facilitate the frequency scaling, e.g., the MEMS [57, 72], the silicon the magnesium (Mg) [149] and the thermal VO2 diode [63] in the visible frequencies. For instance, Li et al. invented the Mg-based dynamic metasurface platform, and utilized it for the purpose of dynamic holography and optical information encryption [149].

The control coding sequences of the metasurface are usually designed by performing iterative approaches, including the Gerchberg–Saxton (GS) algorithm [153] and stochastic optimization algorithms [154], which limits the deployment of the intelligent metasurfaces in many practices with strong demands on high efficiency and capability. Here, we would like to point out a general framework for metasurface hologram in context of deep generative networks. Mathematically, the desired radiation pattern \(y\) of an illumination \(x\) experienced through intelligent metasurface can be represented as \(y = f_{{\Theta }} \left( x \right)\), where \({\Theta } = \left\{ {0,1} \right\}^{N}\) encapsulates N meta-atoms of intelligent metasurface, where one-bit coding is assumed. Typically, the function \(f\) behaviors in a nonlinear way, which can be trained in context of deep generative networks, e.g., generative adversarial networks (GAN) strategy. Invoked by this observation, we proposed an efficient non-iterative solution to the design of intelligent metasurface holograms [56], i.e., VAE-cGAN. Being remarkably different from conventional cGAN requiring a large amount of labeled training data, our VAE-cGAN is trained under the supervision of wave–matter-interaction physical mechanism rather than the labeled data, and thus can avoid the difficulty of conventional cGAN. Specifically, the physical mechanism between the electric-field distribution and intelligent metasurface is introduced to model the VAE decoding module of VAE-cGAN. After the VAE-cGAN is trained well, we call for the generator part to generate desired intelligent metasurface holograms. The non-iterative property of the generator enables the high-quality holographic imaging with high efficiency, which has been validated numerically and experimentally. It can be expected that the smart holograms can be developed by deploying our VAE-cGAN on neural network chips, finding more valuable applications in wireless communications, microscopy, and so on.

2.5 Invisibility cloak

Intelligent metasurfaces are powerful in manipulating wave–information–matter interactions on the physical level, and thus they are capable of controlling the EM temporal–spatial response of an object by changing the surrounding environment. For example, they, with properly designed, can guide the waves propagating around away the object, rendering the object invisible to the EM detectors, which is known as the so-called invisibility cloak. The invisibility cloak has been a fantasy dream for humanity until the emergence of metamaterials and transformation optics [41,42,43,44,45,46, 155,156,157,158,159,160]. Ideally, an invisibility cloak is able to self-adaptively adjust its internal structure to make hidden objects invisible. In the past decade, various metamaterials-based invisibility cloaks have been proposed, but we have seen progress on some of the most crucial challenges that have hindered their utility in the past. For instance, they all have a fundamental challenge in the difficulty of implementing bulky composite materials with anisotropy and inhomogeneity. More importantly, they all lack the intelligence in the sense that they all work in a static manner and cannot be adaptive to the change of object or ambient environment. Here, we argue that the intelligent metasurface is born to be a good candidate for addressing the above difficulties, due to its properties of ultrathin structure, programmability and intelligence.

We take the intelligent invisibility cloak (see Fig. 2f) developed by Qian et al. [55] as an example to illustrate the critical ingredients and operational principle. The intelligent metasurface in [55] consists of 24 × 28 electronically-controllable meta-atoms, and each controllable meta-atom integrated with a varactor diode has a controllable reflection spectrum according to its biased voltages. Regarding the design of intelligent invisibility cloak, a fundamental but challenging job is to interpret the dependence of cloak structure on the illumination wave and surrounding environment. To resolve this difficulty, Qian et al. proposed a deep learning solution (i.e., a pre-trained ANN) to approximate the intricate relationship between the quantities of incident wave and reflection spectrum and the applied bias voltage for each meta-atom of the intelligent metasurface, by which all bias voltages of the meta-atoms can be automatically calculated and are instantly supplied to the invisibility cloak. In addition, two active detectors were introduced to monitor the change of illumination(s) and surrounding environment, respectively. For instance, when the changes of the incidents or the backgrounds can be detected in real-time and transformed, they are detected in real time and transformed instantly by the ANN into the cloak. Then, the intelligent metasurface cloak under control of the detectors and pretrained ANN can hide object(s) without any human intervention even when the incident wave and ambient environment are rapidly changing on a millisecond timescale. In a word, embedded with active detectors and pretrained ANN, the intelligent metasurface cloak exhibits effective and robust self-adaptability in response to a rapidly changing incident wave and background, without human intervention.

3 Wireless communications

Wireless communication has become an essential tool of resolving the ever-expanding demands on wireless information transfer in the modern society [161,162,163,164,165,166]. A fundamental measure quantifying the performance (e.g., communication rate, security, and so on) of wireless communication systems is the information capacity [161, 166]. Given a wireless communication system with the frequency spectrum and spatial–temporal channels, the well-known Shannon’s information theory states that the ultimate bound on the information capacity it can achieve is determined by the signal-to-noise ratio (SNR) [161, 166]. Conventionally, the capacity is bounded by the number of the available channels, e.g., independent spectral and spatial degrees of freedom available. To meet the ever-expanding demand for more information transfer, notably given the advent of IoT, a wide range of solutions have been proposed, including elaborate coding schemes (e.g., OFDM), elaborate antenna designs (e.g. massive MIMO) [163,164,165], and even the engineering of the propagation medium’s disorder [167]. However, these approaches have many practical challenges in energy-infrastructure-limited cases, where the bulky systems are hard to be deployed, and portable devices such as IoT sensors and handhelds are demanded. A natural question arises up: given a deployed wireless communication system, can we improve efficiently its information capacity? The answer is really encouraging. From the above discussions in Sect. 2, we can envision two scenarios of metasurface-aided wireless communications: (i) the radio signal energies, which are emitted from the transmitter but dispersed in space, are recycled and directed towards the desired user, leading to the remarkably improved SNR and thus the communication performance; (ii) additional information are encoded into the intelligent metasurface on the physical level and transferred to the users. In this research route, many metasurface-aided wireless communication architectures [168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187] have been suggested recently. In the communication community, the intelligent metasurface is usually referred to as the reconfigurable intelligent surface (RIS) [168,169,170,171,172,173,174,175,176,177,178] and has gained ever-increasingly interest in the past years. Here, we categorize them into three major types: (A) non-modulated-metasurface backscatter communications, (B) modulated-metasurface backscatter communications, and (C) ambient modulated-metasurface backscatter communications, as shown in Fig. 3.

Fig. 3
figure 3figure 3

Three novel wireless communication architectures based on intelligent metasurfaces. a Conceptual illustration of non-modulated-metasurface backscatter wireless communication connecting the digital world and physical world, which is achieved by allocating RF source, nonlinear mixer, wideband power amplifier, and other necessary circuits. Here, the intelligent metasurface, usually called RIS in the community of communication, can be regarded as a passive relay or extension part of conventional antenna arrays. Several examples of NBWC are provided in ce. b Potential applications of RIS in smart radio environment [171], c system model for IRS-assisted secure communication system [174], d illustration of the RIS-based computation off-loading in mobile edge computing system [176]. e Conceptual illustration of modulated-metasurface backscatter wireless communication by deploying an RF source and intelligent metasurface. Here, the digital information to be transferred is directly encoded into the metasurface on the physical level, and the signal modulation is achieved by the wave-metasurface interaction. Specifically, the intelligent metasurface plays two roles: mixer and antenna. Several examples of MMBWC are provided in fh. f Schematic illustration of the operational mechanism of the direct modulation wireless communication system in [183], g experimental scenario of the dual-channel wireless communication system based on the space–time-coding digital metasurface [184]. h A photo of the metasurface-based MIMO wireless communication prototype [180]. i Conceptual illustration of ambient modulated-metasurface backscatter wireless communication by an intelligent metasurface. The intelligent metasurface plays three critical roles: mixer, antenna, energy harvest collector. jl The demo system of AMMBC proposed Zhao et al. and some selected results [186]. Figures adapted with permission from: (b)–(d) IEEE, (f)–(g) under a CC BY 4.0 license, (h) IEEE, (j)–(l) under a CC BY 4.0 license

3.1 Non-modulated-metasurface backscatter communications (NMMBCs)

NMMBCs work similar to the conventional wireless communication systems, but with the use of intelligent metasurfaces for recycling the energy dissipated in space that was conventionally thought to be useless and further improving SNR. Similar to conventional wireless communication systems, in NMMBC, an intended RF carrier carrying the information to be transferred is required for information transfer, where the signal modulation or demodulation is made by using nonlinear RF mixers. However, as opposed to the conventional wireless system, in NMMBC, the intelligent metasurface is deployed to shape the ambient environment such that the effective number of information channels can be increased (see Fig. 3a, b). In NMMBC, the intelligent metasurface is utilized to extend the aperture of the antenna in the conventional wireless communication systems in a distributed manner. In other words, the intelligent metasurface can be regarded as an extension part of antenna arrays of the conventional wireless communication systems, which is connected with the intended RF source using air rather than transmission lines [177].

Besides, the intelligent metasurface has several ubiquitous properties. First, the intelligent metasurface can be optimized to match any RF source and associated modules, since it improves the communication performance by tailoring the surrounding environment for all nearby devices instead of modifying the transmitting and receiving devices. Second, unlike the transmission lines in the conventional communication systems, the intelligent metasurface does not involve high-speed signals [177], and thus it can be easily incorporated into the ambient environment and remarkably improve SNR and thus the information capacity of the conventional systems. For instance, Tang et al. demonstrated theoretically that the intelligent metasurfaces were helpful in improving the energy efficiency of power allocation of the base station [177]. Hougne et al. demonstrated that the one-bit reconfigurable metasurface can be optimized to improve remarkably the equivalent number of channels of MIMO wireless communication systems [167]. More recently, in the community of wireless communication, the RIS has been numerically demonstrated to be helpful in enhancing the secure transfer [173, 174] (see Fig. 3c), reducing the mobile edge computing [175, 176] (see Fig. 3d), and so on. Overall, there are rapidly growing interests in this topic, and we would like to refer the readers of interest to Refs. [170,171,172] for more comprehensive reviews about recent progress.

3.2 Modulated-metasurface backscatter communications (MMBCs)

It has been demonstrated in NMMBCs that the introduction of the intelligent metasurface is beneficial to improve the performance of conventional wireless communication systems. However, there are several critical challenges related to the need for active carrier signal generation: costly and heavy hardware (e.g. oscillators, nonlinear mixers, and wideband power amplifiers), power consumption, spectrum allotment issues, and security. Here, we mean by the security that, for NMMBCs, the information distributed in space is free of radiation directions, implying that it can be eavesdropped if the detector of eavesdropper is sensitive enough. These issues are particularly pressing for IoT connectivity since IoT devices are urgently demanded with lightweight, cheap and green. In sharp contrast to NMMBCs, in MMBCs, the sequence of digital information is directly encoded into the time–space-coding intelligent metasurface on the physical layer [168, 179,180,181,182,183,184,185, 188], so that such information-carrying intelligent metasurface will directly modulate the radio signal from RF resource, as illustrated in Fig. 3e–h. It is apparent that the MMBCs work similar to backscatter communications developed in the RFID area [188,189,190,191], but with the utilization of the inexpensive large-aperture intelligent metasurface. We would like to say that, from the perspective of classical MIMO wireless communications, the inexpensive intelligent metasurface has a large number of independently controllable antenna elements, and thus supports massive spatial communication channels. Therefore, with the aid of the time–space coding intelligent metasurface, the information capacity can be drastically improved compared to that of conventional backscatter communication systems. We remark that the MMBC can also be considered as the direct-modulation communication scheme [182, 183], where the intelligent metasurface, as a part of the transmitter, produces information-dependent radiation beams. In this sense, compared with the conventional backscatter communication systems, MMBCs can ensure their communication secrecy very well. Recently, this idea has also been explored in the millimeter-wave wireless communication by using a custom-designed spatio-temporal phased array [188], which is demonstrated to be resilient to distributed eavesdropper attaches. Besides, since the intelligent metasurface is powerful in shaping arbitrary wavefronts and waveforms simultaneously, novel wireless communication schemes with more flexible information modulations can be developed, for instance, the controllable orbital angular moment (OAM) modulations [182].

We take the MMBC system developed by Zhang et al. as an example to demonstrate the operational mechanisms behind MMBC. The intelligent metasurface serves as not only a large-aperture antenna radiating the information-carrying radio signals towards space, but also an analog mixer replacing the costly analog–digital converter and RF network required by the conventional heterodyne architectures. We would like to provide some insights into the MMBC system [25, 184]. To that end, we assume that the radio source emits a single-tone incident signal \(E_{i} \left( t \right) = \cos \left( {\omega_{c} t} \right)\) with the operational angular frequency of \(\omega_{c}\), and the meta-atom has the time-dependent reflection coefficient \({\Gamma }\left( t \right)\). Then, the reflection signal of the meta-atom is \(E_{r} \left( t \right) = E_{i} \left( t \right)\Gamma \left( t \right),\) and its frequency-domain representation is \(\tilde{E}_{r} \left( \omega \right) = {{\left[ {\tilde{\Gamma }\left( {\omega + \omega_{c} } \right) + \tilde{\Gamma }\left( {\omega - \omega_{c} } \right)} \right]} \mathord{\left/ {\vphantom {{\left[ {\tilde{\Gamma }\left( {\omega + \omega_{c} } \right) + \tilde{\Gamma }\left( {\omega - \omega_{c} } \right)} \right]} 2}} \right. \kern-\nulldelimiterspace} 2}.\) Recall that the base information \({\Gamma }\left( t \right)\) has been directly encoded into the meta-atom on the physical level. It is clear that the signal up-conversion signal is achieved by the physical weak interaction between the meta-atom and carrier signal, which completely avoids using the RF components required in the conventional super-heterodyne wireless communication systems, for instance, nonlinear mixing components, wideband power amplifier, and the associated accessory circuits. In this way, MMBC has an overwhelming advantage over NMMBC in terms of complexity, cost, and energy consumption. Figure 3g illustrates the sketch process of MMBC designed by Zhang et al. [183], where two destination users are placed at two different locations, and on–off-keying modulations and demodulations are considered. During the experimental test, two color pictures (the logo of the State Key Laboratory of Millimeter Waves and the logo of Southeast University) were firstly encoded and then translated into control signals to drive the intelligent metasurface. The carrier signal emitted from a feeding horn antenna was modulated and reflected by the intelligent metasurface, and then the radiation signals in free space are individually received by two horn antennas and demodulated into the digital signals via the receiver to recover the two pictures simultaneously. When user #1 moves to an undesired direction (for example, θ = 0°), the correct digital information can be retrieved from the received signal, regardless of the level of transmitting power and the sensitivity of detectors. Hence, this phenomenon demonstrates that the proposed time-modulated intelligent metasurface supports the high-quality wireless communications with the significantly simplified system architecture, and that it can provide secure wireless communications via directional modulations.

3.3 Ambient modulated-metasurface backscatter communications (AMMBCs)

We now turn to discuss the ambient modulated-metasurface backscatter communications (AMMBCs). As implied by the name, the AMMBCs work similar to the ambient backscatter communications [189,190,191], and thus they share a common advantage: neither dedicated RF sources nor new frequency spectrum is needed in the AMMBCs, because the carrier emitter is from ambient RF sources, such as TV towers, Bluetooth, cellular base stations, and Wi-Fi. Compared with the conventional ambient backscatter communications with one or a few controllable antennas, the AMMBC explores the intelligent metasurface with massive controllable elements for the wave-information manipulations, and thus has three unique strengths: nearly no effect on background wireless communications, multi-user secure communications, and higher data rate. Conceptually, the intelligent metasurface in AMMBC is utilized to program the propagation environment of a wave with unknown characteristics (source location, angle of arrival, and shape of the incident wavefront), which sharply differs from the existing communication schemes. In particular, the intelligent metasurface in MBWC has three major purposes: (i) encoding the digital information to be transferred on the physical level; (ii) modulating directly the ambient stray signals with high SNR; and (iii) assigning users’ information via the information-dependent beam-forming.

Zhao et al. proposed the first AMMBC’s framework by manipulating the commodity 2.4 GHz Wi-Fi signals (called MBWC), and demonstrated secure wireless communications without any active radio components at the data rates in the order of hundreds of Kbps, as shown in Fig. 3i–l [186]. In their demo, an inexpensive intelligent metasurface with 768 independently controllable meta-atoms is deployed. For ambient backscatter communications, the critical challenge is the difficulty of modulating and demodulating the base-band signal since the ‘carrier’ is the unknown non-stationary stray wireless signal. To address this difficulty, Zhao et al. proposed an efficient approach by deploying two receiving antennas (referred to as master and slavery antennas, respectively) and controlling the intelligent metasurface such that the energy of wireless signal is manipulated to be focused towards the master antenna alone. As such, the intelligent metasurface is only visible to the master antenna, and is invisible to others including the slavery antenna. Then, the demodulated signal denoted by \(\hat{H}_{{{\text{s}} \to {\text{meta}} \to {\text{mr}}}}\) can be easily obtained by making the simple normalized coherence computation with respect to signals collected by two coherent receiving antennas. In this way, the information encoded into the metasurface can be transferred to the user with very high SNR and security; meanwhile, has nearly no effect on the background communications. As a proof-of-concept demonstration, we considered a three-channel AMMBC with phase shift keying (PSK) modulation and demodulation. The amplitude and phase distributions of \(\hat{H}_{{{\text{s}} \to {\text{meta}} \to {\text{mr}}}}\) at the distance of z = 3 m away from the metasurface are reported in Fig. 3l. Note that the intensities of \(\left| {\hat{H}_{{{\text{s}} \to {\text{meta}} \to {\text{mr}}}} } \right|\) are focused around three intended users, and the phases of \(\hat{H}_{{{\text{s}} \to {\text{meta}} \to {\text{mr}}}}\) are well controlled in the desired manner. Clearly, we can see that the three-channel MBWC-BPSK modulation is easily achieved, and that the sequence of BPSK digital information can be independently controlled for each channel. Based on these 8 MBWC-BPSK coding patterns, we can realize the AMMBC transmission of a full-color image from Alice to Bob.

Now, we can observe that AMMBCs have interesting advantages in comparison with MMBCs and NMMBCs: they fundamentally remove the harsh requirements on allocating new frequency spectrum and deploying dedicated RF sources, which can remarkably improve the spectrum resource utilization and reduce the hardware cost and power consumption. Of course, the current AMMBC scheme can be further improved in several aspects, for instance, to develop an optimal coding strategy of the intelligent metasurface for modulation and information-dependent beamforming [187], and to design more specialized meta-atoms for faster switching. In addition, the AMMBC strategy can be extended to other frequencies and other types of wave phenomena for more applications.

4 Computing and sensing

There is no doubt that computing is of fundamental importance to people’s daily lives, and that electronic digital processors are prevalent over other strategies nowadays. Over the past 60 years, the electronic digital computing has evolved from CPUs for scalar computations to GPUs for tensor computations. But, the further development of electronic digital processors will suffer from the curse of the Moore’s law. To overcome the limitation, many computing platforms have been proposed over past decades, e.g., wave-based computing, quantum computing, AI-based chips, and application-specific-integrated circuits. Among these platforms, the wave-based analog computing is particularly attractive since it could be optimized to parallelly deal with high-dimensional data through the weak wave–matter interaction on the physical level [128,129,130,131,132,133,134, 186, 187, 192,193,194,195]. Compared with the electronic digital computing, the wave-based analog computing has two more advantages: the parallelism with the utilization of plenty of wave-domain division multiplexing techniques, and the negligible energy consumption due to the weak wave–matter interaction. The wave-based analog computing is not a new concept, which can be traced back to the pioneering work of Foucault in 1859 [193], and thus has nearly 70-years of history. To date, many optical analog computing schemes have been explored, and here we would like to refer readers of interest to the excellent review paper [192, 194,195,196]. In this work, we are particularly interested in the scenario of the robot-human alliance: how the robot processes and understands data? For instance, the robot is designed to understand human behaviors in the physical world and build a digital recognition map of the physical world in a contactless way, which can be achieved by remotely sensing where people in the physical world are, what they are doing, what they want to express, how their physiological states are, and so on. To examine these features, we consider the EM sensing as an illustrative example in this review.

The EM sensing has been widely demonstrated to be a powerful nondestructive examination tool under all-weather and all-time operational conditions [197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212], severing as a fundamental asset in science, engineering and military. In the picture of the robot–human alliance, the robot forms the digital recognition map of the physical world (human plus the surrounding environment) via the sensing, implying that the sensing plays a crucial role in bridging the gap between the digital world and the physical world, as conceptually shown in Fig. 4a–c. Typically, an entire sensing chain has two major building parts: data acquisition and data postprocessing. To date, three kinds of popular sensing schemes have been proposed: real-aperture imaging [141, 201], synthetic aperture imaging [197, 199, 202], and coding aperture imaging [203,204,205,206,207,208,209,210,211,212]. The conventional sensing systems have to struggle with trading off the cost performance indexes of the data acquisition and data postprocessing, especially in dealing with the ‘data crisis’. For instance, the coding-aperture and synthetic-aperture sensing strategies could produce high-quality images with one or a few sensors, but at the cost of computationally inefficient digital computation algorithms. In contrast, the real-aperture strategy can be optimized to have nearly negligible pressure on the digital data processing, but costly requires massive sensors for the data acquisition. To tackle with these formidable challenges, the intelligent metasurfaces, which synergize the ultrathin artificial materials (AM), for the wave-based analog computing on the physical level and the artificial intelligence (AI) for the very powerful digital data processing on the digital level [25, 156, 158, 159, 185, 188, 189], emerge as the intelligent hybrid-computing-based sensing platforms in response to the proper time and conditions, and have attracted growing interests over the past decade. Here, we would like to highlight three representative progresses.

Fig. 4
figure 4figure 4

Three kinds of computation-enabled intelligent sensing using intelligent metasurface. a Conceptual illustration of nearly digital-computational-free intelligent sensing featuring its role in bridging the physical world and digital world. The range of the scene probed (marked in blue) is small in comparison to the whole scene in yellow, due to the use of metasurface-based linear data computation. b Programmable sensing system invented by Li et al. [122]. c The reprogrammable artificial intelligence machine (PAIM) based on an array of intelligent metasurfaces proposed by Liu et al. [219]. d All-optic reconstruction using diffractive networks proposed by Rahman and Ozcan [218]. e Conceptual illustration of hybrid-computing-based intelligent sensing, featuring its role in bridging the physical world and digital world. Here the hybrid computing consists of analog computing on the physical level and digital computing on the digital level. The analog computing for high-dimensional data reduction is achieved by the metasurface, while the digital computing for data postprocessing is done by using advanced signal processing techniques or artificial neural networks. f Computational metamaterial imager proposed by Hunt et al. [221]. g Intelligent EM metasurface imager and recognizer working at 2.4 GHz proposed by Li et al. [51] which has active and passive operational modes. For the passive mode, the intelligent metasurface is used to manipulate the ambient stray wireless signal already available in our daily lives. h The operational flowchart of intelligent metasurface sensor by Li et al. [51]. i Conceptual illustration of the hybrid-computing-based intelligent integrated sensing system, where the data acquisition on the physical level and data postprocessing on the digital level are integrated as a whole and are learned simultaneously. jl Correspond to the schemes proposed by Li et al. [54], Hougne et al. [53] and Tseng et al. [232], respectively. Figures (b)–(d), (f)–(h) and (j)–(l) adapted with permission under a CC BY 4.0 license

4.1 Nearly digital-computing-free intelligent sensing

Nowadays, for most practical sensing systems, the most important yet challenging problem is to deal with the high-dimensional data or ‘data crisis’. Fortunately, the high-dimensional data have some structured representations in many practical scenarios; the well-known Johnson–Lindenstrauss lemma states that the structured high-dimensional data could be projected into a low-dimensional feature space with nearly neglectable information loss through a properly designed linear transform [213]. In other words, the essential information of the high-dimensional data can be retrieved from its remarkably reduced measurements in most of practical settings. This is theoretically grounded and has been widely explored especially since the emergence of the compressed sensing theory in the mid 2000s [203, 214,215,216]. By now, there are many popular linear embedding transforms with the so-called restricted isometry property (RIP). Among them, some embedding transforms, for instance, the principle component analysis (PCA) [217], allows for the low-dimensional representations with the mathematically or physically meaningful features, implying that the target information can be well retrieved from these low-dimensional features in an almost digital-computation-free way.

We consider the utilization of linear embedding techniques in intelligent metasurface sensors [122]. As explored in Sect. 2.4, the intelligent metasurface is capable of generating nearly arbitrarily radiation patterns or the measurement modes desired by the machine learning techniques. Inspired by this, we proposed the concept of a machine-learning reprogrammable imager (see Fig. 4b), in which the intelligent metasurface is trained with a vast number of training data using the PCA such that the machine-learning-desired radiation patterns can be achieved on the physical level. Then, the intelligent metasurface serves as a physical computing device: which outputs the low-dimensional PCA features from the input of the high-dimensional raw data in an analog computing way. As such, the resultant sensing strategy is almost free of digital computation.

Figure 4b illustrates the principle behind the intelligent sensing scheme proposed by Li et al. [122], where a two-bit reprogrammable coding metasurface is used, and each meta-atom is integrated with three PIN diodes. In Ref. [122], we considered two classical linear machine learning techniques: the random projection and PCA, which were utilized to train the machine-learning reprogrammable imager. Of course, other linear machine learning techniques can also be applicable. It is relatively trivial to realize the random projection by independently and randomly controlling the PIN diodes of the reprogrammable coding metasurface. However, for the PCA measurements, the PIN diodes need to be carefully manipulated in order to achieve the desired measurement modes, and the modified Gerchberg–Saxton (G–S) algorithm was used in our implementations. Li et al. demonstrated experimentally that not only the gestures of the test person can be recovered by the machine-learning imager, but also the armed glass scissor can be clearly reconstructed, even when the target is behind an opaque wall. It is clear that the PCA-guided sensing scheme, enabled by the meaningful low-dimensional measurements, has considerably better imaging and recognition performance than the random scheme in the case of a small amount of measurements. In passing, such intelligent imager can be utilized for multiple distinct sensing functions over a physical entity without any hardware modification, for instance, the high-quality imaging and recognition of digital-like targets and others [122]. Now, we can conclude that the machine-learning-guided sensing strategy enables the real-time and high-quality imaging with the nearly ignorable digital computation, and such a sensing strategy will provide us with a promising route for smart sensing in various frequencies and beyond.

It is worthy of mentioning that the data processing, i.e., matrix–vector multiplicative operation, has been involved on the digital level by the above imager, although it was claimed to be not taken much. Recall the wave-based computing discussed earlier, we can envision that the above algebraic operation on the digital level could be accomplished by using the metamaterials- or metasurfaces-based computing devices on the physical level [130,131,132]. For instance, an optical platform (i.e., diffractive deep neural networks, D2NN [171]) has been recently designed to take advantage of the wave property of photons to realize parallel linear data processing at the speed of light. Recently, such physical networks has been utilized for the design of all-optical imaging system [218] (see Fig. 4d). However, the wave-based D2NN is a passive device, which has fixed network architecture once fabricated. Hence, it cannot be re-trained for other targets, limiting its functions. To establish a re-trainable wave-based D2NN, we proposed a reprogrammable artificial intelligence machine (PAIM) using an array of intelligent metasurfaces [219], where the multi-layer metasurfaces act as the programmable physical layers of D2NN. We designed the PAIM (see Fig. 4c) to be a real-time re-trainable system, whose parameters could be set in digital to realize alive artificial neurons. On the physical layer, the PAIM could hierarchically manipulate the energy distribution of transmitted EM waves by a five-layer array of intelligent metasurfaces, from which the amplitude of the transmitted wave through each meta-atom could be enhanced or attenuated by controlling the value of the digital parameters. The PAIM is an on-site programmable D2NN platform running by real-time control of the EM waves in a digital way, which can perform computations based on the parallelism of EM wave propagations at the speed of light. It could be optimized to be a general-purpose wave-based intelligence machine, which could not only deal with the traditional deep learning tasks such as image recognition and feature detection, but also provide an on-site and user-friendly way to manipulate the spatial EM waves such as multi-channel coding and decoding in the CDMA scheme and dynamic multi-beam focusing, thereby may find potential applications in wireless communications, image processing, remote control, IoT, and other intelligent applications.

4.2 Hybrid-computing-based intelligent sensing

The linear-machine-learning-driven metasurface imager relies on the assumption of linear mapping from the data to results, which to some extent limits itself to handle relatively simple sensing tasks. It is believed that the deep networks have much more powerful representation capability than shallow networks do, let alone linear networks [96]. Recently, we have witnessed rapid progress in all-wave (specifically, all-optical) physical deep networks that are optimized to match the modern deep acritical networks in optics [194]. However, one of the remaining challenges is the difficulty of the physical implementation of the nonlinear activation functions, although nonlinear materials (e.g. crystals, polymers, semiconductor materials) are available. Thus, we considered the powerful capability of deep learning in the digital world, and proposed the intelligent sensing scheme by exploring the hybrid computing scheme [129, 220]: the analog high-dimensional data preprocessing (e.g., data compression) with the intelligent metasurface on the physical level, and the digital postprocessing with the modern deep acritical neural networks on the digital level. Note that the compressive-sensing-inspired computational metasurface sensors [207,208,209,210,211, 221] can be treated as hybrid-computing-based intelligent sensing, in the sense that the data compression is accomplished on the metasurface level, and the sparsity-aware data processing is implemented on the digital level.

Inspired by the above insights, Li et al. proposed the concept of intelligent EM camera by integrating ANNs into the intelligent metasurfaces [51]. The EM camera proposed by Li et al. has two operational modes (see Fig. 4g, h): active and passive. For the passive mode, the EM camera is passively excited by the ambient stray wireless signals (like Wi-Fi signals) that ubiquitously exist in daily lives. Recently, the utilization of wireless signals in the area of sensing, especially probing human behaviors has gained researchers intensive attractions [222,223,224,225], but these strategies suffer from the limited spatial–temporal image resolution and recognition accuracy due to the limited size of field of review. In contrast to these techniques, we here highlight three-aspect critical roles of the intelligent metasurface. First, the intelligent metasurface is utilized to probe real-time people in a full-viewing scene with the high temporal–spatial resolution. Second, the intelligent metasurface controls the EM wavefields (e.g. ambient wireless signals) towards the local spots of interest for efficiently recognizing the fine-grained body signs, by which the undesired interferences from the ambient environment and other body parts can be remarkably suppressed, as shown in Fig. 4h. We found that the intelligent metasurface was capable of reallocating the commodity Wi-Fi waves towards the desired spots (e.g. the left hand of the subject person) with the energy enhancement of more than 20 dB, similar to the aforementioned smart WPT. Finally, the body signs (e.g. hand signs) and vital signs of non-cooperative people can be clearly identified in a real-time way. As such, we can envision that the target person ‘wears’ virtually a monitoring device with safety radio explosion, which is capable of monitoring the body information, like the physical wearable devices [226].

Li et al. demonstrated experimentally that the hand-sign and respiration recognitions can be identified with very high accuracies, even when the targets are behind a 5 cm-thickness wall and the EM camera works in the passive mode. Now, it is clear that the deep-learning-driven intelligent EM camera exhibits robust performance in remotely monitoring the notable human movements, subtle body gesture languages, and physiological states from multiple non-cooperative people in the real-world settings. Similarly, the EM camera can also work in the active mode, which is excited by the radiation source such as horn antenna. We expect that the intelligent metasurface, i.e., a synergizing exploration of artificial materials and artificial intelligence, could be utilized to achieve the goal that the conventional systems cannot achieve easily, and that such a methodology can be extended over the entire EM spectra, which will be helpful in the future smart society like human-device interactive interfaces, and so on.

4.3 Hybrid-computing-based intelligent integrated sensing

Here, we briefly discuss the intelligence in the aforementioned sensing schemes. The intelligence in Sect. 4.2 is referred to the capability of the intelligent metasurface in adaptively performing a series of successive sensing tasks without human supervision. However, as a large amount of deep learning based sensing strategies in the digital level [227,228,229,230,231], it remains not intelligent enough, in the sense that it indiscriminately acquires all information and ignores the available knowledge about Scenarios, and the sensing task and hardware have constraints. Yet, using the available a-priori knowledge is critical to limit the data acquisition to the relevant information—the crucial conceptual improvement necessarily to reduce the latency and computational burden. On the other hand, the sensing strategy in Sect. 4.1 also lacks the intelligence since it considers the data acquisition and processing separately, and hence fails to highlight the salient features in the processing layer, although the scene illuminations to the knowledge of what scene is explored. To fully reap the benefits of intelligence in the whole sensing chain, three constituting components of the intelligent sensing—the scene in the physical world, the data in the digital world, and the measure connecting the two worlds—need to be jointly considered in a unique learnable pipeline. Actually, such an idea has been recently explored in optics [232] (see Fig. 4l) and ultrasound [233].

Hougne et al. proposed the idea of learned EM sensing with programmable metasurface hardware (see Fig. 4k) [53].We proposed two frameworks of intelligent integrated sensing pipeline using the intelligent metasurface: variational autoencoder [54] and free-energy minimization [61], which enable us to jointly learn the optimal measurements on the physical level and digital processing settings on the digital level for the given hardware, task and expected scene. For instance, a measurement network (called m-ANN) and a reconstruction network (called r-ANN) are introduced, and such two networks are jointly optimized to achieve the optimal compressive measurements, and extract the desired information in a variational autoencoder framework [54], as shown in Fig. 4j. In this way, we made use of all available a-priori knowledge about the probed scene, the specific sensing task, and the constraints on the measurement setting and postprocessing pipeline. Thus, we can expect that such a learnable sensing strategy yields a superior performance compared to the conventional strategies that optimize the measurement and postprocessing separately. The performance improvement is particularly significant when the number of measurements is limited. This strategy could drastically reduce the number of measurements, which enables us to remarkably improve many critical metrics, for instance, the speed, processing burden and energy consumption. Such strategies could raise important impacts on showing how to merge the AMs-based analogy computing and AIs-empowered digital computing in designing the learnable sensing architectures, and pave the path to low-latency sensing (e.g. biological systems) in the human–robot alliance.

5 Summary and outlook

Manipulation of waves and information is a long-standing topic, which is now urgently demanded with the advent of 6G wireless communications, green IoT, and digital twin. The intelligent metasurfaces, evolved from the composite materials and information metasurfaces, emerge in response to time under proper conditions, which could serve as the wave–information–critical smart platforms by synergizing AMs with AIs. In sharp contrast to the conventional metasurfaces, the intelligent metasurface integrated with algorithms and active devices has three unique properties: digitization, programmability and intelligence. In this article, we review the recent progress of the intelligent metasurfaces in controlling the wave–information–matter interactions by providing the historical background and the physical mechanisms. Afterward, we explore the use of intelligent metasurfaces in novel wireless communication architectures and wave-based analog computing. From these results, we can envision that the intelligent metasurfaces, similar to the biological systems, are capable of learning environment, making the decision, self-programming and continuously learning throughout their ‘lifetime’. Although this paper predominantly focuses on the intelligent metasurfaces in microwave frequencies, the thriving applications of the intelligent metasurface are spurred in other frequencies (e.g. terahertz, infrared, and optical) and other wave phenomena. The intelligent metasurface is an emerging research direction involving various disciplines, including physics, mathematics, materials, data science, computer, and information science, and there are a lot of open questions needed to be carefully addressed in the future. Below we present four important research lines.

  • Designing more specialized intelligent metasurfaces

    From the standpoint of hardware design, the intelligent metasurface remains far from the mature level for practical applications, especially for the large-scale intelligent metasurface. In this regard, the first issue is on the energy consumption. Taking the PIN-based metasurface in the microwave as an example, although each meta-atom alone takes the power consumption in the order of a few \(\mu\)W, the energy consumption of the entire intelligent metasurface with N2 meta-atoms is much large than N2 \(\mu\)W, since association modules (e.g. the driving circuits for PIN diodes and diagnostic circuits for PIN diodes) are needed. Therefore, it needs to design intelligent metasurfaces with low power consumption by introducing new approaches to control the states of meta-atoms, such as using the energy harvest techniques [144], MEMS [72], and microfluid [234]. Second, the intelligent metasurface is designed to take the role of encoding and processing the digital information, and then, it is helpful to explore more controllable freedoms to represent the quantized physical information of meta-atom, for instance, the anisotropic coding, polarization coding, frequency coding, and amplitude-phase coding, besides the mainly used space-domain phase coding and space–time-domain phase coding in the current stage. Third, the miniaturization is a research goal for designing the intelligent metasurface by exploring more advanced microelectronics techniques.

  • Making AIs and AMs understand each other

    The intelligent metasurface relies on the synergetic exploration of the artificial intelligence (i.e., deep learning strategies) in the digital world and active artificial materials (i.e., metamaterials) in the physical world. Though the heuristic use of AI-empowered or computation-enabled metasurface is increasing at an incredible rate, and this trend is expected to continuously accelerate, the most fundamental but unresolved problem is why and when such integration works? When the intelligent metasurface trained over a vast amount of training data can generalize quite well for unseen data? Actually, for the deep learning itself, the most hither-to-mysterious thing is also about its learning and generalization. Recently, researchers have begun to explore this noble question. For instance, Jacot et al. introduced the concept of neural tangent kernel (NTK) [116], and demonstrated that, in the infinite-wide limit, a deep ANN trained by gradient descent with mean-squared-error loss can be well approximated with the first-order Taylor expansion w.s.t. ANN’s learnable parameters, and the ANN’s evolving behavior can be described with the classical kernel regression methods. Simon et al. extended the results of Jacot et al. by examining the NTK’s eigensystem, and proved a new no-free-lunch theorem [119]: improving a network’s generalization for a given target function must worsen its generalization for orthogonal functions. Interestingly, although the results by Jacot et al. and Simon et al. are rigorously derived in the infinite-width limit of fully-connected networks, they are empirically demonstrated to be applicable to many modest-width networks including the deep convolutional neural networks, transformers, and beyond [90, 93]. In addition, the interpretability is another critical but open challenging problem for deep ANNs, since ANNs are typically treated as the ‘black boxes’ and the governing equations are uncovered. The deep learning considered here will inevitably involve the physics-data interactions instead of data alone. Intuitively, the physics-informed deep learning scheme may be helpful in dealing with the interpretability. Meanwhile, the physics-informed deep learning could favor the training from scarce data [206, 235, 236], if additional important knowledge from the physical constraints is considered. Finally, the intelligent metasurface deployed in physical environments has very strong EM coupling with the ambient surroundings and targets. However, by now, the surrounding environment is largely treated as the free space, for instance, in the area of wireless communications, which is obviously not realistic and has important negative impacts on the performance of information acquisition and processing. Therefore, it is demanded to model the realistic interaction between the intelligent metasurfaces and the ambient environment. We expect that, through the above efforts, the intelligent metasurface can be designed to understand the mathematical and physical mechanism behind deep learning in the near future.

  • Approaching all-wave information systems

    As pointed out in Sect. 4, the current silicon-based Von Neumann digital computing architectures is suffering from the curse of the Moore’s law, and the wave-based analog computing is uniquely positioned among the options of tackling the challenge. Although various wave-based computing schemes have been suggested by now, they are either under the control of conventional electronic digital computing, or are limited to some simply pre-specified functionalities. It is appealing to develop all-wave general-purpose computing systems. For instance, the aforementioned D2NNs [130] and PMIM [219] are essentially limited to dealing with the linear problems since no nonlinear activation function is involved. A simple way may be to introduce the nonlinear materials or devices into the D2NNs and PMIM, but at the cost of energy efficacy and the computation speed. When nonlinear materials or devices are unavailable, one possible way could be conceived in the context of reservoir computing [237]. Specifically, the controllable meta-atoms of the intelligent metasurface are divided into two groups: one is for encoding the input information to be processed, and the other is for forming the adjustable ANN’s weights. In light of the fundamental EM principles [238], the relation between the matter (i.e., data-carrying meta-atoms) and its EM response is nonlinear. Then, the general nonlinear operations can be reached through the weak interaction of the intelligent metasurface with the illumination waves. We could envision that the all-wave information systems will enjoy many elegant ‘green’ properties in terms of cost, power consumption, complexity, and efficiency.