Introduction

Fiber lasers feature good beam quality, high efficiency, compact structure, and enable to be tuned extensively and work efficiently from continuous-wave operation to ultrashort optical pulses [1, 2], from low power to high power schemes [3,4,5], which has been widely applied in nonlinear microscopy [6], optical communication [7, 8], and materials processing [9]. In the past several decades, the performance enhancement of fiber lasers mainly relied on fiber development, system optimization, algorithm improvements, and other means [10,11,12,13,14]. Among them, the role of machine learning is becoming ever prominent.

Machine learning (ML) is an umbrella term, broadly defined as “field of study that gives computers the ability to learn without being explicitly programmed” [15]. As an emerging role, it has been introduced into many fields and achieved gratifying results in speech recognition [16], object classification [17], chemical health and safety study [18], computational imaging [19,20,21], optical metrology [22], optical communications and networking [23, 24], sensing [25], and photonic design [26,27,28,29]. Recently, various machine learning methods, particularly deep neural networks (DNNs), have attracted more attention to solving problems in fiber lasers. For example, learning enables approximate models of the underlying physics or dynamic process for complex fiber laser systems in the form of a “black box”, serving for proxy measurement and tracking control of physical parameters.

The purpose of this review is to highlight the recent progress utilizing machine learning techniques for developing advanced fiber lasers in terms of design and manipulation for on-demand laser output, prediction and control of nonlinear effects, reconstruction and evaluation of laser properties, as well as robust control for lasers and laser systems. Challenges and perspectives are considered in the end.

General description

The field of machine learning yields multiple sources that involve various disciplines as diverse as probability theory, statistics, adaptive control theory, psychological models, and complexity theory. Different sources bring different methods and terms into the machine learning field. At the same time, machine learning continues to develop dramatically, and new technologies continue to emerge. It is not easy to summarize all machine learning content perfectly. Here we introduce some general descriptions of machine learning and its application in fiber lasers, which aims to provide a reference for readers in the fiber laser community.

Machine learning basics

This section first introduces the concept of machine learning, followed by the learning algorithm taxonomy, and emphasizes a widely adopted algorithm, artificial neural networks (ANNs).

Concept

The field of machine learning and optimization are intertwined. Most machine learning problems can transform into optimization ones in the end. Some researchers put several works with purely adaptive and robust optimization algorithms into the category of machine learning, for example, evolutionary algorithms, typically genetic algorithms, for coherent control of ultrafast dynamics [30], intelligent breathing soliton generation [31], and self-tuning mode-locked fiber lasers [32]. More common definitions of machine learning emphasize “learning” and “to gain knowledge” from data, and a classical one of them is “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E” [33]. Generally, experience is usually presented in the form of data in tasks, and learning algorithms are methods of generating models from data. With the learned model, the machine can make a prediction or take actions in tasks. Obviously, datasets, models, and learning algorithms are three core elements of machine learning.

The collection of data from experiments or numerical simulation of specific tasks is called a dataset, marked as D = {(xi, yi)}i = 1,2,…,N, where (xi, yi) is an example and N is the number of examples. xi is a property description of an example, usually named as ‘sample’ or ‘feature vector’. For example, xi = {xij}j = 1,2,…,d is a feature vector with dimensionality d, where each dimension contains a value xi j that describes the example somehow. yi is the label of xi, which can be the form of one of a finite set of classes, a vector, a matrix, a graph, or others. In some tasks, yi may not exist. Training (also known as learning) is the process of using data to generate models through learning algorithms. The undetermined parameters of the model would be modified during the training. Therefore, the model can be regarded as the parameterized representation of the learning algorithm on the given data and model parameter space. The data used in training is called a training dataset. Sometimes, a validation dataset is split proportionally from the training dataset to show the performance of the model during the training process. After training, the model needs to be tested on an independent dataset from the same or similar statistical distribution to the training dataset, the testing dataset, to evaluate its generalization applicability for new data. Figure 1 shows the general working framework of machine learning, including data preparation, algorithm selection, training, and test.

Fig. 1
figure 1

The working framework of machine learning

Learning algorithm taxonomy

Machine learning covers a very broad field, and it has developed a variety of learning algorithms to handle different types of learning tasks. We describe four rough classifications of machine learning algorithms. In different tasks, the available data have different forms, labeled or unlabeled, research object itself or only a metric value of it. According to the form of data algorithms used, machine learning can be divided into supervised, unsupervised, semi-supervised, and reinforcement learning (RL) [34,35,36,37,38,39,40]. The data for supervised learning is labeled, that is, D = {(xi, yi)}i = 1,2,…,N. With the difference between actual label yi and model output, the model parameters can be iteratively modified to map the label better. Supervised learning aims to find a mapping f: χ → y, where xi∈χ (sample space), yiy (label space), D∈χ × y, so that f (xi) = yi. Typical supervised learning problem includes classification and regression. Unsupervised learning specializes in learning the internal representation or potential relationships or structures of samples without labels, where D = {xi}i = 1,2,…,N. Clustering and dimensionality reduction are two common unsupervised learning problems. Semi-supervised learning adopts partially labeled datasets, D=D1 + D2, D1 = {(xi, yi)}i = 1,2,…,N and D2 = {xi}i = 1,2,…,M, where M> > N. Reinforcement learning attempts to learn what to do and how to map situations to actions to maximize a reward function [41]. To some degree, deep reinforcement learning is a control strategy that does not require accurate object models because it can adapt to the environment via interacting [42].

Machine learning algorithms can be classified according to the learning tasks, such as classification algorithms, regression algorithms, clustering algorithms, and dimensionality reduction algorithms. For example, principal component analysis and manifold learning are popular dimensionality reduction algorithms. Some learning algorithms can work for not only one one kind of task, like support vector machines for classification and regression tasks [43] and ANNs for almost all machine learning tasks [44,45,46,47,48].

Depending on whether physical knowledge is involved, machine learning algorithms can be categorized as physics-based and physics-free. Physics-free machine learning is a purely data-driven method. The core is data-driven modeling, extracting hidden physical and mathematical models from available system data and representing them by learned models [49]. Unlike physical and mathematical models represented by explicit equations, data-driven models belong to empirical models that can be a universal functional approximator, acting as a black box that allows people to solve problems without professional background or expertise. Generally, physically-free machine learning requires big data for training and is not available for specific tasks where data acquisition costs are prohibitive. By contrast, physics-informed machine learning integrates data-driven modeling and prior knowledge [50]. For example, a physics-informed neural network (PINN) is designed to satisfy some physical constraints automatically, improving accuracy and enhancing generalization in small data regimes [51]. In some cases, prior physical laws can act as a regularization term that constrains the space of admissible solutions to a manageable size, enabling it to steer itself towards the right solution and converge quickly [51,52,53].

Machine learning algorithm features shallow or deep architecture. The performance of machine learning methods is heavily dependent on the choice of data representation (or features) [54]. In the early stage, machine learning works with shallow architectures, for example, hidden Markov model, maximum entropy models, conditional random fields, and perceptron or ANN with a single hidden layer [55]. They all have a few nonlinear feature transformations, resulting in a limited ability to extract features from raw data and requiring expertise in engineering for design [56]. In recent years, deep learning (DL) under deep architectures represented by various deep neural networks (DNN) has become a hot subfield of machine learning. Deep learning shows amazing power in discovering intricate structures in high-dimensional data by transforming raw data into more abstract and ultimately more useful representations through multiple simple but nonlinear models [56].

Artificial neural networks

Here, we provide more information about ANN because of its notable impact on fiber laser research. ANN is a mathematical model that imitates the structure and function of biological neural networks, which is usually used to estimate or approximate functions [38]. ANNs consist of three types of layers: input, hidden, and output. Each layer consists of many processing elements, known as neurons or nodes, which have a bias (or called threshold) b and an active function f that is usually nonlinear (such as the softmax, relu, and sigmoid). According to the McCulloch-Pitts (MP) Model [57], when node j in the network has n inputs, and xi (i = 1, 2, …, n) notes the ith inputs with interconnection weight wij, the output of node j is \({y}_j={f}_j\left({\sum}_{i=1}^n{w}_{ij}{x}_i-{b}_j\right)\), where bj and fj means the bias and activation function of node j. Plenty of nodes are arranged in a certain hierarchical structure to form a network.

The architecture of an ANN can be classified by its topological structure, i.e., the overall connectivity and active function of nodes. ANNs can be divided into feedforward and recurrent classes according to their topological connectivity structure. Feedforward neural network is the most common network with a unidirectional multilayer structure, where data flows from the input to the hidden layer and then to the output layer. The simplest feedforward neural network is the fully connected network (FCNN), the nodes in each layer are connected with all the nodes in the last layer. The recurrent neural network (RNN) is developed mainly to process sequence data, the feature of which is that the current output is related to the previous output, for example, video and text. RNNs will memorize the previous information and apply it to the calculation of the current output. The input of the hidden layer includes not only the output of the input layer but also the output of the previously hidden layer. Theoretically, RNNs can process sequence data of any length. However, in practice, to reduce complexity, it is often assumed that the current state is only related to the first few states. Mainstream RNNs are long short-term memory (LSTM) and gated recurrent unit (GRU) [58] (Fig. 2).

Fig. 2
figure 2

Architectures of artificial neural network

The training process of the ANN is to determine these weights with search operators. Optimization is the core of the training, and most machine learning problems boil down to optimization problems [59]. In practice, a great variety of gradient descent algorithms, for example, stochastic gradient descent (SGD) algorithm, Adam, AdaGrad, RMSProp [60,61,62], combined with the backpropagation algorithm, are used to train ANNs. The working details of the backpropagation are similar to the chain rule for derivatives [56]. In recent years, in addition to the gradient descent algorithm, there has been a great interest in combining learning with metaheuristics optimization algorithms, like evolution algorithms [63,64,65] and simulated annealing algorithms [66].

ANN with a multilayer structure rather than a single hidden layer is expected to yield a better learning ability. However, the weights of multilayer networks are difficult to optimize because of gradient diffusion (Gradient Diffusion). As the number of network layers increases, this situation will become more serious. The existence of these problems restricts the development of multilayer networks. In 2006, Geoffrey E. Hinton et al. proposed improved training methods for deep architectures, which is regarded as the beginning of deep learning [67]. Nowadays, DNNs, a FNN with more than one hidden layer [16], is still the mainstream deep learning framework. Popular DNNs include restricted Boltzmann machine (RBM), deep belief network (DBN), and convolutional neural network (CNN). Studies that exploit supervised, unsupervised, and semi-supervised learning have developed various architectures like autoencoder (AE), generative adversarial network (GAN), variational autoencoder (VAE), and graph convolutional network (GCN) [68]. Besides, the deep Q-network (DQN) is a representative algorithm in deep reinforcement learning, trained with a variant of Q-learning [69].

Learning-enabled fiber laser

We first analyze typical problems in fiber lasers and then explain what machine learning can do for them.

The learning problems in the field of the fiber laser can be divided into identification (learning the input-output prediction model), estimation (learning how to characterize unmeasured parameters, such as reconstructed inputs, predicted theoretical outputs, and inferred evaluation metrics of outputs), design (learning how to obtain the target), and control (learning the control law). In practice, these problems are interrelated. For example, the identified prediction model can help solve estimation (including prediction, reconstruction, and evaluation), design, and control problems. For the convenience of description, a general formulation of data relationship is considered, y = Ax, where x and y are the input and corresponding output of the fiber laser system, A is the forward operator or transfer function of fiber or fiber laser setup, which describes the explicit relationship (e.g., physical principles and rules) or implicit relationship (without enough physical knowledge) between the input x and output y. Sometimes, some special terms are considered, such as Δx, the disturbance of input coming from the environment, n, noise included in the output, and E(y), an evaluation function of output y. Table 1 and Fig. 3 illustrate the typical problems in the fiber laser systems.

Table 1 Typical problems in the fiber laser systems (*means a specific value)
Fig. 3
figure 3

Typical problems in the fiber laser systems

Prediction 

Machine learning has demonstrated an outstanding system identification ability to reproduce physical models by identifying hidden structures and learning input-output functions based on data analysis, which even can distill theories of dynamic processes, transforming observed data into predictive models [53]. For example, recurrent neural networks are influential in successful applications because of their ability to represent sequential dependent data, such as forecasting the spatiotemporal dynamics of high-dimensional and reduced-order complex systems [70], modeling the large-scale structure and low-order statistics of turbulent convection [71], and inferring high-dimensional chaos [72]. In the fiber laser field, nonlinear dynamic systems described by nonlinear partial differential equations (PDEs), e.g., the nonlinear Schrödinger equation (NLSE), usually have no analytical solutions. Numerical methods and related calculation strategies are studied for numerical solutions. There is a strong interest in finding a data-driven solution through machine learning. In recent years, machine learning has shown power in predicting complex nonlinear evolution governed by NLSE [73,74,75]. PINNs guided with specific theories can also be an effective analytical tool to solve PDEs from incomplete models and limited data [76].

Reconstruction and design (inverse problem)

The inverse problem in fiber laser fields can be divided into two categories. The first one is the reconstruction problem: recovering the x* from measurement data y*, where y* = Ax* + n, for example, pulse reconstruction from a speckle pattern through a multimode fiber, mode decomposition from measured intensity patterns. The noise n might be an obstacle to achieving a high-precision reconstruction. The second is the design and manipulation problem: given a specific design target y* (e.g., a gain profile), to determine the required input of fiber laser system x* (such as the input voltages, currents, powers, and wavelengths), or the laser system itself A*(e.g., fiber with specific structures) where y* = Ax* or  y* = A*x. The noise n is usually ignored during the design process. Typical design problems include finding suitable geometric parameters during fiber structure design and shaping signals to produce target temporal and spatial characteristics. In some special cases, the target y* is too ideal and cannot be achieved because of physical theories or the restricted experimental condition and can only find one close to it.

It should be noted that the forward operator, A, can be completely known, partly known, or unknown in different applications. When A is well known, some conventional methods can transfer the inverse problem to an optimization problem and solve it with an iterative process. For each y*, a similar operation needs to be solved from scratch. However, this scheme is weak or cannot work when the forward operator, A, is complex, requiring a time-consuming calculation procedure, partly known or even totally unknown. Machine learning is a powerful tool to solve inverse problems, simply relying on learning the inverse mapping A−1 and then obtaining a solution x* = A−1 (y*) in a single step. Further, additional feedback and control can help to improve the result accuracy, and a well-trained model can accelerate this process by replacing complex computation in A.

Control

When there is a high requirement for control accuracy and speed because of dynamical environmental disturbance, a feedback loop and the corresponding control unit are required to follow the specific change. Learning and optimization are two primary means to affect robustness. They usually involve computational processes incorporated within the system that trigger parametric updating and knowledge or model enhancement, improving progressively. Machine learning provides new insights for feedback and control [77, 78], particularly in the dynamic, complex, and disturbance-sensitive system, where conventional control algorithm shows low control bandwidth and weak robustness. An exciting discovery in published literature is that learning models can automatically reject instrumental or environmental noise. Some applications combine machine learning with traditional algorithms to enhance performance [79, 80].

Denoising

This part has a tight relationship with image and signal processing. Machine learning techniques can overcome data error to some extent, such as removing bad points and blur in the raw data [81, 82] and completing tasks when the measurement device yields strong noise [83]. The denoising ability of machine learning is significant in many practical applications.

Other applications

Machine learning can be used to reduce manual engineering in experimental operations of laboratories by modifying the hardware, such as the alignment of laser beams [84].

Fiber and laser design

Different applications require lasers output with specific characteristics in time, space, and frequency domains. In the design problems to obtain on-demand output, factors like fiber structure, laser type, experimental beam path, etc., usually come into consideration. This section will review typical applications of machine learning techniques in fiber structure design and fiber amplifiers design. Machine learning can complement iterative design methods based on physical principles and optimization algorithms where each design problem needs to be repeated. Besides, nonlinear effects enable laser shaping in an optical fiber with many degrees of freedom. A prediction model of nonlinear phenomena and laser propagation can help with laser properties shaping. The related content of properties manipulation based on the study of nonlinear effects can be found in Section 4 (Fig. 4).

Fig. 4
figure 4

Design and manipulation for targeted laser properties

Fiber design

Photonic crystal fiber (PCF) is an important new optical waveguide. Different from the structure of conventional fibers with two concentric regions (core and cladding) with varying doping levels, the core of PCFs has air holes periodically arranged along the fiber’s length, which makes the cladding index wavelength-dependent [85]. The optical properties of PCF result from a series of structure parameters, such as the holes size, hole spacing, and the number of air-hole rings. Therefore, the parameters design of the PCF structure relies on high-precision modeling of structure parameters. Conventional numerical methods like the finite element method, block-iterative frequency-domain method, and plane wave expansion method need to perform multiple times for a specific fiber design, which requires significant computing resources when dealing with complex fiber structures.

In 2019, Sunny Chugh et al. adopted a FCNN with supervised learning to model a solid-core PCF [86]. PCF geometric parameters, including the diameter of holes (d), the separation between the center of two adjacent holes (pitch, Λ), the refractive index of core (nc), wavelength (λ), and the number of rings (Nr), are considered as the inputs of ANN, and optical properties including effective index (neff), effective mode area (Aeff), dispersion (D), and confinement loss (αc), are their labels, respectively, which are calculated using Lumerical Mode Solutions. Simulation quantitative analyses show that this method can support the accurate and quick design for PCF structure parameters. The predicted optical properties of the trained ANN model have an acceptable MSE value with their labels. As for the computation runtimes, Lumerical Mode Solutions takes a few minutes for each parameter, while the ANN model only needs a few milliseconds (Fig. 5).

Fig. 5
figure 5

Photonic crystal fiber modeling with a fully connected neural network. Figure adapted with permission from ref. [86] (© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement)

Fiber amplifier design

Machine learning plays a more important role in innovative fiber amplifier design. Raman amplifier (RA) is a typical research exemplar [79, 87,88,89,90]. RA is an attractive optical amplification scheme that offers gain availability across a broad range of wavelengths while maintaining low noise due to distributed amplification. Inverse design for Raman amplifiers focuses on selecting pump powers and wavelengths that would result in a targeted gain profile. The challenge of this problem lies in the highly-complex interaction between pumps and Raman gain [87].

In 2019, Darko Zibar et al. demonstrated an ANN method for the highly-accurate design of arbitrary Raman gain profiles, numerically in C and C + L–band and experimentally in C band [87]. The ANN resembles auto-encoders, including two FNNs. The first is the backward neural network NNbw, mapping from the Raman gain profile to the required pump power and wavelength configurations. The second one NNfw, forward neural network, represents the forward mapping between the pump powers and wavelengths and the Raman gain profile, which can work to fine-adjust the predicted results of NNbw when combined with a gradient descent algorithm.

Prediction and control of nonlinear effect

Machine learning provides physics-free and physics-informed manners for modeling nonlinear fiber laser systems. On the one hand, with an amount of data representing system behaviors and powerful computation hardware, machine learning techniques can find the relationship between system state variables (input, internal, and output variables), providing new avenues for exploring high-dimensional dynamical systems without solving complex mathematical and physical equations. On the other hand, incorporating physical principles into neural networks can help regularize the training in small data regimes. Further, the obtained model of the nonlinear effect can also be used to design and control the laser properties (Fig. 6).

Fig. 6
figure 6

Prediction of laser properties governed by the nonlinear effects

Pulse prediction of nonlinear dynamics

High nonlinearity in pulse evolution is an obstacle to establishing accurate numerical propagation simulations. Machine learning can provide an alternative solution by modeling the propagation and evolution of laser properties based on collected data from the nonlinear system.

In 2021, Lauri Salmela et al. used a RNN to achieve model-free prediction of complex nonlinear propagation in optical fibers governed by an NLSE system [73]. The trained network was proved to work for higher-order soliton compression and ultra-broadband supercontinuum generation, predicting temporal and spectral evolutions of ultrashort pulses in highly nonlinear fibers solely from the input pulse intensity profile. Other propagation scenarios for a wider range of input conditions and fiber systems can also be generalized, including multimode propagation.

Hao Sui et al. demonstrated a compressed convolutional neural network as an inverse computation tool to predict initial pulse distribution from a series of discrete power profiles at different propagation distances [75]. Two nonlinear dynamics, the pulse evolution in fiber optical parametric amplifier systems and the soliton pair evolution in high-nonlinear fiber, are studied in simulations. The simulation results on the test datasets hold a deviation with fair stability, which indicts the potential applications of this method in optimizing the initial pulse of fiber optics systems (Fig. 7).

Fig. 7
figure 7

Initial pulse distribution prediction with a convolutional neural network. Figure adapted with permission from ref. [75] (© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement)

Xiaotian Jiang et al. presented a physical-informed neural network to solve NLSE and characterize the pulse evolution of different input waveforms [91]. The network uses an initial pulse and its label (responsive NLSE solution) for training. The predicted NLSE solution from the network will be used to calculate its corresponding pulse via physical theory. The loss is combined with two terms: the loss of initial pulse and calculated pulse, and the other term describes the difference between the NLSE solution from network and label. In this way, the predicted results of the network can always satisfy the NLSE. The network can work with less computational complexity than a commonly used numerical method, the split-step Fourier method.

Spatiotemporal nonlinearities prediction and control

In 2020, Uğur Teğin studied spatiotemporal nonlinearities in multimode fibers for spectrum shaping. The results show that a multilayer neural network could learn nonlinear frequency conversion dynamics, serving for generating a target beam spectrum [92]. Another two highly nonlinear phenomena, cascaded stimulated Raman scattering based broadening of the spectrum and supercontinuum generation, are also under consideration. Later in 2021, they extended the method of ref. [73] (see Pulse Prediction of Nonlinear Dynamics) to predict spatiotemporal nonlinear propagation for an arbitrary number of modes in graded-index multimode fibers through a RNN [74] (Fig. 8).

Fig. 8
figure 8

Spectrum shaping with a fully connected neural network. Figure adapted with permission from ref. [92] (© The Author(s) 2020, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/ licenses/by/4.0/)

Spatiotemporal nonlinearities prediction is significant for generating a white-light continuum (WLC). The accuracy model of the underlying spatiotemporal nonlinear optical process will help the generation of the broad and stable WLC, replacing the time-consuming empirical optimization procedures of WLC properties (such as bandwidth, energy, and stability). In 2021, Carlo M. Valensise et al. adopted deep reinforcement learning to control the spatiotemporal dynamics for WLC generation [93]. The learning agent can learn an effective control policy of three degrees parameters (the energy of the pump pulse, the numerical aperture of the focused beam, and the position of the nonlinear plate concerning the beam waist), achieving stable and broadband WLC generation experimentally (Fig. 9).

Fig. 9
figure 9

White-light continuum generation with deep reinforcement learning. Figure adapted with permission from ref. [93] (© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement)

Pulse nonlinear shaping

Pulse shaping in optical fibers based on wave-chopping devices and nonlinear control are two efficient methods to tailor the on-demand laser properties. Wave-chopping is a primary method for pulse laser generation from extra-cavity modulation of continuous laser, which commonly relies on chopper devices such as electro-optic and acousto-optic modulators [94, 95] and enables flexible shaping of temporal properties to get arbitrary pulse shape and duration. Limited by the response time of the driver, it is hard for the extra-cavity modulated pulse fiber laser to obtain the ultra-short pulse fiber, usually with pulse durations in μs and ns levels [96,97,98,99,100]. The pulse power obtained from extra-of-cavity modulation is limited by the power handleability of the modulation device. A multi-stage amplifier is required to get a high pulse output, but this will lead to pulse distortion because of the gain saturation effect [99]. A method to overcome this problem is to find the suitable modulation signal by optimization algorithm to pre-compensate distortion [97].

An accuracy model of pulse nonlinear propagation is another way for pulse shaping [101, 102]. In 2020 and 2021, Sonia Boscolo et al. used artificial neural networks to model nonlinear pulse propagation in fibers with normal and anomalous dispersion [103, 104]. A FCNN is trained to learn the relationship between the temporal and spectral intensity profiles of the pulses and the fiber parameters. Further, the network can identify the initial pulse shape according to the pulse shape from the fiber output.

Reconstruction and evaluation of laser properties

In laser property reconstruction and evaluation, recent research involving machine learning focuses on indirect methods, highlighting the advantage of low experimental cost and high immunity to instrumental and environmental noise. In detail, measurement images (such as intensity patterns like pulse intensity, spectral intensity, near-field beam intensity) are mapped to the required laser properties (such as ultrashort pulses spectral amplitude, phase, and temporal duration) for one-step inference rather than a direct measurement. For example, a deep learning method has been explored to map the speckle pattern of a single-mode fiber followed by a disordered medium to the wavelength of a diode laser directly [105]. When considering phase detection, there may be a phase ambiguity problem (e.g., multiple phases result in the same intensity pattern). To eliminate the ambiguity, a speckle pattern passing through a scattering device, e.g., the multimode fiber, is used to break the degeneracy of data and build a one-to-one correspondence to the required laser properties, such as a single-shot full-field pulse measurement technique enabled by deep learning [106] (Fig. 10).

Fig. 10
figure 10

Reconstruction and evaluation of laser properties

Ultrashort pulses reconstruction

Ultrashort laser pulse reconstruction is a challenging topic in ultrafast science, such as ultrafast imaging, femtochemistry, coherent control, and high-harmonic spectroscopy [107]. Typically, the duration of an ultrashort pulse is below picoseconds and too short to be measured directly by photodiodes. Frequency-resolved optical gating (FROG) [108] and dispersion scan (d-scan) [109] are two widespread indirect methods. With a recovery algorithm, such as the principal component general projections algorithm [110] and Ptychographic reconstruction algorithm [111], the 2D trace can support the reconstruction of ultrafast pulses. In recent years, deep learning has been introduced in ultrafast pulses reconstruction. In 2018, Tom Zahavy et al. first applied the DNN technique to FROG to characterize ultrashort optical femtosecond pulses phase [112]. Later, more work on the deep-learning reconstruction of the ultrashort pulse phase was demonstrated for the attosecond pulse [113, 114].

One research in the fiber laser field showed temporal duration characterization of the mode-locked pulses using the dispersive Fourier transform trace [115]. The trained artificial neural network can predict the pulse duration with an average consistency of 95%. The proposed technique can be adapted to create a compact and low-cost feedback loop in complex laser systems.

Mode decomposition

Mode decomposition (MD) technique of multimode fibers, which aims to calculate the amplitude and phase information of each eigenmode, is essential for analyzing the complete optical field and its beam properties. A challenge of complete mode decomposition is that different combinations of modal weights and phases may result in the same near-field intensity pattern [116]. In the few-mode fiber, the main ambiguity comes from the modal phase.

In 2019, Yi An et al. used a CNN, modified from VGG-16, with supervised training to predict modal weights and relative phases with only the near-field intensity pattern for the first time [117]. Because of phase ambiguity, the cosine value of relative phases is adopted rather than relative phases themselves as a part of labels to ensure the network can converge under a one-to-one mapping relationship. Considering the sign of relative modal phases, the process from predicted cosine values to relative phases takes up a higher time cost as the number of modes increases. Restricted by the capturing speed of the CCD (30 Hz), the real-time decomposing rate is experimentally limited to 29.9 Hz for 3-mode and 6-mode cases if it only needs modal weight determination [118, 119]. When predicting both modal weights and relative phases, the real-time decomposing rate is 29.9 Hz for 3-mode and 24 Hz for 6-mode cases. Later, this work was extended to modal analysis for Hermite–Gaussian beams emitting from solid-state lasers [120] (Fig. 11).

Fig. 11
figure 11

Mode decomposition with a convolutional neural network modified from VGG-16. Figure adapted with permission from ref. [117] (© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement)

In 2020, Xiaojie Fan et al. handled the phase ambiguity of modal coefficients using two labels: near-field and far-field images [121]. A combined loss to train a convolutional neural network considers the reconstruction loss of near-field and far-field images together. Simulation results show that the training accuracy can be improved at the cost of increased labels in the datasets.

In 2021, with more powerful computational graphics processing units, Stefan Rothe et al. presented mode decomposition with another type of CNN, DenseNet, achieving 10 modes experimentally [122]. Like [117,118,119,120], they also used the cosine value of relative phases to make the label. The trained network can work for datasets with modes unknown, implementing mode decomposition on a subset of 10 modes of a 55-mode fiber.

Han Gao et al. used optimized datasets for network training to reduce the calculation complexity [123]. The principal component analysis, a dimensionality reduction algorithm, was adopted to remove redundant information and noise in the near-field beam patterns. A 3-layer FCNN is trained to map the pre-processed near-field beam patterns to its label (the cosine value of the modal phase and the modal weights). In a 3-mode simulation, dataset optimization can help reduce the speed of complete modal demodulation from 40 ms per frame to 5 ms per frame. In the 3-mode experiment test, the averaged correlation between reconstructed images and target images of 300 samples is 0.9224.

Beam quality evaluation

Beam propagation factor M2 is an important parameter for assessing laser beam quality. A standard M2 measurement determined by the International Organization for Standardization is experimentally complex and relatively time-consuming. Improved techniques for fiber laser include a single-shot scheme with a Fabry-Perot resonator [124], complex amplitude reconstruction methods with interferometers [125, 126], and two identical Charge-Coupled Device (CCD)s [127], and mode decomposition methods [128,129,130]. Although the relationship between the M2 factor and a single near-field pattern is implicit, deep learning method can extract a straightforward mapping based on data analysis.

In 2019, Yi An et al. utilized a trained CNN to achieve M2 determination of the fiber beams in about 5 ms with only one near-field beam pattern from the CCD, which is highly competitive in real-time measurement for time-varying beams [131]. This method also shows excellent robustness for imperfect beam patterns, such as noisy patterns and patterns from the CCD with vertical blooming [83] (Fig. 12).

Fig. 12
figure 12

M2 evaluation with a convolutional neural network modified from VGG-16. Figure adapted with permission from ref. [131] (© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement)

Robust control for laser and laser system

Introducing a feedback mechanism into the architecture of fiber laser or fiber laser system to perform closed-loop control of external and internal factors through servo drive components is a feasible solution to maintaining its stable operation and state locking. Ongoing efforts in machine learning have made the control of complex and sensitive fiber laser and fiber laser systems.

Mode locking in mode-locked laser

Mode-locked fiber lasers (MLFLs) based on nonlinear polarization rotation are the mainstream commercial products, and their performance is extremely sensitive to perturbations inside or outside the cavity, thus requiring strict environmental control to maintain robust performance. Cavity sensitivity to birefringence has a significant impact on mode-locking dynamics. However, quantitative modeling of stochastic and sensitive birefringence is unclear. Traversal and optimization algorithms have been wildly studied for automatic mode-locking techniques [30, 132,133,134,135,136,137].

In 2014, a research group at the University of Washington achieved birefringence characterization of MLFLs based on machine learning sparse representation in a numerical simulation [138]. Further, by combining the adaptive extremum-seeking controllers and the machine learning based birefringence classification, they proposed a self-tuning fiber laser based on numerical simulations [139]. In 2018, they developed a self-tuning laser based on deep learning and model predictive control (DL-MPC) algorithm [140]. The centerpiece of the DL-MPC algorithm is the model prediction module, a recurrent neural network to predict the future laser states. When the difference between the predicted and real laser state exceeds a certain threshold, a VAE will work at first to infer the birefringence. Then a simple FCNN will map its result to the control input (angles of a polarizer and three quarter-waveplates) to maintain mode-locking (Fig. 13).

Fig. 13
figure 13

Mode-locking with deep-learning model predictive control algorithm. Figure adapted with permission from ref. [140] (© 2018 Optical Society of America, Open Access)

In 2021, Qiuquan Yan et al. demonstrated a deep-reinforcement learning algorithm with low latency (DELAY) for the automatic mode-locked operation in a saturable absorber-based ultrafast fiber laser [141]. The DELAY algorithm has four deep neural networks, two for selecting the appropriate action (adjust input voltages of the electrical polarization controller) according to the laser state and the other two to evaluate the effect of the executed actions. The experiment result shows that the fastest recovery time of the algorithm after vibration is 0.472 s, and the average recovery time is 1.948 s (Fig. 14).

Fig. 14
figure 14

Mode-locking with low-latency deep-reinforcement learning algorithm. Figure adapted with permission from ref. [141] (© 2021 Chinese Laser Press, Open Access)

Phase locking in coherent laser combination

Fiber laser has attracted research interest in many fields because of its compact structure, high efficiency, high portability, and good beam quality. As the power increases, the beam quality of a single fiber laser will decrease due to some physical limits [3,4,5]. Coherent beam combination (CBC) of multiple fiber lasers is a practical approach to breaking the power limitation [142, 143]. With phase synchronization between the sub-beams, the coherent output could be realized, thereby improving the power of the entire output beam while maintaining beam quality and improving the brightness.

Because of mechanical and thermal perturbations in actual engineering, a phase control technique is required to ensure phase synchronization and stabilization of sub-beams and thus maximize combination efficiency. The active phase-locking method uses a phase-detection and feedback servo control system to compensate for the dynamic phase noise to realize the in-phase coherent output of each sub-beam. Phase detection in classic active phase-locking can be divided into two categories: direct and indirect detection. Direct ones yield high accuracy while requiring complex experimental structures, such as the heterodyne detection method [144, 145], interferometric phase measurement method [146], phase-intensity mapping method [147], and pattern recognition method [148]. Indirect detection techniques utilize electrical modulation and demodulation to find phase information, typically dithering techniques [149,150,151,152] and stochastic parallel gradient descent (SPGD) algorithm [153].

A common question in CBC is how to combine a large number of sub-beams efficiently to achieve a high output power. However, the control bandwidth of most classic active methods decreases along with the number of sub-beams. The control bandwidth of the phase control is still a challenging problem in large-scale CBC systems. Machine learning has recently been introduced to extend the classic control methods, where reinforcement and supervised learning are two main approaches in applications.

In 2019, Henrik Tünnermann et al. demonstrated deep reinforcement learning for Mach-Zehnder interferometer CBC set up and tiled aperture beam combining [77, 78]. The result of the critic network is regarded as a reward for beam quality, and it will decide the action of the control network. The control strategy is similar to that in an optimization scheme. Since it needs time to take action, the robustness still needs to be enhanced before practical applications (Fig. 15).

Fig. 15
figure 15

Tiled aperture beam combining with deep reinforcement learning. Figure adapted with permission from ref. [78] (© The Author(s) 2019, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/ licenses/by/4.0/)

In 2021, Maksym Shpakovych et al. proposed a quasi-reinforcement learning algorithm for an array of up to 100 laser beams [154]. The piston aberration in this system is created by a spatial light modulator (SLM) rather than coming from the actual environment. In this case, the influence of dynamic random noise in real systems was not considered.

In the same year, Xi Zhang et al. applied the Q-learning algorithm, an iterative reinforcement learning method, to CBC [155]. When the number of channels is low, it has a similar performance to the SPGD algorithm, while its parameter debugging is more convenient than the SPGD algorithm.

A feasible supervised deep-learning based CBC relies on a well-trained neural network that can reverse mapping the phase of each sub-beam from an intensity profile. During the phase control process, the phase errors are compensated by the predicted phase of the neural network. The premise of this method is to ensure one-to-one correspondence between input and output. Only this way can the network converge and thus have phase prediction capabilities. In the symmetrically arranged beam array of tiled aperture systems, the far-field intensity distribution can correspond to the different phase distributions in the near-field [156], so it cannot be directly paired with the phase as training data.

In 2019, Tianyue Hou et al. incorporated a CNN based on supervised learning into tiled aperture CBC systems to learn the relationship between the intensity profile of the combined beam and the relative phases of array elements for the first time [157]. In this way, the required phase for compensation can be obtained directly, which is quite different from the methods based on reinforcement learning [77, 78, 154, 155]. This work adopted non-focal-plane rather than focal-plane images to train the deep-learning model to avoid the data collision problem. Later, this method is extended to generate orbital angular momentum beams with different topological charges from a CBC system [80]. It is to be noted that structured light [158,159,160,161], one of the research frontiers, can be generated based on a beam array inspired by similar methods (Fig. 16).

Fig. 16
figure 16

Coherent beam combination with a deep convolutional neural network and non-focal-plane images. Figure adapted with permission from ref. [157] (© The Author(s) 2019, distributed under the terms of the Creative Commons Attribution license, http://creativecommons.org/ licenses/by/4.0/)

Furthermore, to obtain a straightforward optical design that could adopt the focal-plane intensity image, in 2021, Qi Chang et al. considered breaking the degeneracy of the combined beam pattern in the focal plane with a diffuser [162]. A CNN maps the scattering intensity images in the focal plane to the phase error of the fiber array. To some extent, the role of the diffuser is equivalent to applying both intensity and phase modulation.

Similarly, Renqi Liu et al. applied amplitude modulation into sub-beams of a tiled-aperture coherent beam combination system [163]. In the two-beam coherent combination experiment, the system can simultaneously measure sub-beam beam-pointing and phase difference with an RMS accuracy of about 0.2 μrad and λ/250, respectively.

Another research adopted a four-layer FCNN to combine 81 channels on a two-dimensional, 9×9 beam diffractive optical element (DOE) combiner [164]. Similar to [154], this system also works in an ideal situation. The network was trained to map the far-field interference pattern to the phase of DOE. Since nearly identical interference patterns might come from different beam phases, the phase ambiguity is an obstacle to the convergence of networks. A core operation of this work is that the training data is produced from a limited phase perturbation range (less than the 180-degree range) which is regarded as the unambiguous region. The trained network can quickly predict the phase for small yet frequent perturbations, which is the usual case. A feedback loop is introduced to pull the phase into the trained region through a random-walk if the phase is out of the trained range. This work also discussed why limited region training is adequate for the full phase perturbation range.

Discussions and prospects

Over the last decade, machine learning has dramatically boosted the development of fiber lasers, leading to new paradigms for advanced research and practical engineering in fiber lasers. Even though many pretty impressive results have been presented in the available studies, potential problems and challenges remain. For example, some work is based only on numerical data or numerical verification. Further work needs to be concerned, such as uncovering the governing models from experimental responses, validating learning models in laboratories, and online learning involving environmental changes in real-time applications. Besides, the high burden of data collection, expensive computation cost, and poor interpretability might severely restrict the application possibilities of machine learning, especially the deep learning methods with a black-box mechanism. Applications that require high interpretability in the industry still prefer classical optimization algorithms. When should we choose the machine learning method under the trade-off of effectiveness and cost? What makes one machine learning method better than another? To what extent would we trust the machine learning results and conclusions? The exploration of fundamental questions like the above will drive machine learning research.

Opportunities and challenges are on the way. Going forward, effective tools are supposed to accelerate machine learning research. Mature open sources like TensorFlow [165] and PyTorch [166], have brought great convenience to the popularization of machine learning. However, standardized benchmarks for fiber and fiber laser research are rare. Further work could consider creating open datasets for a specific topic that serve as a standard for judging or comparing other things, which would benefit machine learning research in the relative field.

Novel machine learning techniques are emerging in an endless stream. Various new frameworks and new mathematics for scalable, robust, and rigorous next-generation learning machines are under development, which will continue to promote the development of lasers and achieve more brilliant results in the foreseeable future. For example, algorithm design by hand is a laborious process. To improve it, the concept of learning to optimize deserves more attention [167, 168], which shows algorithm design can be cast as a learning problem. The relative technique might benefit the creation of autonomous fiber lasers for self-learning and intelligence, featuring unlimited scalability and resistance to disruption.