Introduction

Synchrotron light sources, with their high intensity, brightness, and broad spectrum have empowered many research fields in material science1,2,3,4, biology5,6,7, and earth science8,9,10, etc. X-ray absorption spectroscopy (XAS), especially the near-edge absorption structure (XANES), is particularly useful for characterizing the valence state and local chemical environment of elements which has been widely applied to energy storage materials11,12,13 and catalyst14,15,16. Conventional XANES records signal from bulk samples and overlooks the heterogeneity of composition and chemical reaction inside the material5,17. This has led to the development of spatially resolved XANES imaging techniques11,18,19,20.

In recent years, two-dimensional (2D) XANES imaging has been incorporated into scanning fluorescence X-ray microscopy and full-field transmission X-ray microscopy and is capable of resolving chemical species at the nanoscale21,22,23. In scanning fluorescence X-ray microscopy, monochromatic X-rays are focused to a small spot and scanned over the sample point-by-point while X-ray fluorescent (XRF) signals are collected from each point. By repeating scans over a range of incident X-ray energy across the absorption edge of specific element, a 2D XRF-XANES dataset is assembled. State-of-the-art scanning X-ray microscopy can provide nanoscale resolution down to 10 s’ of nm, limited by the size of the focused spot24. As a dark-field imaging technique, XRF-XANES has very high sensitivity in detecting low concentration elements, mainly because of its zero background in achieving high signal-to-noise ratio (SNR). However, the slow data collection makes it unsuitable for studying large samples and dynamic systems. For example, at the hard X-ray nanoprobe (HXN) at NSLS-II, with a 40 nm focused spot, it takes ~20 min to scan a 10 um × 10 um area, equivalent to ~20 h for collecting a XANES dataset with 60 energy points.

As an alternative to scanning X-ray fluorescence microscopy, full-field transmission X-ray microscopy (TXM) operates in a similar way to optical microscopy25,26. A single exposure on the order of 0.01 s is used to capture a snapshot of the entire absorption contrast image of the sample inside the field of view. By assembling all the images taken at different incident energies, a 2D XANES image stack is created for further analysis and quantification27. TXM-XANES has distinct advantages including high efficiency (data collection within minutes instead of hours), large field of view (up to hundreds of micrometers), and high spatial resolution (up to 20–30 nm)28. However, TXM exhibits relatively poor sensitivity owing to the inherent low SNR associated with full-field imaging technique. As described in the following, a reliable TXM-XANES analysis weighs heavily on the accuracy in extracting sample signals from the background.

In a typical TXM-XANES measurement (Fig. 1a), we take a pair of images at each incident X-ray energy, one with sample and one without sample (referred to as the background or white-field image). Image normalization is calculated by dividing the sample image with respect to the background image to extract the absorption contrast from the sample only. In practice, variation in the beam profile induces errors in the background normalization. This represents the system noise that depends on the characteristics of the X-ray source and beamline hardware. While many existing methods, such as non-local mean29, TV- regularization30, BM3D31, etc.32 could remove random noise (e.g., Poisson noise from finite photon counts) from the XANES images, they are not able to eliminate the system noise related to beam profile variation due to the lack of a priori knowledge, leading to an error-prone data interpretation.

Fig. 1: Challenges in quantitative analysis for TXM-XANES.
figure 1

a Scheme of the TXM measurement. The normalized image is obtained by dividing the projection image taken with the sample with the background image taken without a sample. b, c Measurement 1: X-ray absorption image of the LiNiO2 precursor (pre-heated mixture of Ni(OH)2 and LiOH) at 8.3 and 8.6 keV. d, e XANES spectrum extracted from the region of interest ROI 1 and ROI 2 from measurement 1, as indicated by the black square shown in (b). Two reference spectra measured from NiO (ref Ni2+) and LiNiO2 (ref. Ni3+) are used to perform the linear regression to fit the ROI spectrum. The fitted spectrum is denoted as the red curves in those figures. The reference spectrum is scaled by its fitted coefficient and denoted as the green and yellow lines in the figures. f Pseudo-colored map showing Ni3+ concentration fitted from the data collected in measurement 1. g, h Measurement 2: X-ray absorption image of LiNiO2 precursor at 8.3, 8.6 keV, respectively. i, j XANES spectrum extracted from the region of interest ROI 1 and ROI 2 from measurement 2, as indicated by the black square shown in (g). k A pseudo-colored map showing Ni3+ concentration fitted from the data collected in measurement 2. Note: colorbars in (b, c, g, h) represent the normalized intensity. Colorbars in (f, k) represent the ratio of the fitted coefficient for Ni3+, calculated by CNi3+/(CNi2+ + CNi3+). Scale bars are 10 um.

In this study, we demonstrate a deep-learning approach to perform reliable background normalization in full-field TXM-XANES images. Machine learning (ML) based on deep neural networks has been widely used in various image processing tasks such as object detection33, imaging denoising34, and imaging inpainting35, and they often outperform traditional image processing algorithms. Here we designed and trained a deep learning model based on the residual in residual dense block (RRDB), a neural network architecture initially proposed for image super-resolution36,37,38, to extract the beam profile from the measured XANES images. Notice that 2D TXM-XANES generates three-dimensional data with absorption images stacked along the energy dimension. The absorption images at different energies are not independent. Instead, they have strong correlation because the absorption has well defined element-specific dependence on the photon energy away from the X-ray absorption edge. A specific feature of our approach is to utilize such domain knowledge in the training process to greatly improve prediction accuracy. In experiments, the ML approach is found to significantly improve the accuracy of chemical states mapping, especially in the case when element of interest is at low concentration.

Results and discussion

Challenge of XANES imaging

We use an example to highlight the challenge of analyzing low-absorption samples caused by imperfect background normalization. The sample is a mixture of Ni(OH)2 and LiOH after heating at 400 °C for 4 h. The mixture is a precursor for the synthesis of LiNiO2, a cathode material for lithium-ion batteries. We conducted a 2D TXM-XANES measurement to map the Ni valence state in the heterogeneous sample to reveal the morphology and chemical uniformity of the precursor in order to help to fine-tune the synthesis process. Two measurements were conducted on the same sample at different times at the FXI beamline at National Synchrotron Light Source II (NSLS-II). In both measurements, the beam energy varies from 8.2 to 8.8 keV with a total of 90 energy points, covering the K-edge absorption of Ni. The absorption images at 8.3 and 8.6 keV from the first and second measurements are shown in Fig. 1b, c, g, h, respectively. More examples of the absorption images of the two measurements can be found in Supplementary Note 1 and Supplementary Figs. 1 and 2. To derive the valence state of Ni, we use linear regression to fit the measured spectrum (pixel-wise) using two reference spectra from NiO and LiNiO2. It’s worth noting that although the dynamic John-Teller effect in LiNiO2 type of materials has been suggested by some experiments and first-principles studies39,40,41, the variation in the Ni-O bonding length has little effect on the K-edge fitting results of the absorption spectra. Additionally, the exposure time for capturing absorption images at each energy point is 0.05 s, significantly longer than the dynamic J-T distortion occurring in picoseconds39,41. Consequently, the distortions are time-averaged and wouldn’t be reflected in our measurements. The spectra extracted from two regions of interest (ROI) from the two measurements are plotted in Fig. 1d, e, i, j, respectively. ROI #1 is inside a large particle with relatively high absorption. The Ni valence state averaged over the ROI#1 has similar values (+2.48 vs. +2.39) for the two measurements. ROI #2 is inside a small particle with low absorption. The Ni valence state averaged over ROI #2 is very different between the two measurements (+2.77 vs. <+2). The result from the second measurement (Fig. 1j) is clearly unphysical, as Ni should always have a valence state >+2 during material synthesis (see “Methods” section). Inspection of the XANES spectra from ROI #2 (Fig. 1e, j) shows they are very noisy, which is the reason for the large discrepancy and error in the oxidation state quantification. Such noise originates from the random variation of the incident beam intensity and profile and introduces un-even background normalization. Without correction, the obtained chemical mapping is prone to significant errors. Similarly, a few particles in Fig. 1k exhibit a core-shell-like structure. This artificial feature is attributed to the noisy spectrum from small particles and at the periphery of large particles, resulting from weak volumetric X-ray absorption. Fitting the noisy spectrum is more susceptible to improper background normalization (see Supplementary Note 2 and Supplementary Fig. 3).

ML training dataset and model architecture

To address the problem highlighted in the previous section, we implemented a physics-informed deep learning model to improve background normalization for accurate 2D XANES analysis. Figure 2 illustrates the model architecture. The model comprises a Residual-in-Residual Dense Block (RRDB) network which utilizes residual dense blocks (RDB) as the building module. Each RDB consists of densely connected layers with local residual learning (Fi to Fi+1). Multiple RDBs are assembled in a residual network (F0 to F0 + F3) to form the RRDB network. This neural network takes noisy XANES images captured at multiple X-ray energies as input and predicts the beam backgrounds embedded in the images. Instead of predicting denoised particle images, we focus on estimating the background intensity profile. This choice is driven by the fact that samples studied in experiments can exhibit diverse morphologies and features. It may not be feasible to predict the foreground of such images without an extensive large dataset to sufficiently train the ML model. In contrast, the background profile of the beam exhibits less variation, allowing our model to make reliable predictions with a relatively small training dataset.

Fig. 2: The structure of the RRDB network.
figure 2

The RRDB network contains three Residual-Dense-Blocks (RDB). Conv is the abbreviation of convolutional layer.

We generated the training dataset by fusing synthetic and experimental data together. A training image (Fig. 3a) is obtained by the superposition of three types of images: a simulated Ni K-edge absorption image from a synthetic sample (Fig. 3c and Supplementary Fig. 4g–r) and an experimentally acquired beam intensity profile (Fig. 3b and Supplementary Fig. 4a–f), which can be further decomposed to a smooth background profile (Fig. 3d) and a random high-frequency noise (Fig. 3e). A synthetic sample is used here because it provides the exact knowledge about the ground truth of the foreground. The background images were collected at the FXI beamline at NSLS-II over the course of a few months. A Gaussian filter with a 7 × 7 kernel size was used to remove the high-frequency noise (originated from the Poisson noise) to give a Clean “background” image (Fig. 3d), which serves as the ground truth during network training. The X-ray absorption images of synthetic particles are simulated according to the Lambert Beer law where the absorption coefficients are from the Xraylib database. More details on the generation of absorption images from synthetic samples are described in Supplementary Note 3 and 4. The goal of the training process is to extract the background image (Fig. 3d) from the input, which is subsequently removed together with the random noise from the input to obtain a prediction of the denoised XANES image.

Fig. 3: Training dataset for TXM-XANES denoising.
figure 3

a Synthetic X-ray absorption image. b Intensity profile of image (a) viewed in 3D. c Intensity profile of a clean (ground truth) absorption image. d Intensity profile of a smoothed background (Clean background). e Intensity profile of high-frequency random noise. For (ae) the colorbar represents the pixel intensity after background normalization. Notice that some regions (pixels) have intensities larger than 1, indicating the system errors induced in the process of background normalization due to the fluctuation of the incident beam.

Protocol used for network training

As shown in Fig. 4a, we utilize a set of synthetic “Noise Images” (depicted in Fig. 3b), which represent the calculated absorption patterns of a synthetic sample at different energies is fed into the network. The background images, which are free of high-frequency noise (as shown in Fig. 3d), serve as ground truth and are labeled as “Clean Bkg” in Fig. 4. Detailed information regarding the preparation of training data can be found in Supplementary Note 3 and 4. Our task is to predict the backgrounds corresponding to the “Nosie Images” at each absorption energy. The predicted backgrounds are denoted as “Output Bkg”. Additionally, in Fig. 4, we introduce the “Cal Img”—the image obtained after background normalization. The normalization process involves dividing the input “Noise images” by the corresponding “Output Bkg”. To improve the prediction accuracy, multiple loss functions are employed to regularize the training process. These loss functions can be categorized into two groups:

  1. (1)

    Mean squared error loss (MSE loss) and feature loss (VGG loss), labeled as L1 and L2 in Fig. 4

Both loss functions quantify the discrepancy between the predicted background (“Output Bkg”) and the ground truth (“Clean Bkg”). These loss functions regulate the background prediction from individual XANES image at each energy independently.

  1. (2)

    Spectrum fitting loss

2D XANES images represent 3D data with two spatial dimensions and one energy dimension. However, a limitation of the L1 and L2 losses lies in their inability to capture the inherent correlation between absorption images in the energy domain. In a typical XANES spectrum, the attenuated intensity I exhibits a stepwise increase across an element’s absorption edge. If excluding the “white line” and the wiggles in the post-edge region, which are sensitive to the chemical environment, the blue curve in Fig. 4b approximates this intensity I, and can be described by the following function:

$$I(E)={\boldsymbol{A}}\,\cdot\, \mu \left(E\right)+{\boldsymbol{B}}\,\cdot\, {E}^{-3}+{\boldsymbol{C}}$$
(1)

where µ(E) approximates the linear attenuation coefficient that is a function of the X-ray energy E, which can be read from the database, i.e., Xraylib. µ(E) is an intrinsic property of the target element and dictates the step-like shape of the spectrum. Parameter A is proportional to the amount of material integrated along the X-ray beam path and referred to as “absorption thickness”. In Eq. (1), the second term \({E}^{-3}\) represents the attenuation caused by other elements in the material, which is valid when E is far away from the absorption edge of those other elements by i.e., a few hundred eV. In cases where this assumption is not valid, additional step function terms can be easily incorporated into Eq. (1) to account for the absorption edges of the other elements. The last term characterizes additional offsets of the spectrum background. In XANES analysis, A, B, and C are parameters to be fitted from the spectrum. It is worth noting that A is insensitive to the valence state of the element, and so could be robustly inferred from the spectrum even in the presence of noises. Clearly, the domain knowledge (Eq. (1)) and sample information (A, B, C) could be utilized to improve the separation of background from XANES images. To achieve this objective, we included several additional loss functions in the training process:

  • Consistency in thickness fitting

As shown in Fig. 4a, Eq. (1) is fitted to the predicted denoised XANES spectra to obtain a mapping of the A, B, and C of the sample. The discrepancy between the predicted A (“Fitted thickness”) and its value fitted from the ground truth XANES image (“Thickness”) forms the MSE loss (L5).

  • Consistency in fitted images

The low-frequency part of the spectroscopic images (“Fitted Img.”) is reconstructed from the predicted parameters (A, B, C) using Eq. (1). They are then compared with the predicted denoised images (“Cal Img.”) to compose an MSE loss (L3) and VGG-based feature loss (L4).

Fig. 4: The training processes of the RRDB model.
figure 4

a The ML training architecture. In the figure, the gray boxes are the synthetic image data used in the training. Blue boxes are the direct and derived output from the network. Green boxes are the spectrum fitting results using Eq. (1). The yellow diamonds represent the loss functions used. More specifically, “Noisy images” represents the synthetic X-ray absorption images at different incident X-ray energy of E1, E2, … and En, referring to the images showing in Fig. 3a, which is the input of the neural network. “Output Bkg.” is the predicted background image output from the neural network. “Clean Bkg.” represents the smoothed background images, referring to the image showing in Fig. 3d. “Clean Img.” represents the ground truth absorption images of the sample, referring to the image showing in Fig. 3c. “Cal. Img.” represents the background normalization, calculated as: “Noisy images”/“Output Bkg.”. b The scheme illustrates how to fit the “thickness” from the XANES spectrum.

The L3, L4, and L5 loss terms serve to penalize the prediction errors in the spectroscopic dimension. By incorporating those spectra fitting loss, we observed several benefits during the training process: (1) Faster convergence: the loss functions converge more rapidly, leading to efficient model training. (2) Enhanced stability: the model becomes more stable overall, as evidenced by the profiles of the loss functions for both the training and validation datasets (refer to Supplementary Fig. 5 and Note 6). (3) Improved denoising performance: the peak signal-to-noise ratio (PSNR) of the denoised images exhibits steadier improvement at the early stage of training upon introducing the spectra fitting loss (Supplementary Fig. 6). Through numerical simulations, we have discovered that including those loss terms significantly reduces noise in the XANES spectra, particularly in low-absorption regions, as depicted Fig. 1e, j.

ML performance evaluation

To test the performance of our ML method, we apply the trained model to the same set of XANES imaging data previously presented in Fig. 1. The reprocessed results from the first measurement are shown in Fig. 5a–d, and those from the second measurement of the same sample area are in Fig. 5e–h. Compared to the traditional approach, the ML method does a much better job. Figure 5a, b, e, f shows a very even background outside the particle area, and the spectra from the two particle ROIs (squared regions in Fig. 5a, e) exhibit significant improvement in quality over the original results, see Fig. 5c, d, g, h vs. Fig. 1d, e, i, j. In particular, the spectrum of ROI #2, which has a low Ni concentration, is no longer noisy and could be fitted excellently by the Ni2+ and Ni3+ reference spectra (Fig. 5d, h). Furthermore, ML denoising also generates consistent analysis results from the XANES images. This can be seen in the almost identical mappings of the Ni3+ fraction (Fig. 5i vs. 5j) calculated from the two separate measurements of the same area, which is in sharp contrast to the vastly different Ni3+ mappings (Fig. 1f vs. 1k) when the traditional approach is used. The changes in the Ni3+ mapping results of the two measurements before and after ML correction are compared in Supplementary Note 7 and Supplementary Fig. 7. Figure 5k shows the ratio of the ML-corrected Ni3+ fractions obtained from the two measurements. The pixel-wised value extracted from the particle area in Fig. 5k is plotted in the histogram (Fig. 5l), which approximates a Gaussian distribution with a mean value of 1 and a standard deviation (s) of 0.006. The narrow distribution infers the difference between the two measurements being less than 1.8% for 99.7% (3 s) of the particle area. The above test demonstrates that our physics-informed ML model successfully learned how to separate the smooth background from the noisy TXM data and could significantly enhance the accuracy of XANES analysis.

Fig. 5: XANES images after ML correction.
figure 5

a, b ML-corrected XANES images from Measurement I (Fig. 1b–d) at 8.3 and 8.6 keV. Colorbars represent the normalized intensity. c, d Spectra of ROI 1 and ROI 2 extracted from Measurement I after ML correction. Spectra are fitted by least square fitting using NiO (ref. Ni2+) and LiNiO2 (ref. Ni3+) as reference spectra. e, f ML-corrected XANES images from Measurement II (Fig. 1g, h) at 8.3 and 8.6 keV. g, h Spectra of ROI 1 and ROI 2 extracted from Measurement II after ML correction. i Ni3+ fraction from Measurement I after ML correction. j Ni3+ fraction from Measurement II after ML correction. Colorbars in (i, j) represents the ratio of the fitted coefficient for Ni3+, calculated by CNi3+/(CNi2+ + CNi3+). k Pixel-wise ratio of the Ni3+ fractions calculated by dividing pixel value from (i) to pixel value from (j). l Histogram of the pixel-wise Ni3+ fraction ratio extracted from the particle area as shown in (k). The x-axis is the fraction ratio, and the y-axis indicates the number of pixels. Scale bars are 10 um.

Improve the model transferability with self-supervised learning

Although the model was trained using simulated Ni XANES image stacks, it can be applied to many other XANES measurements at different absorption edges. Supplementary Note 8 and Supplementary Figs. 811 present the results of using the pre-trained model to correct noisy experimental XANES images for Ni (LiNi0.85Co0.1Mn0.05O2), Co (LiCoO2), Mn (LiNi0.5Co0.2Mn0.3O2), and Fe (Fe + Fe2O3), respectively, demonstrating the generality of the model. However, when applying the model to another Co XANES dataset taken for the material LiNi0.85Co0.1Mn0.05O2, the Co spectrum does not come out as desired. Only slight improvements are found after ML correction (Fig. 6b vs. Fig. 6c). We hypothesize that this relatively poor performance originates from the small signal response (change) across the Co absorption edge due to its low concentration (~10%). In this case, the residual error from non-perfect background normalization after ML correction may have a magnitude comparable to the signal variation in the Co spectrum since the model was not trained for this particular dataset (background pattern). To further enhance the ML performance on Co, rather than persisting in generating entirely new labeled synthetic noisy images based on Co, our objective shifts towards exploring the potential of the model in self-supervised learning. While supervised learning typically yields higher accuracy when provided with abundant labeled data representative of the target task, self-supervised learning stands out for its superior data efficiency. Therefore, in this section, we introduce a secondary “production” step aimed at constructing the new training dataset directly from the Co XANES images and fine-tuning model parameters accordingly.

Fig. 6: Self-supervised learning to improve the ML transferability.
figure 6

a X-ray absorption image of a LiNi0.85Co0.1Mn0.05O2 particle at 8 keV (above Co K-edge). Scale bar is 10 um. b Original XANES spectrum of ROI indicated by the open square in (a). c ROI spectrum after background removal using the trained ML model. df ROI spectrum after 1, 2, and 3 iterations of the self-supervised learning process. Reference spectra from CoO and LiCoO2 were used in fitting the Co XANES shown in (f). g Schematic of the pre-edge (E1 to Ep) and post-edge (Eq to Em) energies used in self-supervised learning. h Diagram of the iterative self-supervised learning procedure.

In the production step, we do not have the ground truth image to compose loss terms (L1, L2, and L5 in Fig. 4a). Instead, we extract a portion of images stacks selected from the pre-edge and post-edge of the XANES spectrum to form the new training dataset and utilize the idea of the invariant absorption “thickness” to design our ML workflow (Fig. 6h). First, raw projection images are fed into the existing ML-model to get preliminarily denoised images (“Cal. Img1” in Fig. 6h). Secondly, a “thickness” image corresponding to the parameter A in Eq. (1) is fitted using the current XANES image stack at all energies. Thirdly, we generate a few new training batches (e.g., 20 batches) with each batch composed by randomly selecting 16 images from the pre-edge and post-edge of the XANES spectrum (Fig. 6g), and then fed into the neural network. Two loss terms are constructed to update the model. The first lost term (L1 in Fig. 6h) minimizes the total variation of the background images predicted from the model (“Cal. Img” in Fig. 6h), which regulates the image smoothness. The second loss term (L2 in Fig. 6h) minimizes the difference between “Cal. Img” and the fitted image (“Fitted Img” in Fig. 6h), which is reconstructed using the fitting parameter (A, B, C) while the parameter A is fixed and substituted by the “thickness” fitted from the whole stack as described in the second step. In this way, we keep the “thickness” unchanged at each learning step. After the first iteration, the model shows significant improvement in the spectral SNR (Fig. 6d). The results after the second and third iterations are shown in Fig. 6e, f. Higher spectral SNR indicates improved performance. For example, when compared to the Co3+ concentration fitted from the raw dataset (Supplementary Fig. 12a), the Co3+ concentration obtained from the ML corrected dataset shows reduced fitting variance and, consequently, less uncertainty (Supplementary Fig. 12e–h). Additional information can be found in Supplementary Note 5 and 9.

Using self-supervised learning, we successfully determined the distribution of Co valence states in pristine LiNi0.85Co0.1Mn0.05O2 particles from Co XANES data (Fig. 7a). Surprisingly, we found that the Co valence state varied from ~+2.7 to +3, deviating from the common belief that Co has a +3 valence in LiNixCoyMnzO2. Similarly, Fig. 7b shows that Ni also displayed a non-uniform distribution of its valence states. We utilized the K-Mean clustering method to identify three major clusters. Their spatial distribution (Fig. 7c) reveals the correlation of valence states between Co and Ni. The first cluster, in the central part of the particle (green color), had an average valence state of +2.93 for Co and +2.96 for Ni. The second cluster, mainly from the bottom of the particle (orange color), had averaged valence states of +2.84 for Co and +2.89 for Ni. The third cluster, primarily from the left and right sides of the particle (blue color), had averaged valence states of +3 for Co and +2.9 for Ni. Figure 7d plots the ratio of atomic concentration of Co/(Co+Ni), by calculating the “thickness” fitting for both Co and Ni (illustrated in Fig. 4b). Notably, Co had a relatively low concentration in the region of cluster 3 and a relatively high concentration in the region of cluster 2. It is interesting to observe the correlation between the metal valence state (especially Co) and the atomic concentration, which is apparent in the combined plot shown in Fig. 7e. For example, it is observed that higher concentrations of Co tended to result in lower Co valence states (cluster 2). The heterogeneous distribution of valence states and the underlying correlation with the atomic ratio of metal species warrant further study. A comparison of the XANES fitting using raw data and denoised data can be found in Supplementary Note 10 and Supplementary Figs. 13 and 14. This example demonstrates how XANES imaging can provide valuable insights into the relationship between synthesis processes and material properties. To this end, we demonstrate that ML techniques could critically improve the data quality to enable detailed analysis that otherwise could not be accomplished by conventional methods.

Fig. 7: Valence (oxidation) state of Co and Ni in LiNi0.85Co0.1Mn0.05O2.
figure 7

a Distribution of Co valence state fitted from Co XANES dataset after denoising by self-supervised learning (after 3 iterations). Reference spectra from CoO and LiCoO2 were used in fitting the Co XANES. b Distribution of Ni valence state fitted from XANES dataset after denoising using the pre-trained neural network. Reference spectra from NiO and LiNiO2 were used in fitting the Ni XANES. c Segmented clusters generated using the K-Mean clustering method that is applied to the Co and Ni valence state map as shown in (a) and (b). d Atomic ratio of Co/(Co + Ni) derived from the thickness fitting of Co and Ni XANES. e Pixel-wise correlation of the Co and Ni valence states. The color of the dots indicates the atomic ratio of Co/(Co + Ni). The three areas indicated by “1”, “2” and “3” refer to the three clusters as shown in (c). Scale bars in (ad) are 5 um.

Discussion on model generality

At present, we utilize a simulated Ni absorption spectrum to train our neural network model. While it may seem tempting to broaden the training dataset to encompass absorption spectra from various elements to enhance the model’s generality, we argue that this is unnecessary. This is primarily because there is no characteristic difference in the simulated absorption spectrum for Ni, Mn, Co, or other elements. By employing the absorption coefficients derived from the Xraylib database, only the step-like edge jump is retained, while all high-frequency features (the wiggling) are discarded. As illustrated in Fig. 4b, the blue curve represents the simulated spectrum, which can mimic the absorption spectrum of any element if we merely adjust the energy value on the x-axis. This explains why the model trained on Ni data can be directly applied to Co, Fe, Mn in most instances, as shown in Supplementary Figs. 810.

In some cases, applying the pre-trained model directly to new data does not produce satisfactory outcomes, as demonstrated in Fig. 6c. We speculate that there are distinct morphological features in the new sample data that were not identified during training. This prompts us to proceed to the second step of training, which we term the “production” step. This is a self-supervised learning step, specifically tailored for the new experimental data without the need for labeled data and ground truth, thereby substantially boosting the model’s transferability.

In summary, we have presented a physics-informed ML method to recover the noisy XANES images typically collected at synchrotron beamlines. The method successfully delivered consistent and robust quantitative analysis of the Ni oxidation states from two measurements with distinct noise levels. Furthermore, we developed a self-supervised machine learning architecture that could adapt to new datasets without introducing additional training data. This approach enhances the signal-to-noise ratio of the spectra from the low-concentrated species, which is beyond the capabilities of conventional techniques. We demonstrated this by successfully characterizing the valence states of low content Co in LiN0.85Co0.1Mn0.05O2, revealing insights into the correlation between composition and valence state of metal species. Looking forward, the proposed ML method not only improves the detection sensitivity of TXM-XANES to a new level but also has the potential to substantially increase the speed of data collection. Conventionally, a pair of the object image and background image are collected at each energy, incurring significant time overhead and sample-registration error from moving the sample in and out of the beam. With the proposed ML method, it is possible to collect the background images at all energies in advance, and then normalize all subsequent XANES images using the pre-taken background image. This approach eliminates the need for moving the sample in and out at each energy, which reduces the time overhead and registration errors and allows for faster data collection. This is particularly useful for experiments studying dynamic changes in materials, such as electrode materials under battery operation at high charging/discharging speeds. It is estimated that the TXM-XANES experiment time can be reduced by half, providing significant benefits to these types of studies. Finally, although we focused our ML approach on TXM-XANES, the idea of exploring the underlying physics and incorporating it into ML will inspire many other scientific imaging analyses to take advantage of the rapid development in machine learning.

Methods

Preparation for Ni(OH)2 + LiOH precursor

The Ni(OH)2 precursor was mixed with LiOH·H2O at a molar ratio of 1:1.05 using an acoustic mixer. The mixed powders were reheated with continuous oxygen flow in a tube furnace at 400 °C for 4 h. A heating rate of 1 °C/min was used to reach the temperature set point. During the preheat treatment, the hydroxide precursors underwent thermal decomposition and partial oxidation of Ni atoms.

Preparation of LiNi0.85Co0.1Mn0.05O2

The secondary-particle precursors of Ni0.85Co0.1Mn0.05(OH)2 were synthesized via a co-precipitation method. Appropriate amounts of NiSO4·6H2O, CoSO4·7H2O and MnSO4·H2O were used to prepare a mixed aqueous solution with a concentration of 2.0 mol l−1. This mixed aqueous solution was pumped into a homemade continuously stirred tank reactor (4 l) under an N2 atmosphere. Concurrently, a 4.0 mol l−1 NaOH solution (aqueous) and the desired amounts of NH4OH solution (aqueous) as a chelating agent were separately pumped into the reactor. The pH value of the precursor solution was kept at 11, the temperature at 60 °C, and the stirring speed at 1000 r.p.m. The precursor powders were obtained by filtering, washing, and vacuum drying in an oven overnight. LiNi0.85Co0.1Mn0.05O2 were prepared by thoroughly mixing the corresponding precursors with appropriate contents of LiOH•H2O (Sigma-Aldrich, ≥99%, the molar ratio of Li:[Ni + Mn + Co] = 1.03:1). This was then followed by identical calcination conditions at 720 °C for 12 h under an oxygen atmosphere.