Introduction

Non-negative matrix factorization (NMF) is an unsupervised machine learning method used for decomposing compressed data. NMF extracts distinct components from related signal sets in various research fields, including signal processing1, biomedical engineering2, pattern recognition3, image engineering4, and so on. NMF differs from principle component analysis (PCA)5 by applying positivity constraints on the extracted components and their weights. It is then attractive for attempting to find components that resemble physical signals in the case where the positivity constraints are expected to hold. In crystallography, NMF has demonstrated significant potential in finding physically plausible structural signals from diffraction data collected from in situ chemical reactions6,7,8. Recently, NMF has also been used for in situ time-dependent diffraction measurements9,10 and spatially resolved electron diffraction maps11, single-layer nanosheets12, integrated multimodal analysis13, and metal-organic frameworks14,15.

In many scientific fields, including chemistry and materials science, the need to analyze data exhibiting stretching phenomena is paramount. For instance, in temperature series experiments, the stretching of peak positions in diffraction patterns or atomic pair distribution function (PDF) data can provide crucial insights into structural changes induced by varying temperatures. Conventional NMF, constrained by the assumption of fixed components, struggles to effectively model such stretching behaviors, resulting in difficulties for researchers to use the extracted mathematical components for subsequent component analysis and structural identification.

To address this limitation, extended NMF models have been proposed. One such model is the Shifted NMF, which accounts for shifts in the onset of a frequency profile, which can be induced by the Doppler effect for spectrometry data16. However, Shifted NMF is not able to solve the temperature series data problem because the change in the component is a stretch, not a shift. Another approach is to incorporate stretching regression steps into the analysis workflow17. There is also a method based on statistical information to obtain stretching and align peak positions, combined with PCA to decompose X-ray diffraction (XRD) data18. Despite these improvements, the NMF algorithm that considers stretching, which is the most direct and effective method, still remains a gap.

In this paper, we propose a new extended NMF model called stretchedNMF to explore a more fundamental aspect of the algorithm itself. We introduce a stretching factor matrix to describe the stretching scales of each component and each component is allowed to have different entire stretching factors at different moments. stretchedNMF can be developed to account for a simple stretching of the measured signal and returns only components that explain variability beyond this stretching.

In this paper, we first develop the mathematical formulas of stretchedNMF in the form of functional optimization. We present the method of discretization and the optimization algorithm. An enhanced version, sparse-stretchedNMF, optimized for powder diffraction data from crystalline materials, leverages signal sparsity for accurate extraction, especially with small stretches. Then using both simulated and real data, we show that stretchedNMF and sparse-stretchedNMF significantly outperforms conventional NMF in the case of diffraction data with thermal expansion. Furthermore, we show that the algorithm may be used to extract different chemical components from the data if there are multiple components that have differential thermal expansivities. This gives an interesting possibility for extracting the components in a multi-phase sample from a temperature dependent measurement of that sample, even when those components are not changing chemically during the measurement. Although we focused on diffraction signals from temperature series data, the algorithm may be used for any case where part of the changes to the signal are exactly, or approximately, a stretch of its dependent variable.

Results

Diffraction use case

Here we test the approach using simulated and also real x-ray powder diffraction (PXRD) data19,20, and atomic pair distribution function (PDF)21 data. PXRD and PDF patterns are continuous 1D signals that encode the 3D arrangement of atoms in a material. PDF data represents the probability distribution of atomic distances, while XRD patterns exist in reciprocal space, meaning that the positions and intensities of its diffraction peaks vary with diffraction angle. We assume a situation where the PDF and PXRD patterns have been measured for samples as a function of temperature and are undergoing thermal expansion, where the thermal expansion coefficient of each phase is different. The thermal expansion causes Bragg peaks in the PXRD, and peaks in the PDF, to change their positions. In principle, thermal expansion can be different along different directions of the crystal, but often it is quite isotropic and appears as a stretching of the pattern where the peak shifts increase with increasing distance along the independent variable axis as required for this algorithm to work. This makes it an interesting use-case for stretchedNMF, though we note that the stretchedNMF may be applied to any series of signals where one aspect of the variability is a continuous stretching on the axis of the independent variable.

An intuitive explanation regarding thermal expansion and the stretching of PDF data is that under thermal expansion, if the overall atomic structure of a material enlarges, then all inter-atomic distances become a multiple a of the original distances, which is reflected in the PDF data as multiplying the horizontal positions of all PDF peaks by a. XRD data functions in reciprocal space. Theoretically, there exists a Fourier transform between the PDF and XRD. Therefore, the stretching factor a in PDF data corresponds to the reciprocal of the stretching factor, 1/a, in XRD data for the same material.

The goal of our testing use-case is to see if we can use NMF in general, but stretchedNMF in particular, to separate the chemical components in a binary chemical mixture where the two components have different thermal expansion coefficients. For example, this could be used by a material scientist to discover the chemical components in a synthesis product by measuring the mixture as a function of temperature and running stretchedNMF on the mixture, where the algorithm returns mathematical components that resemble the PXRD or PDF signals of the actual chemical components. These mathematical components could then be given to algorithms such as the structureMining22 or spacegroupMining23 algorithms that are implemented as a service on the PDFitc.org website24. These algorithms, given an uploaded PDF, will return a rank-ordered list of candidate structures consistent with that PDF. Our test of the algorithm will, therefore, consist of taking either simulated or actual measured data over a wide temperature range from binary mixtures where the components have different thermal expansion coefficients. These signals will be fed to NMF and stretchedNMF to extract two components which will then be analyzed to see if they resemble the signals from the actual chemical components. In the case of the PDF, an interesting test of this is to take the extracted mathematical components and giving them to the structureMining algorithm to see if it correctly identifies the chemical component from the stretchedNMF and conventional NMF extraction.

Simulated data

To evaluate the performance of the stretchedNMF and sparse-stretchedNMF algorithms, we test them on the simulated datasets. The algorithms are designed to be applied to any signal undergoing stretching and we want to understand their performance compared to conventional NMF algorithms. We could choose any simulated signals to do this, but since a primary motivation for the development was to decompose temperature-dependent powder diffraction data we have simulated powder diffraction patterns and PDFs. However, since we are interested in testing the performance when stretches are large and when they are small we simulate data with stretches that are not realistic for real powder data, as well as some more realistic examples later.

We simulate signals by computing the PXRD and the PDF patterns of a weighted sum of a cubic perovskite BaTiO3 and a cubic zinc-blende ZnSe phase. To ensure that both signals contribute comparably we chose an atomic concentration ratio of 0.61:1.00 when initial lattice parameters of BaTiO3=4.18 Å and ZnSe=5.62 Å were used. The crystal structures used for BaTiO3 and ZnSe were from structures reported in25 and26, respectively, and downloaded from the Springer Materials database (https://materials.springer.com/isp/crystallographic/docs/sd_0304044, https://materials.springer.com/isp/crystallographic/docs/sd_1929775).

In this example we considered that the component signals stretched at ratio of rates of αBTO/αZS = 2 across the series of signals, with a further assumption that α does not vary with temperature. Although we are not initially using realistic thermal expansions of these materials, this is the approximate ratio of expansivities of these two materials27,28. In detail, we simulated thermal expansions for αBTO: αZS to be 20:10, 4:2, and 2:1.

We note that to test purely the effects of stretching, which is the basis of the current NMF modification, we fixed and did not vary the atomic displacement parameters (ADPs) that would result in changes in the attenuation of the PXRD Bragg peaks and broadening of peaks in the simulated PDF. Such effects are likely to be present in real data and may require a further modification to the NMF algorithm in the future but this is beyond the scope of the current paper. This set of simulations assumed no phase transition or chemical reaction to be occurring and the relative weights of the components were not varied in the computed dataset.

The PXRD patterns were simulated using Dans-Diffraction29. Pseudo-Voight lineshapes were used. The PDFs were simulated using Diffpy-CMI30. The code used to generate the PDFs can be found at https://github.com/yevgenyr/diffpysim. The static set of parameters used for the simulations is reproduced in Table 1. Representative PXRD and PDF patterns are shown in Fig. 1.

Table 1 The static set of parameters that were used for PXRD and PDF simulations
Fig. 1: Example simulated signals used in the tests.
figure 1

These were from a linear combination of the PDFs of BaTiO3 and ZnSe where each was computed with a linearly expanding cubic lattice parameter. The PDFs are plotted offset from one another expanding from the bottom to the top of the figure. In the curves shown an overall expansion values of 20% and 10% was used from the first to the last curve. Top panel shows the simulated XRD and the bottom panel shows the PDF.

Results on Simulated PDF

First, we compare the performance of the conventional NMF and the stretchedNMF on simulated PDF data. The PDFs were generated by a combination of two components, namely simulated BaTiO3 and ZnSe. The weight coefficients for each component were set as constants. We also assigned different linearly increasing rates for the thermal expansion of BaTiO3 and ZnSe. Specifically, we used artificially generated rates such that BaTiO3 and ZnSe linearly expands from the first PDF to the last with 20% and 10% expansions, respectively.

We then applied the conventional NMF and stretchedNMF methods to extract two components from the simulated PDF data. These could then be compared with the ground-truth PDFs. In principal, any of the ground-truth PDFs could be picked as we apply a stretching factor to the NMF component signal before the comparison. In this study, we selected the ground-truth PDF that resulted in the minimal residual when using only the scale factor variable. We further optimized the agreement between the NMF component and the selected ground-truth PDF by varying both the scale-factor and stretch factor variables.

We first evaluated the outcomes of the conventional NMF approach. These findings are illustrated in Fig. 2a–d and Table 2.

Fig. 2: Comparison of NMF-extracted PDF signals (red) and ground truth PDFs (blue).
figure 2

The top 2 × 2 block (ad) shows the extraction for the conventional NMF algorithm and the bottom block (eh) is the extraction for the stretchedNMF algorithm. The exact curves are indicated by the legend in each case, where Truth I and Truth II are the simulated ground truth curves I and II and Comp I and II indicate the first and second extracted components in each case.

Table 2 Results of the comparison between the NMF extracted components and the ground-truth PDFs on simulated PDF data test with 20% and 10% expansions

Figure 2 (a–d) depicts the resulting PDFs in a matrix layout, with the NMF extracted components being represented as rows (in red) and the ground-truth PDFs as columns (in blue). The difference curves (ground-truth - NMF component) are plotted below in green. Large residuals and large Rw factors are evident between all the NMF components and the ground-truth curves and the NMF extraction has failed to produce components that resemble the actual signals. This is not surprising since the weights of the two components are not varying in the test.

The same test was applied using the stretchedNMF algorithm and the results are shown in Fig. 2e–h and Table 2. In this case we can see that the stretchedNMF extracted signal I is closely related to the ground truth component I and likewise for the component II. This is evident as a very flat difference curve in Fig. 2e, h and small Rw for these pairings in Table 2.

This shows that even in the absence of changes in component weights the stretchedNMF algorithm can extract components just from a differential stretching of the structure signal.

Results on simulated PXRD

We carry out the same comparison of NMF vs stretchedNMF for the case of powder diffraction signals. Similar to the simulated PDF case, the data comprise of a combination of simulated BaTiO3 and ZnSe, where BaTiO3 and ZnSe have 20% and 10% linearly varying expansions, respectively.

The results of the comparison are presented in Fig. 3 and Table 3.

Fig. 3: NMF and stretchedNMF solutions on simulated PXRD data.
figure 3

Truth I is the simulated powder diffraction pattern for BaTiO3 and Truth II is for ZnSe. The blue and red curves represent the true PDF and the extracted NMF components, respectively. ad show the comparisons between components extracted with conventional NMF and (eh) those extracted using the stretchedNMF algorithm.

Table 3 Comparison between the NMF extracted components and the ground-truth PXRDs on the simulated PXRD data set with 20% and 10% expansions

As is evident in Fig. 3a–d, none of the extracted conventional NMF components resemble ground-truth curves. Again, this is not a surprise because the weights of the components are not changing. However, for the stretchedNMF extraction we see that the first extracted component (Comp I) corresponds well the BaTiO3 pattern (Truth I), and the second extracted component (Comp II) corresponds well to the ZnSe diffraction pattern (Truth II) (Fig. 3e, h).

As with the simulated PDF data, the stretchedNMF algorithm can extract components resembling the physical signals from a phase mixture where the weights are not changing but there is a variable thermal expansion.

Results on simulated PDF and PXRD data with small expansion coefficients

The tests above show that even in the presence of large stretches of signals stretchedNMF can automatically extract signals that resemble real physical signals whereas conventional NMF cannot, at least in the case where the component weights are not changing.

We now would like to see how well stretchedNMF can perform for smaller stretching factors, for example, for magnitudes that might occur in physical systems due to thermal expansion. The simulated data is still taken as the combination of BaTiO3 and ZnSe. The weights are set to constants as before. However, in this example we set the thermal expansion rates of BaTiO3 and ZnSe to 4% and 2%, respectively. Both simulated PDF and PXRD are tested.

First, we compare the performance of the conventional NMF and the stretchedNMF on simulated PDF data. The results are presented in Table 4 and Fig. 4.

Table 4 Results of the comparison between the NMF extracted components and the ground-truth PDFs on simulated PDF data sets with 2% expansions on ZnSe
Fig. 4: NMF and STRETCHEDNMF solutions on simulated PDF data with 2% expansions on ZnSe.
figure 4

Extracted components compared to the ground-truth components for PDFs computed from the structures of BaTiO3 (blue curve in a and c) and ZnSe (blue curve in b and d). Extracted signals done with conventional NMF (red curves in a and b) and stretchedNMF (red curves in c and d).

Unlike the previous figures we just plot the agreement of the extracted component and the ground-truth curve that shows the best agreement. The poor performance of the conventional NMF is evident in Fig. 4a, b, whereas again, even for this much smaller stretch, the stretchedNMF algorithm still gives a good extraction of the physical components (Fig. 4c, d).

We get the same overall result for the test on simulated PXRD data as for the PDF data. The results are shown in Fig. 5 and Table 5. Again, stretchedNMF gives a very good extraction of the physical components even for this small relative expansion coefficient (Fig. 5c, d) whereas conventional NMF does not (Fig. 5a, b)

Fig. 5: NMF and STRETCHEDNMF solutions on simulated PXRD data with 2% expansions on ZnSe.
figure 5

Extracted components compared to the ground-truth components for powder PXRD patterns computed from the structures of BaTiO3 (blue curve in a and c) and ZnSe (blue curve in b and d). Extracted signals done with conventional NMF (red curves in a and b) and stretchedNMF (red curves in c and d).

Table 5 Results of the comparison between the NMF extracted components and the ground-truth PXRDs on simulated PXRD data test with 2% expansions on ZnSe

The results are less ideal when the expansion rates are reduced further to BaTiO3 and ZnSe changing linearly from 1 to 1.02 and 1.01, respectively. The results are summarized in Table 6 and Fig. 6.

Table 6 Results of the comparison between the NMF extracted components and the ground-truth PXRDs on simulated PXRD data test with a 1% differential expansion between the components
Fig. 6: Comparison of NMF-extracted solutions on simulated PXRD data with 1% expansions on ZnSe.
figure 6

Extracted components compared to the ground-truth components for powder PXRD patterns with 2% and 1% expansions, computed from the structures of BaTiO3 (blue curve in a, c, and e) and ZnSe (blue curve in b, d, and f). Extracted signals done with conventional NMF (red curves in a and b), stretchedNMF (red curves in c and d) and sparse-stretchedNMF (red curves in e and f).

At this level of expansion, even the stretchedNMF is not correctly extracting the physical components. For example, it incorrectly assigns peaks in the spectrum of its extracted components in red at around Q = 1.5, 2, and 2.5 Å−1 (Fig. 6c, d). These same peaks are partially misassigned by the conventional NMF algorithm.

However, the sparse-stretchedNMF algorithm does a good job of extracting physical components from the powder PXRD simulations (Fig. 6e and f) even in this challenging case with a relatively small (1%) differential expansion. The components of sparse-stretchedNMF are close to ground truths. This indicates that sparse-stretchedNMF can enhance the performance of stretchedNMF.

These tests show that the stretchedNMF algorithm is able to extract physically meaningful PDF and PXRD signals from sets of data where the signals are unchanged except for a different relative stretch between the two curves. If there is a large differential change in lattice parameter across the dataset stretchedNMF can still extract ground-truth PDF and PXRD signals. For relative stretches of a few percent, comparable to what might be expected for a mixture of compounds with a differential thermal expansion, this is also true for both PDF and PXRD data. When the differential thermal expansion gets to around 1%, stretchedNMF starts to struggle to extract physical components. However, for PXRD data the sparse-stretchedNMF algorithm still performs well. We note that the PDF data is not sparse, and therefore sparse-stretchedNMF algorithm is applied only to PXRD data.

We should note that in these ground-truth tests on simulated data we wanted to test how well stretchedNMF can handle datasets that contain stretches, for example, as might come from thermal expansion. We, therefore, did not include in the simulation other effects of temperature changes, such as increases in atomic displacement factors (ADPs). In principle, we would like to develop a new algorithm that can eliminate changes in ADP in the same way as stretchedNMF eliminates stretches. This problem will be left for a future paper. Preliminary tests on simulated data with combined stretching and increased-ADP effects indicate that stretchedNMF and sparse-stretchedNMF still perform reasonably well and clearly outperform the conventional NMF algorithm, but with larger errors than in the constant-ADP tests reported here. Despite this known shortcoming, we would still like to see whether stretchedNMF and sparse-stretchedNMF can perform well on experimental data from a variable temperature experiment, and this is discussed in the following section.

Experimental PXRD - thermal expansion

To test the stretchedNMF and sparse-stretchedNMF algorithms on real data, we use part of an in situ solid-state synthesis reaction dataset where no phase-transition or chemical reaction occurred but which spanned a rather broad temperature range. This allows us to evaluate how the algorithms perform for the effect of thermal expansion of a phase mixture from real data. The PXRD experiment was done at the 28-ID-2 beamline (XPD instrument) at the NSLS-II facility at Brookhaven National Laboratory. A large area 2D Perkin Elmer detector was used to acquire the data. To gain high spectral resolution in the PXRD, the distance between the sample and the detector was set to 144 cm. The beam wavelength was 0.1949 Å.

A stoichiometric mixture of 2:1 YOCl (>98% tetragonal phase) and MgMn2O4 (spinel phase) was uniformly mixed and sealed in a quartz capillary. It was then heated in a gradient furnance, meaning that each location on the quartz tube had a different temperature31. The absolute temperatures at each point along the sample were calibrated from the lattice expansion of a known calibration material, Ni. The data went from a low temperature of 368C to a highest temperature of 668C with a total of 20 individual temperature points. Using ‘pyFAI’32, the collected 2D diffraction patterns were then cleaned by masking the beam-stop and over-bright/dead pixels, followed by an azimuthal integration to gain 1D PXRD patterns. The 1D PXRD data was then used as inputs to the different NMF algorithms.

Multi-component Rietveld refinements were carried out and indicate that the chemical components in this reaction are MgMn2O4, orthorhombic YMnO3, and rhombohedral and tetragonal YOCl (rYOCl and tYOCl, respectively) where MgMn2O4 and tYOCl are the dominant phases. The results of the Rietveld refinements for the two majority phases were used as ground truth against which to compare the performance of the NMF algorithms.

The results are shown in Fig. 7 and the resulting Rw and PC are listed in Table 7.

Fig. 7: Solutions on real PXRD data using conventional NMF, stretchedNMF, and sparse-stretchedNMF.
figure 7

The first row (a, b, and c) shows MgMn2O4 and the second row (d, e, and f) shows tYOCl. The blue curves are the diffraction patterns obtained for those phases by multi-phase Rietveld refinement and are used as the desired component signal. The red curves represent the extracted NMF components. Extracted signals done with conventional NMF (red curves in a and d), STRETCHEDNMF (red curves in b and e) and SPARSE-STRETCHEDNMF (red curves in c and f).

Table 7 Results of the comparison between the NMF extracted components and the ground truth from Rietveld refinement on real PXRD data test

In Fig. 7, the blue curves in the top row (a, b, c) are from the diffraction pattern of MgMn2O4 and the blue curves in the bottom row (d, e, f) are from tYOCl. The red curves in each panel show the relevant extracted component from the NMF algorithm used. The columns are sorted by the NMF algorithm used. The first column (a, d) used regular NMF, the second (b, e) used the stretchedNMF algorithm, and the third column (c, f) used the sparse-stretchedNMF algorithm.

All NMF solvers give reasonable results for the tYOCl chemical component. The peak positions are consistent with the ground truth, and the inconsistency of intensity is acceptable. But for MgMn2O4, the NMF and stretchedNMF derived components are poor. They are much better using the sparse-stretchedNMF algorithm, which gives better agreement both visually and in terms of the Rw between the ground-truth and the extracted components. For this case, from the perspective of separation ability, sparse-stretchedNMF is superior to stretchedNMF which is superior to the conventional NMF in this test.

The scaled weights from all NMF solvers are compared to the weights from Rietveld refinement which can be considered as ground-truth. The results are shown in Fig. 8.

Fig. 8: Weights comparison on real PXRD data.
figure 8

The component weights vs. temperature of the data for (a) MgMn2O4 and (b) tYOCl. The results obtained by a multi-phase Rietveld refinement can be considered as ground-truth and are shown as the dashed purple line. The conventional NMF (red line) cannot capture the temperature evolution of the chemical components. stretchedNMF (green) does much better and sparse-stretchedNMF (blue) yields almost exactly the Rietveld result.

The weights of the chemical components do no show stark changes, but only fluctuate about 5% around a constant average and so we would expect the weights to be largely independent of temperature. The conventional NMF clearly does not return constant weights and is getting confused by the thermal expansion in the data. The stretchedNMF and sparse-stretchedNMF methods do yield almost constant weights. Rietveld refinements were carried out on these data-sets and can be treated as a ground-truth. The results of the Rietveld refinement are shown as the dashed curve. stretchedNMF is doing quite well, but sparse-stretchedNMF is doing very well in reproducing the results of the Rietveld refinement.

Experimental PXRD - thermal expansion and reaction

We also tested the NMF algorithms on another PXRD dataset, but this time, where a solid-state chemical reaction happened together with the thermal expansion so that the weights of the components as well as the thermal expansion were varying during the experiment. The data were measured as the temperature changed from 28 C to 370 C in 215 steps during the reaction of

$${{\rm{CuCl}}}_{2}+{{\rm{Na}}}_{2}{{\rm{Se}}}_{2}\longrightarrow {{\rm{CuSe}}}_{2}+2{\rm{NaCl}}.$$

Here the components involved in the reaction are NaCl, CuSe, Cu2Se, Se, pyrite, and marcasite, as determined by a multi-phase Rietveld refinement33 on the full dataset carried out previously, where Rietveld refinement is a process of local optimization of a structural model to give the best agreement of between calculated and measured PXRD patterns. The full details of experiment are published in ref. 34.

The top panel in Fig. 9 depicts the measured PXRD data obtained during the in situ reaction experiment. At a position of 2.2 Å−1, a peak exhibiting an expansion coefficient of approximately 2% is observed. This expansion coefficient closely matches the 2% expansion coefficient we set in our tested simulations in section “Results on simulated PDF and PXRD data with small expansion coefficients”.

Fig. 9: The upper subplot shows the 215 experimental raw data PXRD curves, offset for clarity.
figure 9

The subplot below plots the ground-truth curves obtained from multi-phase Rietveld refinements34 on the left and the corresponding Rietveld extracted weights in the right hand column, both in blue. The sparse-stretchedNMF extracted components and weights are overlaid in the respective panel in red.

The curves obtained by a multi-phase Rietveld refinement fit34 are shown in blue in the panels below. The Rietveld refined phase weights are shown in blue in the right hand column below34. The components extracted from a sparse-stretchedNMF decomposition are shown in red, plotted on top of the ground-truth components, and the extracted weights are shown in red on top of the Rietveld extracted weights in the right hand column. The results are very good and indicate that, except for Se, the components obtained from sparse-stretchedNMF matched well with the ground truth in terms of peak positions, as do the extracted weights. The majority of the extracted weights exhibit the same increasing or decreasing trends as the weights obtained from Rietveld refinements, where except for NaCl and CuSe that show some stronger deviations, four of the components show almost a perfect match between the SNMF weights and those obtained by Rietveld refinements.

This shows that the sparse-stretchedNMF algorithm can be used as a rapid way to extract reliable components and weights from data collected at different temperatures. This approach can be very helpful looking at large amounts of data very rapidly as it is being collected to look for known phases and unknown phases without having to carry out a complex multicomponent Rietveld campaign in real time.

Discussion

This paper presents a novel functional optimization model called stretchedNMF, which is an extension to the traditional NMF model. The initial experimental results indicate that for data where stretches in the signal are observed, such as diffraction data where thermal expansion has taken place, the proposed stretchedNMF model outperforms the conventional NMF. This is true even for PXRD and PDF data with small stretching degrees corresponding to realistic thermal expansivities. However, a further enhancement to stretchedNMF, which makes use of the sparsity of powder diffraction patterns, called sparse-stretchedNMF allows correct extractions even for very small stretches where stretchedNMF struggled.

Assumptions of the algorithm are that signal stretches are uniform on all features in the signal, as would be the case for an isotropic thermal expansion. Strictly speaking, the algorithm is not valid for the case of anisotropic expansions. However, it can be expected to perform better than conventional NMF even when there is anisotropic thermal expansion, especially if it is small. For example, in Fig. 9, we find a very good match to the Rietveld refinements of Marcasite, which is orthorhombic. More work is needed to establish the robustness of the algorithm in these cases. However, we note that it is not the goal of the work to replace a fully quantitative model dependent analysis of the data, such as a multi-phase Rietveld refinement, but to give a useful rapid, model independent, assessment of large sets of data that can help guide any later model dependent analysis.

Numerous factors influence the anisotropic expansion of materials, with symmetry playing a crucial role. Lower symmetry crystal systems are more prone to displaying significant anisotropies during expansion. Moreover, materials with low dimensionality, like layered materials, are highly inclined towards exhibiting anisotropic expansion behavior. A deeper study into the significance of symmetry, especially regarding its influence on anisotropic expansion in low-symmetry crystals and layered materials will be the topc of future work.

At the algorithmic level, incorporating additional transformations beyond stretching may help explore the anisotropic expansion further. We note that the current model only considers stretching, adding shift transforms it into a first-order polynomial transformation. In this case, only a new block is added to the computation, but a better approximation can be obtained. Incorporating higher-order polynomial transformations could further balance the computational and approximative accuracy of the model. Further research is needed to investigate and optimize the stretchedNMF model’s potential in overcoming these challenges.

We also note that experimental noise can affect the outcome. This has not been studied in detail in this paper, but we note that we obtained good results from real data that included noise. To further address the noise issue, different regularization techniques can be utilized.

Finally, we note that although the motivation for the development, and all the tests, were on diffraction data where underlying structures have undergone thermal expansion, the stretchedNMF algorithm will work on any signal decomposition that smooth continuous variations in a stretching fact as a characteristic of the signal and it is not limited to use on diffraction data.

Methods

Stretched non-negative matrix factorization

Non-negative matrix factorization (NMF) is a mathematical tool to approximate a given matrix \(Z\in {{\mathbb{R}}}^{N\times M}\) by the product of two low-rank non-negative matrices,

$$Z\,\approx\, XY,$$
(1)

where \(X\in {{\mathbb{R}}}^{N\times K}\) and \(Y\in {{\mathbb{R}}}^{K\times M}\), and KN, M35. Its description and use are described in detail in multiple places36,37. The common NMF model uses the square of Euclidian distance (SED) as the objective function, and the corresponding optimization problem is written as

$$\begin{array}{rcl}&\mathop{\min }\limits_{X\in {{\mathbb{R}}}^{N\times K},Y\in {{\mathbb{R}}}^{K\times M}}&\frac{1}{2}{\left\Vert XY-Z\right\Vert }_{F}^{2},\\ &\,\text{s.t.}\,&X\ge 0\,\,\text{and}\,\,Y\ge 0.\end{array}$$
(2)

Similar to principal component analysis38, the NMF decomposition will find components that explain variability in the signals in the set of data. Unlike PCA, a constraint of positivity is applied to both the components and the weights. Since many real physical signals, and their weights, obey positivity, NMF is more likely to find components that resemble signals from different physical components contributing to a compound signal coming from multiple sources. As such, it is finding extensive use in scientific applications9,39,40.

Here we address a situation where one aspect of the variability, a stretching of the signal on the axis of its independent variable, is not of scientific interest, for example, due to the thermal expansion of a material affecting its diffraction pattern. We formulate an approach named stretchedNMF which extends the conventional NMF decomposition whilst accounting for the stretching in the algorithm.

Suppose the experimental signals, which are columns of Z, zm for m = 1…M, and the components, which are columns in X, xk for k = 1…K, are continuous functions of an independent variable r. Then the conventional NMF optimization problem may be written as

$$\min\limits_{y_k^m\geq 0,x_k\geq 0} \quad {\sum\limits_{m=1}^M} \left\|\sum\limits_{k=1}^K y_k^m x_k(r)-z^m(r) \right\|_{L_2}^2,$$
(3)

where \({y}_{k}^{m}\) is the weight of the kth component at the mth position in the dataset. Now, we assume that there is an m-dependent stretching of the signal along the r axis. The component signals stretch with component-dependent rates that we capture in a stretching factor, \({\{{a}_{k}^{m}\}}_{m = 1,\ldots ,M}\). We add the stretching factors \({a}_{k}^{m}\) into Eq. (3) and the optimization problem becomes

$$\min\limits_{a_k^m\geq 0,y_k^m\geq 0,x_k\geq 0} \quad{\sum\limits_{m=1}^M} \left\|\sum\limits_{k=1}^K y_k^m x_k\left(r/a_k^m\right)-z^m(r) \right \|_{L_2}^2.$$
(4)

Notice that if \({a}_{k}^{m} \,>\, 1,{x}_{k}\) is stretched, and if \({a}_{k}^{m} \,<\, 1,{x}_{k}\) is compressed. In practice, we consider a finite r range \([0,{r}_{\max }]\). Therefore, without loss of generality, we define xk(r) = 0 for \(r\ge {r}_{\max }\). Thus, when \({a}_{k}^{m} \,>\, 1,{x}_{k}(r/{a}_{k}^{m})=0\) for \(r\ge {r}_{\max }/{a}_{k}^{m}\). Now we are able to expand the L2 norm in Eq. (4) as an integral over the r range as

$$\begin{array}{rcl}&\mathop{\min }\limits_{{a}_{k}^{m}\ge 0,{y}_{k}^{m}\ge 0,{x}_{k}\ge 0}&\mathop{\sum}\limits_{m = 1}^{M}\mathop{\int}\nolimits_{\!0}^{{r}_{\max }}{\left(\mathop{\sum}\limits_{k = 1}^{K}{y}_{k}^{m}{x}_{k}(r/{a}_{k}^{m})-{z}^{m}(r)\right)}^{2}dr,\\ &\,\text{s.t.}\,&{x}_{k}(r)=0,\,\text{if}\,\,r\ge {r}_{\max }.\end{array}$$
(5)

For fixed component \(k,{\{{a}_{k}^{m}\}}_{m = 1,\cdots ,M}\) is a series of stretching factors, which usually change smoothly with time m. However, the optimization problem in Eq. (5) is non-convex, and hence the smoothness of \({\{{a}_{k}^{m}\}}_{m = 1,\cdots ,M}\) may be violated when we solve it numerically. Therefore, we add a regularization term to the objective function to make it favor smooth ak, i.e.,

$$\begin{array}{rcl}&\mathop{\min }\limits_{{a}_{k}^{m}\ge 0,{y}_{k}^{m}\ge 0,{x}_{k}\ge 0}&\mathop{\sum}\limits_{m = 1}^{M}\mathop{\int}\nolimits_{\!0}^{{r}_{\max }}{\left(\mathop{\sum}\limits_{k = 1}^{K}{y}_{k}^{m}{x}_{k}(r/{a}_{k}^{m})-{z}^{m}(r)\right)}^{2}dr\\ &&+\rho \mathop{\sum}\limits_{k = 1}^{K}\mathop{\sum}\limits_{m = 1}^{M-2}{({a}_{k}^{m}-2{a}_{k}^{m+1}+{a}_{k}^{m+2})}^{2},\\ &\,\text{s.t.}\,&{x}_{k}(r)=0,\,\text{if}\,\,r\ge {r}_{\max },\end{array}$$
(6)

where \(\mathop{\sum }\nolimits_{k = 1}^{K}\mathop{\sum }\nolimits_{m = 1}^{M-2}{({a}_{k}^{m}-2{a}_{k}^{m+1}+{a}_{k}^{m+2})}^{2}\) is the smoothness regularization and ρ is the parameter to control the effect of regularization. In our numerical testing section, we initiate a large ρ and gradually decrease it in subsequent iterations.

Numerical solution of stretchedNMF

In this section, we describe the numerical implementation of the stretchedNMF.

In order to numerically solve the functional optimization problem Eq. (6), we discretize the functionals and solve the corresponding vector optimization problem. Unlike Shifted NMF16, we cannot get benefits from discretizing the frequency domain of the components after applying the Fourier transform. So we choose to discretize the problem in the real r space, without loss of generality, using a uniform grid on \([0,{r}_{\max }]\). Since we have introduced the stretching factors, when we discretize the functionals \({x}_{k}(r/{a}_{k}^{m})\), on this uniform grid the arguments \(r/{a}_{k}^{m}\) are actually not on the grid nodes. Therefore, we apply a spline interpolation, that is we approximate \({x}_{k}(r/{a}_{k}^{m})\) from xk(r), where the interpolant is a piecewise polynomial. In terms of the order of the spline, we need at least a quadratic order, i.e., a piecewise quadratic polynomial with continuous derivatives on the grid points. The smoothness of the spline will help the convergence of the discretized optimization solution. In this paper, we use a quadratic spline interpolation to approximate \({x}_{k}(r/{a}_{k}^{m})\) in the optimization problem Eq. (6). Explicitly, let 0 = r0 < r1 < ⋯ < rN = rmax be the uniform grid nodes, resulting in an interval of h = rmax/N. The quadratic piecewise polynomial approximation, Si(r), of x(r) for r ∈ [ri, ri+1] is

$${S}_{i}(r)={q}_{i}(r-{r}_{i})(r-{r}_{i+1})+\left[x({r}_{i+1})-x({r}_{i})\right](r-{r}_{i})/h+x({r}_{i}),$$
(7)

where qi is the quadratic coefficient to be determined. The derivatives of the polynomials Si(r) and Si+1(r) are

$${S}_{i}^{{\prime} }(r)={q}_{i}(2r-{r}_{i}-{r}_{i+1})+\left[x({r}_{i+1})-x({r}_{i})\right]/h,$$
(8)
$${S}_{i+1}^{{\prime} }(r)={q}_{i+1}(2r-{r}_{i+1}-{r}_{i+2})+\left[x({r}_{i+2})-x({r}_{i+1})\right]/h.$$
(9)

Notice the fact that the second-order spline should have continuous derivatives over the entire domain, which means that \({S}_{i}^{{\prime} }({r}_{i+1})={S}_{i+1}^{{\prime} }({r}_{i+1})\) at positions ri+1 for i = 0, …, N − 2, using Eqs. (8) and (9), we get

$${q}_{i}+{q}_{i+1}=[x({r}_{i})-2x({r}_{i+1})+x({r}_{i+2})]/{h}^{2}.$$
(10)

Since we have x(r) = 0, for r ≥ rmax, we set SN−1(rN) = 0 and \({S}_{N-1}^{{\prime} }({r}_{N})=0\). Then we can write q as

$$\left(\begin{array}{c}{q}_{0}\\ {q}_{1}\\ \vdots \\ {q}_{N-1}\end{array}\right)=\frac{1}{{h}^{2}}{\left(\begin{array}{ccccc}1&1&&&\\ &1&1&&\\ &&\ddots &\ddots &\\ &&&1&1\\ &&&&1\end{array}\right)}^{-1}\left(\begin{array}{ccccc}1&-2&1&&\\ &1&-2&1&\\ &&\ddots &\ddots &\ddots \\ &&&1&-2\\ &&&&1\end{array}\right)\left(\begin{array}{c}x({r}_{0})\\ x({r}_{1})\\ \vdots \\ x({r}_{N-1})\end{array}\right).$$
(11)

Now we can write \({x}_{k}(r/{a}_{k}^{m})\) in terms of xk(ri) as a linear transformation

$${x}_{k}(r/{a}_{k}^{m})={q}_{i}(r/{a}_{k}^{m}-{r}_{i})(r/{a}_{k}^{m}-{r}_{i+1})+[x({r}_{i+1})-x({r}_{i})](r/{a}_{k}^{m}-{r}_{i})/h+x({r}_{i}),$$
(12)

if \(r/{a}_{k}^{m}\in [{r}_{i},{r}_{i+1}]\) and \({x}_{k}(r/{a}_{k}^{m})\) is set to zero if \(r/{a}_{k}^{m}\ge {r}_{max}\). Since the leading coefficient q is also linearly dependent on x as shown in Eq. (11), we can conclude the linear transformation \({x}_{k}({r}_{i}/{a}_{k}^{m})={b}_{i,{a}_{k}^{m}}^{T}{x}_{k}\) for i = 0, 1, …, N in a matrix form

$$\left(\begin{array}{c}{x}_{k}({r}_{0}/{a}_{k}^{m})\\ {x}_{k}({r}_{1}/{a}_{k}^{m})\\ \vdots \\ {x}_{k}({r}_{N}/{a}_{k}^{m})\end{array}\right)=\left(\begin{array}{ccc}\cdots \,&{b}_{0,{a}_{k}^{m}}^{T}&\cdots \\ \cdots \,&{b}_{1,{a}_{k}^{m}}^{T}&\cdots \\ &\vdots &\\ \cdots \,&{b}_{N,{a}_{k}^{m}}^{T}&\cdots \end{array}\right)\left(\begin{array}{c}{x}_{k}({r}_{0})\\ {x}_{k}({r}_{1})\\ \vdots \\ {x}_{k}({r}_{N})\end{array}\right),$$
(13)

and denote the coefficient matrix as \({B}_{{a}_{k}^{m}}\). Now we are ready to write the discretization of the optimization problem in Eq. (6) as

$$\mathop{\min }\limits_{{a}_{k}^{m}\ge 0,\,{y}_{k}^{m}\ge 0,\,{x}_{k}\ge 0}\mathop{\sum}\limits_{m = 1}^{M}{\left\Vert \mathop{\sum}\limits_{k = 1}^{K}{y}_{k}^{m}{B}_{{a}_{k}^{m}}{x}_{k}-{z}^{m}\right\Vert }^{2}+\rho \mathop{\sum}\limits_{k = 1}^{K}\mathop{\sum}\limits_{m = 1}^{M-2}{({a}_{k}^{m}-2{a}_{k}^{m+1}+{a}_{k}^{m+2})}^{2},$$
(14)

where \({y}_{k}^{m},{x}_{k}\) and zm are discretized functionals on the uniform grid \(0={r}_{0} < {r}_{1} < \cdots < {r}_{N}={r}_{\max }\).

If the theoretical convergence is neglected, linear interpolation may be used as an approximation. In this case, we set qi = 0 in Eq. (12). The final form of the optimization problem is still Eq. (14), but with a different \({B}_{{a}_{k}^{m}}\) with higher sparsity.

Among the existing methods, a popular approach to solve the conventional NMF is alternating non-negative least squares (ANLS)41,42,43,44,45. ANLS alternatively adjusts X and Y to minimize the objective function and each subproblem can be solved by the non-negative linear least square method. In fact, this framework is also called the block coordinate descent (BCD) method with two blocks. In our problem Eq. (14), which can be simplified as

$$\mathop{\min }\limits_{A\ge 0,Y\ge 0,X\ge 0}f(A,Y,X),$$
(15)

there are three blocks A, Y and X. Applying the BCD method with three blocks, we can solve problem Eq. (15) using Algorithm 1.

Algorithm 1

Block Coordinate Descent (BCD) Method

1: for t = 1, 2, ⋯ do

2: \(A:= \arg \mathop{\min }\nolimits_{A\ge 0}f(A,Y,X)\)

3: \(Y:= \arg \mathop{\min }\nolimits_{Y\ge 0}f(A,Y,X)\)

4: \(X:= \arg \mathop{\min }\nolimits_{X\ge 0}f(A,Y,X)\)

5: end for

Similar to conventional NMF, the subproblems of Y and X in Lines 3 and 4 are convex quadratic programming problems that can be easily solved by existing solvers. But the subproblem of A in Line 2 is highly non-convex and therefore we cannot efficiently solve it for its global minimum. In practice, we use a subspace trust-region method46 to find a local minimum.

The convergence of the BCD method for 3 blocks is not guaranteed47. Here we use an algorithm that can guarantee its convergence for a quadratic spline approximation that is called the linearized block coordinate descent method48. The outline of the framework is presented in Algorithm 2, where αt is the step size and \(\hat{A}/\hat{X}/\hat{Y}\) are the extrapolations of the current A/X/Y and previous A/X/Y. In each iteration, the algorithm randomly chooses one block and minimizes the corresponding linear approximation and a proximal term. One can refer to ref. 48 for more information about parameter selections.

Algorithm 2

linearized block coordinate descent method

1: for t = 1, 2, ⋯ do

2: pick one of the following to implement in a deterministic or random manner;

3: \(\begin{array}{rcl}A:&=&\arg \mathop{\min }\nolimits_{A\ge 0}\langle {\nabla }_{A}\,f(\hat{A},Y,X),A\rangle +\frac{1}{{\alpha }_{t}}\parallel A-\hat{A}{\parallel }^{2}\\ &=&\max \{\hat{A}-\frac{{\alpha }_{t}}{2}{\nabla }_{A}\,f(\hat{A},Y,X),0\}\end{array}\)

4: \(\begin{array}{rcl}Y:&=&\arg \mathop{\min }\nolimits_{Y\ge 0}\langle {\nabla }_{Y}f(A,\hat{Y},X),Y\rangle +\frac{1}{{\alpha }_{t}}\parallel Y-\hat{Y}{\parallel }^{2}\\ &=&\max \{\hat{Y}-\frac{{\alpha }_{t}}{2}{\nabla }_{Y}f(A,\hat{Y},X),0\}\end{array}\)

5: \(\begin{array}{rcl}X:&=&\arg \mathop{\min }\nolimits_{X\ge 0}\langle {\nabla }_{X}\,f(A,Y,\hat{X}),X\rangle +\frac{1}{{\alpha }_{t}}\parallel X-\hat{X}{\parallel }^{2}\\ &=&\max \{\hat{X}-\frac{{\alpha }_{t}}{2}{\nabla }_{X}\,f(A,Y,\hat{X}),0\}\end{array}\)

6: end for

In Algorithm 2, the gradient projection method is utilized in each iteration, which is a common technique in the ANLS framework for solving the classical NMF problem. The primary computational cost of this method lies in computing gradients, rendering the computation relatively lightweight. Specifically, the computational complexity of computing the gradient for A/X/Y is O(NMK), where the constant factor is influenced by the interpolation method. For example, by employing linear interpolation instead of quadratic spline, it can avoid the computation in Eq. (11) and thus reduce the computational cost. On one hand, due to the non-convex and non-quadratic nature of the subproblem of A, the gradient projection technique is quite suitable. On the other hand, in the updates of X/Y, the computational cost in each iteration using gradient projection is much smaller compared to other methods such as the interior-point method49 (O((N + MK)M2K2)).

When considering the overall convergence of the algorithm, the reference48 guarantees a sublinear convergence rate theoretically, while in the numerical experiments of this paper, it is observed that the algorithm can achieve the desired results within a reasonable number of iterations.

Algorithm developments

In the case of PDF data, we apply stretchedNMF to time-series data according to the workflow shown in the chart in Fig. 10. A common experimental function (for example, the output of xPDFsuite50 and PDFgetX351, is the G(r) function52. This function oscillates above and below zero. NMF works on the basis that signals are positive and in order to avoid the loss of signal where the function goes negative, we need to modify the signal into a non-negative form. Here we use an offset method, by taking the smallest of all data values and adding its absolute value to all data. This approach has the advantage of being simple and has been successfully applied to the deep learning method of PDF analysis9.

Fig. 10: The workflow of STRETCHEDNMF.
figure 10

The stretchedNMF workflow of the PDF test.

After running the NMF solvers, we must restore the components to valid G(r) functions (oscillating around zero). To do this we utilize the solved weights and stretching factors to recover the components according to

$$\mathop{\min }\limits_{{x}_{k}}\mathop{\sum}\limits_{m = 1}^{M}{\left\Vert \mathop{\sum}\limits_{k = 1}^{K}{y}_{k}^{m}{B}_{{a}_{k}^{m}}{x}_{k}-{z}^{m}\right\Vert }^{2},$$
(16)

where, zm is the original data rather than the data after the offset pre-processing and the other symbols are described alongsided Eq. (14). The weight, y, and stretching factors, a, are fixed to be those obtained from the NMF solution, and we remove the constraint that the components must be non-negative. Functions resembling G(r) are then recovered from the NMF components and may be fit using standard PDF modeling protocols. This is reasonable because it is based on our trust in the weights and stretching factors of the NMF solver’s solution of the preprocessed data. This approach is highly automated and can be applied to both conventional NMF and stretchedNMF, because the stretching factor of the conventional NMF is always 1.

For the case of PXRD data from highly crystalline samples, we have the additional observation that the diffraction patterns consists of a sparse set of sharp peaks. That is, the function value is zero in between the Bragg peaks (neglecting backgrounds and any diffuse scattering). We can make use of this property to enhance our ability to decompose signals by adding a sparse regularization term to the optimization problem. For the case where there are smooth backgrounds in experimental PXRD data, the background can be easily and automatically eliminated to make the data sparse. In this case, we make two modifications to the optimization problem in Eq. (14). The first is adding the l1/2 sparse regularization term to x53. The second is adding an upper bound on y, in order to prevent x from collapsing to zero as a whole, resulting in

$$\begin{array}{ll}\,\mathop{\min }\limits_{{a}_{k}^{m}\ge 0,\,0\le {y}_{k}^{m}\le 1,\,{x}_{k}\ge 0}\,\mathop{\sum}\limits_{m = 1}^{M}{\left\Vert \mathop{\sum}\limits_{k = 1}^{K}{y}_{k}^{m}{B}_{{a}_{k}^{m}}{x}_{k}-{z}^{m}\right\Vert }^{2}\\ \,\,+\rho \mathop{\sum}\limits_{k = 1}^{K}\mathop{\sum}\limits_{m = 1}^{M-2}{\left({a}_{k}^{m}-2{a}_{k}^{m+1}+{a}_{k}^{m+2}\right)}^{2}+\eta \mathop{\sum}\limits_{k = 1}^{K}\mathop{\sum}\limits_{i = 1}^{N}{({x}_{k,i})}^{\frac{1}{2}}.\end{array}$$
(17)

We refer to this as sparse-stretchedNMF.

In this optimization model, there are two regularization parameters, ρ and η. From experience, the smoothness parameter ρ is not sensitive and is usually adjusted by multiplying by 10. The sparsity parameter η can be adjusted by doubling. Problem Eq. (17) is still solved using Algorithm 2. However, it is worth mentioning that when updating Y, a constraint of Y ≤ 1 is enforced, leading to the update rule:

$$Y:= \min \left\{\max \left\{\hat{Y}-\frac{{\alpha }_{t}}{2}{\nabla }_{Y}f(A,\hat{Y},X),0\right\},1\right\}.$$

The update for X is formulated as:

$$X:= \arg \mathop{\min }\limits_{X\ge 0}\frac{1}{2}{\Vert X-\hat{X}+\frac{{\alpha }_{t}}{2}{\nabla }_{X}\,f(A,Y,\hat{X})\Vert }^{2}+\eta \parallel X{\parallel }_{\frac{1}{2}}^{\frac{1}{2}},$$

which has a closed-form solution as demonstrated in ref. 54.