Analyzing Raman Spectral Data without Separability Assumption

Raman spectroscopy is a well established tool for the analysis of vibration spectra, which then allow for the determination of individual substances in a chemical sample, or for their phase transitions. In the Time-Resolved-Raman-Sprectroscopy the vibration spectra of a chemical sample are recorded sequentially over a time interval, such that conclusions for intermediate products (transients) can be drawn within a chemical process. The observed data-matrix $M$ from a Raman spectroscopy can be regarded as a matrix product of two unknown matrices $W$ and $H$, where the first is representing the contribution of the spectra and the latter represents the chemical spectra. One approach for obtaining $W$ and $H$ is the non-negative matrix factorization. We propose a novel approach, which does not need the commonly used separability assumption. The performance of this approach is shown on a real world chemical example.


Introduction
In Raman spectroscopy vibrational spectra can be detected. Analysis of those spectra provides comprehension about chemical and physical properties of molecular structures, which is important in different research areas in biology, medicine and industry [1,2,3]. Nowadays, Raman spectrometers are capable to generate spectral recordings down to the femto second time scale. Such time-resolved Raman spectroscopy allows -besides spectral recordings of stable substances -for monitoring of events like intra molecular rearrangements and chemical reactions [4]. We thereby obtain measured Raman spectra as a function of time, which depicts both main characteristics of an observed process: On the one hand, each measured spectrum is a fingerprint of compounds and therefore represents the intrinsic spectra of the individual species or molecular states involved in the reaction. On the other hand, the relative contributions of the involved spectra to each measured spectrum reflect the momentary composition of the sample at the corresponding time. Through the full series of generated spectra we hence draw conclusions about the kinetics of the underlying reaction process. Consequently, the central task about time-resolved Raman data analysis is deciphering the series of measured spectra with respect to the individual component spectra and their temporal evolution. This article is organized as follows. In Section 2, we give an overview of NMF approaches and algorithms known so far. In particular we present the separable NMF method, which found application in the approach for spectral analysis in [5]. Our new NMF approach as well as the algorithmic details of the corresponding computational method are introduced in Section 3. In Section 4, we present numerical results of our novel method. On the one hand, we thereby discuss recovery results for synthetic measurement data with increasing interference of the component spectra and presence of measurement noise. On the other hand, we verify the influence of the single components of our adaptable objective function through recovery results for certain choices of weighting coefficients.

Non-Negative Matrix Factorization (NMF)
From a mathematical point of view the non-negative measurement matrix M , which contains the discretized time-resolved Raman spectra, can be expressed as where the columns of W represent the component spectra and H the course of the relative concentrations. A factorization of M into the two matrices W and H is from the chemical point of view interesting, the matrix W gives us the substances being involved in the reaction and the matrix H allows inference on the speed of the reaction. Note, that this is not possible by considering only one row or column of the matrix M . Summing up, time-resolved Raman spectral data can be modeled as the product of two non-negative matrices representing the single component spectra and the underlying reaction kinetics. The matrix H represents the "normed" intensity which we term relative concentration. The matrix W represents the wavenumber.
Recovering these factorization matrices only given the measured time-resolved spectra requires non-negative matrix factorization (NMF). In general, NMF is an utile tool for the analysis of high-dimensional data and therefore relevant topic in present-day research in many scientific fields [6,7,8]. Besides detecting a compressed representation, NMF delivers insights into structure and features of the given data by extracting easily interpretable factors. The goal of nonegative matrix factorization (NMF) (see e.g. [9,8] and the references therein) of a data matrix M as input, is to solve an optimization problem in order to find matrices W and H with non-negative entries such that the product W H is the best possible approximation of our non-negative input data matrix M . NMF is a linear dimension reduction technique for a non-negative data set, which means that the corresponding matrix of data points is approximated by a linear combination of the columns of matrix W .
Mathematical Background The columns of W form a basis for the column space of matrix M and the columns of matrix H are the weights to approximate the data points. The NMF problem is N P-hard [10], due to the non-negative constraints on W and H. Moreover the solution of an NMF Problem is generally not unique. To see this, assume that W > 0, H > 0, and that there exists a matrix D such that W D > 0 and D −1 H > 0 then M = (W D)(D −1 H) which shows that the NMF is not unique.
In the absence of the positivity constraints the problem could be solved efficiently by using methods such as truncated singular value decomposition (TSVD) [11]. One of the common approaches for solving the NMF problem is the alternating least squares approach [12,13]. In this approach, one of the two matrices is fixed, for example H and then finds the corresponding optimal solution for W , which is a convex optimization problem with nonnegativity constraints. Then alternate between W and H. If the matrix M satisfies a separability condition, then we can solve the NMF problem efficiently. By definition a matrix M is r-separable , if there exists a nonnegative factorization (exact factorization) of rank r, where each column of W is equal to a column of M . Meaning that each column of W , being a basis for the column space of M , appears somewhere in the data matrix M as its column. Geometrically, the columns of W are the vertices of the convex hull of the columns of M . The separability condition means, that all columns of M can be reconstructed by using a convex combination of r columns of W [14,15]. This is only possible, if the columns of M form a simplex which is spanned by r columns of M . This is not necessarily the case.
NMF in the context of measurement data Given a component-wise non-negative matrix M of dimension n × m and an integer r > 0, NMF determines likewise componentwise non-negative matrices W and H of dimensions n × r and r × m, respectively, such that M = W H. Generally, integer r is denoted as rank of the factorization. Assuming M to represent m measurements of n non-negative variables, we interpret the NMF task as follows: We aim to identify r ingredients which allow for recovery of all m measurements by composition according to respective contributions. The ingredients then are reflected by the columns of factorization matrix W while the columns of H contain the corresponding mixing coefficients. In practice, considering measured data and therefore allowing noise or other forms of data uncertainty generally rules out the existence of an exact NMF in terms of M = W H. Thus, from now on we want to compute componentwise non-negative matrices W and H such that W H is an approximation of M .
In the context of Raman data spectral analysis, focusing on the non-negativity of involved matrices becomes reasonable through the model for time-resolved Raman spectral data of Liesen et al. [5]. They introduce an approach to express a series of spectral recordings of a chemical reaction (matrix M ) as the matrix product of the component spectra (matrix W ) and the evolution of relative concentrations of these reaction components (matrix H). Based on this model and synthetic spectral data, which satisfy the recently much-cited separability assumption, the authors of [5] furthermore present an algorithm to detect a factorization W H = M using separable NMF methods. Inspired by their results, we propose a novel method, which does not rely on the separability assumption, since in the context of a spectral analysis this assumption is very restrictive. The separability assumption means that the convex hull of the columns of M is given by the column vectors of W . This is not necessarily given in real-world data. In other words, this assumption means, that the convex hull of M is a simplex. Of course, it is true that we are searching for a simplex that includes all columns vectors of M , but the convex hull of M needs not be a simplex. Thus, we will exploit additional chemical or physical model aspects in order to find the optimal simplex including the columns of M without separability assumption. In the center of attention of this new approach stands an adaptable objective function, taking into account only the common structural properties of the sought-for, process defining matrices W and H.

Solving an Optimization Problem for NMF
In the following we pick up the concepts of both previous chapters as we introduce a new NMF approach which is specialized on analysis of time-resolved Raman spectral data. Recall from (1) that the thereby recovered nonnegative matrices represent the component spectra of the involved species (W ) and the reaction kinetics in terms of the evolution of relative concentrations (H). Our novel NMF approach differs from the methods discussed so far as it is mainly based on minimization of an objective function which directly incorporates all known structural properties of the sought-for matrices W and H. Furthermore, our approach is unaffected by the restrictive separability assumption. In contrast to Liesen et al. [5], we hence apply our method even to non-separable measurement data. Additional flexibility and adaptability of the novel approach will be depicted in the numerical results in section 4. Here we present the leading ideas of this approach as well as the details of the corresponding computational method.

Optimization Criteria for NMF
In the following we propose a novel approach which is based on an objetive function which includes the needed structural properties of the sought-after matrices W and M .
Claims on the matrices W and H In the following we assume, that the component spectra are positive, such that W is a positive matrix. The componentwise non-negativity of the kinetics H is also reasonable, since relative concentrations are in general non-negative. Furthermore, because of representing relative concentrations, each column of H is a priori supposed to sum up to 1.
For each of the s chemical species the relative concentration is given by the relavtiv concentration function h s : describing the relative concentration of species s at time t ∈ [0, T ] of the considered reaction.
Since the concentrations h s (t) are relative we have By using m time steps for discretization of the concentration functions h s (t) we obtain the column stochastic matrix The sequential Raman-measurements can not be modelled as a "random picking of spectra". The temporal order of measurements is important. Let the columns of H be given by Given the initial "concentrations" h(t i−1 ) there is a kinetics (or some Markov process) providing the concentrations of the next time-step h(t i ). This can be modelled by assuming a transition matrix P for the autonomous Markov process, if the time intervals are always constant. Thus, we claim that there exists a (row) stochastic matrix P ∈ R r×r such that In other words, the change of the relative concentration between the time steps can be interpreted as a Markov process. The construction of this matrix P will be explained later.
Summing up the objective function in our approach has the following penalty terms iii) H is column stochastic, iv) P is component-wise non-negative, and v) P is row stochastic.
Summing up, we arrive at the following objective function It has to be mentioned here, that the constraint iv) is not necessarily valid. The matrix P has to be row-stochastic, however, the entries of P can be negative. A Galerkin projection of a Markov Process on the basis of microstates to a small set of macrostates can lead to negative entries in the projected matrix P . In the real-world example in Section 4.3, we will show a crystallization process with a non-exponential decay of one species, which leads to a matrix P with one negative entry.

Robust Perron Cluster Analysis (PCCA+)
In the computational method of our novel NMF approach we apply the Robust Perron Cluster Analysis (PCCA+) [16] to generate an initialization of the kinetics in matrix H. We thus briefly introduce intention and operating principles of PCCA+ and reveal its utility for our context. PCCA+ belongs to the family of algorithms for characterizing objects of similar behaviour to combine them into a certain number of clusters. In several areas of computational life science this kind of task plays a versatile role. PCCA+ arises from investigation of molecular conformation dynamics and the thereby main interest into identification of metastable conformations [17,18]. There, metastable conformations are clusters for which the large scale geometric structure of the observed ensemble is conserved under the influence of a spatial transition operator [19]. Translating this approach into terms we consider a stochastic matrix T ∈ R N ×N (representing the discretized version of the spatial transition operator) and we search for a non-negative matrix Y ∈ R N ×N C , which column-wise contains the clusters y i , i = 1, . . . , N C , and thus satisfies three requirements: Y is non-negative and row stochastic in order to meet the partition-of-unity constraint. Thirdly the vectors y i build an eigenvalue cluster near 1.0 of T . This means for each i = 1, . . . , N C we have The main idea of PCCA+ is to generate Y as a linear transformation of the matrix X ∈ R N ×N C , where X columnwise contains the N C first eigenvectors of T with respect to eigenvalues close to λ 1 = 1. PCCA+ therefore computes a non-singular transformation matrix A ∈ R N C ×N C in order to gain the nonnegative, row stochastic matrix Y via Above, in paragraph matrix properties, we claimed that the sought-for matrix H of reaction kinetics needs to be non-negative and column stochastic. Both requirements are satisfied if we consider (4) and choose H = Y T as an initial guess of the kinetics. Thus, in the computational method of our novel NMF approach, the preprocessing prepares the application of PCCA+ in order to generate a promising initialization of H.
Investigating (4) generally we may find several feasible solutions A ∈ R N C ×N C providing an appropriate matrix Y . PCCA+ tackles this issue by computing A through solving an optimization problem with respect to a certain objective function. Given that the stochastic matrix T is the discretization 8 of a transition operator (consider e.g. molecular conformation dynamics), maximization of this objective function is equivalent to the maximization of metastability between the generated clusters. In other contexts (consider e.g. geometrical cluster problems) the interpretation of the objective functional may be different while still meaningful. See [20,17,21] for exemplary applications and illustrations of PCCA+ in several research areas.

Computational Method
The main work stages in the computational method of our novel NMF approach are summarized in Algorithm 1. Note that we distinguish between the finally recovered matrices (denoted as W rec and H rec ) and their corresponding interim results (denoted as W andH). Furthermore, we use matlab method pinv to calculate pseudoinverses of singular or even non-square matrices. We then label the pseudoinverse of a matrix A as A † . Furthermore, with A + we denote the matrix which is constructed out of A by deleting the first row and A − is the corresponding matrix constructed out of A by deleting the last row.

Algorithm 1 Novel NMF for Raman Data Spectral Analysis
Require: data matrix M ∈ R n×m and factorization rank r Ensure: matrix W rec ∈ R n×r of component spectra and H rec ∈ R r×m of reaction kinetics such that M ≈ W rec H rec 1: Perform SVD for primary factorization M T = U ΣV T and reshape U into U. 2: Apply PCCA+ in order to initializeH = (UA) Minimize objective function with respect to transformation matrix A. 4: Reconstruct spectra W rec and kinetics H rec , P rec according to the result of Step 3.

• Step 1: Preprocessing
In the preprocessing we consider M T . By subtraction of a reference point we transfer the columns of M T into a linear space. Afterwards we perform singular value decomposition (SVD) such that we gain M T = U ΣV T . In order to initializeH we want to apply PCCA+ to the leading r − 1 columns of U . Thus we build a matrix U, which takes the role of X in (4), as follows: The first column of U is equal to e = [1, . . . , 1] T ∈ R m , which is a requirement of PCCA+. We then stock up with columns 1, . . . , r − 1 of U until U ∈ R m×r . Subsequently, for efficiency reasons of PCCA+, we ensure orthogonality among the columns of U [16].
• Step 2: InitializingH, W , andP We apply PCCA+ to U. According to (4), we obtain a non-negative, column stochastic matrix H settingH whereby A ∈ R r×r is the computed PCCA+ transformation matrix.H is our initial guess of the kinetics of relative concentrations. Accordingly we gain an initialization of the component spectra W through the relation In (2), we can see that the matrixP is given bỹ Regarding (5), (6), and (7) we express the initial guesses of the soughtfor matrices only in terms of the given and processed data (M , U) and the PCCA+ transformation matrix (A). • Step 3: Minimizing objective function The objective function of our novel NMF approach only incorporates structural properties of the sought-for matrices as discussed above in paragraph matrix properties. With respect to each property we estimate a penalty value as stated in the following expressions: In regard to non-negativity of light intensities and relative concentrations, penalties 1, 2, and 4 determine the smallest entries in matrices W ,H, andP . As the sum of penalty values is supposed to increase if these smallest entries appear to be negative, weighting coefficients α, β, and δ are generally chosen negative, too. The requirement onH to be column stochastic is regarded by computing the maximal deviation of a column sum from being equal to 1.0 in penalty 3. Whereas, the requirement onP to be row stochastic is regarded by computing the maximal deviation of a column sum from being equal to 1.0 in penalty 5.
Consider Ψ to represent the sum of penalty values. As we choose the relations (5) and (6) for initialization, the input arguments for the objective function are the matrices M , U and A. Since we perform optimization with respect to parameter A, the minimization problem can be written in the form Minimizing Ψ 2 hence numerically adjusts matrices W andH according to the claimed structural properties. For computation we apply matlab method fminsearch, which uses the simplex search method of Lagarias et al. [22].
• Step 4: Recovering W rec , H rec , and P rec The minimization in Step 3 finally returns a transformation matrix A opt . We then recover the resulting kinetics P rec of relative concentrations H rec and the component spectra W rec according to (5)- (7) as In regard to NMF in the context of Raman data spectral analysis, our novel approach offers two main advancements: Firstly, in contrast to the method of Liesen et al. [5], our novel NMF approach is unaffected by the separability assumption. Since we only consider the general properties of the sought-for matrices without further demands on the input data, we may apply the novel approach to the broader range of even non-separable spectral data. Secondly, note the possibility to manipulate the decicive objective function in Step 3 by the choice of weighting coefficients α, β, γ, δ and µ or by addition of further penalty terms. This flexibility and adaptability of our method allows for example for special focus on certain data properties or even extension of the recovery objectives. We remark that the approach of optimizing P r ec has already been suggested in [23] and recently (7) has been applied in [24]. The next section presents some numerical experiments.

Numerical Results
In this section we present the level of performance of our novel NMF approach by applying it to a sequence of artificial time-resolved Raman spectral data. After describing the reaction data generation in Section 4.1, we prove that the component spectra are recovered to a high quality and that we even reach meaningful approximations of the underlying reaction kinetics. As well in Section 4.2, we present the effectiveness of our method in the case of increased overlap among the individual component spectra and the occurrence of measurement noise. In Section 4.3, we present real-word data from Raman spectroscopy measured during a crystallization process of paracetamol in ethanol. We show that our method can help to identify and characterize intermediate states (and their life-times) of a chemical process.

Description of the Reaction Data Generation
As in Section 2 for the model of time-resolved Raman spectral data, we here again follow the framework of Liesen et al. [5].
Regarding the generation of artificial time-resolved Raman spectral data we consider a reaction scheme with five involved species A, B, C, D and E which are inter-related by first-order reactions. These first-order reactions are characterized by a rate matrix of transition coefficients as follows: The rows i = 1, . . . , 5 of K reflect the transition behaviour of the corresponding species in the course of the observed reaction. So K 12 says that 53% of the amount of species A merge into species B per arbitrary unit of reciprocal time. The diagonal entries of K represent the sum of relative loss of each species per time unit. Thus we already notice species D to be the only product of this modeled reaction as just this species exclusively absorbs rates.
Here, we let species A be the only educt of the reaction and therefore denote the initial concentration vector as h 0 ∶= h(t 0 ) = [1, 0, 0, 0, 0] T . With h 0 and rate matrix K we obtain the reaction kinetics as a function of time by The spectral overlap among the single component spectra is adjustable. This means we may increase the level of spectral interference by moving all base points x 0 of the generated Lorentzians towards certain focal points. The level of spectral interference decides the level of separability of the measurement data. While the results in [5] are based on near-separability because of low spectral interference, we prove the effectiveness of our method even in the

Recovery Results
Considering the measurement data according to the artificial reaction scheme as introduced in the previous Section 4.1, our goal is now to recover the single component spectra as well as the reaction kinetics only given matrix M . In other words, we compute matrices W rec and H rec by applying our novel NMF approach to M . We thereby are especially interested into the reconstruction of the true component spectra W in order to provide a powerful tool for compound identification in real-life Raman spectral analysis. Recall that the objective function in our approach is based on adding up the penalty terms in (8), which represent the structural properties of the sought-for matrices and which are weighted by choice of the coefficients α, β and γ. In this section we present results of our method for the predefinitions α = −0.0001, β = −1 and γ = 1.
Recall additionally that we applied singular value decomposition in the preprocessing of our computational method. That is why the order of species in the recovered matrices W rec and H rec may be permuted in comparison to the order in the exact matrices W and H. For comparative visualization of our recovery results we thus compute the correlation coefficients between the columns ( ∼ species) of W rec and W and associate the spectra as well as the reaction kinetics according to the maximal correlation values.
Exemplary recovery results of our novel method for the noiseless case with low spectral interference are displayed in Figure 4. Especially the recovery of components A, B and D is nearly exactly: The coordinates as well as the heights of peaks can hardly be distinguished visually from the original data.
In the bottom right panel we also present the recovery result for the matrix H of reaction kinetics.  As in all upcoming illustrations of the reconstructed kinetics the dotted lines are assigned to their species through the corresponding color in the spectral panels. For comparison, the exact kinetics (black lines) represent the kinetics from Figure 2 (right). Indeed our reconstructed kinetics in Figure 4 reflect the general trends of the exact kinetics as in particular species A is recognized to be the only educt and species D to be the exclusive product of the generated reaction scheme.
As the first extension of the data setting we now investigate the effectiveness of our method in the case of increased spectral interference. As mentioned in Section 4.1, we generate increased spectral interference among the component spectra in W by moving the base points x 0 in all species towards three focal points. We then obtain component spectra as displayed in Figure 5. In Figure 6 we present the results of our novel approach being applied to very interference-rich measurement data. Besides the remaining high quality in the recovery of components A, B and D the reconstruction of species C and E apparently improved compared to the results in Figure 4. In this interference-rich case our method computes the coordinates of the peaks in all component spectra quite satisfactorily. Concerning the recovery of the reaction kinetics, displayed in the bottom right panel, we again precisely identify the educt and the product of the reaction.  As the second extension of our data setting we regard the recovery results of our routine additionally considering contamination of measurement noise. In any practical setting Raman spectral analysis needs to deal with this issue since, for instance, signal shot noise or background noise appear in any real experimental data. Here we assume the noise from all different sources to be adequately represented by additive Gaussian white noise, which disturbs the measurement matrix M according tõ The entries of N thereby are generated by the normal distribution N (0, 1) and δ = 0.5 is the relative noise level. See Figure 3 (bottom) for an interpolated visualization of the interference-rich and noisy measurement matrix M . Applying our novel NMF approach with the predefinitions in (9) toM , the illustrations of results in Figure 7 prove that the component spectra still show a reasonable agreement with the exact spectra. Furthermore, the main traits of the true reaction kinetics are recognizable in the recovered kinetics as well.

Example: Paracetamol in Ethanol
We took experimental time-resolved Raman spectroscopy data of paracetamol as an example to demonstrate application and usability of our NMF algorithm. Paracetamol crystallizes in two polymorphs, and these polymorphs can have difference in the processing of the drug in its final tablet formulation. The bioavailability of the drug can also be different according to a particular polymorph [25]. Control over crystallization is required in an attempt to manufacture a desired polymorph, for which crystallization is studied in an empirical manner with different solvents, cooling rate, etc. The effects of the solvents on crystallization of small drug molecules, paracetamol are of paramount importance. Different solvent choices yield different polymorphs of paracetamol [26]. Crystallization studies from liquid solutions were performed in a custom-made acoustic levitator [27]. The acoustic levitator allows executing contact-free crystallization studies and in situ measurements. The droplet of the solution can be fixed in a stable and undisturbed position by means of an ultrasonic field. The environment around the sample can be controlled regarding the surface, temperature, and humidity by passing a cool/hot stream of nitrogen. During the experiment the solvent evaporates and leads to a gradual increase of the concentration of the droplet which finally crystallizes (Fig. 8). Time-resolved Raman spectroscopy is performed with the resolution of 3 seconds during this crystallization process. Various pathways from solution phase of the drug molecules to final crystallized 20 phase have been suggested. An intermediate metastable polyamorphic state has been reported wherein the paracetamol molecules existing in transient disorganised cluster undergoes ordering to fetch final crystal structure of high order [28]. With our method, we were able to not only understand the kinetics of the intermediate phase, but were also able to calculate the spectra of the intermediate state. This data is crucial in understanding and thus controlling the crystallization of a drug substance. The measurements are shown in Fig. 9.  : In real-world applications, sequential measurements of Raman spectra lead to input data for NMF. The intensity of different wavenumbers is measured at different timesteps. Figure 10: During the crystallization, solvated paracetamol (black spectrum) passes through an intermediate amorphous state (red spectrum) which then immediately turns into a crystal structure (green spectrum). The three component spectra of this process are extracted by using NMF.
The following settings are used for the optimization function: α = 0.00001, β = 100, γ = 100, δ = 1, µ = 1. With these settings it is focused on feasible concentrations. This means, we focus on providing a matrix H rec with non-negative entries and rowsum 1, such that Fig. 11 shows mathematically feasible concentration curves. α is set to a very low value, because the intensities of the spectra are orders of magnitude higher than the entries in H rec or P rec . After using the optimization approach Alg. 1, especially the matrices H rec and W rec are important experimental findings. They show the spectra of intermediate steps and of the final crystal form of paracetamol (Fig. 10) and they show the kinetics of the crystallization process (Fig. 11). The matrix P rec is: This matrix represents the approximated Galerkin projection (3 states) of a transition process in a continuous space (micorscopic 3D arrangement of the atoms in the droplet). The third row of P rec represents the initial state. The second row is the intermediate state. There is a zero probability for going back from this state to the initial state. The first row represents the stable final crystal. The upper right part of P rec is zero. This is because the crystallization process is directed. Fig. 11 shows a decay of the initial state which is nearly linear. In reaction kinetics we usually expect exponential decay. The matrix is just the optimal fit to a presumed kinetics according to the chosen objective function. Depending on the optimization criterion, one can obtain different results from NMF of the given raw Raman spectroscopy data. These results can be checked using a cross-validation method to confirm the mathematical interpretation of the chemical process. We compared the results of NMF with simultaneous time-lapse photography of the droplet, the first of its kind to be used as a watchdog for comparing results obtained from NMF that correspond to the experimental results. Besides comparing time-step of phase change point observed in concentration curves with the experimental time-steps, another factor that validates the results are the peaks reported for metastable intermediate amorphous state closely matches with our calculated spectra. The peaks in red curve, for measured intermediate state, 1236 cm -1 ,1326 cm -1 ,1618 cm -1 to refer to few of many, match with calculated peaks at 1235 cm -1 , 1327 cm -1 ,1619 cm -1 [28]. Naturally, the peaks for final moieties can also be verified and are in accordance with reported experimental data. Structural changes, which are predicted with NMF are verified on the basis of this recording.

Conclusion
Summarizing, our novel NMF approach returns remarkable and robust results in the recovery of component spectra and reaction kinetics while the method is mainly based on the general structural properties of the soughtfor matrices. The recovery results of our approach even indicate that the quality of the recovered component spectra improves as the spectral overlap among the component spectra increases. Our novel approach can therefore be considered as a complement to the method of Liesen et al. [5] since the success of their method especially depends on low spectral interference (nearseparability of M ).