1 Introduction

Magnetic resonance spectroscopic imaging (MRSI) is an in-vivo clinical imaging modality which detects nuclear magnetic resonance signals produced by nuclei in living tissues. Quantification of this signal amplitude generates metabolic maps which show the concentration of metabolites in the sample being investigated. Accurate quantification of these metabolites is important for diagnosis of brain tumor and other in-vivo diseases. For this purpose, a common practice in the MRS community has been to use non-linear spectral fitting tools such as the LCModel [5], TARQUIN [9], AMARES [8] and ProFit [7] amongst which the LCModel is regarded as the gold standard fitting tool. In this study, we present an alternative to the non-linear model fitting using a machine learning approach.

Non-linear Model Fitting. The LCModel software uses a linear combination of metabolite basis spectra set to model the spectral measurement in the frequency domain. It also uses smoothing splines to model the baseline signals and subsequently fits the parameters of the basis set using a non-linear optimisation. LCModel incorporates the prior knowledge of the data while modeling the fit and this ensures robustness in the model leading to estimation of the spectral parameters such as concentration of metabolites. Some of the drawbacks of this non-linear fitting model are: (1) Metabolite quantification can be time-consuming depending on the dataset size and requires a lot of manual parameter tuning. (2) The error in estimating parameters is lower if high SNR spectra are used since the non-linear voxel-wise fitting to noisy data leads to a high amount of local minima and subsequent inaccuracy in quantification [3, 4].

Machine Learning. Machine learning methods such as decision forests, random forests [2] are being extensively used in the medical imaging community for tasks such as parameter estimation, diseases diagnosis, segmentation, etc. In MRSI, machine learning tools have been used only for specific tasks such as classification of spectra [4] and assessment of spectral quality [1]. This opens up the possibility of using the recent advances in machine learning to predict MRSI data parameters while addressing the drawbacks of conventional fitting tools such as long computation time and poor performance for data with artifacts.

Our Contribution. In this work, we propose a simple yet effective method using random forest regression for multi-parameter estimation in MR Spectroscopic Imaging. We generate over 1 million simulated spectra training-set having concentration magnitudes, linewidth effects, baseline and lipid artifacts. We also use spectral data from 287 human subjects to create a physical training model to be used in the regression framework (Sect. 3.1). In the following we present our method adapting random forest regression to MRSI (Sect. 2) followed by experiments in the aforementioned dataset. Our proposed method is then validated quantitatively and qualitatively using: (1) synthetic brain spectra, (2) human in-vivo single voxel spectra having the same image acquisition protocol as the physical training model and (3) independently acquired human in-vivo 2D MRS Images to perform a blind test on the physical and synthetic models. We present the results (Subsect. 3.2) of our experiments followed by a summary and discussion (Sect. 4) on the future work in this domain. This is the first application- to the best of our knowledge- of machine learning for determining MRS parameters which were otherwise determined using basis fitting tools.

2 Methods

MR Spectroscopy. Magnetic resonance spectroscopy, based on the concept of nuclear magnetic resonance (NMR), exploits the resonance frequency of a molecule, to obtain information about the concentration of a particular metabolite [6]. The time-domain complex signal of a nuclei is given by:

$$\begin{aligned} S(t) = \int \mathrm {p}(\omega )\mathrm {exp}(-i\varPhi )\mathrm {exp}(-t/T^{*}_{2})dw. \end{aligned}$$
(1)

The frequency-domain signal is given by \(S(\omega )\), \(T^{*}_{2}\) is the magnetization decay in the transverse plane due to magnetic field inhomogeneity and \(\mathrm {p}(\omega )\) comprises of Lorentzian absorption and dispersion line-shapes function having the spectroscopic information about the sample. \(\varPhi \) represents the phase, \((\omega t + \omega _{0})\), of the acquired signal where \(\omega t\) is the time-varying phase change and \(\omega _{0}\) is the initial phase. Non-linear fitting tools facilitate the generation of metabolic maps to estimate concentration of metabolites such as N-acetyl-aspartate (NAA), Creatine (Cr) and Choline (Cho). An example of the spectra present in the brain has been shown in Fig. 1.

Fig. 1.
figure 1

Example brain 2D MRSI dataset. (A) The simulated brain with the region of interest (red box). (B) Highlighted regions corresponding to GM, WM and CSF (c) Corresponding spectrum of GM, WM and CSF

Random Forest Regression. Random Forests [2] have been shown to be effective in a wide range of classification and regression problems. These comprise of a set of binary trees wherein splits are created in each tree based on a random subsets of the feature variables on which the forests are subsequently trained. Piecewise linear regression is implemented by each tree over the input data and, after seeking for the best prediction at every node, data points are sent to the left or right branches based on feature selection by thresholding. This process continues till it reaches the end of the tree and subsequently the weighted average of the prediction from each tree is taken to give a single output estimate. The randomness in the training process encourages the trees to give independent estimates which can be combined to achieve an accurate and robust result.

For MRSI, we adapt the random forest approach to have a training dataset \(D = (S_{i} (\omega ), Y_{i})\), \(i\in [1, N]\), where N is the total number of training spectra. \(S_{i}(\omega )\) represents the training spectral data while \(Y_{i}\) represents the corresponding multi-parameter training labels. For our model, we consider the concentrations of NAA, Cho and Cr for simulated data, while for the real data we additionally consider Myo-Inositol (mI) and Glutamate+Glutamine (Glx). Therefore, for a given spectra \(S_{i}(\omega )\), \(Y_{i} = [\) NAA\(_{i}\), Cho\(_{i}\), Cr\(_{i}\), mI\(_{i}\), Glx\(_{i}]\).

Running the random forest regression on this produces a training model which can then be used to obtain parameter estimates \(\hat{Y}_{j}\) of test spectra \(S_{j}(\omega )\) having test labels \(Y_{j}\), \(j\in [1, M]\) where M is the total number of test spectra.

Error Calculation. For our experiments, given the estimate \(\hat{Y}_{j}\) and the testing label \(Y_{j}\), the estimate error for the parameter \(Y_{j}\) can be calculated as,

$$\begin{aligned} \hat{E}_{j} = ||\hat{Y}_{j} - Y_{j}||./||Y_{j}|| \end{aligned}$$
(2)

This method helps us to assess the change in parameter estimate over the testing/ground-truth values.

3 Experiments and Results

3.1 Data

We perform 4 sets of experiments to assess our proposed method: (1) training and testing on simulated spectra (Synthetic - Synthetic (Spectra)), (2) training and testing on human in-vivo spectral data from different subjects but having the same acquisition protocol (Real (Spectra) - Real (Spectra)), (3) training and testing on human in-vivo spectral data from different subjects with different acquisition protocol (Real (Spectra) - Real (MRS Images)) and (4) using the simulated spectra model to test on MRS images (Synthetic (Spectra) - Real (MRS Images)).

Synthetic (Spectra). A metabolite basis set was generated by using the data provided by the ISMRM MRS Fitting Challenge 2016. These were then used to simulate over 1 million spectra. In order to ensure that the simulated spectra was as close as possible to human in-vivo spectra, we incorporate the following features: variations in NAA, Cho, Cr concentrations, macro-molecular baseline, lipids, t2 values (for changes in linewidth) and signal-to-noise ratio (SNR) to account for changes in spectral quality. As a preliminary case study, we only simulate the major metabolites (NAA, Cho and Cr) as these are easily detected by the LCModel and would, therefore, help us to evaluate the outcome of our approach and allow a suitable comparison with the LCModel. A set of over 10,000 independent test spectra were also simulated with varying combinations of the aforementioned features. For both the training and testing sets, we used the basis-set metabolite concentration values as our ground-truth.

Real (Spectra). To evaluate our method on in-vivo data, we utilize LCModel-fitted single-voxel spectroscopy (SVS) data from 287 independent human subjects. The data was obtained using the same standardized imaging protocol with the following acquisition parameters: TE/TR = 35/2000 ms, spectral width = 2500 Hz, number of points = 1024. We implement a K-fold cross-validation with 10 folds along with the random-forest regression to generate different training and testing sets having spectra from 259 and 28 subjects respectively. The metabolites assessed were: NAA, Cho, mI and Glx.

Real (MRS Images). To further assess our approach, we acquire a standard phase-encoded 2D brain MRSI data of a healthy human volunteer on a 3 T scanner using a point-resolved spin-echo localization sequence (PRESS) with voxel size = 10\(\,\times \,\)10\(\,\times \,\)15 mm3, TE/TR = 35/1000 ms, spectral width = 2000 Hz, number of points = 400. For testing purposes, we use 96 spectra from the inner-region of the brain which serves as the region of interest.

Due to the differences in acquisition parameters of the training and testing set, both the resulting spectra vary in amplitude and metabolite peak alignment. We perform a pre-processing spectral alignment step where all the test spectra are cropped from 4.3 to 0.2 ppm and interpolated to the same number of points as the training spectra to compensate for differences in acquisition bandwidth. This is followed by normalizing the amplitude of the test spectra using one of the training spectra as reference.

3.2 Results

Synthetic - Synthetic (Spectra). We perform an initial experiment to determine the out-of-bag (OOB) error using different number of trees and features on a set of 20,000 simulated train and test spectra. Based on the results shown in Fig. 2, we proceed with the parameter estimation experiment by identifying the appropriate number of trees and features required to achieve convergence of the OOB error. For the regression error estimates, we use metabolite concentration ratios with respect to Cr (used as a standard assessment method in MRS as a means for calibration). We obtain R scores of 0.968 and 0.962 for NAA/Cr and Cho/Cr values respectively. The corresponding figures representing the linear regression are shown in Fig. 3 and the error plots in comparison with the LCModel are shown in Fig. 4.

Fig. 2.
figure 2

Out-Of-Bag (OOB) Error for Simulated Spectra. The experiment is performed for a varying number of features (from 1 to 256 as shown in the legend) and each iteration is assessed for a varying number of trees (as shown in the X-axis). The Y-axis represents the OOB Error rate. The error rate is minimal for more than 64 features and also converges when the number of trees is close to 100.

Fig. 3.
figure 3

Regression Scores for the following parameters (from left to right): NAA/Cr concentration estimate and Cho/Cr concentration estimate. The X-axis represents the true values of the parameter while the y-axis represents the estimated values. Both sets of values are plotted using linear regression.

Fig. 4.
figure 4

Synthetic-Synthetic (Spectra): Estimation error for different metabolite concentration ratios in a given test-set. Whiskers span the [min max] values. Median error values are represented by the red line and are as follows: NAA/Cr Regression = 0.064, LCModel = 0.077, Cho/Cr Regression = 0.043, LCModel = 0.070.

Real (Spectra) - Real (Spectra). For the SVS dataset, we use the LCModel concentration ratio estimates as the ground-truth. Table 1 indicates the mean metabolite concentration estimate error across the 10-folds of the cross-validation process using the random forest regression method. Median error for the NAA/Cr estimate is 0.068, 0.072 for the Cho/Cr estimate, 0.093 for the mI/Cr estimate and 0.070 for the Glx/Cr estimate compared to the corresponding LCModel estimates. The difference in error estimates is small and shows a similarity in assessment between our proposed method and the LCModel. Moreover, the low-concentration metabolites such as mI and Glx usually display a fitting error with the LCModel and the estimation error for these metabolite ratio concentrations is lower indicating that our model works well for these metabolites as well.

Table 1. Concentration-ratio estimate errors using random forest regression. Results are for the experiments Real(spectra)-Real(spectra) and Real(spectra)-Real(Images). The errors are calculated over the respective LCModel estimates as per the formula given in Eq. 2. The major metabolites (NAA and Cr) show a low error while the smaller concentration metabolites (mI and Glx) show a slightly higher error.
Fig. 5.
figure 5

Left: Synthetic (Spectra)-Real (MRS Images): Estimation error for different metabolite concentration ratios for the same test dataset. Whiskers span the [min max] values. Median error values are represented by the red line and are as follows: NAA/Cr = 0.024, Cho/Cr = 0.034. Right: NAA/Cr and Cho/Cr concentration distribution estimates from random forest regression and non-linear model fit.

Synthetic (Spectra) - Real (Images). We test our synthetic spectra training model on the 2D MRSI data and the results are shown in the boxplot in Fig. 5 along with the resulting concentration distribution from both the regression approach and the non-linear model fit. As our synthetic model is trained for only NAA and Cho ratios, we show the errors for these two only. Median estimate error for NAA/Cr is 0.24 using regression. For Cho/Cr, the estimation error is 0.34. The corresponding concentration values estimated from the LCModel serves as our ground-truth.

Real (Spectra) - Real (Images). We perform a blind test with 96 2D MRSI spectra against the training model generated using the 287 SVS spectra and the results are shown in Table. 1. Median estimate error for NAA/Cr is 0.1, for Cho/Cr is 0.18, for mI/Cr is 0.217 and for Glx/Cr is 0.13. Although we expect the errors to be higher in the blind test due to difference in the acquisition protocols of the training and testing dataset, the errors appear to be within a reasonable window. As expected, the estimated errors are highest for mI/Cr while Glx/Cr surprisingly has a lower error than Cho/Cr.

The Real Spectra training model provides a marginally better metabolite concentration estimate than the Synthetic spectra model. We attribute this to the presence of arbitrary scanning effects and artifacts in the real spectra model as compared to the synthetic model. For future experiments, this provides the scope for learning on a large synthetic spectral data-set with similar additional arbitrary effects to have a robust classifier for real data (especially in the cases where annotating training data is expensive).

4 Conclusion

Machine learning techniques such as Random Forest-based regression provide a new and faster way of metabolite quantification. Our synthetic training model accounts for spectral features such as macro-molecular baseline, lipids, linewidth and SNR variations in combination with different metabolite concentrations. Additional features such as frequency and/or phase-shift effects along with B0 inhomogeneity could be incorporated in the model to improve robustness. For the human in-vivo data, we use training spectra from different subjects and the random-forest regression provides a low amount of estimation error over the LCModel fit even in the presence of arbitrary scanning effects. Training times for the simulated spectra can be considerable (around 5–6 h) given that we generate over 1 million spectra while it is only a few minutes for the in-vivo spectra. On the other hand, testing and concentration estimation happens in only a few seconds and is considerably faster than the non-linear model fitting. The machine learning approach may be used directly, or indirectly by initializing LCModel fits thereby improving their results in the presence of noise and speeding up convergence. They can also be combined with global decisions about spectral quality predicting whether a spectrum can or cannot be interpreted by the physics model because of the presence of artifacts.

Future work would involve using a more robust approach such as deep-learning based methods to improve the accuracy of parameter estimation. Once a framework has been established, further work can be done on having disease-based training models for parameter estimation to predict disease progression and the corresponding metabolite maps.