Introduction

Mass spectrometry imaging (MSI) is a rapidly growing research area where mass spectrometric analysis is performed in a spatially resolved manner. MSI data is most commonly represented as heat maps based on ion abundance mapped on the Cartesian coordinates of the analyzed sample. This allows for the generation of images for each spectrally resolved m/z value across the analyzed region. A color scale bar is used to represent the ion abundance. A wide variety of ionization sources and mass analyzers have been used to create MSI datasets. Most commercial software available for analysis of MSI data are vendor-specific and proprietary. However, in recent years there has been increased support for data sharing through a universal data format imzML [1] and the introduction of open-source and/or free data analysis software such as MSiReader [2], Cardinal [3], BioMAP [4], msIQuant [5], SpectralAnalysis [6], and METASPACE [7]. Each software has unique advantages and disadvantages. Recently, a new version of msIQuant [8] was introduced with the ability to analyze imzML data with software tools to handle MSI quantification data and multimodal imaging capabilities. Cardinal and SpectralAnalysis are software packages that allow advanced statistical analysis of MSI data sets in an untargeted fashion [3, 6]. The recent developments of software packages available for analyzing and processing MSI data were outlined in a recent review article [9].

MSiReader was introduced in 2013 as an open-source, vendor neutral MSI data analysis software written in MATLAB [2]. At the time of publication, it was the only software capable of analyzing high resolving power mass spectrometry imaging data without data compression. The maximum file size was and still is only limited by the amount of RAM available. MSiReader was initially designed to process and analyze imaging data generated in our laboratory using the infrared matrix-assisted laser desorption electrospray ionization (IR-MALDESI) imaging source [10, 11]; however, since then, it has become an essential tool for research in our laboratory as well as others. At the time of the submission of this manuscript, the original manuscript has been cited over 110 times for diverse applications in proteomics [12], plant and animal metabolomics [13,14,15], and forensics [16]. Community feedback has led to the incorporation of numerous improvements in the user interface that enhance workflow, as well as tools for spectral export, image export, data binning, colocalization, normalization methods, polarity switching, and filtering, as well as quantification.

The evolution of MSI in the past several years has required the constant development and improvement of software packages to facilitate the analysis of a wide variety of MSI data. MSiReader has evolved to include tools that allow complex and customized data analysis workflows to be incorporated into the software, allowing unprecedented analysis of MSI data. In this manuscript, the authors present MSiReader v1.0, with an improved user interface, more robust error checking, faster data loading and processing, and newly incorporated features for the analysis of MSI datasets.

Experimental

All experiments mentioned in this manuscript were performed in accordance with local ethical guidelines. Human cervical tissues were obtained from the University of North Carolina Tissue Procurement Facility through UNC IRB #09-0921, and written informed consent was obtained from all patients. Two-d-old whole neonatal mouse pups were obtained from NCSU Department of Molecular Biomedical Sciences. Hen ovarian tissues were acquired from commercial egg laying hens in the NCSU Department of Poultry Science. All husbandry practices were approved by North Carolina State University Institutional Animal Care and Use Committee (IACUC).

All imaging experiments were performed in our laboratory using infrared matrix-assisted laser desorption electrospray ionization (IR-MALDESI) source coupled to high resolving power mass spectrometers. The details of IR-MALDESI source design and steps involved in imaging experiments are described elsewhere in detail [10, 17, 18]. In short, the tissues were sectioned into 10–25 μm thick sections using a Leica CM1950 cryostat (Buffalo Grove, IL, USA) and then thaw-mounted onto clean microscope slides. For quantitative MSI analyses, a calibration series of stable isotope-labeled version of the analyte was pipetted directly on top of the tissue section prior to IR-MALDESI analysis. The tissue sections were then transferred to the enclosure housing the IR-MALDESI imaging source and placed on a Peltier-cooled stage. The relative humidity inside the enclosure was reduced to ~10% by purging the enclosure with dry nitrogen gas, and the stage temperature was reduced to –10 °C. After roughly 10 min, the enclosure was opened to allow the deposition of a thin layer of ice matrix on the tissue by sublimation of water present in the atmosphere. Once a thin layer of ice was formed over the tissue, the enclosure was closed and the relative humidity was again reduced to ~10%.

Two mid-infrared laser pulses at a wavelength of 2940 nm were used to desorb material from the tissue sections. The neutral material is desorbed from the tissue partition into the charged droplets of the electrospray and ionized in an ESI-like fashion. Quantitative MSI and whole-body MSI were performed using a Q Exactive mass spectrometer (Thermo Scientific, Bremen, Germany) as described by Bokhart et al. [19] and Rosen et al. [20], respectively. Polarity switching IR-MALDESI MSI was performed using a Q Exactive Plus mass spectrometer (Thermo Scientific, Bremen, Germany) as outlined by Nazari and Muddiman [21]. Since IR-MALDESI is a pulsed ionization source, the automatic gain control (AGC) is disabled and ions are stored in the C-trap for a predetermined amount of time denoted by the maximum injection time (IT). Mass ranges, electrospray solvent composition, and the injection times varied for each experiment since these need to be optimized based on the goals of each analysis.

The .RAW files generated by the Thermo instruments were first converted to mzML format using the msConvert tool from ProteoWizard [22], and then converted to imzML using the imzML converter [23]. The imzML files were subsequently loaded into MSiReader v1.0 for visualization and analysis.

Results and Discussions

Overview of MSiReader Features

MSiReader was developed because of the need to process high resolving power MSI data sets without compromising the integrity of the data with software-imposed limitations on dynamic range or resolution. Initial efforts were directed at providing an efficient workflow for untargeted analysis using new instrumentation. The first version could simply load an mzXML data set and display a heatmap image of ion abundance for any m/z value and window size chosen by the user. It also provided a tool to plot the spectrum for a single pixel and another to find putative peaks based on a parabolic fitting and thresholding algorithm. Since then, many additional capabilities have been implemented and performance has been greatly improved. Some examples of new capabilities of the software include: saving and reloading an entire session as a MATLAB binary (.MAT) file; peak picking based on differences between reference and interrogated regions of interest; four channel colocalization plots; m/z abundance rank plots; several normalization options; numerous ways to export, save and reuse data; exporting binned abundance data into a continuous m/z range in preparation for multivariate analysis; batch generation of images for each m/z value in a list; extensive customization options via a preferences file; and an application programming interface (API) that eases the task of prototyping new ideas and implementing custom scripts.

The application programming interface provides up to eight special icons on MSiReader’s toolbar that can be associated with user-written MATLAB functions. When one of these icons is clicked, the loaded data cube along with all current parameter settings and handles to all GUI objects are accessible to the user's function via a structure parameter called MSi. For each function there is an option to preserve this data by making it read-only. The user’s code has access to the library of over 275 functions distributed with MSiReader as well as any installed MATLAB toolbox functions. Several simple examples (e.g., the abundance ratio between two m/z values across an image) and documentation of the function library and MSi data structure are included in the v1.0 MSiReader release.

The focus of this paper is on the performance improvements made since the initial release and five new and useful tools for: loading multiple data sets at once, absolute quantification, polarity switching, image overlay, and mass measurement accuracy QA plots.

Loading and Processing of Imaging Data Sets

In addition to the preferred imzML format, MSiReader can load data stored as mzXML, IMG (Analyze 7.5), and Bruker ASCII files. All features of MSiReader are supported for each of the formats, and internally the spectra are stored as sparse, processed data. That is, the spectra for each pixel are stored as an independent sequence of m/z, abundance pairs using double precision floating-point values following the IEEE 754 standard. As each scan is loaded, it can be filtered by any combination of m/z range, abundance threshold, spatial location, or polarity. This can have a significant improvement on performance, especially by removing large numbers of low abundance values from the data cube of a high resolving power instrument. It also allows the user to restrict the spatial and m/z range of data loaded from very large files. Additionally, a smart abundance filter can be selected whereby any consecutive sequence with more than two abundance values below the threshold is replaced by threshold (or zero) values at each end instead of being removed. This improves the appearance of filtered spectral plots by forcing the lines between data spectral data points to return to the baseline abundance instead of connecting the two data points above the threshold by a straight line.

Table 1 shows the dramatic improvement in data set loading time for the supported formats that have been made since MSiReader release 0.03 in July of 2013. Even though load time is dominated by the speed of the file storage medium and the amount of RAM available, significant improvements have been made. All tests were done on the same computer, a Dell Precision Tower 5810 (Xeon ES-1607 v3, 3.1 GHz Processor, 64 GB RAM, 256 GB SSD and 4 TB HD) in a newly started MATLAB 2017a session. All code was stored on the SSD and the data sets were on the hard drive. The measured time required for various operations in MSiReader are obtained using the built-in MATLAB tic and toc functions and displayed in the MATLAB command window.

Table 1 MSiReader Dataset Loading Time Improvement

The performance of other tasks in MSiReader has also been improved even though new features (hence new code) have been added. Table 2 compares several common operations: updating the heatmap display after changing the m/z value the normalization option or the window tolerance; exporting raw spectra for the scans in a region of interest (ROI); peak picking based on differences between a reference and an interrogated ROI; and batch export of heatmap images for a list of m/z values into a folder (png file format). The heatmap update and batch export tasks were done using the imzML(2) file from Table 1 and the pixel export and peak picking tasks were done using the imzML(3) file. The 42 scan ROI was the same for both export operations and that ROI was also the interrogated region for the peak picking operation. As shown in the table, significant improvements have been made for these tasks, greatly enhancing the user experience.

Table 2 MSiReader Operation Time Improvement

Loading Multiple Data Sets

A commonly requested feature that is now available in MSiReader is the ability to load and process more than one data set simultaneously. Previously, the only way to accomplish this was to merge all the files into one file, which could be a tedious and time-consuming process, especially for files with different spatial dimensions. Now MSiReader can load all the imzML or mzXML files in a folder on-the-fly and distribute the images into a tiled mosaic defined by the user. The files need not have the same dimensions and smaller files are padded with empty scans such that all images in a row have the same number of scan rows and all images in a column have the same number of scan columns. Each image is centered in its tile and surrounded by at least one empty row and column. After selecting to load a folder, the user is prompted to define the number of tile rows and columns (Figure 1). The files are then loaded in alphanumeric order by name or they may be given a sequence number suffix. The source of each pixel (file and scan number) is retained throughout all subsequent processing. All ROI, peak picking analysis, export, and visualization tools are supported across the image mosaic. For example, the interrogated and reference ROIs for peak picking may be in different image tiles and thus from different data sets.

Figure 1
figure 1

Loading multiple imaging data sets. The images presented are for cholesterol [M+H+–H2O]+ in 12 sagittal mouse brain tissue sections

Absolute Quantification in MSiReader for Mass Spectrometry Imaging

Quantification in mass spectrometry imaging is a challenging experiment requiring consideration to many factors that may influence ion abundance during the experiment. This requires the incorporation of a calibration curve and may involve the incorporation of a normalization compound. Visualization and analysis software must support common and advanced data analysis methods. Most MSI software packages allow normalization (ratio) of images to a specific compound or total ion current (TIC), thereby accounting for per-pixel ionization efficiencies. This normalization step has proven itself to be essential for quantitative mass spectrometry imaging [19].

Incorporation of a spatial calibration curve is essential to any quantitative MSI experiment to relate ion abundance to an absolute amount of a compound. Integration of standards with tissues has been performed in various ways [24] but all require the selection of a ROI for each calibration point in addition to the tissue area to be quantified. The newly introduced MSiQuantification tool (Figure 2) in MSiReader allows the user to select up to 10 calibration ROIs and define the concentration of each. Once the tissue ROI and at least three calibration points have been selected, the user can generate a linear least squares regression for the calibration curve and calculate the concentration of analyte across the selected tissue ROI. After the calibration curve has been defined, the tissue image may be represented as absolute concentration based on the calibration curve. To preserve data analyses and allow additional analyses, the quantification process parameters and ROIs can be saved into a binary file and reloaded later to repeat the analysis using the same exact parameters.

Figure 2
figure 2

MSiQuantification tool for absolute quantification MSI experiments. The tool allows ROIs to be drawn for an area of the tissue to be quantified (a), and up to 10 calibration spots (b). Once the ROIs have been drawn, the concentration of each calibration spot can be input in the MSiQuantification pop-up window (c). The ion images presented are for emtricitabine [M+Na+]+ in a human cervical tissue section. The calibration curve spots of stable isotope labeled emtricitabine [M+Na+]+ were spotted directly on top of the tissue section prior to IR-MALDESI MSI

Polarity Switching Mass Spectrometry Imaging Data Processing

Lipids and metabolites are some of the most commonly analyzed biomolecules using MSI. Alterations in lipidome and metabolome have been linked to several diseases spanning from hypertension, diabetes, Alzheimer’s diseases, and cancer [25, 26]; therefore, assessing the distribution and abundance of lipids and metabolites in biological tissue specimens is of paramount importance in order to provide insight into the onset and progression of disease, as well as discovery of potential biomarkers. Due to their remarkable structural diversity, lipids and metabolites exhibit preferential ionization efficiencies. Generally, metabolites that are detected in high ion abundance as positive ions are poorly detected (or not detected at all) as negative ions, and vice versa. Therefore, to comprehensively characterize the metabolome, analysis of tissue specimens in both polarities is required. One could perform two MSI experiments, one in each polarity, to obtain extensive metabolome coverage; however, this approach would be costly in terms of time, sample, and materials used. Alternatively, polarity switching MSI can be used to obtain the same biochemical information, while shortening the analysis time and reducing sample consumption by almost half. Indeed, polarity switching MSI has become a valuable analysis mode in recent years [21, 27,28,29,30]. The recent surge in reports of polarity switching MSI calls for new software options for analyzing these datasets.

One of the new features introduced in MSiReader is the ability to parse data using the polarity in which the spectra were acquired. There are different approaches to acquiring polarity switching spectra such as alternating polarities from pixel to pixel [21, 30], from one line to the next [29], or even using a “spiral” step [28] to acquire spectra in positive and negative modes. MSiReader can analyze these patterns while parsing the data and load the desired polarity. Furthermore, it can be used to filter out “equilibration spectra” that might have been collected in between polarity switching events during data acquisition. MSiReader implements four polarity switching patterns: [+–+], [–++–], [+–], and [–+]. For the 4-tuple patterns, either the odd (1,3) or the even (2,4) scans are equilibrium scans used to allow the electrospray to stabilize when the polarity is changed from one pixel to the next with no advancement of the sample raster stage. An example of a polarity switching MSI method and equilibration spectra was recently demonstrated by our group [21], and a representative image showing the loading process in MSiReader is shown in Figure 3. Additional patterns for data analysis can be readily incorporated into the software using the existing algorithm for reading the file header and scan polarities. In addition, an optional filter parameter allows the user to select the positive image, the negative image, or both images for subsequent processing. If both polarities are retained, polarity is selected when the MSiPeakfinder or MSiSpectrum tools are launched. The filter and polarity selection can also be used without switching on a file with an arbitrary distribution of polarities – for example, a tiled composite imzML image of multiple tissue samples.

Figure 3
figure 3

Polarity switching and polarity filtering are implemented for the mzXML and imzML file formats as the data is loaded. In this case, healthy (left) and cancerous (right) hen ovarian tissue sections were analyzed using a polarity pattern of [+–+, odd] and only the (+) polarity scans are retained

Image Overlay Tool

Mass spectrometry imaging allows the spatial distribution of ions to be depicted in a two-dimensional array. The overlap of two complimentary imaging techniques allows researchers to gain a deeper insight into the significance of spatial localizations. In combining MSI images with other imaging modalities, the molecular distributions gain meaning in their context to the tissue morphology.

MSiImage tool has been implemented in MSiReader to import, scale, and align an image (e.g., optical, fluorescence) on a MSiReader heatmap. This includes an adjustable transparency of the image along with resizing, cropping, and rotation. Any file format supported by MATLAB can be imported, allowing connection between two analysis methods to be compared. Alignment of classic biology techniques such as staining by hematoxylin and eosin (H&E) or immunohistochemistry with a MSI data set allows for histology-directed molecular analysis of tissues [31]. An example of an image overlay with the MSI data is demonstrated in Figure 4, where the spatial distribution of m/z 367.3363, putatively assigned to desmosterol [M+H+–H2O]+, is displayed on a semitransparent optical image of the whole-body section. In addition, the MSiImage function can be used in combination with the colocalization tool in MSiReader to show the distribution of three distinct ions overlaid with another image.

Figure 4
figure 4

MSiImage tool for overlaying an optical image with the ion image of putatively assigned desmosterol (m/z 367.3363, [M+H+–H2O]+) obtained in a whole-body IR-MALDESI analysis 2-d-old neonatal mouse

Mass Measurement Accuracy (MMA) Heatmap and Histograms

The importance of maintaining high mass measurement accuracy (MMA) throughout MS-based experiments is well-known. This point is more vital in MSI analyses since there are no chromatographic separations and deviations in MMA could lead to generation of incorrect ion images and/or identification of incorrect analytes in untargeted studies. In our laboratory, the MMA for an analyte observed in MSI analyses was calculated by exporting the centroid m/z value for the peak at each scan over the user-defined ROI (often the tissue section) to an Excel sheet using MSiReader and then calculating the MMA for each scan. Another option could be to use a separate program to export the centroid m/z from the raw file. These processes are laborious and time-consuming, and do not provide information about any change in the MMA as a function of spatial location.

To circumvent these issues and speed up the analysis process, we have introduced new quality assurance features in MSiReader that allow users to generate heatmaps of MMA for the scans within the user-identified ROI or all scans in the file. This feature uses the m/z value input by the user as the “true” value and picks the most abundant peak in the m/z window as the “experimental” value to calculate the MMA at each scan. The MMA for each scan is then plotted in a heatmap, where possible deviations at specific locations within the ROI can be easily observed. Figure 5 demonstrates how this function can be useful for assessing the quality of the data. The MMA of glutathione (m/z 306.0766, [M–H+]) across a healthy ovarian tissue section was plotted to ensure there were no deviations in the course of MSI analysis. In addition, a histogram of the MMAs, as a function of number of scans or the abundance at each scan, can also be exported to a separate figure (along with the MMA heatmap or separately). This tool can serve as a facile method to check the MMA of the analyte(s) of interest throughout the imaging experiment. If needed, the MMA values, along with the scan number from which they were calculated, can be exported into an Excel sheet for further investigation. The MMA heatmap feature allows for generation of correct ion images and putative identification of unknown metabolites based on their accurate mass in untargeted analyses. In addition, this feature is useful in targeted and quantitative MSI experiments in order to ensure there are no outliers that could potentially undermine the results obtained.

Figure 5
figure 5

Top: Screenshot of MSiReader interface showing the ion image of glutathione (m/z 306.0766, [M–H+]-) in a healthy hen ovary tissue section analyzed with negative mode IR-MALDESI. The mass measurement accuracy function can be accessed by right-clicking on the image axes. Bottom: Heatmap of MMA in ppm of glutathione over the tissue region (denoted by magenta line on the ion image). Calculated MMA values for all interrogated voxels are presented in histogram form with an overlaid Gaussian fit, demonstrating normality of the MMA distribution. The dashed lines on the histogram demonstrate the ±2.5 ppm tolerance with which the ion image was generated

Conclusions

Since its introduction to the MSI community in 2013, MSiReader has been downloaded and used by over 800 researchers worldwide and has been cited in more than 110 publications. Over the past 4 y, new algorithms have been implemented into the software, new features have been added to enhance the analysis workflows, and significant performance improvements have been made based on the feedback from the MSI community. Some of the major features added to v1.0 such as the MSiQuantification tool, ability to analyze polarity switching files, image overlay, and MMA heatmaps were discussed in this manuscript. These features have enabled our group and others in the MSI community to carry out important research projects. For instance, the MSiQuantification has simplified the data analysis steps required to obtain absolute quantification of analytes in tissue sections, while the MMA feature provides a necessary, and often overlooked, quality assurance step to ensure instrument stability and high data quality over the course of an imaging experiment. The influence of the user community and our own research objectives continue to provide an opportunity for further development of this useful and free software tool.