Introduction

The 10–100 nm size range is a challenging size range for both detection and characterization. It roughly corresponds to masses of around 1 MDa to 1 GDa for spherical biological samples. Conventional mass spectrometry methods (where the m/z spectrum is measured) are usually limited to masses less than around a megadalton, though there have been a few exceptions for highly homogeneous samples [1]. A heterogeneous sample of high mass species leads to a congested m/z spectrum with overlapping charge state envelopes that are difficult to resolve and assign. One solution to this problem is to use a single particle approach where the mass is determined directly for each ion [2]. In charge detection mass spectrometry (CDMS), for example, the m/z and charge are measured simultaneously for individual ions and then multiplied to give the mass [3,4,5,6,7,8,9,10,11,12,13,14,15,16]. The measurement is repeated for several thousand different ions, and then, the masses are binned to yield a true mass spectrum without the need to deconvolve the m/z spectrum.

In CDMS, the ions are detected by means of a metal cylinder. When an ion enters the cylinder, it induces a charge which is sensed by a low-noise charge sensitive amplifier. The induced charge dissipates when the ion leaves the cylinder. The m/z is obtained from the flight time through the cylinder and the charge is determined from the amplitude of the signal (which is proportional to the induced charge). With a single pass through the detection cylinder, the uncertainty in the charge measurement is typically greater than 50 elementary charges (e) because of electrical noise. This large uncertainty in the charge leads to a large uncertainty in the mass.

One way to reduce the uncertainty is to place the detection cylinder in an electrostatic linear ion trap (ELIT) so that trapped ions oscillate back and forth through the detection cylinder many times [5]. Ions are trapped for a predetermined time and then the trap is opened. In recent work, the time domain signal from the trapped ion is digitized, transferred to a computer, and analyzed by fast Fourier transforms (FFTs) [10]. The m/z is obtained from the fundamental frequency and the charge is determined from the magnitudes of the fundamental and first harmonic. Not all ions are trapped for the full trapping time, so a series of FFTs must be performed for each trapping event to determine how long each ion was trapped. In current implementations, it takes much longer to analyze the data than to collect it. Most of the data collected during the day is analyzed overnight and interpreted a day after collection.

Because it is necessary to trap single ions, the optimum signal is tightly constrained. If the signal is too low, then most trapping events will be empty. If, on the other hand, the signal is too large, then most of the trapping events will contain multiple ions. Multiple ion trapping events are discarded because the ions can interact, which could lead to their oscillation frequencies being perturbed. To optimize the signal with this configuration, the signal intensity is monitored using a set of microchannel plates (MCPs) located behind the trap. The ion trap must be turned off so that ions pass through and strike the MCPs, so it is necessary to stop the data collection to check the signal. In addition, the relationship between the MCP signal and the number of single ion trapping events is not straightforward because the detection efficiency of the MCPs degrades with time and with increasing m/z. [17, 18]

The absence of real-time feedback in CDMS measurements wastes a lot of time. On many occasions, hours would be spent collecting a data set under what was thought to be optimum conditions, only to find the following morning that most trapping events were either empty or multiple ion. It became clear that for CDMS to become a valuable technique, it was necessary to develop a method of real-time analysis so that the signal intensity and mass distribution could be monitored during data collection. The development and first implementation of CDMS with real time analysis are described herein.

Experimental Considerations

The charge detection mass spectrometer has been described previously [10, 19,20,21]. Ions are generated by nanoelectrospray and enter the instrument through a heated metal capillary. Three differentially pumped regions containing an ion funnel, an RF hexapole, and an RF quadrupole separate the ions from the ambient gas flow. Ions that exit the quadrupole are accelerated to a nominal ion energy of 100 eV/z and focused by an einzel lens into a dual hemispherical deflection analyzer (HDA) that transmits ions with kinetic energies within a narrow band centered on 100 eV/z. Ions exiting the HDA are focused into the center of a modified cone trap that contains the detection cylinder. Trapped ions oscillate back and forth through the detection cylinder. The resulting time domain signal is amplified, digitized (16 bit, 2.4 MHz), and temporarily stored on a field programmable gate array (FPGA) before being packaged and sent to a computer for analysis.

The voltages on the endcaps of the modified cone trap are switched between trapping and transmission modes under the control of the FPGA. Two trapping modes are available: continuous and triggered. In continuous trapping mode, the trap is closed without knowing whether or not an ion is in the trap. The trapping sequence starts with both end caps in the transmission mode (i.e., grounded). First, the potential on the back end cap is switched to trapping mode, at which point, ions are reflected back through the trap. After a short delay, the potential on the front end cap is switched to trapping mode. Trapped ions then oscillate back and forth for a predetermined time at which point both end caps are set to ground to open the trap and begin the trapping cycle again. The number of trapped ions is given by a Poisson distribution, and the average signal must be adjusted to maximize the number of single ion trapping events. With continuous trapping, with the optimum average signal, the maximum fraction of single ion trapping events is 37% [16]. If the signal is too small, most of the trapping events will be empty, and if it is too large, most of them will contain multiple ions.

In triggered trapping, the back end cap voltage is switched to trapping mode so that ions are reflected back through the trap. However, the front endcap is held in transmission mode until a signal from an ion entering the detection cylinder exceeds a preset threshold and triggers the FPGA to raise the potential on the front endcap. The threshold must be set high enough to avoid false positives, which restricts this approach to highly charged ions. Ions are trapped for a predetermined time, and then, both end caps are set to transmission mode to empty the trap and restart the trapping sequence. It is also necessary to optimize the signal intensity with triggered trapping. If the signal is too high, the number of multiple ion trapping events increases. With triggered trapping, it is possible to exceed the 37% trapping efficiency of continuous trapping and approach 90% under favorable conditions [16]. With a trapping time of 100 ms, it typically takes around an hour to collect the several thousand ions needed to generate a reasonable mass distribution.

In both triggered and continuous trapping modes, CDMS collection must be paused to allow ions to reach the MCPs to evaluate the signal. In this work, we will show how real-time analysis can be used to eliminate the need for the MCP detector, by using the CDMS signals to monitor and optimize the measurement.

Computational Methods for Analysis of CDMS Data

The CDMS data is analyzed by a Fortran program according to the flow chart shown in Figure 1. When the Fortran program is initiated by the user, it creates output files to store the results. The program then opens the data files with an unformatted read into an integer array.

Figure 1
figure 1

Flow chart of the Fortran program used to analyze the CDMS data. The steps within the red dashed lines are multithreaded using OpenMP directives in the real time analysis version of the code

Each file contains one trapping event with short pre-trapping and post-trapping periods that contains the noise picked up from when the potentials on the endcaps are switched back and forth between transmission and trapping modes. The trapping period can range for 10 ms to beyond 30 s; 100 ms is the most widely used value as it provides a good balance of data collection speed and uncertainty in the charge determination. The array is truncated so that it only includes the trapping event and then zero padded to the nearest power of two to increase the computational efficiency. For a 100-ms trapping period, this gives arrays of 262,144 points. The array is passed through a high pass filter to eliminate low-frequency noise caused by vibrations from a turbomolecular pump attached to the hexapole region of the instrument. The entire time-domain signal from the high pass filter is Fourier transformed into magnitude mode frequency space using fast Fourier transform (FFT) subroutine of the Intel Math Kernal Library. The resulting frequency domain spectrum is scanned for peaks, which are defined as magnitudes that rise above six times the root mean square deviation (RMSD) of the noise.

At this stage, each trapping event is identified as a no ion event, single ion event, or multiple ion event. A peak finding algorithm first identifies the peak with the largest magnitude and assigns it to the fundamental frequency of ion oscillation. Trapping events with no peaks are considered a no ion event. Trapping events with peaks that do not fall at positions that are harmonics of the fundamental are assigned to multiple ion events. All other events with a fundamental and appropriate harmonics are assigned as single ion events.

For files that are determined to be multiple ion events, the file information is recorded, and the analysis program moves to the next file. For no ion and single ion events, the event is passed to the FFT windowing stages of the analysis. The no ion events are retained at this point because they may contain the signal from an ion with a relatively low charge that was only trapped for a short period of time. The peak in a full event FFT from such a signal may not rise above the threshold.

In the FFT windowing stage, a small section at the beginning of the time domain signal is transformed into magnitude frequency space and subjected to the peak finding algorithm described above. If no peak is found, then the window size is increased, and the process repeated. Once the minimum window size with a frequency peak six times the noise RMSD noise is located, the file is sent to final processing. If no peak is found during windowing up to the full event, the file is categorized as a no ion event and the file information is recorded.

During the final processing stages, the minimum window is incrementally scanned across the time domain signal to determine the oscillation frequency and magnitude at each point. The length of the trapping event and the average m/z and charge are then determined. For routine measurements, only ions that remain trapped for the full event are allowed to contribute to the mass distribution. Information for ions trapped less than the full event is recorded for diagnostic and troubleshooting purposes.

The trapping events are independent, and so, most of the data analysis (the portion inside the red dashed line in Figure 1) can be multithreaded to reduce the total analysis time.

Realizing CDMS with Real-Time Analysis and Visualization

The workflow prior to the implementation of real-time analysis is shown in Figure 2A. The digitized data for each trapping event is accumulated in the FPGA until the end of the trapping event and then sent to a network card on a Windows PC dedicated to data collection, where it is written to a file as an array of 16 bit integers. Periodically, a batch of files is manually transferred to another computer for analysis using a beta version of the Fortran code described above. The batch analysis computer was a Windows PC with an Intel® Core™ processor (i5-2500K, 4 cores, 3.3 GHz).

Figure 2
figure 2

(a) The old CDMS workflow resulting in 100 ms of collection time taking on average around 2 s to analyze. (b) The new workflow with real time analysis software implementation. Results are relayed to user in real time with a typical analysis time of only 0.021 s/file when all cores are used

Analysis typically took around 2 s per file (assuming an optimum mix of empty, single ion, and multiple ion trapping events). Thus, batch analysis of a 1-h long data set would take around 20 h on a single core. Multithreading to use all four cores available on this computer reduced the analysis time to around 5 h. We would often transfer data to additional computers so that the analysis of data collected over more than a few hours could be completed overnight.

The first step in realizing real-time analysis was to upgrade the computer used to analyze the CDMS data to a Linux server (OpenSuse Leap 42.1) with four Intel® Xeon™ processors (E5-4657L v2, 12 core, 2.4 GHz). Data was transferred directly from the FPGA boards to the Linux server so that data collection and analysis were performed on the same system. The Linux server has a lower clock speed than the Windows PC used for batch analysis (2.4 GHz versus 3.3 GHz). However, the Linux server has 48 cores compared to 4 on the Windows PC. Further improvements were obtained by optimizing the Fortran code and by using OpenMP directives. These improvements reduced the average analysis time to around 0.021 s/file, which is approximately a 100× improvement over the Windows PC.

The 100× speedup allowed CDMS data to be analyzed as it was collected. To accomplish real-time analysis, the Fortran analysis code was modified to analyze files on demand and wait for new files to be written. However, the use of the Linux server created a barrier for many users because widely used graphing utilities and data processing programs are limited to Windows operating systems. To bypass the need for experience with the Linux environment and the command line interfaces common to high-performance computing systems, a better user interface needed to be designed to couple with downstream processing programs and to improve user experience. To achieve this, a real time analysis graphical user interface (RTA GUI) was created to control all aspects of the CDMS experiment. The RTA GUI is run natively on the Linux system, but a windows-based X server is used to run the application on a Windows PC through a secure shell (ssh) connection. The RTA GUI allows the user to port control of the CDMS experiment to any computer of their choosing while still maintaining the security of the ssh connection. For added security, the RTA GUI is compatible with two-factor authentication to prevent unauthorized access to the experiment. This new workflow can be seen in Figure 2B. Finally, it is important to note that the Linux server is more easily interfaced with high-performance computing storage resources allowing for over a 100× speedup in the backup of data to permanent storage.

The three display windows of the RTA GUI are shown in Figure 3. Figure 3A shows the main control panel where the user inputs parameters to control the operation of the ion trap. Figure 3B shows a representative snap shot of the real-time output from the analysis program. Each line represents a single input file (i.e., a single trapping event). The first entry is the filename. Empty trapping events are indicated by a zero, and multiple ion events are indicated by “MULTIPLE ION EVENT”. For single ion events, the m/z, charge, mass, and trapping time are output. In this run, a trapping time of 96.8 ms indicates that the ion was trapped for the full period. The total trapping time was 100 ms but a small section of the time domain signal is discarded to allow the preamplifier to recover from the end cap potentials being switched from transmission to trapping mode. Figure 3C shows a snap shot of a real-time histogram of the data accumulated to this point. The m/z or mass histogram can be displayed. The user controls the upper and lower limits and the bin size. In the lower left-hand corner of this window, there is a small panel where the number of empty, single, and multiple ion trapping events are tracked, along with the single ion trapping efficiency which for continuous trapping is obtained from the number of single ion events divided by the total. As noted above, the maximum single ion trapping efficiency that can be achieved for ions that arrive at random times is 37%.

Figure 3
figure 3

Screenshots of windows displayed by the graphical user interface for real-time analysis. (a) A screenshot of the main control panel where input parameters to control the operation of the ion trap are entered. (b) A representative screenshot of the output from the real-time analysis program. Each line represents an input file. The first entry is the filename. Empty trapping events are indicated by a zero, and multiple ion events are indicated by “MULTIPLE ION EVENT”. For single ion events, the m/z, charge, mass, and trapping time are output. In this run, a trapping time of 96.8 ms indicates that the ion was trapped for the full period. (c) A screenshot of a real-time histogram of the data accumulated to this point. The m/z or mass histograms can be displayed. In the lower left hand corner, there is a small panel where the number of empty, single, and multiple ion trapping events are tracked, along with the trapping efficiency which is defined as the number of single ion events divided by the total

Benefits of Real-Time Data Analysis and Visualization

There are two main benefits of real-time analysis. First, it allows the user to optimize the number of single ion trapping events so that the time spent measuring the spectrum is minimized. Second, it allows the user to see the mass distribution as it is being accumulated and thus facilitate the optimization of the measurement to maximize the information content. An illustration of maximizing information content is shown in Figure 4. Figure 4A shows the mass distribution measured for the assembly of the hepatitis B virus (HBV) capsid. The spectrum contains 5737 ions from 15,999 trapping events recorded over a period of 26.7 min. There are a large number of low mass species (< 500 kDa) and a small peak at around 4.1 MDa, close to the expected mass for the HBV Cp149 T = 4 capsid [11, 12]. In Figure 4A, most of the detected ions have masses less than 500 kDa, and most of the 26.7 min used to collect the data was spent collecting the low mass ions. If the user is not interested in the low mass species, a large fraction of the measurement time is wasted. CDMS is a single particle technique, so time spent trapping and analyzing the low mass ions cannot be used to trap and analyze the high mass ions. To avoid wasting time on regions of the spectrum that are uninteresting to the user, it is possible to pre-filter the ions so that only those ions with masses in the region of interest are measured. An example of pre-filtering is described below.

Figure 4
figure 4

(a) CDMS mass spectrum of an HBV assembly reaction where the whole spectrum is measured. (b) CDMS spectrum of an HBV assembly reaction where the frequency of the RF quadrupole has been lowered to cut out the low mass ions that dominate the spectrum in (a). The spectrum in (a) contains 5737 ions from 15,999 trapping events collected over 26.7 min. The spectrum in (b) contains 5207 ions from 16,124 trapping events collected over 26.8 min. Both spectra are plotted with 20 kDa bins. The quadrupole RF frequency was 350 kHz for (a) and 120 kHz for (b)

An RF-only quadrupole acts as a high pass filter, and the lowest m/z transmitted depends on the RF frequency. For Figure 4A, the RF-only quadrupole that precedes the energy analyzer in the CDMS instrument is set to 350 kHz, which is high enough to transmit the Cp149 capsid protein dimer (sequence mass 33,540 Da). For Figure 4B, the RF frequency was lowered to 120 kHz so that most of the ions with masses less than 500 kDa are no longer transmitted by the RF-only quadrupole.

Consequently, most of the trapped ions have masses greater than 400 kDa and most of the time is now spent trapping the higher mass ions. Note that the RF-quadrupole has an m/z cut-off, not a mass cut-off, and this is why the mass cut-off in Figure 4B is not sharp. With the larger number of counts for high mass ions in Figure 4B, it is now possible to identify a low intensity peak with a mass around 3.1 MDa. This peak was not evident in Figure 4A. It is also clear that the peak at around 4.1 MDa has a high mass tail that extends up to around 4.5 MDa. The high mass tail is largely due to overgrown capsids—capsids that have acquired more than the 120 dimers needed to form an icosahedral T = 4 capsid during the initial assembly reaction. Overgrown capsids slowly anneal into icosahedral capsids [12].

As noted above, one of the advantages of real-time trapping is that the signal can be optimized to maximize the fraction of single ion trapping events, substantially reducing the time required to collect the spectrum. A variety of approaches could be used to regulate the signal intensity. In this work, the signal intensity was controlled by adjusting the size of an aperture placed at the ion beam focus just before the ion trap. The adjustment was accomplished using a thin metal disk attached to a precision rotary positioner. The disk had a series of apertures with different diameters on the same radius so that when the disk was rotated, different sized apertures could be brought into the ion beam focus. A smaller aperture was used to decrease the signal and a larger aperture to increase it. The aperture size could be controlled manually or automatically (i.e., under computer control). In both cases, if the number of multiple ion trapping events increased at the expense of single ion trapping events, then a switch to a smaller aperture would occur in order to reduce the signal and maximize the number of single ion trapping events. Real-time optimization of the signal intensity makes it possible to perform CDMS measurements at close to the maximum efficiency, substantially reducing the time required to collect a spectrum.

Conclusions

The implementation of CDMS with real-time analysis involved overcoming a number of significant engineering hurdles. While it was necessary to use a high-performance computing approach to streamline the analysis of CDMS data, we have maintained a user-friendly interface that can be mastered by operators with little to no technical computing expertise.

Real-time analysis eliminates the time lag between data collection and visualization. While real-time visualization of experimental data is valuable for all experimental approaches, it is particularly valuable for CDMS because it is a single particle method where it is necessary to keep the signal intensity within a narrow range in order to maximize the number of single ion trapping events. With real-time analysis, the signal intensity can be adjusted manually or automatically to maintain its optimum value, minimizing the time required to collect a spectrum. This signal adjustment can be done during data collection completely independent of MCPs.

In addition to minimizing the data collection time, real-time analysis allows the user to optimize the experimental conditions to maximize the information content. Again, this is a consequence of the single particle nature of CDMS measurements where time spent analyzing ions that are not relevant to the scientific question in hand, is wasted time that reduces the bandwidth available for relevant measurements.