Advertisement

Comment on: “An Efficient Method to Calculate the Aggregated Isotopic Distribution and Exact Center-Masses” by Jürgen Claesen, Piotr Dittwald, Tomasz Burzykowski, Dirk Valkenborg, J. Am. Soc. Mass Spectrom. 2012, 23, 753–763

  • Sebastian BöckerEmail author
Comment and Reply

Key words

Computational method Isotopic distribution Isotope pattern Accurate mass Mass spectrometry Proteomics Metabolomics 

Claesen et al. [1] recently presented an efficient method for computing the isotope pattern of a molecule, that is, both the isotope distribution and the center masses (also called “probability-weighted masses“ and “aggregated isotopic variants“ in [1]) of the isotope peaks. They favorably compare their approach against five other methods for simulating isotope patterns: Their tool BRAIN is more accurate than any other tool, and computation times are well below 1 s even for huge molecules of mass above 533403 Da and when computing 1325 peak masses.

Unfortunately, the authors fail to mention SIRIUS [2] that is capable of performing such calculations, too. The user interface of SIRIUS does not offer to input a molecular formula, but the method is accessible through the source code freely available as open source, http://bio.informatik.uni-jena.de/Sirius. The mathematical details of the method were introduced in 2006 [3]. SIRIUS computations are based on the convolution of isotope distributions: In [3] it is proven that center masses, called “mean peak masses“ in [2, 3], can also be computed via such convolutions. The same approach was independently, and somewhat informally, suggested in 2006 by Rockwood and Haimi [4]. Combining this with a smart Russian multiplication scheme allows us to quickly determine the isotope pattern of every element, which are then convoluted to determine the final isotope pattern of the molecule. The methods implemented in SIRIUS, including the decomposition of monoisotopic masses, are also available via the Bioconductor package “Rdisop“ written by Pervukhin and Neumann, http://bioconductor.org/packages/release/bioc/html/Rdisop.html

As default, SIRIUS uses the masses and abundances of isotopes from the AME2003 tables [5, 6] and abundances from [7]. For the evaluations in this paper, we have instead used masses and abundances from the IUPAC 1997 standard [8], as it was done by Claesen et al. We have evaluated SIRIUS on the same set of molecules [9], see Table 1 and Table 2 in [1]. For a fair comparison, we chose the number of computed center masses (mean peak masses) identical to those used in [1]. We found that the theoretical average masses of some molecules slightly differ from those reported in [1], with up to 0.000002 Da mass difference. To this end, we repeated all calculation of theoretical average masses with arbitrary high precision (BigDecimal type in Java) but ended up with the same results as reported in Table 1. Also, molecule (7) (Human Na/K ATPase, Renal isoform, subunit) is missing 40 sulfur atoms in Table 2 of [1], compare to Table 3 in [9].
Table 1

Molecular Formulas, Average Masses Computed by SIRIUS, and Running Time of SIRIUS and BRAIN. Average Mass and Mass Delta in Dalton. “No. Peaks” is the Number of Center Masses (Mean Peak Masses) Computed by the two Methods. Running Times in Milliseconds. Running Times for BRAIN Taken from [1]. “R-BRAIN” is the Running Time of the R Implementation of BRAIN

 

Average mass

No.

Running time (ms)

No.

Molecular formula

SIRIUS

Theoretical

peaks

SIRIUS

BRAIN

R-BRAIN

(1)

C50H71N13O12

1046.181107

1046.181107

50

5.9

37.5

18.2

(2)

C254H377N65O75S6

5733.510759

5733.510759

50

9.4

37.0

17.1

(3)

C520H817N139O147S8

11624.448751

11624.448751

50

14.8

37.6

17.7

(4)

C744H1224N210O222S5

16823.321352

16823.321352

100

11.3

37.0

32.6

(5)

C2023H3208N524O619S20

45415.679370

45415.679370

322

41.4

72.3

114.5

(6)

C2934H4615N781O897S39

66432.455560

66432.455560

400

62.7

75.4

146.8

(7)

C5047H8014N1338O1495S48

112895.125932

112895.125932

643

125.1

156.0

280.9

(8)

C8574H13378N2092O2392S77

186506.052593

186506.052593

807

164.0

216.8

388.5

(9)

C17600H26474N4752O5486S197

398722.972482

398722.972482

1163

312.4

355.7

661.1

(10)

C23832H37816N6528O7031S170

533735.214649

533735.214649

1325

400.7

408.6

791.6

The accuracy of SIRIUS is practically identical to that of BRAIN: Since SIRIUS also uses exact monoisotopic masses, the mass difference between calculated and theoretical monoisotopic peak is zero, compare to Table 3 in [1]. For the average mass, the mass computed by SIRIUS (by taking the weighed sum over all masses of the isotope pattern) and the theoretical average mass are again identical, see Table 1 and compare to Table 4 in [1]. Finally, we also compare the running times of SIRIUS and BRAIN: We report running times from [1] (Table 5) where BRAIN is implemented in Matlab and run on a Intel Core 2 Duo processor with 2.26 GHz and 4 GB RAM. SIRIUS was run on a MacBook Pro with Intel Core 2 Duo processor at 2.66 GHz and 4 GB RAM, using the Java virtual machine version 1.6.0. One can see that running times are very similar. We also evaluated the R implementation of BRAIN (again on the MacBook Pro) that is available as a Bioconductor package, http://bioconductor.org/packages/devel/bioc/html/BRAIN.html.

Masses in Table 1 have been rounded to six decimal places, and it appears that this is also true for all tables in [1]. In fact, there is a slight mass error for the average mass, that was well below 0.002 ppb (parts per billion) for all 10 molecules. We stress that a certain error is inevitable when computations are carried out using machine numbers, due to rounding error accumulation. When even higher mass accuracies are needed, other data types such as the BigDecimal type in Java can be used to reach an even higher accuracy, at the expense of increased running times. But this appears to be a wasteful undertaking, given that masses and, in particular, abundances of isotopes are known only with a rather limited precision.

The convolution method implemented in SIRIUS [2, 3] is easy to understand and straightforward to implement. Also, this method is very fast when calculations are limited to only few (say, 10) peaks. This is important when many isotope patterns have to be simulated, for example in the SIRIUS pipeline where for each decomposition of the monoisotopic mass, an isotope pattern is simulated and compared against the measured isotope pattern [2]. In this way, SIRIUS can decompose a monoisotopic mass, simulate isotope patterns for about 1000 molecular formulas, and match them against the measured pattern in less than a second [2]. On the other hand, the mathematically more involved method of Claesen et al. possibly is faster for very large molecules such as the human dynein heavy chain. We note that SIRIUS is implemented in Java and, hence, runtime-compiled into Java bytecode, whereas BRAIN is implemented in R and, hence, interpreted. To this end, it is likely that a constant-factor improvement in running time can be reached, implementing BRAIN in a compiled language. On the other hand, SIRIUS has not been designed to compute isotope patterns of molecules this large, so it is likely that running times can be further improved if this is required.

In full, it seems to be up to the user’s preferences, which method to choose, as both methods reach the same high accuracy and running times are very similar. On the other hand, BRAIN and, hence, also SIRIUS outperform all other methods evaluated in [1] (namely, Emass, Mercury, NeutronCluster, IsoPro, and IsoDalton) with respect to accuracy and sometimes even running time, see [1] for details.

Notes

Acknowledgments

All computations were carried out by Franziska Hufsky.

References

  1. 1.
    Claesen, J., Dittwald, P., Burzykowski, T., Valkenborg, D.: An efficient method to calculate the aggregated isotopic distribution and exact center-masses. J. Am. Soc. Mass Spectrom. 23, 753–763 (2012)CrossRefGoogle Scholar
  2. 2.
    Böcker, S., Letzel, M., Lipták, Zs., Pervukhin, A.: SIRIUS: decomposing isotope patterns for metabolite identification. Bioinformatics 25(2), 218–224 (2009)CrossRefGoogle Scholar
  3. 3.
    Böcker, S., Letzel, M., Lipták, Zs., Pervukhin, A.: Decomposing metabolomic isotope patterns. In: Proc. of Workshop on Algorithms in Bioinformatics (WABI 2006), vol. 4175 of Lect. Notes Comput. Sci, pp. 12–23. Springer, Berlin (2006)Google Scholar
  4. 4.
    Rockwood, A.L., Haimi, P.: Efficient calculation of accurate masses of isotopic peaks. J. Am. Soc. Mass Spectrom. 17(3), 415–419 (2006)CrossRefGoogle Scholar
  5. 5.
    Audi, G., Wapstra, A., Thibault, C.: The AME2003 atomic mass evaluation (II): tables, graphs, and references. Nucl. Phys. A 729, 129–336 (2003)CrossRefGoogle Scholar
  6. 6.
    Wieser, M.E.: Atomic weights of the elements 2005 (IUPAC Technical Report). Pure Appl. Chem. 78(11), 2051–2066 (2006)CrossRefGoogle Scholar
  7. 7.
    de Laeter, J.R., Böhlke, J.K., Biévre, P.D., Hidaka, H., Peiser, H.S., Rosman, K.J.R., Taylor, P.D.P.: Atomic weights of the elements. Review 2000 (IUPAC Technical Report). Pure Appl. Chem. 75(6), 683–800 (2003)CrossRefGoogle Scholar
  8. 8.
    Rosman, K., Taylor, P.: Isotopic compositions of the elements 1997. Pure Appl. Chem. 70(1), 217–235 (1998)CrossRefGoogle Scholar
  9. 9.
    Olson, M.T., Yergey, A.L.: Calculation of the isotope cluster for polypeptides by probability grouping. J. Am. Soc. Mass Spectrom. 20(2), 295–302 (2009)CrossRefGoogle Scholar

Copyright information

© American Society for Mass Spectrometry 2012

Authors and Affiliations

  1. 1.Friedrich-Schiller-UniversityJenaGermany

Personalised recommendations