Rule Learning for Disease-Specific Biomarker Discovery from Clinical Proteomic Mass Spectra

  • Vanathi Gopalakrishnan
  • Philip Ganchev
  • Srikanth Ranganathan
  • Robert Bowser
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3916)


A major goal of clinical proteomics is the identification of protein biomarkers from mass spectral analyses of fairly easily obtainable samples such as blood serum, urine or cerebrospinal fluid from patient populations. It is hoped that such protein biomarkers can be utilized for early detection of disease and examined further for potential therapeutic use. In this paper, we present the process for successful discovery of biomarkers that are indicators of a chronic neurodegenerative disease of motor neurons, called Amyotrophic Lateral Sclerosis; from application of rule learning to the analysis of proteomic mass spectra from cerebrospinal fluid samples. We have implemented a wrapper-based rule learning framework within which the massive number of features that accumulate from mass spectral analyses of clinical samples can be evaluated by repeated invocation of a rule learner. Our framework facilitates evidence gathering as indicated in this case study, and can speed up disease-specific biomarker discovery from clinical proteomic mass spectra.


Amyotrophic Lateral Sclerosis Rule Learner Mass Spectral Analysis Beam Search Certainty Factor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Srinivas, P.R., Verma, M., Zhao, Y., Srivastava, S.: Proteomics for cancer biomarker discovery. Clin. Chem. 48(8), 1160–1169 (2002)Google Scholar
  2. 2.
    Tyers, M., Mann, M.: From genomics to proteomics. Nature 422(6928), 193–197 (2003)CrossRefGoogle Scholar
  3. 3.
    Cazares, L.H., Adam, B.L., Ward, M.D., Nasim, S., Schellhammer, P.F., Semmes, O.J., Wright Jr., G.L.: Normal, benign, preneoplastic, and malignant prostate cells have distinct protein expression profiles resolved by surface enhanced laser desorption/ionization mass spectrometry. Clin. Cancer. Res. 8(8), 2541–2552 (2002)Google Scholar
  4. 4.
    Wright, G.L., Cazares, L.H., Leung, S.M., Nasim, S., Adam, B.L., Yip, T.T., Schellhammer, P.F., Gong, L., Vlahou, A.: Proteinchip(R) surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures. Prostate Cancer Prostatic Dis. 2(5/6), 264–276 (1999)Google Scholar
  5. 5.
    Adam, B.L., Qu, Y., Davis, J.W., Ward, M.D., Clements, M.A., Cazares, L.H., Semmes, O.J., Schellhammer, P.F., Yasui, Y., Feng, Z., Wright Jr., G.L.: Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res. 62(13), 3609–3614 (2002)Google Scholar
  6. 6.
    Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577 (2002)CrossRefGoogle Scholar
  7. 7.
    Coombes, K.R., Morris, J.S., Hu, J., Edmonson, S.R., Baggerly, K.A.: Serum proteomics profiling–a young technology begins to mature. Nat. Biotechnol. 23(3), 291–292 (2005)CrossRefGoogle Scholar
  8. 8.
    Bensmail, H., Golek, J., Moody, M.M., Semmes, J.O., Haoudi, A.: A novel approach for clustering proteomics data using Bayesian fast Fourier transform. Bioinformatics 21(10), 2210–2224 (2005)CrossRefGoogle Scholar
  9. 9.
    Fung, E.T., Weinberger, S.R., Gavin, E., Zhang, F.: Bioinformatics approaches in clinical proteomics. Expert Rev. Proteomics 2(6), 847–862 (2005)CrossRefGoogle Scholar
  10. 10.
    Seibert, V., Ebert, M.P., Buschmann, T.: Advances in clinical cancer proteomics: SELDI-ToF-mass spectrometry and biomarker discovery. Brief Funct. Genomic. Prot. 4(1), 16–26 (2005)CrossRefGoogle Scholar
  11. 11.
    Ranganathan, S., Williams, E., Ganchev, P., Gopalakrishnan, V., Lacomis, D., Urbinelli, L., Newhall, K., Cudkowicz, M.E., Brown Jr., R.H., Bowser, R.: Proteomic profiling of cerebrospinal fluid identifies biomarkers for amyotrophic lateral sclerosis. J. Neurochem. 95(5), 1461–1471 (2005)CrossRefGoogle Scholar
  12. 12.
    Frank, E., Hall, M., Trigg, L., Holmes, G., Witten, I.H.: Data mining in bioinformatics using Weka. Bioinformatics 20(15), 2479–2481 (2004)CrossRefGoogle Scholar
  13. 13.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  14. 14.
    Clearwater, S., Provost, F.: RL4: A Tool for Knowledge-Based Induction. In: Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence (TAI 1990) (1990)Google Scholar
  15. 15.
    Feigenbaum, E.A., Buchanan, B.G.: Dendral and Meta-Dendral - Roots of Knowledge Systems and Expert System Applications. Artif. Intell. 59(1-2), 223–240 (1993)CrossRefGoogle Scholar
  16. 16.
    Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42, 203–231 (2001)CrossRefMATHGoogle Scholar
  17. 17.
    Mitchell, T.: The need for biases in learning generalizations. In: Dietterich, T.G., Shavlik, J. (eds.) Readings in Machine Learning. Morgan Kaufmann, San Francisco (1991)Google Scholar
  18. 18.
    Provost, F., Buchanan, B.G.: Inductive policy: the pragmatics of bias selection. Machine Learning 20, 35–61 (1995)Google Scholar
  19. 19.
    Gopalakrishnan, V., Williams, E., Ranganathan, S., Bowser, R., Cudkowic, M.E., Novelli, M., Lattanzi, W., Ganbotto, A., Day, B.W.: Proteomic Data Mining Challenges in Identification of Disease-Specific Biomarkers from Variable Resolution Mass Spectra. In: Proceedings of SIAM Bioinformatics Workshop 2004. Society of Industrial and Applied Mathematics International Conference on Data Mining, April 2004, pp. 1–10 (2004)Google Scholar
  20. 20.
    Liu, H., Li, J., Wong, L.: A Comparative Study on Feature Selection and Classification methods Using Gene Expression Profiles and Proteomic Patterns. Genome Informatics 13, 51–60 (2002)Google Scholar
  21. 21.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Vanathi Gopalakrishnan
    • 1
  • Philip Ganchev
    • 1
  • Srikanth Ranganathan
    • 2
  • Robert Bowser
    • 2
  1. 1.Center for Biomedical InformaticsUniversity of PittsburghPittsburghUSA
  2. 2.Department of PathologyUniversity of PittsburghPittsburghUSA

Personalised recommendations