Analytical and Bioanalytical Chemistry

, Volume 397, Issue 1, pp 25–41 | Cite as

The principle of exhaustiveness versus the principle of parsimony: a new approach for the identification of biomarkers from proteomic spot volume datasets based on principal component analysis

  • Emilio Marengo
  • Elisa Robotti
  • Marco Bobba
  • Fabio Gosetti
Paper in Forefront


The field of biomarkers discovery is one of the leading research areas in proteomics. One of the most exploited approaches to this purpose consists of the identification of potential biomarkers from spot volume datasets produced by 2D gel electrophoresis. In this case, problems may arise due to the large number of spots present in each map and the small number of maps available for each class (control/pathological). Multivariate methods are therefore usually applied together with variable selection procedures, to provide a subset of potential candidates. The variable selection procedures available usually pursue the so-called principle of parsimony: the most parsimonious set of spots is selected, providing the best classification performances. This approach is not effective in proteomics since all potential biomarkers must be identified: not only the most discriminating spots, usually related to general responses to inflammatory events, but also the smallest differences and all redundant molecules, i.e. biomarkers showing similar behaviour. The principle of exhaustiveness should be pursued rather than parsimony. To solve this problem, a new ranking and classification method, “Ranking-PCA”, based on principal component analysis and variable selection in forward search, is proposed here for the exhaustive identification of all possible biomarkers. The method is successfully applied to three different proteomic datasets to prove its effectiveness.


A new ranking and classification method, Ranking-PCA, is presented for the identification of pools of potential biomarkers from electrophoretic spot volume datasets. The method represents a new perspective in biomarker identification since it searches for the most exhaustive set of potential candidates rather than the most parsimonious. In this way, all significant candidates can be effectively selected.


Exhaustiveness Biomarker discovery Ranking PCA Variable selection 2D gel electrophoresis Classification methods 



The authors gratefully acknowledge the collaboration of Prof Pier Giorgio Righetti (Polytechnic of Milan, Italy) and Dr Daniela Cecconi (University of Verona, Italy) who provided the proteomic datasets used in this study.


  1. 1.
    U.S. Human Genome Project (Department of Energy and the National Institutes of Health of USA).
  2. 2.
    Tonge R, Shaw J, Middleton B, Rowlinson R, Rayner S, Young J, Pognan F, Hawkins E, Currie I, Davison M (2001) Validation and development of fluorescence two-dimensional differential gel electrophoresis proteomics technology. Proteomics 1(3):377–396CrossRefGoogle Scholar
  3. 3.
    Heidema AG, Thissen U, Boer JMA, Bouwman FG, Feskens EJM, Mariman ECM (2009) The association of 83 plasma proteins with CHD mortality, BMI, HDL-, and total-cholesterol in men: applying multivariate statistics to identify proteins with prognostic value and biological relevance. J Prot Res 8(6):2640–2649CrossRefGoogle Scholar
  4. 4.
    Rodriguez-Pineiro AM, Rodriguez-Berrocal FJ, de la Cadena MP (2007) Improvements in the search for potential biomarkers by proteomics: application of principal component and discriminant analyses for two-dimensional maps evaluation. J Chromatogr B 849(1–2):251–260CrossRefGoogle Scholar
  5. 5.
    Lilley KS, Dupree P (2006) Methods of quantitative proteomics and their application to plant organelle characterization. J Exper Botany 57(7):1493–1499CrossRefGoogle Scholar
  6. 6.
    Marengo E, Robotti E, Bobba M, Righetti PG (2008) Evaluation of the variables characterized by significant discriminating power in the application of SIMCA classification method to proteomic studies. J Prot Res 7(7):2789–2796CrossRefGoogle Scholar
  7. 7.
    Marengo E, Robotti E, Righetti PG, Campostrini N, Pascali J, Ponzoni M, Hamdan M, Astner H (2004) Study of proteomic changes associated with healthy and tumoral murine samples in neuroblastoma by principal component analysis and classification methods. Clin Chim Acta 345(1–2):55–67CrossRefGoogle Scholar
  8. 8.
    Karp NA, Griffin JL, Lilley KL (2005) Application of partial least squares discriminant analysis to two-dimensional difference gel studies in expression proteomics. Proteomics 5(1):81–90CrossRefGoogle Scholar
  9. 9.
    Seasholtz MB, Kowalski B (1993) The parsimony principle applied to multivariate calibration. Anal Chim Acta 277:165CrossRefGoogle Scholar
  10. 10.
    Booksh KS, Kowalski BR (1997) Calibration method choice by comparison of model basis functions to the theoretical instrumental response function. Anal Chim Acta 348(1–3):1–9CrossRefGoogle Scholar
  11. 11.
    Gributs CE, Burns DH (2006) Parsimonious calibration models for near-infrared spectroscopy using wavelets and scaling functions. Chemometr Intell Lab Syst 83(1):44–53CrossRefGoogle Scholar
  12. 12.
    Lo Re VIII, Bellini LM (2002) William of Occam and Occam's razor. Annals Int Med 136(8):634–635Google Scholar
  13. 13.
    Massart DL, Vanderginste BGM, Buydens LMC, De Jong S, Lewi PJ, Smeyers-Verbeke J (1998) Handbook of chemometrics and qualimetrics: part A. Elsevier, AmsterdamGoogle Scholar
  14. 14.
    Massart DL, Vanderginste BGM, Deming SM, Michotte Y, Kaufman L (1988) Chemometrics: a textbook. Elsevier, AmsterdamGoogle Scholar
  15. 15.
    Marengo E, Robotti E, Bobba M, Milli A, Campostrini N, Righetti SC, Cecconi D, Righetti PG (2008) Application of partial least squares discriminant analysis and variable selection procedures: a 2D-PAGE Proteomic-Based Study. Anal Bioanal Chem 390:1327–1342CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Emilio Marengo
    • 1
  • Elisa Robotti
    • 1
  • Marco Bobba
    • 1
  • Fabio Gosetti
    • 1
  1. 1.Department of Environmental and Life SciencesUniversity of Eastern PiedmontAlessandriaItaly

Personalised recommendations