Skip to main content

Advertisement

Log in

The principle of exhaustiveness versus the principle of parsimony: a new approach for the identification of biomarkers from proteomic spot volume datasets based on principal component analysis

  • Paper in Forefront
  • Published:
Analytical and Bioanalytical Chemistry Aims and scope Submit manuscript

Abstract

The field of biomarkers discovery is one of the leading research areas in proteomics. One of the most exploited approaches to this purpose consists of the identification of potential biomarkers from spot volume datasets produced by 2D gel electrophoresis. In this case, problems may arise due to the large number of spots present in each map and the small number of maps available for each class (control/pathological). Multivariate methods are therefore usually applied together with variable selection procedures, to provide a subset of potential candidates. The variable selection procedures available usually pursue the so-called principle of parsimony: the most parsimonious set of spots is selected, providing the best classification performances. This approach is not effective in proteomics since all potential biomarkers must be identified: not only the most discriminating spots, usually related to general responses to inflammatory events, but also the smallest differences and all redundant molecules, i.e. biomarkers showing similar behaviour. The principle of exhaustiveness should be pursued rather than parsimony. To solve this problem, a new ranking and classification method, “Ranking-PCA”, based on principal component analysis and variable selection in forward search, is proposed here for the exhaustive identification of all possible biomarkers. The method is successfully applied to three different proteomic datasets to prove its effectiveness.

A new ranking and classification method, Ranking-PCA, is presented for the identification of pools of potential biomarkers from electrophoretic spot volume datasets. The method represents a new perspective in biomarker identification since it searches for the most exhaustive set of potential candidates rather than the most parsimonious. In this way, all significant candidates can be effectively selected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. U.S. Human Genome Project (Department of Energy and the National Institutes of Health of USA). http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml

  2. Tonge R, Shaw J, Middleton B, Rowlinson R, Rayner S, Young J, Pognan F, Hawkins E, Currie I, Davison M (2001) Validation and development of fluorescence two-dimensional differential gel electrophoresis proteomics technology. Proteomics 1(3):377–396

    Article  CAS  Google Scholar 

  3. Heidema AG, Thissen U, Boer JMA, Bouwman FG, Feskens EJM, Mariman ECM (2009) The association of 83 plasma proteins with CHD mortality, BMI, HDL-, and total-cholesterol in men: applying multivariate statistics to identify proteins with prognostic value and biological relevance. J Prot Res 8(6):2640–2649

    Article  CAS  Google Scholar 

  4. Rodriguez-Pineiro AM, Rodriguez-Berrocal FJ, de la Cadena MP (2007) Improvements in the search for potential biomarkers by proteomics: application of principal component and discriminant analyses for two-dimensional maps evaluation. J Chromatogr B 849(1–2):251–260

    Article  CAS  Google Scholar 

  5. Lilley KS, Dupree P (2006) Methods of quantitative proteomics and their application to plant organelle characterization. J Exper Botany 57(7):1493–1499

    Article  CAS  Google Scholar 

  6. Marengo E, Robotti E, Bobba M, Righetti PG (2008) Evaluation of the variables characterized by significant discriminating power in the application of SIMCA classification method to proteomic studies. J Prot Res 7(7):2789–2796

    Article  CAS  Google Scholar 

  7. Marengo E, Robotti E, Righetti PG, Campostrini N, Pascali J, Ponzoni M, Hamdan M, Astner H (2004) Study of proteomic changes associated with healthy and tumoral murine samples in neuroblastoma by principal component analysis and classification methods. Clin Chim Acta 345(1–2):55–67

    Article  CAS  Google Scholar 

  8. Karp NA, Griffin JL, Lilley KL (2005) Application of partial least squares discriminant analysis to two-dimensional difference gel studies in expression proteomics. Proteomics 5(1):81–90

    Article  CAS  Google Scholar 

  9. Seasholtz MB, Kowalski B (1993) The parsimony principle applied to multivariate calibration. Anal Chim Acta 277:165

    Article  CAS  Google Scholar 

  10. Booksh KS, Kowalski BR (1997) Calibration method choice by comparison of model basis functions to the theoretical instrumental response function. Anal Chim Acta 348(1–3):1–9

    Article  CAS  Google Scholar 

  11. Gributs CE, Burns DH (2006) Parsimonious calibration models for near-infrared spectroscopy using wavelets and scaling functions. Chemometr Intell Lab Syst 83(1):44–53

    Article  CAS  Google Scholar 

  12. Lo Re VIII, Bellini LM (2002) William of Occam and Occam's razor. Annals Int Med 136(8):634–635

    Google Scholar 

  13. Massart DL, Vanderginste BGM, Buydens LMC, De Jong S, Lewi PJ, Smeyers-Verbeke J (1998) Handbook of chemometrics and qualimetrics: part A. Elsevier, Amsterdam

    Google Scholar 

  14. Massart DL, Vanderginste BGM, Deming SM, Michotte Y, Kaufman L (1988) Chemometrics: a textbook. Elsevier, Amsterdam

    Google Scholar 

  15. Marengo E, Robotti E, Bobba M, Milli A, Campostrini N, Righetti SC, Cecconi D, Righetti PG (2008) Application of partial least squares discriminant analysis and variable selection procedures: a 2D-PAGE Proteomic-Based Study. Anal Bioanal Chem 390:1327–1342

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the collaboration of Prof Pier Giorgio Righetti (Polytechnic of Milan, Italy) and Dr Daniela Cecconi (University of Verona, Italy) who provided the proteomic datasets used in this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emilio Marengo.

Additional information

Awarded an ABC Poster Prize on the occasion of ‘Euroanalysis 2009’ held in Innsbruck, Austria, from 6-10 September 2009.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marengo, E., Robotti, E., Bobba, M. et al. The principle of exhaustiveness versus the principle of parsimony: a new approach for the identification of biomarkers from proteomic spot volume datasets based on principal component analysis. Anal Bioanal Chem 397, 25–41 (2010). https://doi.org/10.1007/s00216-009-3390-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00216-009-3390-8

Keywords

Navigation