Introducing ROC Curves as Error Measure Functions: A New Approach to Train ANN-Based Biomedical Data Classifiers

  • Raúl Ramos-Pollán
  • Miguel Ángel Guevara-López
  • Eugénio Oliveira
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6419)

Abstract

This paper explores the usage of the area (Az) under the Receiver Operating Characteristic (ROC) curve as error measure to guide the training process to build machine learning ANN-based classifiers for biomedical data analysis. Error measures (like root mean square error, RMS) are used to guide training algorithms measuring how far solutions are from the ideal classification, whereas it is well known that optimal classification rates do not necessarily yield to optimal Az’s. Our hypothesis is that Az error measures can guide existing training algorithms to obtain better Az’s than other error measures. This was tested after training 280 different configurations of ANN-based classifiers, with simulated annealing, using five biomedical binary datasets from the UCI machine learning repository with different test/train data splits. Each ANN configuration was trained both using the Az and RMS based error measures. In average Az was improved in 7.98% in testing data (9.32% for training data) when using 70% of the datasets elements for training. Further analysis reveals interesting patterns (Az improvement is greater when Az are lower). These results encourage us to further explore the usage of Az based error measures in training methods for classifiers in a more generalized manner.

Keywords

ROC Curves Artificial Neural Networks Machine learning Classifiers Biomedical Data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kostka, P., Tkacz, E.J.: Feature extraction and selection algorithms in biomedical data classifiers based on time-frequency and principle component analysis. In: Proc. 11th Mediterranean Conference on Medical and Biomedical Engineering and Computing 2007, vol. 16, pp. 70–73. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Drakos, J., Karakantza, M., Zoumbos, N., Lakoumentas, J., Nikiforidis, G., Sakellaropoulos, G.: A perspective for biomedical data integration: Design of databases for flow cytometry. BMC Bioinformatics 9(1), 99 (2008)CrossRefGoogle Scholar
  3. 3.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27(8), 861–874 (2006)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Castro, C.L., Braga, A.P.: Optimization of the Area under the ROC Curve. In: Proc. of 10th Brazilian Symposium on Neural Networks, SBRN 2008, pp. 141–146 (2008)Google Scholar
  5. 5.
    Cortes, C., Mohri, M.: AUC optimization vs. error rate minimization. In: Advances in Neural Information Processing Systems. MIT Press, Cambridge (2003)Google Scholar
  6. 6.
    Rakotomamonjy, A.: Optimizing Area under ROC Curve with SVMs. In: Proc. Workshop of ROC Analysis in Artificial Intelligence, pp. 71–80. ROCAI (2004)Google Scholar
  7. 7.
    Heaton, J.: Programming Neural Networks with Encog 2 in Java. Heaton Research, Inc. (2010)Google Scholar
  8. 8.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P.: Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
  9. 9.
    Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by Simulated Annealing. Science 220(4598), 671–680 (1983)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
  11. 11.
    EGEE: The gLite middleware, vol. 2010 (2009)Google Scholar
  12. 12.
    John Eng, M.D.: ROC analysis: web-based calculator for ROC curves, vol. 2010. Johns Hopkins University, Baltimore (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Raúl Ramos-Pollán
    • 1
  • Miguel Ángel Guevara-López
    • 2
  • Eugénio Oliveira
    • 3
  1. 1.CETA-CIEMAT Centro Extremeño de Tecnologías AvanzadasTrujilloSpain
  2. 2.INEGI Instituto de Engenharia, Mecanica e Gestão IndustrialUniversidade do PortoPortoPortugal
  3. 3.LIACC-DEI-Faculdade de EngenhariaUniversidade do PortoPortoPortugal

Personalised recommendations