Skip to main content

Heat Map Based Feature Selection: A Case Study for Ovarian Cancer

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9028))

Abstract

Public health is a critical issue, therefore we can find a great research interest to find faster and more accurate methods to detect diseases. In the particular case of cancer, the use of mass spectrometry data has become very popular but some problems arise due to that the number of mass-to-charge ratios exceed by a huge margin the number of patients in the samples. In order to deal with the high dimensionality of the data, most works agree with the necessity to use pre-processing. In this work we propose an algorithm called Heat Map Based Feature Selection (HmbFS) that can work with huge data without the need of pre-processing, thanks to a built-in compression mechanism based on color quantization. Results shows that our proposal is very competitive against some of the most popular algorithms and succeeds where other methodologies may fail due to the high dimensionality of the data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Yu, J., Chen, X.: Bayesian neural network approaches to ovarian cancer identification from high-resolution mass spectrometry data. Bioinformatics 21, i487–i494 (2005)

    Article  Google Scholar 

  2. Datta, S., DePadilla, L.M.: Feature selection and machine learning with mass spectrometry data for distinguishing cancer and non-cancer samples. Stat. Methodol. 3(1), 79–92 (2006). ISSN 1572-3127

    Article  MATH  MathSciNet  Google Scholar 

  3. Liotta, L.A., Ferrari, M., Petricoin, E.: Clinical proteomics: written in blood. Nature 425, 905 (2003)

    Article  Google Scholar 

  4. Wulfkuhle, J.D., Loitta, L.A., Petricoin, E.F.: Proteomic applications for the early dectection of cancer. Nature 3, 267–275 (2003)

    Google Scholar 

  5. Srinivas, P.R., Verma, M., Zhao, Y., Srivastava, S.: Proteomics for cancer biomarker discovery. Clin. Chem. 48, 1160–1169 (2002)

    Google Scholar 

  6. Tang, N., Tornatore, P., Weinberger, S.R.: Current developments in SELDI affinity technology. Mass Spectrom. Rev. 23, 34–44 (2004)

    Article  Google Scholar 

  7. Herrmann, P.C., Liotta, L.A., Petricoin, E.F.: Cancer proteomics: the state of the art. Dis. Markers 17, 49–57 (2001)

    Article  Google Scholar 

  8. Vlahou, A., Schellhammer, P.E., Mendrinos, S., Patel, K., Kondylis, F.L., Gong, L., Nazim, S., Wright, G.L., Jr.: Development of a novel proteomic approach for the detection of transitional cell carcinoma of the bladder in urine. Am. J. Pathol. 158, 1491–1520 (2001)

    Article  Google Scholar 

  9. Kuschner, K., Malyarenko, D., Cooke, W., Cazares, L., Semmes, O., Tracy, E.: A Bayesian network approach to feature selection in mass spectrometry data. BMC Bioinform. 11, 177 (2010)

    Article  Google Scholar 

  10. Malyarenko, D., Cooke, W.E., Adam, B.L., Malik, G., Chen, H., Tracy, E.R., Trosset, M.W., Sasinowski, M., Semmes, O.J., Manos, D.M.: Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-flight mass spectrometric records for serum peptides using time-series analysis techniques. Clin. Chem. 51, 65–74 (2005)

    Article  Google Scholar 

  11. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)

    Article  MATH  Google Scholar 

  12. Huertas, C., Juarez-Ramirez, R.: Filter feature selection performance comparison in high-dimensional data: a theoretical and empirical analysis of most popular algorithms. In: 17th Information Fusion (FUSION) Conference (2014)

    Google Scholar 

  13. Dittmann, B., Nitz, S.: Strategies for the development of reliable QA/QC methods when working with mass spectrometry-based chemosensory systems. Sens. Actuators, B 69, 253–257 (2000)

    Article  Google Scholar 

  14. Depczynski, U., Frost, V., Molt, K.: Genetic algorithms applied to the selection of factors in principal component regression. Anal. Chim. Acta 420, 217–227 (2000)

    Article  Google Scholar 

  15. Suganthy, M., Ramamoorthy, P.: Principal component analysis based feature extraction, morphological edge detection and localization for fast iris recognition. J. Comput. Sci. 8, 1428–1433 (2012)

    Article  Google Scholar 

  16. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997)

    Article  Google Scholar 

  17. Petricoin, E., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359, 572–577 (2002)

    Article  Google Scholar 

  18. Zhang, X., Lu, X., Shi, Q., Xu, X., Leung, H., Harris, L., Iglehart, J., Miron, A., Liu, J., Wong, W.: Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data. BMC Bioinform. 7(1), 197 (2006)

    Article  Google Scholar 

  19. Bonferroni, C.E.: Il calcolo delle assicurazioni su gruppi di teste. In: Studi in Onore del Professore Salvatore Ortu Carboni, Rome, pp. 13–60 (1935)

    Google Scholar 

  20. Westfall, P., Young, S.: Resampling-Based Multiple Testing, Examples and Methods For p-Value Adjustment. Wiley, New York (1993)

    Google Scholar 

  21. Liu, Y.: Feature extraction and dimensionality reduction for mass spectrometry data. Comput. Biol. Med. 39(9), 818–823 (2009)

    Article  Google Scholar 

  22. Jiqing, K., Lei, Z., Bin, H., Qi, D., Yaojia, W., Lihua, L., Shenhua, X., Hanzhou, M., Zhiguo, Z.: Sparse representation based feature selection for mass spectrometry data. In: Bioinformatics and Biomedicine Workshops (BIBMW), pp. 57–62 (2010)

    Google Scholar 

  23. Wu, B., Abbott, T., Fishman, D., McMurray, W., Mor, G., Stone, K., Ward, D., Williams, K., Zhao, H.: Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19(13), 1636 (2003)

    Article  Google Scholar 

  24. Ahmed, S., Zhang, M., Peng, L.: Feature selection and classification of high dimensional mass spectrometry data: a genetic programming approach. In: Vanneschi, L., Bush, W.S., Giacobini, M. (eds.) EvoBIO 2013. LNCS, vol. 7833, pp. 43–55. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  25. Sebastiani, F., Ricerche, C.N.D.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)

    Article  Google Scholar 

  26. Sun, Y., Wu, D.: A relief based feature extraction algorithm. In: SDM, pp. 188–195 (2008)

    Google Scholar 

  27. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MATH  Google Scholar 

  28. Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 284–292 (1996)

    Google Scholar 

  29. John, G., Kohavi, R., Pfleger, K.: Irrelevant feature and the subset selection problem. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129 (1994)

    Google Scholar 

  30. Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE Computer Society (1995)

    Google Scholar 

  31. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 856–863 (2003)

    Google Scholar 

  32. Kira, K., Rendell, L.: A practical approach to feature selection. In: Proceedings of the Ninth International Conference on Machine Learning (ICML 1992), pp. 249–256 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos Huertas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Huertas, C., Juárez-Ramírez, R. (2015). Heat Map Based Feature Selection: A Case Study for Ovarian Cancer. In: Mora, A., Squillero, G. (eds) Applications of Evolutionary Computation. EvoApplications 2015. Lecture Notes in Computer Science(), vol 9028. Springer, Cham. https://doi.org/10.1007/978-3-319-16549-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16549-3_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16548-6

  • Online ISBN: 978-3-319-16549-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics