Skip to main content

Handling Label Noise in Microarray Classification with One-Class Classifier Ensemble

  • Conference paper

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 311))

Abstract

The advance of high-throughput techniques, such as gene microarrays and protein chips have a major impact on contemporary biology and medicine. Due to the high-dimensionality and complexity of the data, it is impossible to analyze it manually. Therefore machine learning techniques play an important role in dealing with such data. In this paper, we investigate the influence of label noise on the effectiveness of classification system applied to microarray analysis. Popular methods do not have any mechanism for handling such difficulties embedded in the nature of data. To cope with that, we propose to use a one-class classifiers, which distinct from canonical methods, rely on objects coming from single class distributions only. They distinguish observations coming from the given class from any other possible decision about the examples, that were unseen during the classification step. While having less information to dichotomize between classes, one-class models can easily learn the specific properties of a given data set and are robust to difficulties embedded in the nature of the data. We show, that using ensembles of one-class classifiers can give as good results as canonical multi-class classifiers, while allowing to deal with unexpected label noise in the data. Experimental investigations, carried out on public data sets, prove the usefulness of the proposed approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tinker, A.V., Boussioutas, A., Bowtell, D.D.L.: The challenges of gene expression microarrays for the study of human cancer. Cancer Cell 9(5), 333–339 (2006)

    Article  Google Scholar 

  2. Silveira, V.S., Scrideli, C.A., Moreno, D.A., Yunes, J.A., Queiroz, R.G.P., Toledo, S.C., Lee, M.L.M., Petrilli, A.S., Brandalise, S.R., Tone, L.G.: Gene expression pattern contributing to prognostic factors in childhood acute lymphoblastic leukemia. Leukemia and Lymphoma 54(2), 310–314 (2013)

    Article  Google Scholar 

  3. Schatton, T., Murphy, G.F., Frank, N.Y., Yamaura, K., Waaga-Gasser, A.M., Gasser, M., Zhan, Q., Jordan, S., Duncan, L.M., Weishaupt, C., Fuhlbrigge, R.C., Kupper, T.S., Sayegh, M.H., Frank, M.H.: Identification of cells initiating human melanomas. Nature 451(7176), 345–349 (2008)

    Article  Google Scholar 

  4. Finak, G., Bertos, N., Pepin, F., Sadekova, S., Souleimanova, M., Zhao, H., Chen, H., Omeroglu, G., Meterissian, S., Omeroglu, A., Hallett, M., Park, M.: Stromal gene expression predicts clinical outcome in breast cancer. Nature Medicine 14(5), 518–527 (2008)

    Article  Google Scholar 

  5. Lynch, C.C., Hikosaka, A., Acuff, H.B., Martin, M.D., Kawai, N., Singh, R.K., Vargo-Gogola, T.C., Begtrup, J.L., Peterson, T.E., Fingleton, B., Shirai, T., Matrisian, L.M., Futakuchi, M.: Mmp-7 promotes prostate cancer-induced osteolysis via the solubilization of rankl. Cancer Cell 7(5), 485–496 (2005)

    Article  Google Scholar 

  6. Larranaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armananzas, R., Santaf, G., Perez, A., Robles, V.: Machine learning in bioinformatics. Briefings in Bioinformatics 7(1), 86–112 (2006)

    Article  Google Scholar 

  7. Wang, Y., Yu, Z., Anh, V.: Fuzzy c-means method with empirical mode decomposition for clustering microarray data. International Journal of Data Mining and Bioinformatics 7(2), 103–117 (2013)

    Article  Google Scholar 

  8. Ringner, M., Peterson, C., Khan, J.: Analyzing array data using supervised methods. Pharmacogenomics 3(3), 403–415 (2002), www.scopus.com ; cited By 43 (since 1996)

  9. Bariamis, D., Maroulis, D., Iakovidis, D.K.: Unsupervised svm-based gridding for dna microarray images. Computerized Medical Imaging and Graphics 34(6), 418–425 (2010)

    Article  Google Scholar 

  10. Woźniak, M., Grana, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Information Fusion 16, 3–17 (2014)

    Article  Google Scholar 

  11. Moorthy, K., Mohamad, M.S.: Random forest for gene selection and microarray data classification. In: Lukose, D., Ahmad, A.R., Suliman, A. (eds.) KTW 2011. CCIS, vol. 295, pp. 174–183. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Liu, K., Huang, D.: Cancer classification using rotation forest. Computers in Biology and Medicine 38(5), 601–610 (2008)

    Article  Google Scholar 

  13. Inza, I., Larraaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in dna microarray domains. Artificial Intelligence in Medicine 31(2), 91–103 (2004)

    Article  Google Scholar 

  14. Krawczyk, B.: Combining one-class support vector machines for microarray classification. In: FedCSIS, pp. 83–89 (2013)

    Google Scholar 

  15. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)

    Article  Google Scholar 

  16. Noto, K., Brodley, C., Slonim, D.: Frac: A feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Mining and Knowledge Discovery 25(1), 109–133 (2012)

    Article  MathSciNet  Google Scholar 

  17. Tax, D.M.J., Juszczak, P., Pękalska, E.z., Duin, R.P.W.: Outlier detection using ball descriptions with adjustable metric. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR&SPR 2006. LNCS, vol. 4109, pp. 587–595. Springer, Heidelberg (2006)

    Google Scholar 

  18. Schölkopf, B., Smola, A.: Learning with kernels: support vector machines, regularization, optimization, and beyond. In: Adaptive Computation and Machine Learning. MIT Press (2002)

    Google Scholar 

  19. Tax, D.M.J., Duin, R.P.W.: Support vector data description. Machine Learning 54(1), 45–66 (2004)

    Article  MATH  Google Scholar 

  20. Bicego, M., Figueiredo, M.A.T.: Soft clustering using weighted one-class support vector machines. Pattern Recognition 42(1), 27–32 (2009)

    Article  MATH  Google Scholar 

  21. Wilk, T., Woźniak, M.: Soft computing methods applied to combination of one-class classifiers. Neurocomput. 75, 185–193 (2012)

    Article  Google Scholar 

  22. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bartosz Krawczyk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Krawczyk, B., Woźniak, M. (2015). Handling Label Noise in Microarray Classification with One-Class Classifier Ensemble. In: Bogdanova, A., Gjorgjevikj, D. (eds) ICT Innovations 2014. ICT Innovations 2014. Advances in Intelligent Systems and Computing, vol 311. Springer, Cham. https://doi.org/10.1007/978-3-319-09879-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09879-1_35

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09878-4

  • Online ISBN: 978-3-319-09879-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics