Skip to main content

Evaluation of Error-Sensitive Attributes

  • Conference paper
Trends and Applications in Knowledge Discovery and Data Mining (PAKDD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7867))

Included in the following conference series:

Abstract

Numerous attribute selection frameworks have been developed to improve performance and results in the research field of machine learning and data classification (Guyon & Elisseeff 2003; Saeys, Inza & Larranaga 2007), majority of the effort has focused on the performance and cost factors, with a primary aim to examine and enhance the logic and sophistication of the underlying components and methods of specific classification models, such as a variety of wrapper, filter and cluster algorithms for feature selection, to work as a data pre-process step or embedded as an integral part of a specific classification process. Taking a different approach, our research is to study the relationship between classification errors and data attributes not before, not during, but after the fact, to evaluate risk levels of attributes and identify the ones that may be more prone to errors based on such a post-classification analysis and a proposed attribute-risk evaluation routine. Possible benefits from this research can be to help develop error reduction measures and to investigate specific relationship between attributes and errors in a more efficient and effective way. Initial experiments have shown some supportive results, and the unsupportive results can also be explained by a hypothesis extended from this evaluation proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alpaydin, E.: Introduction to Machine Learning. The MIT Press, London (2004)

    Google Scholar 

  • Bredensteiner, E.J., Bennett, K.P.: Feature Minimization within Decision Trees. Computational Optimization and Applications 10(2), 111–126 (1998)

    MathSciNet  MATH  Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)

    MATH  Google Scholar 

  • Carpenter, G.A., Markuzon, N.: ARTMAP-IC and medical diagnosis: Instance counting and inconsistent cases. Neural Networks 11, 323–336 (1998)

    Article  Google Scholar 

  • Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, Irvine (2010), http://archive.ics.uci.edu/ml

    Google Scholar 

  • Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  • Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)

    Google Scholar 

  • Han, J., Kamber, M.: Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)

    Google Scholar 

  • Kayaer, K., Yyldyrym, T.: Medical diagnosis on pima indian diabetes using General Regression Neural Networks. Paper presented to the International Conference on Artificial Neural Networks/International Conference on Neural Information Processing, Istanbul, Turkey (2003)

    Google Scholar 

  • Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256. Morgan Kaufmann Publishers Inc. (1992)

    Google Scholar 

  • Kittler, J.: Feature set search algorithms. Pattern recognition and signal processing 41, 60 (1978)

    Google Scholar 

  • Liu, H., Motoda, H., Setiono, R.: Feature Selection: An Ever Evolving Frontier in Data Mining. Journal of Machine Learning Research: Workshop and Conference Proceedings 10, 10 (2010)

    Google Scholar 

  • Mangasarian, O.L., Street, W.N., Wolberg, W.H.: Breast Cancer Diagnosis and Prognosis via Linear Programming, Mathematical Programming Technical Report (1994)

    Google Scholar 

  • Quinlan, J.R.: C4. 5: programs for machine learning. Morgan Kaufmann (1993)

    Google Scholar 

  • Raymer, M.L., Doom, T.E., Kuhn, L.A., Punch, W.L.: Knowledge Discovery in Medical and Biological Datasets Using a Hybrid Bayes Classifier/Evolutionary Algorithm. In: Proceedings of the IEEE 2nd International Symposium on Bioinformatics and Bioengineering Conference, pp. 236–245 (2001)

    Google Scholar 

  • Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  • Shannon, C.E.: A Mathematical Theory of Communication. The Bell System Technical Journal 27, 379–423, 623–656 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  • Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus. In: Proc. Annu. Symp. Comput. Appl. Med. Care., vol. 9, pp. 261–265 (1988)

    Google Scholar 

  • Taylor, J.R.: An Introduction to error analysis: The Study of uncertainties in physical measurements, 2nd edn. University Science Books, Sausalito (1996)

    Google Scholar 

  • Wei, L., Altman, R.B.: An Automated System for Generating Comparative Disease Profiles and Making Diagnoses. IEEE Transactions on Neural Networks 15 (2004)

    Google Scholar 

  • Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

  • Wolberg, W.H., Mangasarian, O.L.: Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proceedings of the National Academy of Sciences 87, 9193–9196 (1990)

    Article  MATH  Google Scholar 

  • Yoon, K.: The propagation of errors in multiple-attribute decision analysis: A practical approach. Journal of the Operational Research Society 40(7), 681–686 (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, W., Zhang, S. (2013). Evaluation of Error-Sensitive Attributes. In: Li, J., et al. Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2013. Lecture Notes in Computer Science(), vol 7867. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40319-4_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40319-4_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40318-7

  • Online ISBN: 978-3-642-40319-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics