Evaluation Measures of the Classification Performance of Imbalanced Data Sets

Gu, Qiong; Zhu, Li; Cai, Zhihua

doi:10.1007/978-3-642-04962-0_53

Qiong Gu^5,6,
Li Zhu⁶ &
Zhihua Cai⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 51))

Included in the following conference series:

International Symposium on Intelligence Computation and Applications

2712 Accesses
76 Citations

Abstract

Discriminant Measures for Classification Performance play a critical role in guiding the design of classifiers, assessment methods and evaluation measures are at least as important as algorithm and are the first key stage to a successful data mining. We systematically summarized the evaluation measures of Imbalanced Data Sets (IDS). Several different type measures, such as commonly performance evaluation measures and visualizing classifier performance measures have been analyzed and compared. The problems of these measures towards IDS may lead to misunderstanding of classification results and even wrong strategy decision. Beside that, a series of complex numerical evaluation measures were also investigated which can also serve for evaluating classification performance of IDS.

An Erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-642-04962-0_55

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pepe, M.S.: Receiver Operating Characteristic Methodology. Journal of the American Statistical Association 95, 308–311 (2000)
Article Google Scholar
Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. Machine learning 31 (2004)
Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: The 23rd International Conference on Machine Learning (ICML 2006), pp. 233–240. ACM, New York (2006)
Chapter Google Scholar
Drummond, C., Holte, R.C.: Cost curves: An improved method for visualizing classifier performance. Machine learning 65, 95–130 (2006)
Article Google Scholar
Weiss, G.M.: Mining with rarity: a unifying framework. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining 6, 7–19 (2004)
Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Article Google Scholar
Provost, F., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: The 3rd International Conference on Knowledge Discovery and Data Mining, pp. 43–48 (1997)
Google Scholar
van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Google Scholar
Kubat, M., Holte, R.C., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)
Article Google Scholar
Youden, W.J.: Index for rating diagnostic tests. Cancer 3, 32–35 (1950)
Article Google Scholar
Biggersta, B.J.: Comparing diagnostic tests: a simple graphic using likelihood ratios. Statistics in Medicine 19, 649–663 (2000)
Article Google Scholar
Blakeley, D.D., Oddone, E.Z., Hasselblad, V., Simel, D.L., Matchar, D.B.: Noninvasive carotid artery testing: a meta-analytic review. Am. Coll. Physicians 122, 360–367 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics & Computer Science, Xiangfan University, Xiangfan, Hubei, 441053, China
Qiong Gu
School of Computer, China University of Geosciences, Wuhan, Hubei, 430074, China
Qiong Gu, Li Zhu & Zhihua Cai

Authors

Qiong Gu
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhihua Cai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, China University of Geosciences, 430074, Wu-Han, China
Zhihua Cai
State Key Laboratory of Novel Software Technology,, Nanjing University, P.O. Box, China
Zhenhua Li
Computation Center, Wuhan University, Wuhan, China
Zhuo Kang
School of Computer Science and Engineering, The University of Aizu, Japan
Yong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gu, Q., Zhu, L., Cai, Z. (2009). Evaluation Measures of the Classification Performance of Imbalanced Data Sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2009. Communications in Computer and Information Science, vol 51. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04962-0_53

Download citation

DOI: https://doi.org/10.1007/978-3-642-04962-0_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04961-3
Online ISBN: 978-3-642-04962-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics