Abstract
We present what is, to the best of our knowledge, the first analysis that uses dataset complexity measures to evaluate case base editing algorithms. We select three different complexity measures and use them to evaluate eight case base editing algorithms. While we might expect the complexity of a case base to decrease, or stay the same, and the classification accuracy to increase, or stay the same, after maintenance, we find many counter-examples. In particular, we find that the RENN noise reduction algorithm may be over-simplifying class boundaries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brighton, H., Mellish, C.: On the consistency of information filters for lazy learning algorithms. In: Rauch, J., Żytkow, J.M. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 283–288. Springer, Heidelberg (1999)
Cummins, L.: Combining and Choosing Case Base Maintenance Algorithms. PhD thesis, Department of Computer Science, University College Cork, Ireland (forthcoming, 2011)
Cummins, L., Bridge, D.: Maintenance by a committee of experts: The MACE approach to case-base maintenance. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 120–134. Springer, Heidelberg (2009)
Delany, S.J.: The good, the bad and the incorrectly classified: Profiling cases for case-base editing. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 135–149. Springer, Heidelberg (2009)
Delany, S.J., Cunningham, P.: An analysis of case-based editing in a spam filtering system. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 128–141. Springer, Heidelberg (2004)
Doyle, D., Cunningham, P., Bridge, D., Rahman, Y.: Explanation oriented retrieval. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 157–168. Springer, Heidelberg (2004)
Fornells, A., Recio-GarcÃa, J.A., DÃaz-Agudo, B., Golobardes, E., Fornells, E.: Integration of a methodology for cluster-based retreval in jcolibri. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 418–433. Springer, Heidelberg (2009)
Frank, A., Asuncion, A.: UCI machine learning repository (2010)
Ho, T.K., Basu, M.: Measuring the complexity of classification problems. In: Procs. of the 15th Intl. Conference on Pattern Recognition, pp. 43–47 (2000)
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. on Pattern Analysis and Machine Intelligence 24(3), 289–300 (2002)
Macià , N., Bernadó-Mansilla, E., Orriols-Puig, A.: On the dimensions of data complexity through synthetic data sets. In: Procs. of the 11th Intl. Conference of the Catalan Association for Artificial Intelligence, pp. 244–252 (2008)
Massie, S., Craw, S., Wiratunga, N.: Complexity profiling for informed case-base editing. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS (LNAI), vol. 4106, pp. 325–339. Springer, Heidelberg (2006)
McKenna, E., Smyth, B.: Competence-guided case-base editing techniques. In: Blanzieri, E., Portinale, L. (eds.) EWCBR 2000. LNCS (LNAI), vol. 1898, pp. 186–197. Springer, Heidelberg (2000)
Orriols-Puig, A., Macià , N., Bernadó-Mansilla, E., Ho, T.K.: Documentation for the data complexity library in C++. Technical Report GRSI Report No. 2009001, Universitat Ramon Llull (2009)
Pranckeviciene, E., Ho, T.K., Somorjai, R.: Class separability in spaces reduced by feature selection. In: Procs. of the 18th Intl. Conference on Pattern Recognition, pp. 254–257 (2006)
Smyth, B., McKenna, E.: Modelling the competence of case-bases. In: Smyth, B., Cunningham, P. (eds.) EWCBR 1998. LNCS (LNAI), vol. 1488, pp. 208–220. Springer, Heidelberg (1998)
Tomek, I.: An experiment with the edited nearest-neighbor rule. IEEE Trans. on Systems, Man, and Cybernetics 6(6), 448–452 (1976)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cummins, L., Bridge, D. (2011). On Dataset Complexity for Case Base Maintenance. In: Ram, A., Wiratunga, N. (eds) Case-Based Reasoning Research and Development. ICCBR 2011. Lecture Notes in Computer Science(), vol 6880. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23291-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-23291-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23290-9
Online ISBN: 978-3-642-23291-6
eBook Packages: Computer ScienceComputer Science (R0)