Abstract
In machine learning and numerical optimization, there has been an ongoing debate about properties of local optima and the impact of these properties on generalization. In this paper, we make a first attempt to address this question for case-based reasoning systems, more specifically for instance-based learning as it takes place in the retain phase. In so doing, we cast case learning as an optimization problem, develop a notion of local optima, propose a measure for the flatness or sharpness of these optima and empirically evaluate the relation between sharp minima and the generalization performance of the corresponding learned case base.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Though learning can take place also in one of the other knowledge containers of a CBR system, e.g. when learning similarity measures or adaption knowledge.
- 2.
There remains some sensitivity to the presentation order since in line 4 multiple cases c may reduce \(\mathbb E^{loo}_{\mathcal T}\) equally in which case one of those cases must be selected, e.g. randomly or by some convention.
- 3.
A-Balance, B-BanknoteAuth, C-Cancer, D-Car, E-Contraceptive, F-Ecoli, G-Glass, H-Haberman, I-Hayes, J-Heart, K-Iris, L-MammogrMass, M-Monks, N-Pima, O-QualBankruptcy, P-TeachAssistEval, Q-TicTacToe, R-UserKnowledge, S-VertebralCol, T-Wine, U-Yeast.
References
Aha, D., Kibler, D., Albert, M.: Instance-based learning algorithms. Mach. Learn. 6, 37–66 (1991)
Brighton, H., Mellish, C.: On the consistency of information filters for lazy learning algorithms. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 283–288. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-540-48247-5_31
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Dinh, L., Pascanu, R., Bengio, S., Bengio, Y.: Sharp minima can generalize for deep nets. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 1019–1028. JMLR.org (2017)
Gates, G.: The reduced nearest neighbor rule. IEEE Trans. Inf. Theory 18(3), 431–433 (1972)
Hart, P.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14(3), 515–516 (1968)
Hochreiter, S., Schmidhuber, J.: Flat minima. Neural Comput. 9(1), 1–42 (1997)
Keskar, N., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.: On large-batch training for deep learning: generalization gap and sharp minima. In: Proceedings of the 5th International Conference on Learning Representations, Toulon, France (2017)
Kibler, D., Aha, D.: Learning representative exemplars of concepts: an initial case study. In: Proceedings of the Fourth International Workshop on Machine Learning, pp. 24–30. Morgan Kaufmann (1987)
Leake, D.B., Wilson, D.C.: Categorizing case-base maintenance: dimensions and directions. In: Smyth, B., Cunningham, P. (eds.) EWCBR 1998. LNCS, vol. 1488, pp. 196–207. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0056333
Leake, D., Wilson, M.: How many cases do you need? Assessing and predicting case-base coverage. In: Ram, A., Wiratunga, N. (eds.) ICCBR 2011. LNCS (LNAI), vol. 6880, pp. 92–106. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23291-6_9
Li, H., Xu, Z., Taylor, G., Studer, C., Goldstein, T.: Visualizing the loss landscape of neural nets. In: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS 2018), Montréal, Canada, pp. 6391–6401 (2018)
Lichman, M.: UCI Machine Learning Repository (2013). archive.ics.uci.edu/ml
Lupiani, E., Craw, S., Massie, S., Juarez, J.M., Palma, J.T.: A multi-objective evolutionary algorithm fitness function for case-base maintenance. In: Delany, S.J., Ontañón, S. (eds.) ICCBR 2013. LNCS (LNAI), vol. 7969, pp. 218–232. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39056-2_16
Skalak, D.: Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Proceedings of the 11th International Conference on Machine Learning, New Brunswick, NJ, USA, pp. 293–301 (1994)
Smyth, B., McKenna, E.: Building compact competent case-bases. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) ICCBR 1999. LNCS, vol. 1650, pp. 329–342. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48508-2_24
Smyth, B., McKenna, E.: Competence guided incremental footprint-based retrieval. Knowl.-Based Syst. 14(3–4), 155–161 (2001)
Wilson, D.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 2(3), 408–421 (1972)
Wilson, D., Martinez, T.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)
Zhang, J.: Selecting typical instances in instance-based learning. In: Proceedings of the 9th International Workshop on Machine Learning, Aberdeen, UK, pp. 470–479 (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Gabel, T., Godehardt, E. (2019). On the Generalization Capabilities of Sharp Minima in Case-Based Reasoning. In: Bach, K., Marling, C. (eds) Case-Based Reasoning Research and Development. ICCBR 2019. Lecture Notes in Computer Science(), vol 11680. Springer, Cham. https://doi.org/10.1007/978-3-030-29249-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-29249-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29248-5
Online ISBN: 978-3-030-29249-2
eBook Packages: Computer ScienceComputer Science (R0)