Skip to main content

CBR Meets Big Data: A Case Study of Large-Scale Adaptation Rule Generation

  • Conference paper
  • First Online:
Case-Based Reasoning Research and Development (ICCBR 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9343))

Included in the following conference series:

Abstract

Adaptation knowledge generation is a difficult problem for CBR. In previous work we developed ensembles of adaptation for regression (EAR), a family of methods for generating and applying ensembles of adaptation rules for case-based regression. EAR has been shown to provide good performance, but at the cost of high computational complexity. When efficiency problems result from case base growth, a common CBR approach is to focus on case base maintenance, to compress the case base. This paper presents a case study of an alternative approach, harnessing big data methods, specifically MapReduce and locality sensitive hashing (LSH), to make the EAR approach feasible for large case bases without compression. Experimental results show that the new method, BEAR, substantially increases accuracy compared to a baseline big data k-NN method using LSH. BEAR’s accuracy is comparable to that of traditional k-NN without using LSH, while its processing time remains reasonable for a case base of millions of cases. We suggest that increased use of big data methods in CBR has the potential for a departure from compression-based case-base maintenance methods, with their concomitant solution quality penalty, to enable the benefits of full case bases at much larger scales.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Big data ensembles of adaptations for regression.

  2. 2.

    http://aws.amazon.com/elasticmapreduce/.

References

  1. Kim, G.H., Trimi, S., Chung, J.H.: Big-data applications in the government sector. Commun. ACM 57(3), 78–85 (2014)

    Article  Google Scholar 

  2. Hoover, W.: Transforming health care through big data. Technical report, Institute for Health Technology Transformation (2013)

    Google Scholar 

  3. Greengard, S.: Weathering a new era of big data. Commun. ACM 57(9), 12–14 (2014)

    Article  Google Scholar 

  4. Plaza, E.: Semantics and experience in the future web. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 44–58. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Ontañón, S., Lee, Y.-C., Snodgrass, S., Bonfiglio, D., Winston, F.K., McDonald, C., Gonzalez, A.J.: Case-based prediction of teen driver behavior and skill. In: Lamontagne, L., Plaza, E. (eds.) ICCBR 2014. LNCS, vol. 8765, pp. 375–389. Springer, Heidelberg (2014)

    Google Scholar 

  6. Cordier, A., Lefevre, M., Champin, P.A., Georgeon, O., Mille, A.: Trace-based reasoning - modeling interaction traces for reasoning on experiences. In: Proceedings of the 2014 Florida AI Research Symposium, pp. 363–368. AAAI Press (2014)

    Google Scholar 

  7. Smyth, B., Keane, M.: Remembering to forget: a competence-preserving case deletion policy for case-based reasoning systems. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 377–382. Morgan Kaufmann, San Mateo (1995)

    Google Scholar 

  8. Smyth, B., McKenna, E.: Building compact competent case-bases. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) ICCBR 1999. LNCS (LNAI), vol. 1650, p. 329. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  9. Jalali, V., Leake, D.: Extending case adaptation with automatically-generated ensembles of adaptation rules. In: Delany, S.J., Ontañón, S. (eds.) ICCBR 2013. LNCS, vol. 7969, pp. 188–202. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  10. Jalali, V., Leake, D.: A context-aware approach to selecting adaptations for case-based reasoning. In: Brézillon, P., Blackburn, P., Dapoigny, R. (eds.) CONTEXT 2013. LNCS, vol. 8175, pp. 101–114. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  11. Jalali, V., Leake, D.: Adaptation-guided case base maintenance. In: Proceedings of the Twenty-Eighth Conference on Artificial Intelligence, pp. 1875–1881. AAAI Press (2014)

    Google Scholar 

  12. Jalali, V., Leake, D.: On retention of adaptation rules. In: Lamontagne, L., Plaza, E. (eds.) ICCBR 2014. LNCS, vol. 8765, pp. 200–214. Springer, Heidelberg (2014)

    Google Scholar 

  13. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing. STOC 1998, pp. 604–613. ACM, New York (1998)

    Google Scholar 

  14. Daengdej, J., Lukose, D., Tsui, E., Beinat, P., Prophet, L.: Dynamically creating indices for two million cases: a real world problem. In: Smith, I., Faltings, B. (eds.) Advances in Case-Based Reasoning, pp. 105–119. Springer, Berlin (1996)

    Chapter  Google Scholar 

  15. Beaver, I., Dumoulin, J.: Applying mapreduce to learning user preferences in near real-time. In: Delany, S.J., Ontañón, S. (eds.) ICCBR 2013. LNCS, vol. 7969, pp. 15–28. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  16. Francis, A., Ram, A.: Computational models of the utility problem and their application to a utility analysis of case-based reasoning. In: Proceedings of the Workshop on Knowledge Compilation and Speed-Up Learning (1993)

    Google Scholar 

  17. Smyth, B., Cunningham, P.: The utility problem analysed: a case-based reasoning perspective. In: Proceedings of the Third European Workshop on Case-Based Reasoning, pp. 392–399. Springer, Berlin (1996)

    Google Scholar 

  18. Craw, S., Massie, S., Wiratunga, N.: Informed case base maintenance: a complexity profiling approach. In: Proceedings of the Twenty-Second National Conference on Artificial Intelligence, pp. 1618–1621. AAAI Press (2007)

    Google Scholar 

  19. Muñoz-Ávila, H.: A case retention policy based on detrimental retrieval. In: Althoff, K.-D., Bergmann, R., Branting, L.K. (eds.) ICCBR 1999. LNCS (LNAI), vol. 1650, pp. 276–287. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  20. Ontañón, S., Plaza, E.: Collaborative case retention strategies for CBR agents. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 392–406. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  21. Salamó, M., López-Sánchez, M.: Adaptive case-based reasoning using retention and forgetting strategies. Know.-Based Syst. 24(2), 230–247 (2011)

    Article  Google Scholar 

  22. Zhu, J., Yang, Q.: Remembering to add: competence-preserving case-addition policies for case base maintenance. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pp. 234–241. Morgan Kaufmann (1999)

    Google Scholar 

  23. Angiulli, F.: Fast condensed nearest neighbor rule. In: Proceedings of the Twenty-second International Conference on Machine Learning, pp. 25–32. ACM, New York (2005)

    Google Scholar 

  24. Wilson, D., Martinez, T.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)

    Article  MATH  Google Scholar 

  25. Brighton, H., Mellish, C.: Identifying competence-critical instances for instance-based learners. In: Instance Selection and Construction for Data Mining, The Springer International Series in Engineering and Computer Science, vol. 608, pp. 77–94. Springer, Berlin (2001)

    Google Scholar 

  26. Houeland, T.G., Aamodt, A.: The utility problem for lazy learners - towards a non-eager approach. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS, vol. 6176, pp. 141–155. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  27. Hanney, K., Keane, M.T.: The adaptation knowledge bottleneck: how to ease it by learning from cases. In: Leake, D.B., Plaza, E. (eds.) ICCBR 1997. LNCS, vol. 1266. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  28. Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)

    Google Scholar 

  29. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: IEEE International Conference on Computer Vision ICCV (2009)

    Google Scholar 

  30. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, SCG 2004, pp. 253–262. ACM, New York (2004)

    Google Scholar 

  31. Frank, A., Asuncion, A.: UCI machine learning repository (2010) http://archive.ics.uci.edu/ml

  32. Hayes, M., Shah, S.: Hourglass: a library for incremental processing on hadoop. In: 2013 IEEE International Conference on Big Data, pp. 742–752 (2013)

    Google Scholar 

  33. Jalali, V., Leake, D.: Manual for EAR4 and CAAR weka plugins, case-based regression and ensembles of adaptations, version 1. Technical report TR 717, Computer Science Department. Indiana University, Bloomington (2015)

    Google Scholar 

  34. Witten, I., Frank, E., Hall, M.: Data mining: practical machine learning tools and techniques with Java implementations, 3rd edn. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vahid Jalali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jalali, V., Leake, D. (2015). CBR Meets Big Data: A Case Study of Large-Scale Adaptation Rule Generation. In: Hüllermeier, E., Minor, M. (eds) Case-Based Reasoning Research and Development. ICCBR 2015. Lecture Notes in Computer Science(), vol 9343. Springer, Cham. https://doi.org/10.1007/978-3-319-24586-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24586-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24585-0

  • Online ISBN: 978-3-319-24586-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics