Data Mining in Data-Intensive and Cognitively-Complex Settings: Lessons Learned from the Dicode Project
This chapter reports on practical lessons learned while developing the Dicode’s data mining services and using them in data-intensive and cognitively-complex settings. Various sources were taken into consideration to establish these lessons, including user feedbacks obtained from evaluation studies, discussion in teams, as well as observation of services’ usage. The lessons are presented in a way that could aid people who engage in various phases of developing similar kind of systems.
KeywordsData mining framework Data mining services Text mining services Big data Hadoop Storm Semantic technologies
- 1.Marz, N., Warren, J.: Big Data—Principles and Best Practices of Scalable Real-Time Data Systems. Manning Publications, New York (2012)Google Scholar
- 3.Grosskreutz, H., Paurat D.: Fast and memory-efficient discovery of the top-k relevant subgroups in a reduced candidate space. In: Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science vol. 6911, pp. 533–548. Springer, Heidelberg (2011)Google Scholar
- 4.Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD’00. pp. 1–12. ACM Press, New York http://doi.acm.org/10.1145/342009.335372 (2000)
- 5.Büttcher, S., Clarke, C., Cormack, G.: Information Retrieval: Implementing and Evaluating Search Engines. MIT Press, Cambridge, Mass (2010)Google Scholar
- 6.Friesen, N., Rüping, S.: Distance Metric Learning for Recommender Systems in Complex Domains. In: Proceedings of dicoSyn 2012 (Mastering Data-Intensive Collaboration through the Synergy of Human and Machine Reasoning), a workshop at CSCW 2012, February 12, 2012, Seattle (2012)Google Scholar