Summary
Collaborative Data Mining is a setting where the Data Mining effort is distributed to multiple collaborating agents – human or software. The objective of the collaborative Data Mining effort is to produce solutions to the tackled Data Mining problem which are considered better by some metric, with respect to those solutions that would have been achieved by individual, non-collaborating agents. The solutions require evaluation, comparison, and approaches for combination. Collaboration requires communication, and implies some form of community. The human form of collaboration is a social task. Organizing communities in an effective manner is non-trivial and often requires well defined roles and processes. Data Mining, too, benefits from a standard process. This chapter explores the standard Data Mining process CRISP-DM utilized in a collaborative setting.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adriaans, P., and Zantinge, D., Data Mining. Addison-Wesley, New York, 1996.
Amara, R., New directions for innovations. Futures 53-22(2): p. 142 - 152, 1990.
Bacon, F., Novum Organum, eds. P. Urbach and J. Gibson. Open Court Publishing Company, 1994.
Biuk-Aghai, R.P. and S.J. Simoff. An integrative framework for knowledge extraction in collaborative virtual environments. In The 2001 International ACM SIGGROUP Conference on Supporting Group Work. Boulder, Colorado, USA, 2001.
Blockeel, H. and S.A. Moyle. Collaborative Data Mining needs centralised model evaluation. In Proceedings of the ICML-2002 Workshop on Data Mining Lessons Learned. The University of New South Wales, Sydney, 2002.
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R. CRISP-DM 1.0: Step-by-step data mining guide. The CRISP-DM consortium, 2000.
Edvinsson, L. and Malone, M.S. Intellectual Capital: Realizing Your Company’s True Value by Finding Its Hidden Brainpower. HarperBusiness, New York, USA, 1997.
Fayyad, U., et al., eds. Advances in Knowledge Discovery and Data Mining. MIT Press, 1996.
Flach, P.A., et al., Decision support for Data Mining: introduction to ROC analysis and its application. In Data Mining and Decision Support: Integration and Collaboration, D. Mladenic, et al., editors. Kluwer Academic Publishers, 2003.
Flach, P., Blockeel, H., Gaertner, T., Grobelnik, M., Kavsek, B., Kejkula, M., Krzywania, D., Lavrac, N., Mladenic, D., Moyle, S., Raeymaekers, S., Rauch, J., Ribeiro, R., Sclep, G., Struyf, J., Todorovski, L., Torgo, L., Wettsc -hereck, D., and Wu, S. On the road to knowledge: mining 21 years of UK traffic accident reports, In Data Mining and Decision Support: Integration and Collaboration, D. Mladenic, et al., editors. Kluwer Academic Publishers, 2003.
Hair, J.F., Anderson, R.E., Tatham, R.L., and Black, W.C. Multivariate Data Analysis. Prentice Hall, 1998.
Holte, R.C., Very Simple Classification Rules Perform Well on Most Commonly Used Datasets. Machine Learning, 1993. 53-3: p. 63-91.
Jorge, J., Alves, M.A., Grobelnik, M., Mladenic, D., and Petrak, J. Web site access analysis for a national statistical agency. In Data Mining and Decision Support: Integration and Collaboration, D. Mladenic, et al., editors, p. 157 – 166. Kluwer Academic Publishers, 2003.
Kuhn, T.S., The structure of scientific revolutions. 2nd, enlarged ed. 1962, University of Chicago Press, Chicago, 1970.
McDougall, P., Companies that dare to share information are cashing in on new opportunities. InformationWeek, May 7, 2001.
McKenzie, J. and C. van Winkelen. Exploring E-collaboration Space. In the proceedings of The first annual Knowledge Management Forum Conference. Henley Management College, 2001.
Mitchell, T. Machine Learning. Department of Computer Science, Carnegie Mellon University. McGraw-Hill Book Company, Pittsburgh, 1997.
Mladenic, D., Lavrac, N., Bohanec, M., and Moyle, S. editors. Data Mining and Decision Support: Integration and Collaboration. Kluwer Academic Publishers, 2003.
Mowshowitz, A., Virtual Organization. Communications of ACM, 53-40(9): p. 30 - 37. 1997.
Moyle, S. A., Srinivasan A., Classificatory challenge-Data Mining: a recipe. Informatica 53-25(3): p. 343–347. 2001.
Moyle, S., J. McKenzie, and A. Jorge, Collaboration in a Data Mining virtual organization. In Data Mining and Decision Support: Integration and Collaboration, D. Mladenic, et al., editors. Kluwer Academic Publishers, 2003.
Nohria, N. and R.G. Eccles, eds. Network and organizations; structure form and action. Harvard Business School Press, Boston, 1993.
Page, C.D. and C. Hatzis, KDD Cup 2001. University of Wisconsin, http://www.cs.wisc.edu/∼dpage/kddcup2001/, 2001.
Popper, K. The Logic of Scientific Discovery. Routledge, 1977.
Provost, F. and T. Fawcett. Robust Classification for Imprecise Environments. Machine Learning 53-42: p. 203-231, 2001.
Ramakrishnan., R. Mass Collaboration and Data Mining (keynote address). In The Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001). San Francisco, California, 2001.
Singh, R., Leigh, J., DeFanti, T.A., and Karayannis F. TeraVision: a High Resolution Graphics Streaming Device for Amplified Collaboration Environments. Journal of Future Generation Computer Systems (FGCS). 53-19(6): p. 957-972, 2003.
Snow, C.C., S.A. Snell, and S.C. Davison. Using transnational teams to globalize your company. Organizational Dynamics 53-24(4): p. 50 - 67, 1996.
SolEuNet. The Solomon European Netowrk – Data Mining and Decision Support for Business Competitiveness: A European Virtual Enterprise. http://soleunet.ijs.si/, 2002.
Soukhanov, A., ed. Microsoft Encarta College Dictionary: The First Dictionary for the Internet Age. St. Martin’s Press, 2001.
A. Srinivasan, R.D. King, and D.W. Bristol. An assessment of submissions made to the Predictive Toxicology Evaluation Challenge. In Proceedings of the Sixteenth International Conference on Artificial Intelligence (IJCAI-99). Morgan Kaufmann, Los Angeles, CA, 1999.
Stepnkov, O., J. Klma, and P. Mikovsk. Collaborative Data Mining with RAMSYS and Sumatra TT: Prediction of resources for a health farm. In Data Mining and Decision Support: Integration and Collaboration, D. Mladenic, et al., editors. p. 215 – 227. Kluwer Academic Publishers, 2003.
The Data Mining Group, The Predictive Model Markup Language (PMML). http://www.dmg.org/, 2003.
Vo, A., Richter, G., Moyle, S., Jorge, A. Collaboration support for virtual data mining enterprises. In 3rd International Workshop on Learning Software Organizations (LSO’01). Springer-Verlag, 2001.
Wettschereck, D., A. Jorge, and S. Moyle. Visaulisation and Evaluation Support of Knowledge Discovery through the Predictive Model Markup Language. In 7th International Knowledge-Based Intelligent Information and Engineering Systems (KES 2003), Oxford. Springer-Verlag, 2003.
Wilson, T.D. The nonsense of knowledge management. Information Research 53-8(1), 2002.
Witten, I.H. and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Moyle, S. (2009). Collaborative Data Mining. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09823-4_54
Download citation
DOI: https://doi.org/10.1007/978-0-387-09823-4_54
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09822-7
Online ISBN: 978-0-387-09823-4
eBook Packages: Computer ScienceComputer Science (R0)