Abstract
Data mining techniques are making their entrance in nowadays companies, allowing business users to take informed decisions based on their available data. However, these business experts usually lack the knowledge to perform the analysis of the data by themselves, which makes it necessary to rely on experts in the field of data mining. In an attempt to solve this problem, we previously studied the definition of domain-specific languages, which allowed to specify data mining processes without requiring experience in the applied techniques. The specification was made through high-level language primitives, which referred only to familiar concepts and terms from the original domain of the data. Therefore, technical details about the mining processes were hidden to the final user. Although these languages present themselves as a promising solution, their development can become a challenging task, incurring in costly endeavours. This work describes a development ecosystem devised for the generation of these languages, starting from a generic perspective that can be specialized into the details of each domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, D., et al.: The Beckman report on database research. SIGMOD Rec. 43(3), 61–70 (2014)
Balcázar, J.L.: Parameter-free association rule mining with yacaree. In: Extraction et Gestion des Connaissances (EGC), Brest (France), pp. 251–254 (2011)
Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: the Konstanz information miner. SIGKDD Explor. Newsl. 11(1), 26–31 (2009)
Campos, M., Stengard, P., Milenova, B.: Data-centric automated data mining. In: Fourth International Conference on Machine Learning and Applications (ICMLA 2005), vol. 2005, pp. 97–104 (2005)
Eysholdt, M., Behrens, H.: Xtext: implement your language faster than the quick and dirty way. In: Companion to the 25th Annual Conference on Object-Oriented Programming, Systems, Languages, and Applications (SPLASH/OOPSLA), Reno/Tahoe (Nevada, USA), pp. 307–309, October 2010
Hall, M., et al.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Jalali, S., Wohlin, C.: Systematic literature studies: database searches vs. backward snowballing. In: Proceedings of the 2012 6th ACM_IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 29–38 (2012)
Kamsu-Foguem, B., Tchuenté-Foguem, G., Allart, L., Zennir, Y., Vilhelm, C., Mehdaoui, H., Zitouni, D., Hubert, H., Lemdani, M., Ravaux, P.: User-centered visual analysis using a hybrid reasoning architecture for intensive care units. Decis. Support Syst. 54(1), 496–509 (2012)
Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering. Technical report EBSE 2007–001, Keele University and Durham University Joint Report (2007)
Kolovos, D.S., Paige, R.F., Rose, L.M., Williams, J.R.: Integrated model management with epsilon. In: France, R.B., Kuester, J.M., Bordbar, B., Paige, R.F. (eds.) ECMFA 2011. LNCS, vol. 6698, pp. 391–392. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21470-7_33
Lemke, C., Budka, M., Gabrys, B.: Metalearning: a survey of trends and technologies. Artif. Intell. Rev. 44(1), 117–130 (2015)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Rice, W.: Moodle E-Learning Course Development. Packt Publishing, Birmingham (2006)
Schrage, M.: Stop Searching for That Elusive Data Scientist. Harvard Business Review. https://hbr.org/2014/09/stop-searching-for-that-elusive-data-scientist/
Smith, J.W., et al.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, pp. 261–265, November 1988
Sweney, M.: Netflix gathers detailed viewer data to guide its search for the next hit. The Guardian. http://www.theguardian.com/media/2014/feb/23/netflix-viewer-data-house-of-cards
de la Vega, A., García-Saiz, D., Zorrilla, M., Sánchez, P.: Towards a DSL for educational data mining. In: Sierra-Rodríguez, J.-L., Leal, J.P., Simões, A. (eds.) SLATE 2015. CCIS, vol. 563, pp. 79–90. Springer, Cham (2015). doi:10.1007/978-3-319-27653-3_8
Zorrilla, M., García-Saiz, D.: A service-oriented architecture to provide data mining services for non-expert data miners. Decis. Support Syst. 55, 399–411 (2013)
Acknowledgements
This work has been partially funded by the Government of Cantabria (Spain) under the doctoral studentship program from the University of Cantabria, and by the Spanish Government under grant TIN2014-56158-C4-2-P (M2C2).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
de la Vega, A., García-Saiz, D., Zorrilla, M., Sánchez, P. (2017). A Model-Driven Ecosystem for the Definition of Data Mining Domain-Specific Languages. In: Ouhammou, Y., Ivanovic, M., Abelló, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2017. Lecture Notes in Computer Science(), vol 10563. Springer, Cham. https://doi.org/10.1007/978-3-319-66854-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-66854-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66853-6
Online ISBN: 978-3-319-66854-3
eBook Packages: Computer ScienceComputer Science (R0)