A Model-Driven Ecosystem for the Definition of Data Mining Domain-Specific Languages

de la Vega, Alfonso; García-Saiz, Diego; Zorrilla, Marta; Sánchez, Pablo

doi:10.1007/978-3-319-66854-3_3

Alfonso de la Vega¹⁷,
Diego García-Saiz¹⁷,
Marta Zorrilla¹⁷ &
…
Pablo Sánchez¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10563))

Included in the following conference series:

International Conference on Model and Data Engineering

792 Accesses

Abstract

Data mining techniques are making their entrance in nowadays companies, allowing business users to take informed decisions based on their available data. However, these business experts usually lack the knowledge to perform the analysis of the data by themselves, which makes it necessary to rely on experts in the field of data mining. In an attempt to solve this problem, we previously studied the definition of domain-specific languages, which allowed to specify data mining processes without requiring experience in the applied techniques. The specification was made through high-level language primitives, which referred only to familiar concepts and terms from the original domain of the data. Therefore, technical details about the mining processes were hidden to the final user. Although these languages present themselves as a promising solution, their development can become a challenging task, incurring in costly endeavours. This work describes a development ecosystem devised for the generation of these languages, starting from a generic perspective that can be specialized into the details of each domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://personales.unican.es/delavegaa/files/medi/reviewProtocol.pdf.

References

Abadi, D., et al.: The Beckman report on database research. SIGMOD Rec. 43(3), 61–70 (2014)
Article Google Scholar
Balcázar, J.L.: Parameter-free association rule mining with yacaree. In: Extraction et Gestion des Connaissances (EGC), Brest (France), pp. 251–254 (2011)
Google Scholar
Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: the Konstanz information miner. SIGKDD Explor. Newsl. 11(1), 26–31 (2009)
Article Google Scholar
Campos, M., Stengard, P., Milenova, B.: Data-centric automated data mining. In: Fourth International Conference on Machine Learning and Applications (ICMLA 2005), vol. 2005, pp. 97–104 (2005)
Google Scholar
Eysholdt, M., Behrens, H.: Xtext: implement your language faster than the quick and dirty way. In: Companion to the 25th Annual Conference on Object-Oriented Programming, Systems, Languages, and Applications (SPLASH/OOPSLA), Reno/Tahoe (Nevada, USA), pp. 307–309, October 2010
Google Scholar
Hall, M., et al.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Jalali, S., Wohlin, C.: Systematic literature studies: database searches vs. backward snowballing. In: Proceedings of the 2012 6th ACM_IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 29–38 (2012)
Google Scholar
Kamsu-Foguem, B., Tchuenté-Foguem, G., Allart, L., Zennir, Y., Vilhelm, C., Mehdaoui, H., Zitouni, D., Hubert, H., Lemdani, M., Ravaux, P.: User-centered visual analysis using a hybrid reasoning architecture for intensive care units. Decis. Support Syst. 54(1), 496–509 (2012)
Article Google Scholar
Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering. Technical report EBSE 2007–001, Keele University and Durham University Joint Report (2007)
Google Scholar
Kolovos, D.S., Paige, R.F., Rose, L.M., Williams, J.R.: Integrated model management with epsilon. In: France, R.B., Kuester, J.M., Bordbar, B., Paige, R.F. (eds.) ECMFA 2011. LNCS, vol. 6698, pp. 391–392. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21470-7_33
Chapter Google Scholar
Lemke, C., Budka, M., Gabrys, B.: Metalearning: a survey of trends and technologies. Artif. Intell. Rev. 44(1), 117–130 (2015)
Article Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Rice, W.: Moodle E-Learning Course Development. Packt Publishing, Birmingham (2006)
Google Scholar
Schrage, M.: Stop Searching for That Elusive Data Scientist. Harvard Business Review. https://hbr.org/2014/09/stop-searching-for-that-elusive-data-scientist/
Smith, J.W., et al.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, pp. 261–265, November 1988
Google Scholar
Sweney, M.: Netflix gathers detailed viewer data to guide its search for the next hit. The Guardian. http://www.theguardian.com/media/2014/feb/23/netflix-viewer-data-house-of-cards
de la Vega, A., García-Saiz, D., Zorrilla, M., Sánchez, P.: Towards a DSL for educational data mining. In: Sierra-Rodríguez, J.-L., Leal, J.P., Simões, A. (eds.) SLATE 2015. CCIS, vol. 563, pp. 79–90. Springer, Cham (2015). doi:10.1007/978-3-319-27653-3_8
Chapter Google Scholar
Zorrilla, M., García-Saiz, D.: A service-oriented architecture to provide data mining services for non-expert data miners. Decis. Support Syst. 55, 399–411 (2013)
Article Google Scholar

Download references

Acknowledgements

This work has been partially funded by the Government of Cantabria (Spain) under the doctoral studentship program from the University of Cantabria, and by the Spanish Government under grant TIN2014-56158-C4-2-P (M2C2).

Author information

Authors and Affiliations

Dpto. Ingeniería Informática y Electrónica, Universidad de Cantabria, Santander, Spain
Alfonso de la Vega, Diego García-Saiz, Marta Zorrilla & Pablo Sánchez

Authors

Alfonso de la Vega
View author publications
You can also search for this author in PubMed Google Scholar
Diego García-Saiz
View author publications
You can also search for this author in PubMed Google Scholar
Marta Zorrilla
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Sánchez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alfonso de la Vega .

Editor information

Editors and Affiliations

ISAE-ENSMA, Chasseneuil, France
Yassine Ouhammou
University of Novi Sad, Novi Sad, Serbia
Mirjana Ivanovic
UPC-Barcelona Tech, Barcelona, Spain
Alberto Abelló
ISAE-ENSMA, Chasseneuil, France
Ladjel Bellatreche

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de la Vega, A., García-Saiz, D., Zorrilla, M., Sánchez, P. (2017). A Model-Driven Ecosystem for the Definition of Data Mining Domain-Specific Languages. In: Ouhammou, Y., Ivanovic, M., Abelló, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2017. Lecture Notes in Computer Science(), vol 10563. Springer, Cham. https://doi.org/10.1007/978-3-319-66854-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-66854-3_3
Published: 06 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66853-6
Online ISBN: 978-3-319-66854-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Model-Driven Ecosystem for the Definition of Data Mining Domain-Specific Languages