Skip to main content

A Model-Driven Ecosystem for the Definition of Data Mining Domain-Specific Languages

  • Conference paper
  • First Online:
Model and Data Engineering (MEDI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10563))

Included in the following conference series:

  • 792 Accesses

Abstract

Data mining techniques are making their entrance in nowadays companies, allowing business users to take informed decisions based on their available data. However, these business experts usually lack the knowledge to perform the analysis of the data by themselves, which makes it necessary to rely on experts in the field of data mining. In an attempt to solve this problem, we previously studied the definition of domain-specific languages, which allowed to specify data mining processes without requiring experience in the applied techniques. The specification was made through high-level language primitives, which referred only to familiar concepts and terms from the original domain of the data. Therefore, technical details about the mining processes were hidden to the final user. Although these languages present themselves as a promising solution, their development can become a challenging task, incurring in costly endeavours. This work describes a development ecosystem devised for the generation of these languages, starting from a generic perspective that can be specialized into the details of each domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://personales.unican.es/delavegaa/files/medi/reviewProtocol.pdf.

References

  1. Abadi, D., et al.: The Beckman report on database research. SIGMOD Rec. 43(3), 61–70 (2014)

    Article  Google Scholar 

  2. Balcázar, J.L.: Parameter-free association rule mining with yacaree. In: Extraction et Gestion des Connaissances (EGC), Brest (France), pp. 251–254 (2011)

    Google Scholar 

  3. Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: the Konstanz information miner. SIGKDD Explor. Newsl. 11(1), 26–31 (2009)

    Article  Google Scholar 

  4. Campos, M., Stengard, P., Milenova, B.: Data-centric automated data mining. In: Fourth International Conference on Machine Learning and Applications (ICMLA 2005), vol. 2005, pp. 97–104 (2005)

    Google Scholar 

  5. Eysholdt, M., Behrens, H.: Xtext: implement your language faster than the quick and dirty way. In: Companion to the 25th Annual Conference on Object-Oriented Programming, Systems, Languages, and Applications (SPLASH/OOPSLA), Reno/Tahoe (Nevada, USA), pp. 307–309, October 2010

    Google Scholar 

  6. Hall, M., et al.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  7. Jalali, S., Wohlin, C.: Systematic literature studies: database searches vs. backward snowballing. In: Proceedings of the 2012 6th ACM_IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 29–38 (2012)

    Google Scholar 

  8. Kamsu-Foguem, B., Tchuenté-Foguem, G., Allart, L., Zennir, Y., Vilhelm, C., Mehdaoui, H., Zitouni, D., Hubert, H., Lemdani, M., Ravaux, P.: User-centered visual analysis using a hybrid reasoning architecture for intensive care units. Decis. Support Syst. 54(1), 496–509 (2012)

    Article  Google Scholar 

  9. Kitchenham, B., Charters, S.: Guidelines for performing systematic literature reviews in software engineering. Technical report EBSE 2007–001, Keele University and Durham University Joint Report (2007)

    Google Scholar 

  10. Kolovos, D.S., Paige, R.F., Rose, L.M., Williams, J.R.: Integrated model management with epsilon. In: France, R.B., Kuester, J.M., Bordbar, B., Paige, R.F. (eds.) ECMFA 2011. LNCS, vol. 6698, pp. 391–392. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21470-7_33

    Chapter  Google Scholar 

  11. Lemke, C., Budka, M., Gabrys, B.: Metalearning: a survey of trends and technologies. Artif. Intell. Rev. 44(1), 117–130 (2015)

    Article  Google Scholar 

  12. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  13. Rice, W.: Moodle E-Learning Course Development. Packt Publishing, Birmingham (2006)

    Google Scholar 

  14. Schrage, M.: Stop Searching for That Elusive Data Scientist. Harvard Business Review. https://hbr.org/2014/09/stop-searching-for-that-elusive-data-scientist/

  15. Smith, J.W., et al.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, pp. 261–265, November 1988

    Google Scholar 

  16. Sweney, M.: Netflix gathers detailed viewer data to guide its search for the next hit. The Guardian. http://www.theguardian.com/media/2014/feb/23/netflix-viewer-data-house-of-cards

  17. de la Vega, A., García-Saiz, D., Zorrilla, M., Sánchez, P.: Towards a DSL for educational data mining. In: Sierra-Rodríguez, J.-L., Leal, J.P., Simões, A. (eds.) SLATE 2015. CCIS, vol. 563, pp. 79–90. Springer, Cham (2015). doi:10.1007/978-3-319-27653-3_8

    Chapter  Google Scholar 

  18. Zorrilla, M., García-Saiz, D.: A service-oriented architecture to provide data mining services for non-expert data miners. Decis. Support Syst. 55, 399–411 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

This work has been partially funded by the Government of Cantabria (Spain) under the doctoral studentship program from the University of Cantabria, and by the Spanish Government under grant TIN2014-56158-C4-2-P (M2C2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfonso de la Vega .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

de la Vega, A., García-Saiz, D., Zorrilla, M., Sánchez, P. (2017). A Model-Driven Ecosystem for the Definition of Data Mining Domain-Specific Languages. In: Ouhammou, Y., Ivanovic, M., Abelló, A., Bellatreche, L. (eds) Model and Data Engineering. MEDI 2017. Lecture Notes in Computer Science(), vol 10563. Springer, Cham. https://doi.org/10.1007/978-3-319-66854-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66854-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66853-6

  • Online ISBN: 978-3-319-66854-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics