Skip to main content
Log in

Harvesting models from web 2.0 databases

  • Theme Section
  • Published:
Software & Systems Modeling Aims and scope Submit manuscript

Abstract

Data rather than functionality are the sources of competitive advantage for Web2.0 applications such as wikis, blogs and social networking websites. This valuable information might need to be capitalized by third-party applications or be subject to migration or data analysis. Model-Driven Engineering (MDE) can be used for these purposes. However, MDE first requires obtaining models from the wiki/blog/website database (a.k.a. model harvesting). This can be achieved through SQL scripts embedded in a program. However, this approach leads to laborious code that exposes the iterations and table joins that serve to build the model. By contrast, a Domain-Specific Language (DSL) can hide these “how” concerns, leaving the designer to focus on the “what”, i.e. the mapping of database schemas to model classes. This paper introduces Schemol, a DSL tailored for extracting models out of databases which considers Web2.0 specifics. Web2.0 applications are often built on top of general frameworks (a.k.a. engines) that set the database schema (e.g., MediaWiki, Blojsom). Hence, table names offer little help in automating the extraction process. In addition, Web2.0 data tend to be annotated. User-provided data (e.g., wiki articles, blog entries) might contain semantic markups which provide helpful hints for model extraction. Unfortunately, these data end up being stored as opaque strings. Therefore, there exists a considerable conceptual gap between the source database and the target metamodel. Schemol offers extractive functions and view-like mechanisms to confront these issues. Examples using Blojsom as the blog engine are available for download.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Architecture-Driven Modernization (ADM). Accessed 21-Dec-10. http://adm.omg.org

  2. Eclipse Modeling Framework. Accessed 21-Dec-10. http://www.eclipse.org/modeling/emf

  3. hCard Microformat. Accessed 21-Dec-10. http://microformats.org/wiki/hcard

  4. Hibernate. Accessed 21-Dec-10. http://www.hibernate.org

  5. hProduct Microformat. Accessed 21-Dec-10. http://microformats.org/wiki/hproduct

  6. ISO 9126 Software Quality Model. Accessed 21-Dec-10. http://www.sqa.net/iso9126.html

  7. MDA Specifications. http://www.omg.org/mda/specs.htm

  8. MediaWiki. accessed 21-Dec-10. http://www.mediawiki.org

  9. Microformats. Accessed 21-Dec-10. http://microformats.org

  10. Rdfa. Accessed 21-Dec-10. http://rdfa.info/wiki/Introduction

  11. Structured Blogging. Accessed 21-Dec-10. http://structuredblogging.org/

  12. Teneo. Accessed 21-Dec-10. http://wiki.eclipse.org/Teneo

  13. Use Class With Semantics in Mind, W3C. Accessed 21-Dec-10. http://www.w3.org/QA/Tips/goodclassnames

  14. XText. Accessed 21-Dec-10. http://www.eclipse.org/Xtext/

  15. Barbier, G., Bruneliere H., Jouault F., Lennon Y., Madiot F.: Modisco, a model-driven platform to support real legacy modernization uses cases. In: Information Systems Transformation: Architecture-Driven Modernization Case Studies. Elsevier Science, Amsterdam (2010)

  16. Michael, R.B.: On reverse engineering of vendor databases. In: Working Conference on Reverse Engineering (WCRE), pp. 183–190 (1998)

  17. Cánovas, J.L., Cuadrado, J.S., Molina J.G.: Gra2MoL: a domain specific transformation language for bridging grammarware to modelware in software modernization. In: MODSE 2008 (2008)

  18. Cook, S.: Domain-specific modeling and model driven architecture. MDA J. (2004, last accessed Oct 2010). http://www.bptrends.com/publicationfiles/01-04kel-Cook.pdf

  19. Czarnecki, D.: Blojsom. Accessed 21-Dec-10 http://wiki.blojsom.com

  20. Davis, K.H., Aiken P.H.: Data reverse engineering: a historical survey. In: Working Conference on Reverse Engineering (WCRE), pp. 70–78 (2000)

  21. Díaz O., Villoria F.M.: Generating blogs out of product catalogues: an MDE approach. J. Syst. Softw. 83(10), 1970–1982 (2010)

    Article  Google Scholar 

  22. Hainaut, J.-L., Cleve, A., Henrard, J., Hick, J.-M.: Migration of Legacy information systems. In: Mens and Demeyer [33], pp. 105–138

  23. Heidenreich, F., Johannes, J., Karol, S., Seifert, M., Wende, C.: Derivation and refinement of textual syntax for models. In: ECMDA-FA, pp. 114–129 (2009)

  24. Cánovas J.L., Molina J.G.: An architecture-driven modernization tool for calculating metrics. IEEE Softw. 27, 37–43 (2010)

    Google Scholar 

  25. Jahnke J.H.: Cognitive support in software reengineering based on generic fuzzy reasoning nets. Fuzzy Sets Syst. 145(1), 3–27 (2004)

    Article  MathSciNet  Google Scholar 

  26. Jahnke, J.H., Schäfer, W., Zündorf, A.: Generic fuzzy reasoning nets as a basis for reverse engineering relational database applications. In: ESEC/SIGSOFT FSE, pp. 193–210 (1997)

  27. Jouault, F., Allilaire, F., Bézivin, J., Kurtev, I., Valduriez, P.: ATL: a QVT-like transformation language. In: OOPSLA Companion (2006)

  28. Jouault, F., Kurtev, I.: Transforming models with ATL. In: MoDELS Satellite Events, pp. 128–138 (2005)

  29. Kurtev, I., Bézivin, J., Aksit, M.: Technological spaces: an initial appraisal. In: International Symposium on Distributed Objects and Applications, DOA (2002)

  30. Lockwood, N.S., Dennis, A.R.: Exploring the corporate blogosphere: a taxonomi for research and practice. In: Proceedings of the 41st Annual Hawaii International Conference on System Sciences-HICSS (2008)

  31. Markines, B.: Socially induced semantic networks and applications. SIGWEB Newsl., pp. 3:1–3:3, September (2009)

  32. MartSoft. Open Catalog Format. Accessed 21-Dec-10. http://xml.coverpages.org/ocp.html

  33. Mens T., Demeyer S.: Software Evolution. Springer, Berlin (2008)

    MATH  Google Scholar 

  34. Müller, H.A., Jahnke, J.H., Smith, D.B., Storey, M.-A., Tilley, S.R., Wong, K.: Reverse engineering: a roadmap. In: International Conference on Software Engineering (ICSE), pp. 47–60 (2000)

  35. Carr, N.: Lessons in Corporate Blogging, 2006. Business Week Online. Accessed 21-Dec-10. http://www.businessweek.com

  36. Polo M., Rodríguez de Guzmán I.G., Piattini M.: An MDA-based approach for database re-engineering. J. Softw. Maintenance 19(6), 383–417 (2007)

    Article  Google Scholar 

  37. Reus, T., Geers, H., van Deursen, A.: Harvesting software systems for MDA-based reengineering. In: ECMDA-FA, pp. 213–225 (2006)

  38. Simitsis, A., Skoutas, D., Castellanos, M.: Representation of conceptual ETL designs in natural language using semantic web technology. In: Data & Knowledge Engineering (2009)

  39. Steinberg D., Budinsky F., Paternostro M., Merks E.: EMF: Eclipse Modeling Framework. Addison-Wesley, Reading (2008)

    Google Scholar 

  40. Stonebraker M., Moore D.: Object-Relational DBMSs: The Next Great Wave. Morgan Kaufmann, USA (1996)

    MATH  Google Scholar 

  41. Ulrich W.M., Newcomb P.H.: Information Systems Transformation: ADM Case Studies. Morgan Kaufmann, USA (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gorka Puente.

Additional information

Communicated by Gustavo Rossi, Nora Koch, Geert-Jan Houben, and Antonio Vallecillo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Díaz, O., Puente, G., Cánovas Izquierdo, J.L. et al. Harvesting models from web 2.0 databases. Softw Syst Model 12, 15–34 (2013). https://doi.org/10.1007/s10270-011-0194-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10270-011-0194-z

Keywords

Navigation