Skip to main content
Log in

MongoDB-Based Modular Ontology Building for Big Data Integration

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

Big Data are collections of data sets so large and complex to process using classical database management tools. Their main characteristics are volume, variety and velocity. Although these characteristics accentuate heterogeneity problems, users are always looking for a unified view of the data. Consequently, Big Data integration is a new research area that faces new challenges due to the aforementioned characteristics. Ontologies are widely used in data integration since they represent knowledge as a formal description of a domain of interest. With the advent of Big Data, their implementation faces new challenges due to the volume, variety and velocity dimensions of these data. This paper illustrates an approach to build a modular ontology for Big Data integration that considers the characteristics of big volume, high-speed generation and wide variety of the data. Our approach exploits a NOSQL database, namely MongoDB, and takes advantages of modular ontologies. It follows three main steps: wrapping data sources to MongoDB databases, generating local ontologies and finally composing the local ontologies to get a global one. We equally focus on the implementation of the two last steps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. http://dbs.uni-leip-zig.de/en/research/projects/schema_and_ontology_matching/coma_3_0/coma_3_0_community_edition.

  2. http://alignapi.gforge.inria.fr/edoal.html.

  3. http://sioc-project.org/ontology.

  4. http://www.w3.org/TR/owl-features/.

  5. https://www.w3.org/.

  6. http://nosql-database.org/.

  7. http://www.mongodb.org/.

  8. https://www.talend.com/.

  9. https://github.com/raynaldmo/northwind-mongodb.

  10. https://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFViewNorthwindOntology.

  11. https://virtuoso.openlinksw.com/.

References

  1. Kaisler S, Armour F, Espinosa J A, Money W (2013) Big data: issues and challenges moving forward. In: 6th Hawaii international conference on system sciences (HICSS), pp 995–1004

  2. Gupta R, Gupta H, Mohania M (2012) Cloud computing and big data analytics: what is new from databases perspective?, big data analytics, Lecture notes in computer science, vol 7678, pp 42–61

  3. Zikopoulos P, Eaton C (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York

    Google Scholar 

  4. Boden C, Karnstedt M, Fernandez M, Markl V (2013) Large-scale social-media analytics on stratosphere. In: Proceedings of the 22nd international conference on world wide web companion, pp 257–260

  5. Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 233–246

  6. Malucelli A, Oliveira E (2003) Ontology-services to facilitate agents interoperability. In: Proceedings of the sixth Pacific rim international workshop on multi-agents (PRIMA), pp 170–181

  7. Wache H, Vögele T, Visser U, Stuckenschmidt H, Schuster G, Neumann H, Hübner S (2001) Ontology-based integration of information—a survey of existing approaches. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI-01), workshop: ontologies and information sharing

  8. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220

    Article  Google Scholar 

  9. Benjamins VR, Gómez-Pérez A (2000) Knowledge-system technology: ontologies and problem-solving methods. Department of Social Science Informatics, University of Amsterdam, The Netherlands

  10. Bontcheva K, Sabou M (2006) Learning ontologies from software artifacts: exploring and combining multiple sources. In: Proceedings of the 2nd international workshop on semantic web enabled software engineering (SWESE)

  11. Cimiano P, Mädche A, Staab S, Völker J (2009) Ontology learning. In: Staab S, Studer R (eds) Handbook on ontologies. International handbooks on information systems. Springer, Berlin, Heidelberg

  12. Ziegler P, Dittri KR (2007) Data integration—problems, approaches, and perspectives. In: Conceptual modelling in information systems engineering, pp 39–58

  13. Knoblock CA, Szekely PA (2015) Exploiting semantics for big data integration. AI Mag 36(1):25–38

    Article  Google Scholar 

  14. Kadadi A, Agrawal R, Nyamful C, Atiq R (2014) Challenges of data integration and interoperability in big data. In: IEEE international conference on big data, pp 38–40

  15. Hashemi I A, Schneider T (2012) Ontology summit 2012 communiqué—v1.01 ontology for big systems. http://ontolog.cim3.net/OntologySummit/2012/files/OntologySummit2012Communique-v1.01.pdf

  16. Curé O, Lamolle M, Le Duc C (2013) Ontology based data integration over document and column family oriented NOSQL, The Computing Research Repository

  17. Kiran VK, Vijayakumar R (2014) Ontology based data integration of NoSQL datastores. In: 9th international conference on industrial and information systems (ICIIS)

  18. Jirkovskỳ V, Obitko M (2014) Semantic heterogeneity reduction for big data in industrial automation, information technologies—applications and theory (ITAT)

  19. Jirkovskỳ V, Ichise R (2013) Mapsom: user involvement in ontology matching. In: Proceedings of the 3rd JIST conference

  20. Obitko M, Snasel V, Smid J (2004) Ontology design with formal concept analysis. CLA 110:111–119

    Google Scholar 

  21. Bansal SK, Kagemann S (2015) Integrating big data: a semantic extract-transform-load framework. Computer 48(3):42–50

    Article  Google Scholar 

  22. Baader F, Calvanese D, McGuiness DL, Nardi D, Patel-Schneider P (2003) The description logic handbook: theory, implementation, applications. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  23. Baader F, Sertkaya B, Turhan AY (2004) Computing the least common subsumer w.r.t. a background terminology. J Appl Logic 5:400–412

    MathSciNet  MATH  Google Scholar 

  24. Elloumi-Chaabene M, Mustapha NB, Zghal HB, Moreno A, Sànchez D (2011) Semantic-based composition of modular ontologies applied to web query reformulation. ICSOFT 1:305–308

    Google Scholar 

  25. Bao J, Caragea D, Honavar V (2006) Towards collaborative environments for ontology construction and sharing. In: International symposium on collaborative technologies and systems (CTS), pp 99–108

  26. Ben Mustapha N, Baazaoui Zghal H, Moreno A, Ben Ghézala H (2013) A dynamic composition of ontology modules approach: application to web query reformulation. IJMSO 8(4):309–321

    Article  Google Scholar 

  27. Zimmermann A, Le Duc C (2008) Reasoning with a network of aligned ontologies. In: Proceedings of the 2nd international conference on web reasoning and rulesystems (ICWRRS), pp 43–57

  28. Desprès S (2014) Construction d’une ontologie modulaire pour l’univers de la cuisine numérique, 25èmes Journées francophones d’Ingénierie des Connaissances, pp 27–38

  29. Atrash A, Abel MH, Moulin C (2014) Ontologie Modulaire pour la collaboration, 225èmes Journées francophones d’Ingénierie des Connaissances, p 811

  30. Deparis E, Abel MH, Lortal G, Mattoli J (2014) Information management from social and documentary sources in organizations. Comput Human Behav 30:753–759

    Article  Google Scholar 

  31. Abbes H, Gargouri F (2016) Big data integration: a MongoDB database and modular ontologies based approach, knowledge-based and intelligent information and engineering systems. In: Proceedings of the 20th international conference KES-2016, procedia computer science, vol 96, pp 446–455

  32. Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: 6th international conference on pervasive computing and applications (ICPCA), pp 363–366. https://doi.org/10.1109/ICPCA.2011.6106531

  33. Hecht R, Jablonski S (2011) NoSQL evaluation: a use case oriented survey. In: International conference on cloud and service computing, pp 336–341

  34. Ben Abbes S, Scheuermann A, Meilender T, D’Aquin M (2012) Characterizing modular ontologies. In: 7th international conference on formal ontologies in information systems – FOIS 2012, Jul 2012, Graz, Austria, pp 13–25

  35. D’Aquin M, Haase P, Rudolph S, Euzenat J, Zimmermann A, Dzbor M, Iglesias M, Jacques Y, Caracciolo C, Aranda CB, Gomez JM (2008) D1.1.3: neon formalisms for modularization: syntax, semantics, algebra, deliverable 1.1.3 of the NeOn integrated project. http://www.emse.fr/~zimmermann/Papers/neon-113.pdf

  36. Spaccapietra S, Menken M, Stuckenschmidt H, Wache H, Serafini L, Tamilin A, Jarrar M, Porto F, Parent C, Rector A, Pan J, d’Aquin M, Lieber J, Napoli A, Stoilos G, Tzouvaras V, Stamou G (2005) D2.1.3.1 report on modularization of ontologies, deliverable D2.1.3.1 of the knowledge web project. https://pdfs.semanticscholar.org/d73f/ 7b60d870f7e4ca448c610b4bf875829549ce.pdf

  37. Grau BC, Horrocks I, Kazakov Y, Sattler U (2007) A logical framework for modularity of ontologies. JCAI 2007:298–303

    Google Scholar 

  38. Jarrar M (2005) Methodological principles for ontology engineering, methodological principles for ontology engineering. Ph.D. Thesis. http://www.jarrar.info/phd-thesis/Jarrar-PhDThesis%20Ver0.165.AftPrinted.pdf

  39. Pathak J, Johnson TM, Chute CG (2009) Survey of modular ontology techniques and their applications in the biomedical domain. Integrat Comput Aided Eng 16(3):225242

    Google Scholar 

  40. Djedidi R (2009) Approche d’évolution d’ontologie guidée par des patrons de gestion de changement. Ph.D. thesis. https://tel.archives-ouvertes.fr/tel-00437844/document

  41. Abbes H, Gargouri F (2014) Towards ontology building and updating from big data. In: Advances on decisional systems conference (ASD), pp 61–66

  42. Abbes H, Boukettaya S, Gargouri F (2015) Learning ontology from big data through MongoDB database. In: ACS/IEEE 12th international conference of computer systems and applications (AICCSA), pp 1–7

  43. Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration, KDD, pp 475-480

  44. Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707710

    MathSciNet  Google Scholar 

  45. Cohen WW, Ravikumar P, Fienberg SE (2003) A comparison of string metrics for matching names and records, KDD workshop on data cleaning and object consolidation

  46. Abbes H, Gargouri F (2016) Structure based modular ontologies composition. In: ACS/IEEE 13th international conference of computer systems and applications (AICCSA)

  47. Frikha M, Mhiri M, Gargouri F (2007) Extraction of semantic relationships starting from similarity measurements. ICEIS 3:602–606

    Google Scholar 

  48. Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics (ACL-94), pp 133–138

  49. Abbes H, Gargouri F (2016) M2Onto: an approach and a tool to learn OWL ontology from MongoDB database. In: 16th international conference on intelligent systems design and applications (ISDA)

  50. Rospocher M, Tonelli S, Serafini L, Pianta E (2012) Corpus-based terminological evaluation of ontologies. Appl Ontol 7(4):429–448

    Google Scholar 

  51. Dellschaft K, Staab S (2008) Strategies for the evaluation of ontology learning. Ontol Learn Popul 167:253–272

    Google Scholar 

  52. Raunich S, Rahm E (2012) Towards a benchmark for ontology merging, On the moveto meaningful internet systems: OTM 2012 workshops, pp 124–133

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanen Abbes.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abbes, H., Gargouri, F. MongoDB-Based Modular Ontology Building for Big Data Integration. J Data Semant 7, 1–27 (2018). https://doi.org/10.1007/s13740-017-0081-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-017-0081-z

Keywords

Navigation