Abstract
Big Data are collections of data sets so large and complex to process using classical database management tools. Their main characteristics are volume, variety and velocity. Although these characteristics accentuate heterogeneity problems, users are always looking for a unified view of the data. Consequently, Big Data integration is a new research area that faces new challenges due to the aforementioned characteristics. Ontologies are widely used in data integration since they represent knowledge as a formal description of a domain of interest. With the advent of Big Data, their implementation faces new challenges due to the volume, variety and velocity dimensions of these data. This paper illustrates an approach to build a modular ontology for Big Data integration that considers the characteristics of big volume, high-speed generation and wide variety of the data. Our approach exploits a NOSQL database, namely MongoDB, and takes advantages of modular ontologies. It follows three main steps: wrapping data sources to MongoDB databases, generating local ontologies and finally composing the local ontologies to get a global one. We equally focus on the implementation of the two last steps.
Similar content being viewed by others
Notes
References
Kaisler S, Armour F, Espinosa J A, Money W (2013) Big data: issues and challenges moving forward. In: 6th Hawaii international conference on system sciences (HICSS), pp 995–1004
Gupta R, Gupta H, Mohania M (2012) Cloud computing and big data analytics: what is new from databases perspective?, big data analytics, Lecture notes in computer science, vol 7678, pp 42–61
Zikopoulos P, Eaton C (2011) Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, New York
Boden C, Karnstedt M, Fernandez M, Markl V (2013) Large-scale social-media analytics on stratosphere. In: Proceedings of the 22nd international conference on world wide web companion, pp 257–260
Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 233–246
Malucelli A, Oliveira E (2003) Ontology-services to facilitate agents interoperability. In: Proceedings of the sixth Pacific rim international workshop on multi-agents (PRIMA), pp 170–181
Wache H, Vögele T, Visser U, Stuckenschmidt H, Schuster G, Neumann H, Hübner S (2001) Ontology-based integration of information—a survey of existing approaches. In: Proceedings of the 17th international joint conference on artificial intelligence (IJCAI-01), workshop: ontologies and information sharing
Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220
Benjamins VR, Gómez-Pérez A (2000) Knowledge-system technology: ontologies and problem-solving methods. Department of Social Science Informatics, University of Amsterdam, The Netherlands
Bontcheva K, Sabou M (2006) Learning ontologies from software artifacts: exploring and combining multiple sources. In: Proceedings of the 2nd international workshop on semantic web enabled software engineering (SWESE)
Cimiano P, Mädche A, Staab S, Völker J (2009) Ontology learning. In: Staab S, Studer R (eds) Handbook on ontologies. International handbooks on information systems. Springer, Berlin, Heidelberg
Ziegler P, Dittri KR (2007) Data integration—problems, approaches, and perspectives. In: Conceptual modelling in information systems engineering, pp 39–58
Knoblock CA, Szekely PA (2015) Exploiting semantics for big data integration. AI Mag 36(1):25–38
Kadadi A, Agrawal R, Nyamful C, Atiq R (2014) Challenges of data integration and interoperability in big data. In: IEEE international conference on big data, pp 38–40
Hashemi I A, Schneider T (2012) Ontology summit 2012 communiqué—v1.01 ontology for big systems. http://ontolog.cim3.net/OntologySummit/2012/files/OntologySummit2012Communique-v1.01.pdf
Curé O, Lamolle M, Le Duc C (2013) Ontology based data integration over document and column family oriented NOSQL, The Computing Research Repository
Kiran VK, Vijayakumar R (2014) Ontology based data integration of NoSQL datastores. In: 9th international conference on industrial and information systems (ICIIS)
Jirkovskỳ V, Obitko M (2014) Semantic heterogeneity reduction for big data in industrial automation, information technologies—applications and theory (ITAT)
Jirkovskỳ V, Ichise R (2013) Mapsom: user involvement in ontology matching. In: Proceedings of the 3rd JIST conference
Obitko M, Snasel V, Smid J (2004) Ontology design with formal concept analysis. CLA 110:111–119
Bansal SK, Kagemann S (2015) Integrating big data: a semantic extract-transform-load framework. Computer 48(3):42–50
Baader F, Calvanese D, McGuiness DL, Nardi D, Patel-Schneider P (2003) The description logic handbook: theory, implementation, applications. Cambridge University Press, Cambridge
Baader F, Sertkaya B, Turhan AY (2004) Computing the least common subsumer w.r.t. a background terminology. J Appl Logic 5:400–412
Elloumi-Chaabene M, Mustapha NB, Zghal HB, Moreno A, Sànchez D (2011) Semantic-based composition of modular ontologies applied to web query reformulation. ICSOFT 1:305–308
Bao J, Caragea D, Honavar V (2006) Towards collaborative environments for ontology construction and sharing. In: International symposium on collaborative technologies and systems (CTS), pp 99–108
Ben Mustapha N, Baazaoui Zghal H, Moreno A, Ben Ghézala H (2013) A dynamic composition of ontology modules approach: application to web query reformulation. IJMSO 8(4):309–321
Zimmermann A, Le Duc C (2008) Reasoning with a network of aligned ontologies. In: Proceedings of the 2nd international conference on web reasoning and rulesystems (ICWRRS), pp 43–57
Desprès S (2014) Construction d’une ontologie modulaire pour l’univers de la cuisine numérique, 25èmes Journées francophones d’Ingénierie des Connaissances, pp 27–38
Atrash A, Abel MH, Moulin C (2014) Ontologie Modulaire pour la collaboration, 225èmes Journées francophones d’Ingénierie des Connaissances, p 811
Deparis E, Abel MH, Lortal G, Mattoli J (2014) Information management from social and documentary sources in organizations. Comput Human Behav 30:753–759
Abbes H, Gargouri F (2016) Big data integration: a MongoDB database and modular ontologies based approach, knowledge-based and intelligent information and engineering systems. In: Proceedings of the 20th international conference KES-2016, procedia computer science, vol 96, pp 446–455
Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: 6th international conference on pervasive computing and applications (ICPCA), pp 363–366. https://doi.org/10.1109/ICPCA.2011.6106531
Hecht R, Jablonski S (2011) NoSQL evaluation: a use case oriented survey. In: International conference on cloud and service computing, pp 336–341
Ben Abbes S, Scheuermann A, Meilender T, D’Aquin M (2012) Characterizing modular ontologies. In: 7th international conference on formal ontologies in information systems – FOIS 2012, Jul 2012, Graz, Austria, pp 13–25
D’Aquin M, Haase P, Rudolph S, Euzenat J, Zimmermann A, Dzbor M, Iglesias M, Jacques Y, Caracciolo C, Aranda CB, Gomez JM (2008) D1.1.3: neon formalisms for modularization: syntax, semantics, algebra, deliverable 1.1.3 of the NeOn integrated project. http://www.emse.fr/~zimmermann/Papers/neon-113.pdf
Spaccapietra S, Menken M, Stuckenschmidt H, Wache H, Serafini L, Tamilin A, Jarrar M, Porto F, Parent C, Rector A, Pan J, d’Aquin M, Lieber J, Napoli A, Stoilos G, Tzouvaras V, Stamou G (2005) D2.1.3.1 report on modularization of ontologies, deliverable D2.1.3.1 of the knowledge web project. https://pdfs.semanticscholar.org/d73f/ 7b60d870f7e4ca448c610b4bf875829549ce.pdf
Grau BC, Horrocks I, Kazakov Y, Sattler U (2007) A logical framework for modularity of ontologies. JCAI 2007:298–303
Jarrar M (2005) Methodological principles for ontology engineering, methodological principles for ontology engineering. Ph.D. Thesis. http://www.jarrar.info/phd-thesis/Jarrar-PhDThesis%20Ver0.165.AftPrinted.pdf
Pathak J, Johnson TM, Chute CG (2009) Survey of modular ontology techniques and their applications in the biomedical domain. Integrat Comput Aided Eng 16(3):225242
Djedidi R (2009) Approche d’évolution d’ontologie guidée par des patrons de gestion de changement. Ph.D. thesis. https://tel.archives-ouvertes.fr/tel-00437844/document
Abbes H, Gargouri F (2014) Towards ontology building and updating from big data. In: Advances on decisional systems conference (ASD), pp 61–66
Abbes H, Boukettaya S, Gargouri F (2015) Learning ontology from big data through MongoDB database. In: ACS/IEEE 12th international conference of computer systems and applications (AICCSA), pp 1–7
Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration, KDD, pp 475-480
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707710
Cohen WW, Ravikumar P, Fienberg SE (2003) A comparison of string metrics for matching names and records, KDD workshop on data cleaning and object consolidation
Abbes H, Gargouri F (2016) Structure based modular ontologies composition. In: ACS/IEEE 13th international conference of computer systems and applications (AICCSA)
Frikha M, Mhiri M, Gargouri F (2007) Extraction of semantic relationships starting from similarity measurements. ICEIS 3:602–606
Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd annual meeting of the associations for computational linguistics (ACL-94), pp 133–138
Abbes H, Gargouri F (2016) M2Onto: an approach and a tool to learn OWL ontology from MongoDB database. In: 16th international conference on intelligent systems design and applications (ISDA)
Rospocher M, Tonelli S, Serafini L, Pianta E (2012) Corpus-based terminological evaluation of ontologies. Appl Ontol 7(4):429–448
Dellschaft K, Staab S (2008) Strategies for the evaluation of ontology learning. Ontol Learn Popul 167:253–272
Raunich S, Rahm E (2012) Towards a benchmark for ontology merging, On the moveto meaningful internet systems: OTM 2012 workshops, pp 124–133
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abbes, H., Gargouri, F. MongoDB-Based Modular Ontology Building for Big Data Integration. J Data Semant 7, 1–27 (2018). https://doi.org/10.1007/s13740-017-0081-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13740-017-0081-z