Towards Semantification of Big Data Technology

  • Mohamed Nadjib MamiEmail author
  • Simon Scerri
  • Sören Auer
  • Maria-Esther Vidal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9829)


Much attention has been devoted to support the volume and velocity dimensions of Big Data. As a result, a plethora of technology components supporting various data structures (e.g., key-value, graph, relational), modalities (e.g., stream, log, real-time) and computing paradigms (e.g., in-memory, cluster/cloud) are meanwhile available. However, systematic support for managing the variety of data, the third dimension in the classical Big Data definition, is still missing. In this article, we present SeBiDA, an approach for managing hybrid Big Data. SeBiDA supports the Semantification of Big Data using the RDF data model, i.e., non-semantic Big Data is semantically enriched by using RDF vocabularies. We empirically evaluate the performance of SeBiDA for two dimensions of Big Data, i.e., volume and variety; the Berlin Benchmark is used in the study. The results suggest that even in large datasets, query processing time is not affected by data variety.


Query Processing Resource Description Framework Semantic Data Hadoop Distribute File System Resource Description Framework Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Du, J.H., Wang, H.F., Ni, Y., Yu, Y.: HadoopRDF: a scalable semantic data analytical engine. ICIC 2012. LNCS, vol. 7390, pp. 633–641. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    Franke, C., Morin, S., Chebotko, A., Abraham, J., Brazier, P.: Distributed semantic web data management in HBase and MySQL cluster. In: Cloud Computing (CLOUD), pp. 105–112. IEEE (2011)Google Scholar
  3. 3.
    Gartner, D.L.: 3-D data management: controlling data volume, velocity and variety. 6 February 2001Google Scholar
  4. 4.
    Hogan, A.: Skolemising blank nodes while preserving isomorphism. In: 24th International Conference on World Wide Web, 2015. WWW (2015)Google Scholar
  5. 5.
    Kaoudi, Z., Manolescu, I.: RDF in the clouds: a survey. VLDB J. 24(1), 67–91 (2015)CrossRefGoogle Scholar
  6. 6.
    Khadilkar, V., Kantarcioglu, M., Thuraisingham, B., Castagna, P.: Jena-HBase: a distributed, scalable and efficient RDF triple store. In: 11th International Semantic Web Conference Posters & Demos, ISWC-PD (2012)Google Scholar
  7. 7.
    Martínez-Prieto, M.A., Cuesta, C.E., Arias, M., Fernández, J.D.: The solid architecture for real-time management of big semantic data. Future Gener. Comput. Syst. 47, 62–79 (2015)CrossRefGoogle Scholar
  8. 8.
    Nie, Z., Du, F., Chen, Y., Du, X., Xu, L.: Efficient SPARQL query processing in mapreduce through data partitioning and indexing. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 628–635. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Papailiou, N., Konstantinou, I., Tsoumakos, D., Karras, P., Koziris, N.: H2RDF+: High-performance distributed joins over large-scale RDF graphs. In: BigData Conference. IEEE (2013)Google Scholar
  10. 10.
    Punnoose, R., Crainiceanu, A., Rapp, D.: Rya: a scalable RDF triple store for the clouds. In: Proceedings of the 1st International Workshop on Cloud Intelligence, pp. 4. ACM (2012)Google Scholar
  11. 11.
    Schätzle, A., Przyjaciel-Zablocki, M., Dorner, C., Hornung, T., Lausen, G.: Cascading map-side joins over HBase for scalable join processing. In: SSWS+ HPCSW, pp. 59 (2012)Google Scholar
  12. 12.
    Schätzle, A., Przyjaciel-Zablocki, M., Hornung, T., Lausen, G.: PigSPARQL: a SPARQL query processing baseline for big data. In: International Semantic Web Conference (Posters & Demos), pp. 241–244 (2013)Google Scholar
  13. 13.
    Schätzle, A., Przyjaciel-Zablocki, M., Neu, A., Lausen, G.: Sempala: interactive SPARQL query processing on hadoop. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 164–179. Springer, Heidelberg (2014)Google Scholar
  14. 14.
    Sun, J., Jin, Q.: Scalable RDF store based on HBase and mapreduce. In: 3rd International Conference on Advanced Computer Theory and Engineering. IEEE (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Mohamed Nadjib Mami
    • 1
    • 2
    Email author
  • Simon Scerri
    • 1
    • 2
  • Sören Auer
    • 1
    • 2
  • Maria-Esther Vidal
    • 1
    • 2
  1. 1.University of BonnBonnGermany
  2. 2.Fraunhofer IAISSankt AugustinGermany

Personalised recommendations