Skip to main content
Log in

Multi-level semantic annotation and unified data integration using semantic web ontology in big data processing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The potential applications of big data need semantic annotation and unified integration of heterogeneous data. This paper proposes MOUNT a multi-level annotation and integration framework that significantly process the heterogeneous dataset by exploiting the semantic knowledge to improve the query processing in the large scale infrastructure. The multi-level annotation proposes the coarse-grained and fine-grained annotation models. The coarse-grained annotation employs Yago and SEeds SEarch to categorize the domain information on the big data and fine-grained annotation enables semantic enrichment. Moreover, the MOUNT approach integrates the structured and unstructured data to form the global resource description framework ontology. Moreover, it facilitates the query processing by translating the natural language user query into structured triples. The experimental results prove that the MOUNT approach yields a better performance in terms of result accuracy by 94%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)

    Article  Google Scholar 

  2. Emani, C.K., Cullot, N., Nicolle, C.: Understandable big data: a survey. Comput. Sci. Rev. 17, 70–81 (2015)

    Article  MathSciNet  Google Scholar 

  3. Zhou, Z.H., Chawla, N.V., Jin, Y., Williams, G.J.: Big data opportunities and challenges: discussions from data analytics perspectives. IEEE Trans. Comput. Intell. Mag. 9(4), 62–74 (2014)

    Article  Google Scholar 

  4. Liao, Y., Lezoche, M., Panetto, H., Boudjlida, N., Loures, E.R.: Semantic annotation for knowledge explicitation in a product lifecycle management context: a survey. Comput. Ind. 71, 24–34 (2015)

    Article  Google Scholar 

  5. Dong, X.L., Srivastava, D.: Big data integration. In: IEEE 29th International Conference on Data Engineering (ICDE), pp. 1245–1248 (2013)

  6. Chen, C.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  7. Dou, D., Wang, H., Liu, H.: Semantic data mining: a survey of ontology-based approaches. In: IEEE International Conference on Semantic Computing (ICSC), pp. 244–251 (2015)

  8. El-Sappagh, S.H., Hendawi, A.M., El Bastawissy, A.H.: A proposed model for data warehouse ETL processes. J. King Saud Univ. Comput. Inf. Sci. 23(2), 91–104 (2011)

    Google Scholar 

  9. Buche, P., Dibie-Barthelemy, J., Ibanescu, L., Soler, L.: Fuzzy web data tables integration guided by an ontological and terminological resource. IEEE Trans. Knowl. Data Eng. 25(4), 805–819 (2013)

    Article  Google Scholar 

  10. Salmen, D., Malyuta, T., Hansen, A., Cronen, S., Smith, B.: Integration of intelligence data through semantic enhancement. In: Semantic Technology in Intelligence, Defense and Security (STIDS) (2011)

  11. Boury-Brisset, A.-C.: Managing semantic Big Data for intelligence. In: STIDS, pp. 41–47 (2013)

  12. Robak, S., Franczyk, B., Robak, M.: Applying big data and linked data concepts in supply chains management. In: IEEE Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 1215–1221 (2013)

  13. Sint, R., Schaffert, S., Stroka, S., Ferstl, R.: Combining unstructured, fully structured and semi-structured information in semantic wikis. In: Fourth Workshop on Semantic Wikis—The Semantic Wiki Web 6th European Semantic Web Conference Hersonissos, p. 73 (2009)

  14. Bhide, M.A., Gupta, A., Gupta, R., Roy, P., Mohania, M.K., Ichhaporia, Z.: Liptus: associating structured and unstructured information in a banking environment. In: CM Proceedings of the SIGMOD International Conference on Management of Data, pp. 915–924 (2007)

  15. Park, B.K., Song, I.Y.: Toward total business intelligence incorporating structured and unstructured data. In: ACM Proceedings of the 2nd International Workshop on Business Intelligence and the Web, pp. 12–19 (2011)

  16. Unger, C., Cimiano, P.: Pythia: compositional meaning construction for ontology-based question answering on the semantic web. In: Springer International Conference on Application of Natural Language to Information Systems, pp. 153–160 (2011)

  17. Shekarpour, S., Marx, E., Ngomo, A.C., Auer, S.: Sina: semantic interpretation of user queries for question answering on interlinked data. Sci. Serv. Agents World Wide Web 30, 39–51 (2015)

    Article  Google Scholar 

  18. Yao, Y., Yi, J., Liu, Y., Zhao, X., Sun, C.: Query processing based on associated semantic context inference. In: IEEE 2nd International Conference on Information Science and Control Engineering (ICISCE), pp. 395–399 (2015)

  19. Liu, C., Wang, H., Yu, Y., Xu, L.: Towards efficient SPARQL query processing on RDF data. Tsinghua Sci. Technol. 15(6), 613–622 (2010)

    Article  Google Scholar 

  20. Ding, L., Pan, R., Finin, T., Joshi, A., Peng, Y., Kolari, P.: Finding and ranking knowledge on the semantic web. In: International Semantic Web Conference, pp. 156–170 (2005)

  21. d’Aquin, M., Motta, E.: Watson, more than a semantic web search engine. Semant. Web 2(1), 55–63 (2011)

    Google Scholar 

  22. Qu, Y., Cheng, G.: Falcons concept search: a practical search engine for web ontologies. IEEE Trans. Syst. Man Cybern. A 41(4), 810–816 (2011)

    Article  Google Scholar 

  23. Sabou, M., d’Aquin, M., Motta, E.: Exploring the semantic web as background knowledge for ontology matching. J. Data Semant. 11, 156–190 (2008)

    Google Scholar 

  24. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: ACM Proceedings of the 16th International Conference on World Wide Web, pp. 697–706 (2007)

  25. O’Madadhain, J., Fisher, D., Smyth, P., White, S., Boey, Y.B.: Analysis and visualization of network data using JUNG. J. Stat. Softw. 10(2), 1–35 (2005)

    Google Scholar 

  26. Alani, H., Brewster, C., Shadbolt, N.: Ranking ontologies with AKTiveRank. In: International Conference of Semantic Web-ISWC, pp. 1–15 (2006)

  27. Harold, E.R.: Processing Xml with Java. In: ACM Proceedings of the Addison-Wesley Longman Publishing (2002)

  28. Bizer, C., Seaborne, A.: D2rq—treating non-rdf databases as virtual rdf graphs. In: 3rd International Semantic Web Conference, vol. 2004 (2004)

  29. OCLC: The opensource xsltpro. http://www.oclc.org/research/themes/data-science/opensource.html

  30. Winkler, W.E.: The State of Record Linkage and Current Research Problems. Technical report. Statistical Research Division, U.S. Bureau of the Census, Washington, DC (1999)

    Google Scholar 

  31. Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., Mladenic, D.: Triplet extraction from sentences. In: Proceedings of the 10th International Multiconference on Information Society-IS, pp. 8–12 (2007)

  32. https://data.medicare.gov/

  33. Snyder, W.E.: NC state university image analysis laboratory database. http://www.ece.ncsu.edu/imaging/Archives/ImageDataBase/ (2002)

  34. http://www.opensourcesports.com/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. Shobha Rani.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rani, P.S., Suresh, R.M. & Sethukarasi, R. Multi-level semantic annotation and unified data integration using semantic web ontology in big data processing. Cluster Comput 22 (Suppl 5), 10401–10413 (2019). https://doi.org/10.1007/s10586-017-1029-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1029-7

Keywords

Navigation