Skip to main content

Verification and Validation of Semantic Annotations

  • Conference paper
  • First Online:
Perspectives of System Informatics (PSI 2019)

Abstract

In this paper, we propose a framework to perform verification and validation of semantically annotated data. The annotations, extracted from websites, are verified against the schema.org vocabulary and Domain Specifications to ensure the syntactic correctness and completeness of the annotations. The Domain Specifications allow for checking of the compliance of annotations against corresponding domain-specific constraints. The validation mechanism will detect errors and inconsistencies between the content of the analyzed schema.org annotations and the content of the web pages where the annotations were found.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://semantify.it/evaluator.

  2. 2.

    https://search.google.com/structured-data/testing-tool/.

  3. 3.

    https://www.google.com/webmasters/markup-tester/.

  4. 4.

    https://webmaster.yandex.com/tools/microtest/.

  5. 5.

    https://www.bing.com/toolbox/markup-validator.

  6. 6.

    Authors use term “validation” in their paper due to content definition.

  7. 7.

    https://www.w3.org/TR/shacl-ucr/.

  8. 8.

    List of available Domain Specifications: https://semantify.it/domainSpecifications/public.

  9. 9.

    https://semantify.it/evaluator.

  10. 10.

    The paper is under double blind review and can’t be revealed.

  11. 11.

    https://developers.google.com/search/docs/guides/sd-policies.

  12. 12.

    https://search.google.com/search-console/about.

  13. 13.

    https://support.google.com/webmasters/answer/9044175?hl=en&visit_id=636862521420978682-2839371720&rd=1#spammy-structured-markup .

  14. 14.

    https://www.best-of-zillertal.at.

  15. 15.

    https://www.mayrhofen.at.

  16. 16.

    https://www.seefeld.com/.

  17. 17.

    https://www.zillertalarena.com.

References

  1. Akbar, Z., Kärle, E., Panasiuk, O., Şimşek, U., Toma, I., Fensel, D.: Complete semantics to empower touristic service providers. In: Panetto, H., et al. (eds.) OTM 2017 Conferences. LNCS, vol. 10574, pp. 353–370. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69459-7_24

    Chapter  Google Scholar 

  2. Allahyari, M., et al.: A brief survey of text mining: classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919 (2017)

  3. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284(5), 34–43 (2001)

    Article  Google Scholar 

  4. Boneva, I., Gayo, J.E.L., Hym, S., Prud’hommeau, E.G., Solbrig, H.R., Staworko, S.: Validating RDF with shape expressions. CoRR, abs/1404.1270 (2014)

    Google Scholar 

  5. Boneva, I., Labra Gayo, J.E., Prud’hommeaux, E.G.: Semantics and validation of shapes schemas for RDF. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 104–120. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_7

    Chapter  Google Scholar 

  6. Chang, C.H., Kayed, M., Girgis, M.R., Shaalan, K.F.: A survey of web information extraction systems. IEEE Trans. Knowl. Data Eng. 18(10), 1411–1428 (2006)

    Article  Google Scholar 

  7. Fürber, C., Hepp, M.: Using SPARQL and SPIN for data quality management on the semantic web. In: Abramowicz, W., Tolksdorf, R. (eds.) BIS 2010. LNBIP, vol. 47, pp. 35–46. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12814-1_4

    Chapter  Google Scholar 

  8. Guha, R.: Introducing schema.org: search engines come together for a richer web. Google Official Blog (2011)

    Google Scholar 

  9. Guha, R.V., Brickley, D., Macbeth, S.: Schema.org: evolution of structured data on the web. Commun. ACM 59(2), 44–51 (2016)

    Article  Google Scholar 

  10. Hollenstein, N., Schneider, N., Webber, B.L.: Inconsistency detection in semantic annotation. In: LREC (2016)

    Google Scholar 

  11. Holzknecht, O.: Enabling domain-specific validation of schema.org annotations. Master’s thesis, Innsbruck University, Innrain 52, 6020 Innsbruck, Austria, November 2018

    Google Scholar 

  12. Kärle, E., Fensel, A., Toma, I., Fensel, D.: Why are there more hotels in tyrol than in Austria? Analyzing schema.org usage in the hotel domain. In: Inversini, A., Schegg, R. (eds.) Information and Communication Technologies in Tourism 2016, pp. 99–112. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28231-2_8

    Chapter  Google Scholar 

  13. Kärle, E., Fensel, D.: Heuristics for publishing dynamic content as structured data with schema.org. arXiv preprint arXiv:1808.06012 (2018)

  14. Meusel, R., Petrovski, P., Bizer, C.: The WebDataCommons Microdata, RDFa and microformat dataset series. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 277–292. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_18

    Chapter  Google Scholar 

  15. Mika, P.: On schema.org and why it matters for the web. IEEE Internet Comput. 19(4), 52–55 (2015)

    Article  Google Scholar 

  16. Mohit, B.: Named entity recognition. In: Zitouni, I. (ed.) Natural Language Processing of Semitic Languages. TANLP, pp. 221–245. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-45358-8_7

    Chapter  Google Scholar 

  17. Mühleisen, H., Bizer, C.: Web data commons-extracting structured data from two large web corpora. LDOW 937, 133–145 (2012)

    Google Scholar 

  18. Panasiuk, O., Kärle, E., Şimşek, U., Fensel, D.: Defining tourism domains for semantic annotation of web content. e-Rev. Tour. Res. 9 (2018). Research notes from the ENTER 2018 Conference on ICT in Tourism

    Google Scholar 

  19. Panasiuk, O., Akbar, Z., Gerrier, T., Fensel, D.: Representing geodata for tourism with schema.org. In: Proceedings of the 4th International Conference on Geographical Information Systems Theory, Applications and Management - Volume 1: GISTAM, pp. 239–246. INSTICC, SciTePress (2018)

    Google Scholar 

  20. Prud’hommeaux, E., Labra Gayo, J.E., Solbrig, H.: Shape expressions: an RDF validation and transformation language. In: Proceedings of the 10th International Conference on Semantic Systems, pp. 32–40. ACM (2014)

    Google Scholar 

  21. Şimşek, U., Kärle, E., Holzknecht, O., Fensel, D.: Domain specific semantic validation of schema.org annotations. In: Petrenko, A.K., Voronkov, A. (eds.) PSI 2017. LNCS, vol. 10742, pp. 417–429. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74313-4_31

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Oleksandra Panasiuk , Omar Holzknecht or Umutcan Şimşek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Panasiuk, O., Holzknecht, O., Şimşek, U., Kärle, E., Fensel, D. (2019). Verification and Validation of Semantic Annotations. In: Bjørner, N., Virbitskaite, I., Voronkov, A. (eds) Perspectives of System Informatics. PSI 2019. Lecture Notes in Computer Science(), vol 11964. Springer, Cham. https://doi.org/10.1007/978-3-030-37487-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37487-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37486-0

  • Online ISBN: 978-3-030-37487-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics