Skip to main content

Domain Specific Semantic Validation of Schema.org Annotations

  • Conference paper
  • First Online:
Perspectives of System Informatics (PSI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10742))

Abstract

Since its unveiling in 2011, schema.org has become the de facto standard for publishing semantically described structured data on the web, typically in the form of web page annotations. The increasing adoption of schema.org facilitates the growth of the web of data, as well as the development of automated agents that operate on this data. Schema.org is a large heterogeneous vocabulary that covers many domains. This is obviously not a bug, but a feature, since schema.org aims to describe almost everything on the web, and the web is huge. However, the heterogeneity of schema.org may cause a side effect, which is the challenge of picking the right classes and properties for an annotation in a certain domain, as well as keeping the annotation semantically consistent. In this work, we introduce our rule based approach and an implementation of it for validating schema.org annotations from two aspects: (a) the completeness of the annotations in terms of a specified domain, (b) the semantic consistency of the values based on pre-defined rules. We demonstrate our approach in the tourism domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://schema.org.

  2. 2.

    https://paul.kinlan.me/the-headless-web/.

  3. 3.

    https://search.google.com/structured-data/testing-tool.

  4. 4.

    http://shex.io/primer/#rel-to-shacl.

  5. 5.

    http://sdo-validator.sti2.at.

  6. 6.

    For the annotations that are created via the editor based on the domain specification, only the semantic consistency validation applies.

  7. 7.

    http://schema.org/docs/datamodel.html.

References

  1. Fürber, C., Hepp, M.: Using SPARQL and SPIN for data quality management on the semantic web. In: Abramowicz, W., Tolksdorf, R. (eds.) BIS 2010. LNBIP, vol. 47, pp. 35–46. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12814-1_4

    Chapter  Google Scholar 

  2. Guha, R.V., Brickley, D., Macbeth, S.: Schema.org: evolution of structured data on the web. Commun. ACM 59(2), 44–51 (2016). http://doi.acm.org/10.1145/2844544

    Article  Google Scholar 

  3. Kärle, E., Fensel, A., Toma, I., Fensel, D.: Why are there more hotels in Tyrol than in Austria? Analyzing Schema.org usage in the hotel domain. In: Inversini, A., Schegg, R. (eds.) Information and Communication Technologies in Tourism 2016, pp. 99–112. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-28231-2_8

    Chapter  Google Scholar 

  4. Kärle, E., Simsek, U., Akbar, Z., Hepp, M., Fensel, D.: Extending the Schema.org vocabulary for more expressive accommodation annotations. In: Schegg, R., Stangl, B. (eds.) Information and Communication Technologies in Tourism 2017, pp. 31–41. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51168-9_3

    Chapter  Google Scholar 

  5. Khalili, A., Auer, S.: WYSIWYM authoring of structured content based on Schema.org. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013. LNCS, vol. 8181, pp. 425–438. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41154-0_32

    Chapter  Google Scholar 

  6. Knublauch, H., Kontokostas, D.: Shapes constraint language (2016). https://w3c.github.io/data-shapes/shacl/

  7. Le Hors, A., Solbrig, H., Prudhommeaux, E.: RDF validation workshop report, practical assurances for quality RDF data. Technical rep., Cambridge, MA, USA (2013). https://www.w3.org/2012/12/rdf-val/report

  8. Meusel, R., Bizer, C., Paulheim, H.: A web-scale study of the adoption and evolution of the Schema.org vocabulary over time. In: Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics, WIMS 2015, pp. 15:1–15:11. ACM, New York (2015). http://doi.acm.org/10.1145/2797115.2797124

  9. Patel-Schneider, P.F.: Analyzing Schema.org. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 261–276. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_17

    Google Scholar 

  10. Prud’hommeaux, E., Labra Gayo, J.E., Solbrig, H.: Shape expressions: an RDF validation and transformation language. In: Proceedings of the 10th International Conference on Semantic Systems - SEM 2014, pp. 32–40 (2014)

    Google Scholar 

  11. Simister, S., Brickley, D.: Simple application-specific constraints for RDF models. In: RDF Validation Workshop. Practical Assurances for Quality of RDF Data, Cambridge, MA, Boston, pp. 1–5 (2013). https://www.w3.org/2001/sw/wiki/images/0/00/SimpleApplication-SpecificConstraintsforRDFModels.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Umutcan Şimşek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Şimşek, U., Kärle, E., Holzknecht, O., Fensel, D. (2018). Domain Specific Semantic Validation of Schema.org Annotations. In: Petrenko, A., Voronkov, A. (eds) Perspectives of System Informatics. PSI 2017. Lecture Notes in Computer Science(), vol 10742. Springer, Cham. https://doi.org/10.1007/978-3-319-74313-4_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-74313-4_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-74312-7

  • Online ISBN: 978-3-319-74313-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics