An Infrastructure for Acquiring High Quality Semantic Metadata

  • Yuangui Lei
  • Marta Sabou
  • Vanessa Lopez
  • Jianhan Zhu
  • Victoria Uren
  • Enrico Motta
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4011)


Because metadata that underlies semantic web applications is gathered from distributed and heterogeneous data sources, it is important to ensure its quality (i.e., reduce duplicates, spelling errors, ambiguities). However, current infrastructures that acquire and integrate semantic data have only marginally addressed the issue of metadata quality. In this paper we present our metadata acquisition infrastructure, ASDI, which pays special attention to ensuring that high quality metadata is derived. Central to the architecture of ASDI is a verification engine that relies on several semantic web tools to check the quality of the derived data. We tested our prototype in the context of building a semantic web portal for our lab, KMi. An experimental evaluation comparing the automatically extracted data against manual annotations indicates that the verification engine enhances the quality of the extracted semantic metadata.


News Story Domain Ontology Manual Annotation Name Entity Recognition Semantic Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–43 (2001)CrossRefGoogle Scholar
  2. 2.
    Bizer, C.: D2R MAP - A Database to RDF Mapping Language. In: Proceedings of the 12th International World Wide Web Conference, Budapest (2003)Google Scholar
  3. 3.
    Cimiano, P., Handschuh, S., Staab, S.: Towards the Self-Annotating Web. In: Feldman, S., Uretsky, M., Najork, M., Wills, C. (eds.) Proceedings of the 13th International World Wide Web Conference, pp. 462–471 (2004)Google Scholar
  4. 4.
    Dingli, A., Ciravegna, F., Wilks, Y.: Automatic Semantic Annotation using Unsupervised Information Extraction and Integration. In: Proceedings of the KCAP 2003 Workshop on Knowledge Markup and Semantic Annotation (2003)Google Scholar
  5. 5.
    Fellbaum, C.: WORDNET: An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  6. 6.
    Hyvonen, E., Makela, E., Salminen, M., Valo, A., Viljanen, K., Saarela, S., Junnila, M., Kettula, S.: MuseumFinland – Finnish Museums on the Semantic Web. Journal of Web Semantics 3(2) (2005)Google Scholar
  7. 7.
    Lei, Y.: An Instance Mapping Ontology for the Semantic Web. In: Proceedings of the Third International Conference on Knowledge Capture, Banff, Canada (2005)Google Scholar
  8. 8.
    Lopez, V., Pasin, M., Motta, E.: AquaLog: An Ontology-Portable Question Answering System for the Semantic Web. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 546–562. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Mika, P.: Flink: Semantic Web Technology for the Extraction and Analysis of Social Networks. Journal of Web Semantics 3(2) (2005)Google Scholar
  10. 10.
    Noy, N.F., Sintek, M., Decker, S., Crubezy, M., Fergerson, R.W., Musen, M.A.: Creating Semantic Web Contents with Protege-2000. IEEE Intelligent Systems 2(16), 60–71 (2001)CrossRefGoogle Scholar
  11. 11.
    Popov, B., Kiryakov, A., Kirilov, A., Manov, D., Ognyanoff, D., Goranov, M.: KIM – Semantic Annotation Platform. In: Fensel, D., Sycara, K.P., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 834–849. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Schraefel, M.C., Shadbolt, N.R., Gibbins, N., Glaser, H., Harris, S.: CS AKTive Space: Representing Computer Science in the Semantic Web. In: Proceedings of the 13th International World Wide Web Conference (2004)Google Scholar
  13. 13.
    Sheth, A., Bertram, C., Avant, D., Hammond, B., Kochut, K., Warke, Y.: Semantic Content Management for Enterprises and the Web. IEEE Internet Computing (July/August 2002)Google Scholar
  14. 14.
    Stojanovic, L., Stojanovic, N., Volz, R.: Migrating data-intensive Web Sites into the Semantic Web. In: Proceedings of the 17th ACM symposium on applied computing (SAC), pp. 1100–1107. ACM Press, New York (2002)Google Scholar
  15. 15.
    Sure, Y., Akkermans, H., Broekstra, J., Davies, J., Ding, Y., Duke, A., Engels, R., Fensel, D., Horrocks, I., Iosif, V., Kampman, A., Kiryakov, A., Klein, M., Lau, T., Ognyanov, D., Reimer, U., Simov, K., Studer, R., van der Meer, J., van Harmelen, F.: On-To-Knowledge: Semantic Web Enabled Knowledge Management. In: Zhong, N., Liu, J., Yao, Y. (eds.) Web Intelligence. Springer, Heidelberg (2003)Google Scholar
  16. 16.
    van Harmelen, F.: How the Semantic Web will change KR: challenges and opportunities for a new research agenda. The Knowledge Engineering Review 17(1), 93–96 (2002)Google Scholar
  17. 17.
    Vargas-Vera, M., Motta, E., Domingue, J., Lanzoni, M., Stutt, A., Ciravegna, F.: MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS, vol. 2473, p. 379. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Zhu, J., Uren, V., Motta, E.: ESpotter: Adaptive Named Entity Recognition for Web Browsing. In: Proceedings of the Professional Knowledge Management Conference (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yuangui Lei
    • 1
  • Marta Sabou
    • 1
  • Vanessa Lopez
    • 1
  • Jianhan Zhu
    • 1
  • Victoria Uren
    • 1
  • Enrico Motta
    • 1
  1. 1.Knowledge Media Institute (KMi)The Open UniversityMilton Keynes

Personalised recommendations