Domain Specific Data Retrieval on the Semantic Web

  • Tuukka Ruotsalo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)

Abstract

The Web content increasingly consists of structured domain specific data published in the Linked Open Data (LOD) cloud. Data collections in this cloud are by definition from different domains and indexed with domain specific ontologies and schemas. Such data requires retrieval methods that are effective for domain specific collections annotated with semantic structure. Unlike previous research, we introduce a retrieval framework based on the well known vector space model of information retrieval to fully support retrieval of Semantic Web data described in the Resource Description Framework (RDF) language. We propose an indexing structure, a ranking method, and a way to incorporate reasoning and query expansion in the framework. We evaluate the approach in ad-hoc retrieval using two domain specific data collections. Compared to a baseline, where no reasoning or query expansion is used, experimental results show up to 76% improvement when an optimal combination of reasoning and query expansion is used.

Keywords

Resource Description Framework Query Expansion Mean Average Precision Vector Space Model Indexing Strategy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agirre, E., Arregi, X., Otegi, A.: Document expansion based on wordnet for robust ir. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, pp. 9–17. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  2. 2.
    Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web: Scientific American. Scientific American (May 2001)Google Scholar
  3. 3.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  4. 4.
    Blanco, R., Mika, P., Vigna, S.: Effective and Efficient Entity Search in RDF Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 83–97. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Brickley, D., Guha, R.V.: RDF vocabulary description language 1.0: RDF Schema W3C recommendation. Recommendation, World Wide Web Consortium (February 10, 2004)Google Scholar
  6. 6.
    Castells, P., Fernandez, M., Vallet, D.: An adaptation of the vector-space model for ontology-based information retrieval. IEEE Transactions on Knowledge and Data Engineering 19(2), 261–272 (2007)CrossRefGoogle Scholar
  7. 7.
    Fazzinga, B., Gianforme, G., Gottlob, G., Lukasiewicz, T.: Semantic web search based on ontological conjunctive queries. Web Semantics: Science, Services and Agents on the World Wide Web 9(4), 453–473 (2011)CrossRefGoogle Scholar
  8. 8.
    Férnandez, M., Cantador, I., López, V., Vallet, D., Castells, P., Motta, E.: Semantically enhanced information retrieval: An ontology-based approach. Web Semantics: Science, Services and Agents on the World Wide Web 9(4), 434–452 (2011)CrossRefGoogle Scholar
  9. 9.
    Halpin, H., Herzig, D., Mika, P., Blanco, R., Pound, J., Thompon, H., Duc, T.T.: Evaluating ad-hoc object retrieval. In: Proceedings of the International Workshop on Evaluation of Semantic Technologies, Shanghai, China. CEUR, vol. 666 (November 2010)Google Scholar
  10. 10.
    Kiryakov, A., Popov, B., Ognyanoff, D., Manov, D., Kirilov, A., Goranov, M.: Semantic Annotation, Indexing, and Retrieval. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 484–499. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  11. 11.
    Ning, X., Jin, H., Jia, W., Yuan, P.: Practical and effective ir-style keyword search over semantic web. Information Processing & Management 45(2), 263–271 (2009)CrossRefGoogle Scholar
  12. 12.
    Pérez-Agüera, J.R., Arroyo, J., Greenberg, J., Iglesias, J.P., Fresno, V.: Using bm25f for semantic search. In: Proceedings of the 3rd International Semantic Search Workshop, SEMSEARCH 2010, pp. 2:1–2:8. ACM, New York (2010)Google Scholar
  13. 13.
    Ruotsalo, T., Aroyo, L., Schreiber, G.: Knowledge-based linguistic annotation of digital cultural heritage collections. IEEE Intelligent Systems 24(2), 64–75 (2009)CrossRefGoogle Scholar
  14. 14.
    Ruotsalo, T., Mäkelä, E.: A comparison of corpus-based and structural methods on approximation of semantic relatedness in ontologies. International Journal on Semantic Web and Information Systems 5(4), 39–56 (2009)CrossRefGoogle Scholar
  15. 15.
    Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)MATHCrossRefGoogle Scholar
  16. 16.
    Vallet, D., Fernández, M., Castells, P.: An Ontology-Based Information Retrieval Model. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 455–470. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tuukka Ruotsalo
    • 1
    • 2
    • 3
  1. 1.School of InformationUniversity of CaliforniaBerkeleyUSA
  2. 2.Department of Media TechnologyAalto UniversityFinland
  3. 3.Helsinki Institute for Information Technology (HIIT)Finland

Personalised recommendations