Advances in Data Management pp 25-48

Part of the Studies in Computational Intelligence book series (SCI, volume 223) | Cite as

Integrated Retrieval from Web of Documents and Data

  • Krishnaprasad Thirunarayan
  • Trivikram Immaneni

Abstract

The Semantic Web is evolving into a property-linked web of data, conceptually different from but contained in the Web of hyperlinked documents. Data Retrieval techniques are typically used to retrieve data from the Semantic Web while Information Retrieval techniques are used to retrieve documents from the Hypertext Web. We present a Unified Web model that integrates the two webs and formalizes connection between them. We then present an approach to retrieving documents and data that captures best of both the worlds. Specifically, it improves recall for legacy documents and provides keyword-based search capability for the Semantic Web. We specify the Hybrid Query Language that embodies this approach, and the prototype system SITAR that implements it. We conclude with areas of future work.

Keywords

Data Retrieval Information Retrieval Hypertext Web Semantic Web Unified Web Hybrid Query Language 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Semantic Web Activity page, http://www.w3.org/2001/sw/
  2. 2.
    Prud’hommeaux, E., Seaborne, A. (eds.): SPARQL Query Language for RDF, [W3C WD] (October 2006), http://www.w3.org/TR/rdf-sparql-query/
  3. 3.
    Adida, B., Birbeck, M. (eds.): “RDFa,” [W3C WD] (2006), http://www.w3.org/TR/xhtml-rdfa-primer/
  4. 4.
    Immaneni, T., Thirunarayan, K.: Hybrid Retrieval from the Unified Web. In: Proceedings of the 22nd ACM Symposium on Applied Computing, Semantic Web and Applications Track (ACM SAC 2007), pp. 1376–1380 (March 2007) Google Scholar
  5. 5.
    Immaneni, T., Thirunarayan, K.: A Unified approach To Retrieving Web Documents and Semantic Web Data. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 579–593. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Immaneni, T.: A Hybrid Approach to Retrieving Web Documents and Semantic Web Data. Doctoral Dissertation, Department of Computer Science and Engineering, Wright State University, Dayton, OH (October 2007)Google Scholar
  7. 7.
    Thirunarayan, K.: On Embedding Machine-Processable Semantics into Documents. IEEE Transactions on Knowledge and Data Engineering 17(7), 1014–1018 (2005)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms (1998)Google Scholar
  10. 10.
    Guha, R., McCool, R., Miller, E.: “Semantic Search”. In: Proceedings of the 12th International Conference on World Wide Web, May 2003, pp. 700–709. ACM Press, New York (2003)Google Scholar
  11. 11.
  12. 12.
    Hartmann, J., Sure, Y.: An Infrastructure for Scalable, Reliable Semantic Portals. IEEE Intelligent Systems 19(3), 58–65 (2004)CrossRefGoogle Scholar
  13. 13.
  14. 14.
  15. 15.
    Beckett, D.: SWAD-E Deliverable 10.2: Mapping Semantic Web Data with RDBMSes (2003), http://www.w3.org/2001/sw/Europe/reports/scalable_rdbms_mapping_report/
  16. 16.
    Beckett, D.: SWAD-Europe Deliverable 10.1: Scalability and Storage: Survey of Free Software / Open Source RDF storage systems (2002), http://www.w3.org/2001/sw/Europe/reports/rdf_scalable_storage_report/
  17. 17.
    Bailey, J., Bry, F., Furche, T., Schaffert, S.: Web and Semantic Web Query Lan-guages: A Survey. In: Eisinger, N., Maluszynski, J. (eds.) Reasoning Web. LNCS, vol. 3564, pp. 35–133. Springer, Heidelberg (2005)Google Scholar
  18. 18.
    Haase, P., Broekstra, J., Egerhart, A., Volz, R.: A Comparison of RDF Query Langauges. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 502–517. Springer, Heidelberg (2004)Google Scholar
  19. 19.
    Davies, J., Weeks, R., Krohn, U.: QuizRDF: Search Technology for the Semantic Web. In: Workshop on Real World RDF and Semantic Web Applications, Proceedings of WWW 2002, Hawaii, USA (2002)Google Scholar
  20. 20.
    Mayfield, J., Finin, T.: Information Retrieval on the Semantic Web: Integrating Inference and Retrieval. In: Proceedings of the SIGIR 2003 Semantic Web Workshop, pp. 461–468 (2003)Google Scholar
  21. 21.
    Ding, L., et al.: Finding and Ranking Knowledge on the Semantic Web. In: Proceedings of the 4th International Semantic Web Conference, November 2005, pp. 156–170 (2005)Google Scholar
  22. 22.
    Rocha, C., Schwabe, D., Aragao, M.P.: A Hybrid Approach for Searching in the Semantic Web. In: Proceedings of the 13th International World Wide Web Conference, New York, May 2004, pp. 374–383 (2004)Google Scholar
  23. 23.
    Zhang, L., Yu, Y., Zhou, J., Lin, C., Yang, Y.: An Enhanced Model for Searching in Semantic Portals. In: Proceedings of the 14th International World Wide Web Conference, May 2005, pp. 453–462. ACM Press, Chiba (2005)CrossRefGoogle Scholar
  24. 24.
    Vallet, D., Fernández, M., Castells, P.: An Ontology-Based Information Retrieval Model. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 455–470. Springer, Heidelberg (2005)Google Scholar
  25. 25.
    Bhagdev, R., Chapman, S., Ciravegna, F., Lanfranchi, V., Petrelli, D.: Hybrid Search: Effectively Combining Keywords and Semantic Searches. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 554–568. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  26. 26.
    Hausenblas, M., Herman, I., Adida, B.: RDFa—Bridging the Web of Documents and the Web of Data. Tutorial given at: The 7th International Semantic Web Conference, Karlsruhe, Germany (October 2008)Google Scholar
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
  32. 32.
  33. 33.
  34. 34.
  35. 35.
  36. 36.
    Thirunarayan, K., Immaneni, T.: Hybrid Retrieval of Hypertext Web Documents and Semantic Web Data (submitted to Journal)Google Scholar
  37. 37.
    Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  38. 38.
  39. 39.
  40. 40.
  41. 41.
    Thirunarayan, K., Verma, R.: A Framework for Trust and Distrust Networks. In: Proceedings of Web 2.0 Trust Workshop (W2Trust) (June 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Krishnaprasad Thirunarayan
    • 1
  • Trivikram Immaneni
    • 1
  1. 1.Kno.e.sis Center Department of Computer Science and EngineeringWright State UniversityDaytonUSA

Personalised recommendations