Skip to main content
Log in

Bridging structured and unstructured data via hybrid semantic search and interactive ontology-enhanced query formulation

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we identify the problems of current semantic and hybrid search systems, which seek to bridge structure and unstructured data, and propose solutions. We introduce a novel input mechanism for hybrid semantic search that combines the clean and concise input mechanisms of keyword-based search engines with the expressiveness of the input mechanisms provided by semantic search engines. This interactive input mechanism can be used to formulate ontology-aware search queries without prior knowledge of the ontology. Furthermore, we propose a system architecture for automatically fetching relevant unstructured data, complementing structured data stored in a Knowledge Base, to create a combined index. This combined index can be used to conduct hybrid semantic searches which leverage information from structured and unstructured sources. We present the reference implementation Hybrid Semantic Search System (\(HS^3\)), which uses the combined index to put hybrid semantic search into practice and implements the interactive ontology-enhanced keyword-based input mechanism. For demonstration purpose, we apply \(HS^3\) to the tourism domain. We present performance test results and the results of a user evaluation. Finally, we provide instructions on how to apply \(HS^3\) to arbitrary domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://twitter.com/.

  2. http://www.facebook.com/.

  3. http://lucene.apache.org/solr/.

  4. http://www.openrdf.org/.

  5. http://www.imdb.com/.

  6. http://lucene.apache.org/.

  7. http://code.google.com/webtoolkit/.

  8. http://www.tiscover.com/.

  9. http://trec.nist.gov/.

  10. http://www.ifs.tuwien.ac.at/ir/hybridsearch.

  11. http://www.ifs.tuwien.ac.at/ir/hybridsearch.

References

  1. Baader F, McGuinness D, Nardi D, Patel-Schneider P (2003) The description logic handbook: theory, implementation and applications. Cambridge University Press, Cambridge

    Google Scholar 

  2. Bast H, Bäurle F, Buchhold B, Haussmann E (2012) Broccoli: semantic full-text search at your fingertips. CoRR abs/1207.2615

  3. Bast H, Chitea A, Suchanek F, Weber I (2007) Ester: efficient search on text, entities, and relations. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’07. ACM, New York, NY, USA, pp 671–678

  4. Bhagdev R, Chapman S, Ciravegna F, Lanfranchi V, Petrelli D (2008) Hybrid search: effectively combining keywords and semantic searches. In: Proceedings of the 5th European semantic web conference on The semantic web: research and applications, ESWC’08. Springer, Berlin, Heidelberg, pp 554–568

  5. Bikakis N, Giannopoulos G, Dalamagas T, Sellis T (2010) Integrating keywords and semantics on document annotation and search. In: Proceedings of the 2010 international conference on the move to meaningful internet systems: Part II, OTM’10. Springer, Berlin, Heidelberg, pp 921–938

  6. Broekstra J, Kampman A (2003) Serql: a second generation RDF query language. In: Proceedings of the 2003 SWAD-Europe workshop on semantic web storage and retrieval

  7. Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans Knowl Data Eng 19:261–272

    Article  Google Scholar 

  8. Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) Gate: an architecture for development of robust hlt applications. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL’02. Association for, Computational Linguistics, pp 168–175

  9. Delbru R, Toupikov N, Catasta M, Tummarello G (2010) A node indexing scheme for web entity retrieval. In: Proceedings of the extended semantic web conference (ESWC 2010), pp 240–256

  10. Fernandez M, Lopez V, Sabou M, Uren V, Vallet D, Motta E, Castells P (2008) Semantic search meets the web. In: Proceedings of the 2008 IEEE international conference on semantic computing. IEEE Computer Society, Washington, DC, USA, pp 253–260

  11. Fodeh S, Punch B, Tan PN (2011) On ontology-driven document clustering using core semantic features. Knowl Inf Syst 28:395–421

    Article  Google Scholar 

  12. Fodor O, Werthner H (2005) Harmonise: a step toward an interoperable e-tourism marketplace. Int J Electron Commer 9:11–39

    Google Scholar 

  13. Gärtner M, Seidel I, Froschauer J, Berger H (2010) The formation of virtual organizations by means of electronic institutions in a 3d e-tourism environment. Inf Sci 180:3157–3169

    Article  Google Scholar 

  14. Giunchiglia F, Kharkevich U, Zaihrayeu I (2009) Concept search. In: Proceedings of the 6th European semantic web conference on the semantic web: research and applications, ESWC 2009 Heraklion. Springer, Berlin, Heidelberg, pp 429–444

  15. Guha R, McCool R, Miller E (2003) Semantic search. In: Proceedings of the 12th international conference on World Wide Web, WWW’03. ACM, New York, NY, USA, pp 700–709

  16. Hearst MA (2009) Search user interfaces, 1st edn. Cambridge University Press, New York

    Book  Google Scholar 

  17. Jalali V, Matash Borujerdi M (2011) Information retrieval with concept-based pseudo-relevance feedback in medline. Knowl Inf Syst 29:237–248

    Article  Google Scholar 

  18. Kandogan E, Krishnamurthy R, Raghavan S, Vaithyanathan S, Zhu H (2006) Avatar semantic search: a database approach to information retrieval. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, SIGMOD’06. ACM, New York, NY, USA, pp 790–792

  19. Kasneci G, Suchanek FM, Ifrim G, Elbassuoni S, Ramanath M, Weikum G (2008) Naga: harvesting, searching and ranking knowledge. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, SIGMOD’08. ACM, New York, NY, USA, pp 1285–1288

  20. Kaufmann E, Bernstein A (2010) Evaluating the usability of natural language query languages and interfaces to semantic web knowledge bases. J Web Semant 8:377–393

    Article  Google Scholar 

  21. Kaufmann E, Bernstein A (2007) How useful are natural language interfaces to the semantic web for casual end-users? In: Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference, ISWC’07/ASWC’07. Springer, Berlin, Heidelberg, pp 281–294

  22. Kiryakov A, Popov B, Ognyanoff D, Manov D, Goranov KM (2004) Semantic annotation, indexing, and retrieval. J Web Semant 2:49–79

    Article  Google Scholar 

  23. Mukherjea S (2004) Discovering and analyzing world wide web collections. Knowl Inf Syst 6:230–241

    Article  Google Scholar 

  24. Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33:31–88

    Article  Google Scholar 

  25. Popov B, Kiryakov A, Ognyanoff D, Manov D, Kirilov A (2004) Kim a semantic platform for information extraction and retrieval. J Nat Lang Eng 10:375–392

    Article  Google Scholar 

  26. Prud’hommeaux E, Seaborne A (2004) Sparql query language for RDF. Technical report W3C

  27. Reeve L, Han H (2005) Survey of semantic annotation platforms. In: Proceedings of the 2005 ACM symposium on applied computing, SAC’05. ACM, New York, NY, USA pp 1634–1638

  28. Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, Inc., New York

    Google Scholar 

  29. Schreiber G, Amin A, Aroyo L, van Assem M, de Boer V, Hardman L, Hildebrand M, Omelayenko B, van Osenbruggen J, Tordai A, Wielemaker J, Wielinga B (2008) Semantic annotation and search of cultural-heritage collections: the multimedian e-culture demonstrator. J Web Semant 6:243–249

    Article  Google Scholar 

  30. Snchez D, Isern D, Millan M (2011) Content annotation for the semantic web an automatic web-based approach. Knowl Inf Syst 27:393–418

    Article  Google Scholar 

  31. Stumme G, Hotho A, Berendt B (2006) Semantic web mining: state of the art and future directions. J Web Semant 4(2):124–143

    Article  Google Scholar 

  32. Suchanek F, Kasneci G, Weikum G (2008) YAGO: a large ontology from Wikipedia and WordNet. J Web Semant 6(3):203–217

    Article  Google Scholar 

  33. Tablan V, Damljanovic D, Bontcheva K (2008) A natural language query interface to structured information. In: Proceedings of the 5th European semantic web conference on the semantic web: research and applications, ESWC’08, Springer, pp 361–375

  34. Wang H, Tran T, Liu C (2008) Ce2: towards a large scale hybrid search engine with integrated ranking support. In: Proceeding of the 17th ACM conference on Information and knowledge management, CIKM ’08. ACM, New York, NY, USA, pp 1323–1324

  35. Zenz G, Zhou X, Minack E, Siberski W, Nejdl W (2009) From keywords to semantic queries-incremental query construction on the semantic web. J Web Semant 7:166–176

    Article  Google Scholar 

  36. Zha ZJ, Yang L, Mei T, Wang M, Wang Z (2009) Visual query suggestion. In: Proceedings of the 17th ACM international conference on multimedia, MM’09. ACM, New York, NY, USA, pp 15–24

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markus Gärtner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gärtner, M., Rauber, A. & Berger, H. Bridging structured and unstructured data via hybrid semantic search and interactive ontology-enhanced query formulation. Knowl Inf Syst 41, 761–792 (2014). https://doi.org/10.1007/s10115-013-0678-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0678-y

Keywords

Navigation