The VLDB Journal

, Volume 22, Issue 5, pp 641–663 | Cite as

Exploratory search framework for Web data sources

  • Alessandro Bozzon
  • Marco Brambilla
  • Stefano Ceri
  • Davide Mazza
Special Issue Paper

Abstract

Exploratory search is an information seeking behavior where users progressively learn about one or more topics of interest; it departs quite radically from traditional keyword-based query paradigms, as it combines querying and browsing of resources, and covers activities such as investigating, evaluating, comparing, and synthesizing retrieved information. In most cases, such activities are enabled by a conceptual description of information in terms of entities and their semantic relationships. Customized Web applications, where few applicative entities and their relationships are embedded within the application logics, typically provide some support to exploratory search, which is, however, specific for a given domain. In this paper, we describe a general-purpose exploratory search framework, i.e., a framework which is neutral to the application logic. Our contribution consists of the formalization of the exploratory search paradigm over Web data sources, accessed by means of services; extracted information is described by means of an entity-relationship schema, which masks the service implementations. Exploratory interaction is supported by a general-purpose user interface including a set of widgets for data exploration, from big tables to atomic tables, visual diagrams, and geographic maps; the user interaction is translated to queries defined in \(\mathcal S \hbox {e}\mathcal C \hbox {oQL}\), a SQL-like language and protocol specifically designed for supporting exploratory search over data sources. We illustrate the software architecture of our prototype, which uses the interplay of a query and result management system with an orchestrator, capable of incrementally building queries and of walking through the past navigation history. The distinctive feature of the framework is the ability to extract top solutions, which combine top-ranked entity instances. We evaluate exploratory search from the end-user perspective in the context of a cognitive model for search, by studying the user’s behavior and the effectiveness of exploratory search in terms of quality of results produced by the search process; we also compare the effectiveness of interaction in using our multi-domain search system with the use of various replicas of the system, each acting upon a single domain, and with the use of conventional search engines.

Keywords

User interfaces Search  Exploratory search Structured Web data Design Experimentation Performance 

References

  1. 1.
    Baeza-Yates, R.: Applications of Web query mining. In: Losada, D., Fernandez-Luna, J. (eds.) Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 3408, pp. 7–22. Springer, Berlin/Heidelberg (2005)Google Scholar
  2. 2.
    Bates, M.J.: Information search tactics. J. Am. Soc. Inf. Sci. 30(4), 205–214 (1979)CrossRefGoogle Scholar
  3. 3.
    Bates, M.J.: The design of browsing and berrypicking techniques for the online search interface. Online Review 13(5), 407–424 (1989). http://www.gseis.ucla.edu/faculty/bates/berrypicking.html
  4. 4.
    Belkin, N.J., Cool, C., Stein, A., Thiel, U.: Cases, scripts, and information-seeking strategies: on the design of interactive information retrieval systems. Expert Syst. Appl. 9(3), 379–395 (1995)CrossRefGoogle Scholar
  5. 5.
    Bellahsene, Z., Bonifati, A., Rahm, E.: Schema Matching and Mapping. Springer, Berlin (2011)CrossRefMATHGoogle Scholar
  6. 6.
    Bergamaschi, S., Po, L., Sorrentino, S., Corni, A.: Uncertainty in Data Integration Systems: Automatic Generation of Probabilistic Relationships. Springer, Berlin (2010)Google Scholar
  7. 7.
    Bozzon, A., Brambilla, M., Catarci, T., Ceri, S., Fraternali, P., Matera, M.: Visualization of multi-domain ranked data. In: Ceri, S., Brambilla, M. (eds.) Search Computing, pp. 53–69. Springer, Berlin, Heidelberg (2011). http://dl.acm.org/citation.cfm?id=1983774.1983782
  8. 8.
    Bozzon, A., Brambilla, M., Ceri, S., Fraternali, P.: Liquid query: multi-domain exploratory search on the Web. In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 161–170. ACM, New York (2010)Google Scholar
  9. 9.
    Braga, D., Ceri, S., Corcoglioniti, F., Grossniklaus, M.: Panta rhei: flexible execution engine for search computing queries. In: Ceri, S., Brambilla, M. (eds.) Search Computing, pp. 225–243. Springer, Berlin, Heidelberg (2010). http://dl.acm.org/citation.cfm?id=2172319.2172334
  10. 10.
    Braga, D., Ceri, S., Daniel, F., Martinenghi, D.: Optimization of multi-domain queries on the Web. Proc. VLDB Endow. 1(1), 562–573 (2008)Google Scholar
  11. 11.
    Brambilla, M., Campi, A., Ceri, S., Quarteroni, S.: Semantic Resource Framework, LNCS, vol. 6585 (2011)Google Scholar
  12. 12.
    Broder, A.: A taxonomy of Web search. SIGIR Forum 36(2), 3–10 (2002)CrossRefGoogle Scholar
  13. 13.
    Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R.: View-based query answering in description logics: semantics and complexity. Comput. Syst. Sci. 78(1), 26–46 (2012)CrossRefMATHGoogle Scholar
  14. 14.
    Capra, R.G., Marchionini, G.: The relation browser tool for faceted exploratory search. In: Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL ’08), pp. 420–420. ACM, New York, (2008). doi: 10.1145/1378889.1378967
  15. 15.
    Ceri, S., Bozzon, A., Brambilla, M.: The anatomy of a multi-domain search infrastructure. In: Auer, S., Daz, O., Papadopoulos, G. (eds.) Web Engineering, Lecture Notes in Computer Science, vol. 6757, pp. 1–12. Springer, Berlin/Heidelberg (2011)Google Scholar
  16. 16.
    Choi, N., Song, I.Y., Han, H.: A survey on ontology mapping. SIGMOD Rec. 35(3), 34–41 (2006)CrossRefGoogle Scholar
  17. 17.
    Ciglan, M., Nor\(\dot{\text{a}}\)vg, K., Hluchy, L.: The SemSets model for ad-hoc semantic list search. In: Proceedings of WWW, pp. 131–140. New York (2012)Google Scholar
  18. 18.
    Dalvi, N., Kumar, R., Pang, B., Ramakrishnan, R., Tomkins, A., Bohannon, P., Keerthi, S., Merugu, S.: A Web of concepts. In: Proceedings of PODS, pp. 1–12. ACM (2009)Google Scholar
  19. 19.
    Doan, A., Halevy, A., Ives, Z.: Principles of Data Integration. Morgan Kauffman, San Francisco, CA (2012)Google Scholar
  20. 20.
    Doan, A., Halevy, A.Y.: Semantic integration research in the database community: a brief survey. AI Mag. 26(1), 83–94 (2005)Google Scholar
  21. 21.
    Dong, X., Halevy, A., Madhavan, J., Nemes, E., Zhang, J.: Similarity search for Web services. In: Proceedings of VLDB, pp. 372–383 (2004)Google Scholar
  22. 22.
    Fazzinga, B., Lukasiewicz, T.: Semantic search on the Web. Semant. Web 1(1–2), 89–96 (2010)Google Scholar
  23. 23.
    Foster, H., Uchitel, S., Magee, J., Kramer, J.: Model-based verification of Web service compositions. In: Proceedings of Automated Software Engineering, pp. 152–161 (2003)Google Scholar
  24. 24.
    Golovchinsky, G., Dunnigan, A., Diriye, A.: Designing a tool for exploratory information seeking. In: Proceedings of the 2012 ACM Annual Conference Extended Abstracts on Human Factors in Computing Systems Extended Abstracts, CHI EA ’12, pp. 1799–1804. ACM, New York (2012)Google Scholar
  25. 25.
    Granitzer, M., Sabol, V., Onn, K.W., Lukose, D., Tochtermann, K.: Ontology alignment: a survey with focus on visually supported semi-automatic techniques. Future Internet 2(3), 238–258 (2010)CrossRefGoogle Scholar
  26. 26.
    Hearst, M.A.: Search User Interfaces, 1 edn. Cambridge University Press, Cambridge (2009). http://searchuserinterfaces.com/book/
  27. 27.
    Herzig, D.M., Tran, T.: Heterogeneous Web data search using relevance-based on the fly data integration. In: Proceedings of WWW, pp. 141–150. New York (2012)Google Scholar
  28. 28.
    Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-Kelham, E., de Melo, G., Weikum, G.: Yago2: exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW ’11, pp. 229–232. ACM, New York (2011)Google Scholar
  29. 29.
    Jansen, B.J., Pooch, U.: A review of Web searching studies and a framework for future research. J. Am. Soc. Inf. Sci. Technol. 52(3), 235–246 (2001)CrossRefGoogle Scholar
  30. 30.
    Kuhlthau, C.C.: Inside the search process: information seeking from the user’s perspective. J. Am. Soc. Inf. Sci. 42(5), 361–371 (1991)CrossRefGoogle Scholar
  31. 31.
    Kules, B., Capra, R., Banta, M., Sierra, T.: What do exploratory searchers look at in a faceted search interface? In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ’09, pp. 313–322. ACM, New York (2009)Google Scholar
  32. 32.
    Kumar, R., Tomkins, A.: A characterization of online browsing behavior. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 561–570. ACM, New York (2010)Google Scholar
  33. 33.
    Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of PODS, pp. 233–246. ACM (2002)Google Scholar
  34. 34.
    Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM 49, 41–46 (2006)CrossRefGoogle Scholar
  35. 35.
    Pirolli, P., Card, S.K.: Information foraging. Psychol. Rev. 106, 643–675 (1999)CrossRefGoogle Scholar
  36. 36.
    Pound, J., Mika, P., Zaragoza, H.: Ad-hoc object retrieval in the Web of data. In: Proceedings of WWW, pp. 771–780. New York (2010)Google Scholar
  37. 37.
    Preda, N., Kasneci, G., Suchanek, F.M., Neumann, T., Yuan, W., Weikum, G.: Active knowledge: dynamically enriching RDF knowledge bases by Web services. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, SIGMOD ’10, pp. 399–410. ACM, New York (2010)Google Scholar
  38. 38.
    Quarteroni, S., Brambilla, M., Ceri, S.: A bottom-up, knowledge-aware approach to the integration of Web data services. ACM Trans. Web (TWEB) (to appear)Google Scholar
  39. 39.
    Quarteroni, S., Guerrisi, V., Torre, P.L.: Evaluating multi-focus natural language queries over data services. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA), Istanbul (2012)Google Scholar
  40. 40.
    Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB 10(4), 334–350 (2001)CrossRefMATHGoogle Scholar
  41. 41.
    Rajaraman, A., Sagiv, Y., Ullman, J.D.: Answering queries using templates with binding patterns (extended abstract). In: Proceedings of the Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’95, pp. 105–112. ACM, New York (1995)Google Scholar
  42. 42.
    Rose, D.E.: The information-seeking funnel. In: Marchionini, G., White, R. (eds.) National Science Foundation Workshop on Information-Seeking Support Systems (ISSS), Chapel Hill, NC (2008)Google Scholar
  43. 43.
    Rose, D.E., Levinson, D.: Understanding user goals in Web search. In: Proceedings of the 13th International Conference on World Wide Web, WWW ’04, pp. 13–19. ACM, New York (2004)Google Scholar
  44. 44.
    Saracevic, T.: The stratified model of information retrieval interaction: extension and applications. In: Proceedings of the Annual Meeting of the American Society for Information Science (ASIS’97), pp. 313–327 (1997)Google Scholar
  45. 45.
    Suchanek, F., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of WWW, pp. 697–706 (2007)Google Scholar
  46. 46.
    Suchanek, F.M., Bozzon, A., Valle, E.D., Campi, A., Ronchi, S.: Towards an ontological representation of services in search computing. In: Search Computing—Trends and Developments, LNCS, vol. 6585, pp. 101–112. Springer, Berlin (2011)Google Scholar
  47. 47.
    Tzitzikas, Y., Hainaut, J.L.: How to tame a very large ER diagram (using link analysis and force-directed drawing algorithms). In: ER, pp. 144–159 (2005)Google Scholar
  48. 48.
    Ullman, J.D.: Information integration using logical views. In: Afrati, F.N., Kolaitis, P.G. (eds.) Proceedings of ICDT, LNCS, vol. 1186, pp. 19–40. Springer, Berlin (1997)Google Scholar
  49. 49.
    White, R.W., Drucker, M., Marchionini, G., Hearst, M., Schraefel, M.C.: Exploratory search and HCI: designing and evaluating interfaces to support exploratory search interaction. In: Proceedings of the ACM SIGCHI 2007 Workshop (2007)Google Scholar
  50. 50.
    White, R.W., Marchionini, G., Muresan, G.: Evaluating exploratory search systems: introduction to special topic issue of information processing and management. Inf. Process. Manag. 44(2), 433–436 (2008)CrossRefGoogle Scholar
  51. 51.
    White, R.W., Muresan, G., Marchionini, G.: Report on acm sigir 2006 workshop on evaluating exploratory search systems. SIGIR Forum 40(2), 52–60 (2006). http://portal.acm.org/citation.cfm?id=1189702.1189711
  52. 52.
    White, R.W., Roth, R.A.: Exploratory Search: Beyond the Query-Response Paradigm. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, San Rafael, CA (2009)Google Scholar
  53. 53.
    Wilson, M.L., Schraefel, M.C.: Evaluating collaborative search interfaces with information seeking theory. In: Proceedings of 1st International Collaborative Search Workshop (2008)Google Scholar
  54. 54.
    Wilson, M.L., Schraefel, M.C.: Sii: the lightweight analytical search interface inspector. In: Proceedings of JCDL09 Workshop on Lightweight User-Friendly Evaluation Methods for Digital Librarians, vol. 42(5) (2009) Google Scholar
  55. 55.
    Yogev, S., Roitman, H., Carmel, D., Zwerdling, N.: Towards expressive exploratory search over entity-relationship data. In: Proceedings of the 21st International Conference Companion on World Wide Web, WWW ’12 Companion, pp. 83–92. ACM, New York (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Alessandro Bozzon
    • 1
  • Marco Brambilla
    • 1
  • Stefano Ceri
    • 1
  • Davide Mazza
    • 1
  1. 1.Dipartimento di Elettronica, Informazione e BioingegneriaPolitecnico di MilanoMilanItaly

Personalised recommendations