Towards Simulation-Based Similarity of End User Browsing Processes
For increasingly sophisticated use cases an end user needs to extract, combine, and aggregate information from various (often dynamic) web pages from different websites. Current search engines do not focus on combining information from various web pages in order to answer the overall information need of the user. Semantic Web and Linked Data usually take a static view on the data and rely on providers cooperation. Web automation scripts, initially developed for testing websites, allow end users to capture their browsing activities as executable processes and share them with other end users. A script can contain instructions for accessing, extracting and merging (dynamic) information from various websites for a particular purpose. Techniques for allowing users to search for scripts that satisfy complex constraints restrict to existing scripts in the repository, i.e. they do not deduce scripts that may satisfy the request as well. In this paper, we show how semantic descriptions of web sites can be derived from such scripts, and how such semantic descriptions of web sites along with usage information present in the scripts can be used to obtain new scripts with similar functionality.
Unable to display preview. Download preview PDF.
- 1.Bergman, M.K.: The deep web: Surfacing hidden value. The Journal of Electronic Publishing 7 (2001)Google Scholar
- 2.Bilenko, M., White, R.W.: Mining the search trails of surfing crowds: identifying relevant websites from user activity. In: Proceedings of the 17th International Conference on World Wide Web, WWW 2008, pp. 51–60. ACM (2008)Google Scholar
- 3.White, R.W., Huang, J.: Assessing the scenic route: measuring the value of search trails in web logs. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 587–594. ACM (2010)Google Scholar
- 4.Teevan, J., Alvarado, C., Ackerman, M.S., Karger, D.R.: The perfect search engine is not enough: a study of orienteering behavior in directed search. In: Dykstra-Erickson, E., Tscheligi, M. (eds.) CHI, pp. 415–422. ACM (2004)Google Scholar
- 5.Adar, E., Teevan, J., Dumais, S.T.: Large scale analysis of web revisitation patterns. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2008, pp. 1197–1206. ACM (2008)Google Scholar
- 7.Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. International Journal on Semantic Web and Information Systems 5, 1–22 (2009)Google Scholar
- 8.Friedman, M., Levy, A.Y., Millstein, T.D.: Navigational plans for data integration. In: Hendler, J., Subramanian, D. (eds.) AAAI/IAAI, pp. 67–73. AAAI Press / The MIT Press (1999)Google Scholar
- 9.Junghans, M., Agarwal, S.: Efficient search for web browsing recipes. In: Proceedings of the 20th International Conference on Web Service (ICWS 2013). IEEE (June 2013)Google Scholar
- 12.Madhavan, J., Ko, D., Kot, L., Ganapathy, V., Rasmussen, A., Halevy, A.: Google’s deep web crawl. Proceedings of the VLDB Endowment Archive 1, 1241–1252 (2008)Google Scholar
- 14.Gabbay, D., Kurucz, A., Wolter, F., Zakharyaschev, M.: Many-dimensional modal logics: theory and applications. Studies in Logic, vol. 148. Elsevier Science (2003)Google Scholar