Skip to main content
Log in

B-CWB: Bilingual Comparative Web Browser Based on Content-Synchronization and Viewpoint Retrieval

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

We propose a new way of browsing bilingual web sites through concurrent browsing with automatic similar-content synchronization and viewpoint retrieval facilities. Our prototype browser system is called the Bilingual Comparative Web Browser (B-CWB) and it concurrently presents bilingual web pages in a way that enables their contents to be automatically synchronized. The B-CWB allows users to browse multiple web news sites concurrently and compare their viewpoint of news articles written in different languages (English and Japanese). Our viewpoint retrieval is based on similar and different detection. We described categorizing pages in terms of viewpoint: the entire similarity, the content difference, and subject difference. Content synchronization means that user operation (scrolling or clicking) on one web page does not necessarily invoke the same operations on the other web page to preserve similarity of content between the multiple web pages. For example, scrolling a web page may invoke passage-level viewpoint retrieval on the other web page. Clicking a web page (and obtaining a new web page) invokes page-level viewpoint retrieval within the other site's pages through the use of an English-Japanese dictionary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. Dean and M. R. Henzinger, “Finding related pages in the World Wide Web,” in The 8th International World Wide Web Conference (WWW8) in, Toronto, Canada, May 1999, http://www8.org/w8-papers/4a-search-mining/finding/finding.html

  2. R. Goldman and J Widom, “DataGuides: Enabling query formulation and optimization in semistructured databases,” in Proc. 23rd Intl. Conf. on Very Large Data Bases (VLDB'23), August 1997, pp. 436–445.

  3. S.-J. Lim and Y.-K. Ng, “An automated change-detection algorithm for HYML documents based on semantic hierarchies,” in Proc. the 17th Intl. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001 pp. 303–312.

  4. B. Liu, Y. Ma, and P.S. Yu, “Discovering unexpected information from your competitor's Web Sites,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2001), San Francisco, CA, August 2001.

  5. B. Liu, K. Zhao, and L. Yi, “Visualizing web site comparisons,” in The 11th International World Wide Web Conference (WWW2002), Honolulu, Hawaii, May 2002 (http://www2002.org/CDROM/refereed/571/index.html)

  6. T. Matsukura, H. Kondo, Y. Hirata, and K. Tanaka, “Discovery of semantic relationship among web pages based on web topic structures,” in Proc. of 9th IFIP 2.6 Working Conference on Database Semantics, 2001.

  7. M. Perkowitz and O. Etzioni, “Towards adaptive web sites: Conceptual framework and case study,” Artificial Intelligence 118: 2000, 245–275.

    Google Scholar 

  8. R. D. Roorenbos, O. Etzioni, and D. S. Weld, “A scalable comparison-shopping agent for the world-wide web,” in Proc. the 1st Intl. Conf. on Autonomous Agents, 1997.

  9. P. Schauble and P. Sheridan, “Cross-language information retrieval (CLIR) track overview,” in proceedings of the Sixth Text Retrieval Conference (TREC-6), NIST, Gaithersburg, MD 1998.

  10. M. Uchiyama and H Isahara, “Reliable Measures for Aligning Japanese-English News Articles and Sentences,” in Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics(ACL2003), 2003, pp. 72–79.

  11. Asahi newspaper site homepage. http://www.asahi.com.

  12. Brill's tagger homepage. http://www.cs.jhu.edu/brill/

  13. CNN site homepage. http://www.cnn.com

  14. EIJIRO's homepage (in Japanese). http://www.alc.co.jp/

  15. Gomez homepage. http://www.gomez.com

  16. MeCab homepage. http://cl.aist-nara.ac.jp/taku-ku/software/mecab/

  17. Netscape site homepage. http://www.netscape.com/

  18. TREC site homepage. http://trec.nist.gov/

  19. UserLand site homepage. http://www.userland.com/

  20. Yahoo news site homepage. http://news.yahoo.com/

  21. Yahoo-Japan news site homepage. http://headlines.yahoo.co.jp/hl

  22. Yomiuri newspaper site homepage. http://www.yomiuri.co.jp

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akiyo Nadamoto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nadamoto, A., Ma, Q. & Tanaka, K. B-CWB: Bilingual Comparative Web Browser Based on Content-Synchronization and Viewpoint Retrieval. World Wide Web 8, 347–367 (2005). https://doi.org/10.1007/s11280-005-1316-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-005-1316-8

Keywords

Navigation