Discovering Relationships Among Catalogs

  • Ryutaro Ichise
  • Masahiro Hamasaki
  • Hideaki Takeda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3245)


When we have a large amount of information, we usually use categories with a hierarchy, in which all information is assigned. The Yahoo! Internet directory is one such example. This paper proposes a new method of integrating two catalogs with hierarchical categories. The proposed method uses not only the contents of information but also the structures of both hierarchical categories. In order to evaluate the proposed method, we conducted experiments using two actual Internet directories, Yahoo! and Google. The results show improved performance compared with the previous approaches.


Internal Node Hierarchical Category Concept Hierarchy Category Instance Information Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Srikant, R.: On integrating catalogs. In: Proc. of the Tenth Int. WWW Conf., pp. 603–612 (2001)Google Scholar
  2. 2.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)Google Scholar
  3. 3.
    dmoz (2003),
  4. 4.
    Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Learning to map between ontologies on the semantic web. In: Proc. of the 11th Int. WWW Conf. (2002)Google Scholar
  5. 5.
    Fleiss, J.: Statistical Methods for Rates and Proportions. John Wiley & Sons, Chichester (1973)MATHGoogle Scholar
  6. 6.
  7. 7.
    Ichise, R., Takeda, H., Honiden, S.: Integrating multiple internet directories by instance-based learning. In: Proc. of the 18th Int. Joint Conf. on AI, pp. 22-28 (2003)Google Scholar
  8. 8.
    Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Proc. of the 14th Int. Conf. on Machine Learning, pp. 170–178 (1997)Google Scholar
  9. 9.
    McCallum, A.K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering mccallum/bow/ (1996),
  10. 10.
    McCallum, A.K., Rosenfeld, R., Mitchell, T.M., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: Proc. of the 15th Int. Conf. on Machine Learning, pp. 359–367 (1998)Google Scholar
  11. 11.
    McGuinness, D.L., Fikes, R., Rice, J., Wilder, S.: An environment for merging and testing large ontologies. In: Proc. of the Conf. on Principles of Knowledge Representation and Reasoning, pp. 483–493 (2000)Google Scholar
  12. 12.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  13. 13.
    Noy, N.F., Musen, M.A.: Prompt: Algorithm and tool for automated ontology merging and alignment. In: Proc. of the 17th National Conf. on AI, pp. 450–455 (2000)Google Scholar
  14. 14.
    Omelayenko, B., Fensel, D.: An analysis of B2B catalogue integration problems. In: Proc. of the Int. Conf. on Enterprise Information Systems, pp. 945–952 (2001)Google Scholar
  15. 15.
    Stumme, G., Madche, A.: FCA-Merge: Bottom-up merging of ontologies. In: Proc. of the 17th Int. Joint Conf. on AI, pp. 225–230 (2001)Google Scholar
  16. 16.
    Sun, A., Lim, E.: Hierarchical Text Classification and Evaluation. In: Proc. of IEEE Int. Conf. on Data Mining, pp. 521–528 (2001)Google Scholar
  17. 17.
    Yahoo! (2003),

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Ryutaro Ichise
    • 1
    • 2
  • Masahiro Hamasaki
    • 2
  • Hideaki Takeda
    • 1
    • 2
  1. 1.National Institute of InformaticsTokyoJapan
  2. 2.The Graduate University for Advanced StudiesTokyoJapan

Personalised recommendations