Integrating Vocabularies: Discovering and Representing Vocabulary Maps

  • Borys Omelayenko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2342)


The Semantic Web would enable new ways of doing business on the Web that require development of advanced business document integration technologies performing intelligent document transformation. The documents use different vocabularies that consist of large hierarchies of terms. Accordingly, vocabulary mapping and transformation becomes an important task in the whole business document transformation process. It includes several subtasks: map discovery, map representation, and map execution that must be seamlessly integrated into the document integration process. In this paper we discuss the process of discovering the maps between two vocabularies assuming availability of two sets of documents, each using one of the vocabularies. We take the vocabularies of product classification codes as a playground and propose a reusable map discovery technique based on Bayesian text classification approach. We show how the discovered maps can be integrated into the document transformation process.


  1. 1.
    Clark, J.: XSL Transformations (XSL-T). Technical report, W3C Recommendation, November 16 (1999)Google Scholar
  2. 2.
    Omelayenko, B., Fensel, D.: A Two-Layered Integration Approach for Product Information in B2B E-commerce. In Madria, K., Pernul, G., eds.: Proceedings of the Second International Conference on Electronic Commerce and Web Technologies (EC WEB-2001). Number 2115 in LNCS, Munich, Germany, September 4–6, Springer-Verlag (2001) 226–239Google Scholar
  3. 3.
    Lassila, O., Swick, R.: Resource Description Framework (RDF) Model and Syntax Specification. Technical report, W3C Recommendation, February 22 (1999)Google Scholar
  4. 4.
    Fensel, D., Ding, Y., Omelayenko, B., Schulten, E., Botquin, G., Brown, M., Flett, A.: Product Data Integration for B2B E-Commerce. IEEE Intelligent Systems 16 (2001) 54–59CrossRefGoogle Scholar
  5. 5.
    Schulten, E., Akkermans, H., Botquin, G., Dorr, M., Guarino, N., Lopes, N., Sadeh, N.: The E-Commerce Product Classification Challenge. IEEE Intelligent Systems 16 (2001) 86–88Google Scholar
  6. 6.
    Brickley, D., Guha, R.: Resource Description Framework (RDF) Schema Specification 1.0. Technical report, W3C Candidate Recommendation, March 27 (2000)Google Scholar
  7. 7.
    Ding, Y., Korotkiy, M., Omelayenko, B., Kartseva, V., Zykov, V., Klein, M., Schulten, E., Fensel, D.: Goldenbullet in a nutschell. In: Proceedings of the 15-th International FLAIRS Conference, Pensacola, Florida, May 16–18, AAAI Press (2002)Google Scholar
  8. 8.
    Mitchell, T.: Machine Learning. McGraw Hill (1997)Google Scholar
  9. 9.
    Clark, J., DeRose, S.: XML Path Language (XPath), version 1.0. Technical report, W3C Recommendation, November 16 (1999)Google Scholar
  10. 10.
    Agrawal, R., Srikant, R.: On Integrating Catalogs. In: The 10-th International World Wide Web Conference, Hong Kong, May (2001)Google Scholar
  11. 11.
    Corcho, O., Gomez-Perez, A.: Solving Integration Problems of E-commerce Standards and Initiatives through Ontological Mappings. In: Proceedings of the Workshop on E-Business and Intelligent Web at the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-2001), Seattle, USA, August 5 (2001)Google Scholar
  12. 12.
    Navathe, S., Thomas, H., Amitpong, M.S., Datta, A.: A Model to Support E-Catalog Integration. In: Proceedings of the Ninth IFIP 2.6 Working Conference on Database Semantics, Hong-Kong, April 25–28 (2001) 247–261Google Scholar
  13. 13.
    Rahm, E., Bernstein, P.: A Survey of Approaches to Automatic Schema Matching. The VLDB Journal 10 (2001) 334–350zbMATHCrossRefGoogle Scholar
  14. 14.
    Anhai, D., Domingos, P., Halevy, A.: Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. In: Proceedings of the ACM SIGMOD Conference, Santa Barbara, CA, May 21–24, ACM (2001)Google Scholar
  15. 15.
    CWM: Common Warehouse Model Specification. Technical report, Object Management Group (2001)Google Scholar
  16. 16.
    Borgida, A., Brachman, R., McGuinness, D., Resnik, L.: CLASSIC: A Structural Data Model for Objects. In: Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, Portland, OR, May 31–June 2, ACM (1989) 59–67Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Borys Omelayenko
    • 1
  1. 1.Division of Mathematics and Computer ScienceVrije UniversiteitAmsterdamThe Netherlands

Personalised recommendations