Encoding Classifications into Lightweight Ontologies

  • Fausto Giunchiglia
  • Maurizio Marchese
  • Ilya Zaihrayeu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4011)


Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of electronic information items. Classifications describe their contents using natural language labels, which has proved very effective in manual classification. However natural language labels show their limitations when one tries to automate the process, as they make it very hard to reason about classifications and their contents. In this paper we introduce the novel notion of Formal Classification, as a graph structure where labels are written in a propositional concept language. Formal Classifications turn out to be some form of lightweight ontologies. This, in turn, allows us to reason about them, to associate to each node a normal form formula which univocally describes its contents, and to reduce document classification to reasoning about subsumption.


Child Node Parent Node Description Logic Formal Concept Analysis Common Noun 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    The WWW Virtual Library project, see
  2. 2.
    Adami, G., Avesani, P., Sona, D.: Clustering documents in a web directory. In: Proceedings of Workshop on Internet Data management (WIDM 2003) (2003)Google Scholar
  3. 3.
    Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.: The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, Cambridge (2003)MATHGoogle Scholar
  4. 4.
    Bouquet, P., Serafini, L., Zanobini, S.: Semantic coordination: a new approach and an application. In: Proc. of the 2nd International Semantic Web Conference (ISWO 2003), Sanibel Islands, Florida, USA (October 2003)Google Scholar
  5. 5.
    Mai Chan, L., Mitchell, J.S.: Dewey Decimal Classification: A Practical Guide. Forest P., U.S. (December 1996)Google Scholar
  6. 6.
    Giunchiglia, F., Shvaiko, P.: Semantic matching. In: Workshop on Ontologies and Distributed Systems, IJCAI (2003)Google Scholar
  7. 7.
    Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-match: an algorithm and an implementation of semantic matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 61–75. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Gordon, A.D.: Classification, 2nd edn. Monographs on Statistics and Applied Probability. Chapman-Hall/CRC, Boca Raton (1999)MATHGoogle Scholar
  9. 9.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  10. 10.
    Johnson-Laird: Mental Models. Harvard University Press, Cambridge (1983)Google Scholar
  11. 11.
    Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Fisher, D.H. (ed.) Proceedings of ICML 1997, 14th International Conference on Machine Learning, Nashville, US, pp. 170–178. Morgan Kaufmann Publishers, San Francisco (1997)Google Scholar
  12. 12.
    Lenat, D.B.: CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM 38(11), 33–38 (1995)CrossRefGoogle Scholar
  13. 13.
    Magnini, B., Serafini, L., Speranza, M.: Making explicit the semantics hidden in schema models. In: Proceedings of the Workshop on Human Language Technology for the Semantic Web and Web Services, held at ISWC 2003, Sanibel Island, Florida (October 2003)Google Scholar
  14. 14.
    Miller, G.: WordNet: An electronic Lexical Database. MIT Press, Cambridge (1998)Google Scholar
  15. 15.
    Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.M.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2/3), 103–134 (2000)CrossRefMATHGoogle Scholar
  16. 16.
    Noy, N.F.: Semantic integration: a survey of ontology-based approaches. SIGMOD Rec. 33(4), 65–70 (2004)CrossRefGoogle Scholar
  17. 17.
    The OpenNLP project, See:
  18. 18.
    Sceffer, S., Serafini, L., Zanobini, S.: Semantic coordination of hierarchical classifications with attributes. Technical Report 706, University of Trento, Italy (December 2004)Google Scholar
  19. 19.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
  20. 20.
    Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading (1984)MATHGoogle Scholar
  21. 21.
    Sun, A., Lim, E.-P.: Hierarchical text classification and evaluation. In: ICDM, pp. 521–528 (2001)Google Scholar
  22. 22.
    DMOZ: the Open Directory Project, See:
  23. 23.
    Uschold, M., Gruninger, M.: Ontologies and semantics for seamless connectivity. SIGMOD Rec. 33(4), 58–64 (2004)CrossRefGoogle Scholar
  24. 24.
    Wille, R.: Concept lattices and conceptual knowledge systems. Computers and Mathematics with Applications 23, 493–515 (1992)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fausto Giunchiglia
    • 1
  • Maurizio Marchese
    • 1
  • Ilya Zaihrayeu
    • 1
  1. 1.Department of Information and Communication TechnologyUniversity of TrentoItaly

Personalised recommendations