Conceptual Clustering of Documents for Automatic Ontology Generation

  • Reshmy Krishnan
  • Amir Hussain
  • Sherimon P.C.
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7888)


In Information retrieval, Keyword based retrieval is unsatisfactory for user needs since it can’t always retrieve relevant words according to the concept. Since different words can represent the same concept (polysemy) and one word can represent different concepts (homonymy), mapping problem will lead to word sense Disambiguation. Through the implementation of domain dependent ontology, concept based information retrieval (IR) can be achieved. Since Semantic concept extraction from keywords is the initial phase for automatic construction of ontology process, this paper propose an effective method for it. Reuters21578 is used as the input of this process, followed by indexing, training and clustering using self-Organizing Map. Based on the feature vector, the clustering of documents are formed using automatic concept selections, in order to make the hierarchy. Clusters are represented hierarchically based on the topics assigned .Ontology will be generated automatically for each cluster, based on the topic assigned.


homonymy polysemy Information retrieval indexing feature vector Self-Organizing Map Clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bedini, I., Nguyen, B.: Automatic Ontology Generation: State of the Art. Journal of Molecular Evolution 44(2), 226–233, 02 (1997, 2005)Google Scholar
  2. 2.
    Reshmy, K., Hussain, A., Sherimon P.C.: Retrieval of Semantic Concepts Based on Analysis of Texts for Automatic Construction of Ontology. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part I. LNCS, vol. 7663, pp. 524–532. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Lin., C.-Y.I., Ho, C.-S.: An Ontology-Based Approach to Acquiring Domain Knowledge for Requirement Analysis. In: Proc. Natl. Sci, Counc. ROC (A), vol. 24(1), pp. 44–60 (2000)Google Scholar
  4. 4.
    Bohring, H., Auer, S.: Mapping XML to OWL Ontologies. In: 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), vol. 6, pp. V6-517 – V6-519 (2010)Google Scholar
  5. 5.
    Reshmy, K., Hussain, A., Sherimon P.C.: Automatic ontology construction of unstructured documents using semantic clustering: Applied Ontology (communicated, 2013)Google Scholar
  6. 6.
    Thomas, M., Hussain, A.: Novel logistic regression models to aid the diagnosis of dementia. (Elsevier) Expert Systems with Applications 39(3), 3356–3361 (2012)Google Scholar
  7. 7.
    Bedini, I., Nguyen, B., Gardarin, G.: B2B Automatic Taxonomy Construction. In: International Conference on Enterprise Information systems, ICEIS 2008, pp. 325–330 (2008)Google Scholar
  8. 8.
    Guarino, N., Masolo, C., Vetere, G.: OntoSeek: Content-based Access to the Web. IEEE Intelligent Systems 14(3), 70–80 (1999)CrossRefGoogle Scholar
  9. 9.
    Khan, L.: Ontology-based Information Selection, Ph.D. Thesis, University of South California (2000)Google Scholar
  10. 10.
    Smeaton, F., Rijsbergen, V.: The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System. The Computer Journal 26(3), 239–246 (1993)CrossRefGoogle Scholar
  11. 11.
    Woods, W.: Conceptual Indexing: A Better Way to Organize Knowledge. Technical Report of Sun Microsystems (1999)Google Scholar
  12. 12.
    Khan, L., McLeod, D.: Audio Structuring and Personalized Retrieval Using Ontology. In: Proc. of IEEE Advances in Digital Libraries, Library of Congress, Bethesda, MD, pp. 116–126 (May 2000)Google Scholar
  13. 13.
    Khan, L., McLeod, D.: Disambiguation of Annotated Text of Audio Using Ontology. In: Proc. of ACM SIGKDD Workshop on Text Mining, Boston, MA (August 2000)Google Scholar
  14. 14.
    Elliman, D., Pulido, J.R.G.: Automatic Derivation of On-line Document Ontology. In: 15th European Conference on Object Oriented Programming, MERIT 2001, Budapest, Hungary (June 2001)Google Scholar
  15. 15.
    Hotho, A., Mädche, A., Staab, S.: Ontology-based Text Clustering. In: Workshop Text Learning: Beyond Supervision (2001)Google Scholar
  16. 16.
    Myat, N.N., Hla, K.H.S.: A combined approach of formal concept analysis and text mining for concept based document clustering. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, September 19-22, pp. 330–333 (2005)Google Scholar
  17. 17.
    Salton, G.: Automatic text processing: the transformation, analysis, and retrieval of information by Computer, Reading, and Mass. Addison-Wesley, Wokingham (1988)Google Scholar
  18. 18.
    Kaski, S., et al.: Creating an order in Digital Libraries with self-organizing Map. In: Proc. WCNN 1996 World Congress on Neural Networks, pp. 814–817. Lawrence Erlbann and INNS Press, Mahwah (1996)Google Scholar
  19. 19.
    Freeman, R., Yin, H., Allinson, N.M.: Self-Organizing Maps for Tree View Based Hierarchical Document Clustering. In: Proceedings of the IEEE IJCNN 2002, Honolulu, Hawaii, May 12-18, vol. 2, pp. 1906–1911 (2002)Google Scholar
  20. 20.
    Mehotra, et al.: Self-Organizing Maps, Elements of Artificial Neural Networks, p. 189. MIT Press (1997)Google Scholar
  21. 21.
    Khan, L., Luo, F.: Ontology Construction for Information Selection. In: 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), p. 122 (2002)Google Scholar
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.
  29. 29.
  30. 30.
  31. 31.
    Mehotra, et al.: Self-Organizing Maps, Elemets of Artificial Neural Networks, p. 189. MIT Press (1997)Google Scholar
  32. 32.
    Biébow, B., Szulman, S.: TERMINAE: A linguistics-based tool for the building of a domain ontology. In: Fensel, D., Studer, R. (eds.) EKAW 1999. LNCS (LNAI), vol. 1621, pp. 49–66. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  33. 33.
    Lonsdale, D., Ding, Y., Embley, D., Melby, A.: Peppering knowledge sources with SALT: Boosting conceptual content for ontology generation (2002)Google Scholar
  34. 34.
    Dahaba, M.Y., Hassanb, H.A., Rafea, A.: TextOntoEx: Automatic ontology construction from natural English text Expert systems with applications, pp. 1474–1480 (February 2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Reshmy Krishnan
    • 1
  • Amir Hussain
    • 2
  • Sherimon P.C.
    • 3
  1. 1.Department of ComputingMuscat CollegeMuscatSultanate of Oman
  2. 2.Department of Computing Science and MathematicsUniversity of StirlingScotland, UK
  3. 3.Faculty of Computer StudiesArab Open UniversityMuscatSultanate of Oman

Personalised recommendations