i\(^{\rm {\sc 2}}\)dee: An Integrated and Interactive Data Exploration Environment Used for Ontology Design

  • Fabien Jalabert
  • Sylvie Ranwez
  • Vincent Derozier
  • Michel Crampes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4248)


Many communities need to organize and structure data to improve their utilization and sharing. Much research has been focused on this problem. Many solutions are based on a Terminological and Ontological Resource (TOR) which represents the domain knowledge for a given application. However TORs are often designed without taking into account heterogeneous data from specific resources. For example, in the biomedical domain, these sources may be medical reports, bibliographical resources or biological data extracted from GOA, Gene Ontology or KEGG. This paper presents an integrated visual environment for knowledge engineering. It integrates heterogeneous data from domain databases. Relevant concepts and relations are thus extracted from data resources, using several analysis and treatment processes. The resulting ontology embryo is visualized through a user friendly adaptive interface displaying a knowledge map. The experiments and evaluations dealt with in this paper concern biological data.


Semantic Relation Core Node Ontology Learning Ontology Design Ultrametric Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amigo – Gene Ontology Software and Databases,
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
    GO – the Gene Ontology,
  7. 7.
    GOA – Gene Ontology Annotation,
  8. 8.
    GUS, The Genomics Unified Schema,
  9. 9.
    KEGG: Kyoto Encyclopedia of Genes and Genomes,
  10. 10.
    MAGE, MicroArray and Gene Expression,
  11. 11.
    MeSH, Medical Subject Headings,
  12. 12.
    OMIM, Online Mendelian Inheritance in Man,
  13. 13.
    Pdb, the rcsb Protein DataBank,
  14. 14.
    PlasmoDB, The Plasmodium Genome Resource,
  15. 15.
  16. 16.
  17. 17.
    Umls, Unified Medical Language System,
  18. 18.
    Uniprot, The Universal Protein Resource,
  19. 19.
    Borodin, A., Roberts, G.O., Rosenthal, J.S., et Tsaparas, P.: Link analysis ranking: algorithms, theory, and experiments. ACM Tranactions. On Internet Technology (TOIT) 5(1), 231–297 (2005)CrossRefGoogle Scholar
  20. 20.
    Bourigault, D.: Lexter, a Natural Language tool for terminology extraction. In: 7th EURALEX International Congress, Göteborg, pp. 771–779 (1996)Google Scholar
  21. 21.
    Boutin, F., Hascoët, M.: Multi-level Exploration of Citation Graphs. In: Heery, R., Lyon, L. (eds.) ECDL 2004. LNCS, vol. 3232, pp. 366–377. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  22. 22.
    Choueka, Y.: Looking for needles in a haystack, In: Conference on User-Oriented Context Based Text and Image Handling (RIAO 1988), Cambridge, MA (1988)Google Scholar
  23. 23.
    Church, K.W., et Hanks, P.: Word Association Norms, Mutual Information and Lexicography. In: Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, Vancouver, pp. 76–83 (1989)Google Scholar
  24. 24.
    Corcho, O., Fernández-López, M., et Gómez-Pérez, A.: Methodologies, tools and languages for building ontologies: where is their meeting point? Data Knowledge Engineering, vol. 46(1), pp. 41–64. Elsevier Science Publishers B.V., Amsterdam (2003)Google Scholar
  25. 25.
    Crampes, M., Ranwez, S., Velickovski, F., Mooney, C., Mille, N.: An Integrated Visual Approach for Music Indexing and Dynamic Playlist Composition. In: MMCN 2006. 13th Annual Multimedia Computing and Networking, San Jose, California, January 18-19 (2006)Google Scholar
  26. 26.
    Crampes, M., Ranwez, S., Villerd, J., Velickovski, F., Mooney, C., Emery, A., Mille, N.: Concept Maps for Designing Adaptive Knowledge Maps. Information Visualization Journal, Palgrave (September 2006)Google Scholar
  27. 27.
    Daille, B.: Conceptual structuring through term variations. In: Bond, F., Korhonen, A., MacCarthy, D., Villacicencio, A. (eds.) proc. ACL 2003, Workshop on Multiword Expressions : Analysis, Acquisition and Treatments, pp. 9–16 (2003)Google Scholar
  28. 28.
    Dias, G., Guilloré, S., Lopes, J.G.P.: Extracting Textual Associations in Part-of-Speech Tagged Corpora. In: Fifth EAMT Workshop Harvesting existing resources Ljubljana, Slovenia, May 11-12 (2000)Google Scholar
  29. 29.
    Eades, P.: A heuristic for graph drawing. Congressus Numerantium 42, 149–160 (1984)MathSciNetGoogle Scholar
  30. 30.
    Gaume, B., Duvignau, K., Gasquet, O., Gineste, M.-D.: Forms of meaning, meaning of forms. Journal of Experimental and Theoritical Artificial Intelligence 14(1), 61–74Google Scholar
  31. 31.
    Habert, B., Naulleau, E., Nazarenko, A.: Symbolic word clustering for medium-size corpora. In: proc. 16th COLING, Copenhagen, vol. 490(5) (1996)Google Scholar
  32. 32.
    Harris, Z.: Mathematical Structures of Language. John Wiley & Sons, NY (1968)zbMATHGoogle Scholar
  33. 33.
    Heer, J.: Prefuse: a software framework for interactive information visualization Masters of Sc., Computer Science Division, Univ. of California, Berkeley (2004)Google Scholar
  34. 34.
    Hindle, D., et Rooth, M.: Structural ambiguity and lexical relations. Computational Linguistics, Special issue on using large corpora 19(1), 103–120 (1993)Google Scholar
  35. 35.
    Jacquemin, C.: FASTR: A unification grammar and a parser for terminology extraction from large corpora. In: Journées IA 1994, Paris, pp. 155–164 (1994)Google Scholar
  36. 36.
    Jacquemin, C., Bourigault, D.: Term Extraction and Automatic Indexing. In: Mitkiv, R. (ed.) Handbook of Computational Linguistics, pp. 599–615. Oxford University Press, Oxford (2003)Google Scholar
  37. 37.
    Lauriston, A.: Automatic recognition of complex terms: Problems and the TERMINO solution. Terminology 1(1), 147 (1994)CrossRefGoogle Scholar
  38. 38.
    Malaisé, V.: Méthodologie linguistique et terminologique pour l’exploitation d’outils d’extraction terminologique et la constitution d’ontologies différentielles à partir de corpus textuels. Thèse de doctorat, Université Technologique de Compiègnes (October 2005)Google Scholar
  39. 39.
    Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  40. 40.
    Mizoguchi, R.: Ontology Engineering Environments. In: Staab, S., et Studer, R. (eds.) Handbook on Ontologies, pp. 175–298 (2004)Google Scholar
  41. 41.
    Page, L., Brin, S., Motwani, R., et Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Library Technologies Project (1998)Google Scholar
  42. 42.
    Ploux, S., Ji, H.: A model for matching semantic maps between languages (French/English, English/French). Computational Linguistics 29(2), 155–178 (2003)CrossRefGoogle Scholar
  43. 43.
    Salton, G., et McGill, M.J.: Introduction to Modern Information Retrieval. McGraw Hill, New York (1983)zbMATHGoogle Scholar
  44. 44.
    Suen, C.Y.: N-Gram Statistics for Natural Language Understanding and Text Processing. IEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-1(2), 164–172 (1979)Google Scholar
  45. 45.
    Swanson, D.R.: Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine 30(1), 7–18 (1986)Google Scholar
  46. 46.
    Véronis, J.: Hyperlex: lexical cartography for information retrieval. Computer, Speech and Language 18(3), 223–252 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fabien Jalabert
    • 1
  • Sylvie Ranwez
    • 1
  • Vincent Derozier
    • 1
  • Michel Crampes
    • 1
  1. 1.EMA/Site EERIELGI2P Research CenterNîmesFrance

Personalised recommendations