A Modeling Approach for the Extraction of Semantic Information from a Maritime Corpus

  • Dieudonné Tsatcha
  • Eric Saux
  • Christophe Claramunt
Part of the Advances in Geographic Information Science book series (AGIS)


This paper introduces an algorithm for retrieving semantic information from a maritime corpus. The method is based on Natural Language Processing (NPL) and combines a segmentation of large documents with principles of a conceptual vector model (CVM) and synsets of words. This research is applied to the context of intelligent transport systems and maritime navigation. Based on documents regulating maritime traffic, this approach proposes an aid for navigational decision-making while significantly reducing the number of entities and relations required in the modeling process.


Natural language processing Conceptual vector model  Semantics Navigational decision aid 



The authors grateful to the “Établissement Principal du Service Hydrographique et Océanographique de la Marine” (Brest-France) for their nautical documents and for their assistance in this study. Also thank to Thierry Hamon for his supporting in the use of software Yatea. We would also like to thank Mathieu Lafourcade for his introduction to conceptual vectors and Daniel Howe for his help on RiWordnet, and Alicino Ferreira for his careful proofreading of this article.


  1. Aubin S, Hamon T (2006) Improving term extraction with terminological resources. In: Salakoski T, Ginter F, Pyysalo S, Pahikkala T (eds) Proceeding of the 5th international conference on NLP, FinTAL 2006, advances in natural language processing, pp 380–387. No. 4139 in LNAI, Springer, Aug 2006Google Scholar
  2. Becker J, Grob L, Hellingrath B, Klein S, Kuchen H, Müller-Funk U, Vossen G (2004) Advances in information systems and, management science, Logos Verlag Berlin GmbH, ISSN: 1611–3101Google Scholar
  3. Beckwith R, Fellbaum C, Gross D, Miller K (1990) Introduction to WordNet : an online lexical database. Int J Lexicogr 3(4):235–244CrossRefGoogle Scholar
  4. Chauché J (1990) Détermination sémantique en analyse structurelle : une expérience basée sur une définition de distance. TAL Inf 31(1):17–24Google Scholar
  5. Claramunt C, Fournier S, Li X, Peytchev E (2005) Real-time geographical information for ITS. In: Proceedings of the 5th IEEE international conference in intelligent transportation systems, pp 237–242Google Scholar
  6. Cleverdon CW (1962) Report on testing and analysis of an investigation into the comparatie efficiency of indexing systemsGoogle Scholar
  7. Daniel CH (2008) A WordNet library for java processing.
  8. Dumais ST, Letsche TA, Littman ML, Landauer TK (1997) Automatic cross-language retrieval using latent semantic indexing. In: AAAI-97 spring symposium series: cross-language text and speech retrieval, pp 18–24.
  9. Gibson JJ (1977) The theory of affordances. Lawrence Erlbaum, HillsdaleGoogle Scholar
  10. Goralski R, Gold C (2008) Marine GIS: progress in 3D visualization for dynamic GIS. In: spatial data handling, Springer, pp 401–416Google Scholar
  11. Hiemstra D (2001) Using language models for information retrieval. Ph.D. thesis, Taaluitgeverij Neslia Paniculata, Jan 2001Google Scholar
  12. International Hydrographic Bureau (2011) MONACO: recommended ENC validation checksGoogle Scholar
  13. James M, Vicki G (2011) The handbook of delaware boating laws and responsabilities, By Boat Ed, a division of Kalkomey Enterprises. Inc., TexasGoogle Scholar
  14. Jankowski P, Nyerges T (2003) Geographic information systems for group decision making: towards a participatory, geographic information science. Taylor & Francis, LondonGoogle Scholar
  15. Lafourcade M, Prince V, Schwab D (2002) Vecteurs conceptuels et structuration émergente de terminologie. Traitement Automatiques des Langues 43(1):43–72Google Scholar
  16. National Oceanic and Atmospheric Administration (2011) US department of commerce: United States Coast Pilot, 44th edn. (2011)Google Scholar
  17. Néméta A (2008) Code Vagnon Permis Plaisance : Option cotière, Vagnon edn. (2008)Google Scholar
  18. Pearson M (2008) Mémento Vagnon du Skipper : Moteur et voile (2008)Google Scholar
  19. Potthast M, Stein B, Anderka M (2008) A Wikipedia-based multilingual retrieval model. In: Macdonald C, Ounis I, Plachouras V, Ruthven I, White RW (eds) Proceedings of the 30th European conference on IR research, ECIR 2008, advances in information retrieval, LNCS, vol 4956. Springer, Berlin, pp 522–530.
  20. Salton G, MacGill M (1983) Introduction to modern information retrieval. McGrawHill, New YorkGoogle Scholar
  21. Schmid H (1995) Treetagger–a language independent part-of-speech tagger.
  22. Singhal A (2001) Modern information retrieval: a brief overview. IEEE Data Eng, Bulletin 24(4): 35-43Google Scholar
  23. Strzalkowski T, Carballo J, Marinescu M (1994) Natural language information retrieval: Trec-3-report. In: Proceedings of the 3rd text retrieval conference (1994)Google Scholar
  24. Tsatsaronis G, Panagiotopoulou V (2009) A generalized vector space model for text retrieval based on semantic relatedness. In: Proceedings of the 12th conference of the European chapter of the association for, computational linguistics (EACL-09) April 2009Google Scholar
  25. U.S. Department of Transportation (2011) United States Coast Guard: Navigation Rules International-InLandGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Dieudonné Tsatcha
    • 1
  • Eric Saux
    • 1
  • Christophe Claramunt
    • 1
  1. 1.GIS group, Lanvéoc-PoulmicNaval Academy Research InstituteBrest Cedex 9France

Personalised recommendations