Advertisement

An empirically validated, onomasiologically structured, and linguistically motivated online terminology

Re-designing scientific resources on German grammar
  • Karolina SuchowolecEmail author
  • Christian Lang
  • Roman Schneider
Article

Abstract

Terminological resources play a central role in the organization and retrieval of scientific texts. Both simple keyword lists and advanced modelings of relationships between terminological concepts can make a most valuable contribution to the analysis, classification, and finding of appropriate digital documents, either on the web or within local repositories. This seems especially true for long-established scientific fields with elusive theoretical and historical branches, where the use of terminology within documents from different origins is often far from being consistent. In this paper, we report on the progress of a linguistically motivated project on the onomasiological re-modeling of the terminological resources for the grammatical information system grammis. We present the design principles and the results of their application. In particular, we focus on new features for the authoring backend and discuss how these innovations help to evaluate existing, loosely structured terminological content, as well as to efficiently deal with automatic term extraction. Furthermore, we introduce a transformation to a future SKOS representation. We conclude with a positioning of our resources with regard to the Knowledge Organization discourse and discuss how a highly complex information environment like grammis benefits from the re-designed terminological KOS.

Keywords

Grammatical information system Grammatical terminology Grammatical KOS Concept system visualization SKOS Example-based querying 

References

  1. 1.
    Ahmad, K., Gillam, L., Tostevin, L.: University of surrey participation in TREC8: weirdness indexing for logical document extrapolation, retrieval (WILDER). In: Voorhees, E., Harman, D. (eds.) NIST Special Publication 500–246: The Eighth Text Retrieval Conference (TREC-8), Gaithersburg, MA, pp. 717–724 (1999)Google Scholar
  2. 2.
    Almende, B.V., Thieurmel, B.: visNetwork: network visualization using ‘vis.js’ library. R package version 1.0.3. https://CRAN.R-project.org/package=visNetwork (2016). Accessed 1 June 2017
  3. 3.
    Augustinus, L., Vandeghinste, V., Van Eynde, F.: Example-based treebank querying. In: Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2012/pdf/756_Paper.pdf (2012). Accessed 8 Nov 2017
  4. 4.
    Augustinus, L., Vandeghinste, V., Vanallemeersch, T.: Poly-gretel: cross-lingual example-based querying of syntactic constructions. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2016/pdf/486_Paper.pdf (2016). Accessed 8 Nov 2017
  5. 5.
    Baker, T., Bechhofer, S., Isaac, A., Miles, A., Schreiber, G., Summers, E.: Key choices in the design of Simple Knowledge Organization System (SKOS). Web Semant. Sci. Serv. Agents World Wide Web 20, 35–49 (2013)CrossRefGoogle Scholar
  6. 6.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)CrossRefGoogle Scholar
  7. 7.
    Bubenhofer, N., Schneider, R.: Using a domain ontology for the semantic-statistical classification of specialist hypertexts. In: Papers from the Annual International Conference on Computational Linguistics ‘Dialogue’. Moscow, 26 May 2010/30 May 2010, pp. 622–628 (2010)Google Scholar
  8. 8.
    Chang, W., Cheng, J., Allaire, J., Xie, Y., McPherson, J.: Shiny: Web Application Framework for R. R package version 1.0.0. https://CRAN.R-project.org/package=shiny (2017). Accessed 1 June 2017
  9. 9.
    Deutscher Terminologie-Tag eV: Terminologiearbeit—Best Practices, 2nd edn (2014)Google Scholar
  10. 10.
    DIN 2331: Begriffssysteme und ihre Darstellung (1980)Google Scholar
  11. 11.
    DIN 2342:2011-08: Begriffe der Terminologielehre (2011)Google Scholar
  12. 12.
    Drewer, P., Massion, F., Pulitano, D.: Was haben Wissensmodellierung, Wissensstruktur, künstliche Intelligenz und Terminologie miteinander zu tun? Technical Report, Deutscher Terminologie-Tag e.V (2017)Google Scholar
  13. 13.
    Dunning, T.: Accurate methods for the statistics of surprise and coincidence. J. Comput. Linguist. Spec. Issue Using Large Corpora 19(1), 61–74 (1993)Google Scholar
  14. 14.
    Faber, P.: Frames as a framework for terminology. In: Kockaert, H.J., Steurs, F. (eds.) Handbook of Terminology, vol. 1. John Benjamins Publishing Company, Amsterdam (2015)CrossRefGoogle Scholar
  15. 15.
    Frantzi, K., Ananiadou, S., Mima, H.: Automatic recognition of multi-word terms: the C-value/NC-value method. Int. J. Digit. Libr. 3(2), 115–130 (2000)CrossRefGoogle Scholar
  16. 16.
    Früh, B., Deubzer, F.: Von der Terminologieverwaltung zur Wissensorganisation. Edition 16(1), 27–32 (2016)Google Scholar
  17. 17.
    Fu, B., Noy, N.F., Storey, M.A.: Indented tree or graph? A usability study of ontology visualization techniques in the context of class mapping evaluation. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) The Semantic Web—ISWC 2013: 12th International Semantic Web Conference, Sydney, NSW, Australia, 21–25 Oct 2013, Proceedings, Part I, vol. 8218. Springer, Berlin, pp. 117–134 (2013).  https://doi.org/10.1007/978-3-642-41335-3_8 CrossRefGoogle Scholar
  18. 18.
    Hausmann, F.J.: Lexikographie. In: Schwarze, C., Wunderlich, D. (eds.) Handbuch der Lexikologie, pp. 367–398. Athenäum, Königstein/Ts (1985)Google Scholar
  19. 19.
    Hellmann, S., Unbehauen, J., Chiarcos, C., Ngonga Ngomo, A.C.: The tiger corpus navigator. In: Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories (TLT-9), Northern European Association for Language Technology (NEALT), pp. 91–102 (2010)Google Scholar
  20. 20.
    Hjørland, B.: Semantics and knowledge organization. Annu. Rev. Inf. Sci. Technol. 41(1), 367–405 (2007)CrossRefGoogle Scholar
  21. 21.
    Hjørland, B.: What is Knowledge Organization (KO)? Knowl. Organ. 35(2/3), 86–102 (2008)CrossRefGoogle Scholar
  22. 22.
    Hjørland, B. (ed.): ISKO Encyclopedia of Knowledge Organization (IEKO), online edn. http://www.isko.org/cyclo/ (2016). Accessed 30 Sept 2017
  23. 23.
    ISO 25964-1:2011: Information and documentation—thesauri and interoperability with other vocabularies—Part 1: thesauri for information retrieval (2011)Google Scholar
  24. 24.
    ISO 30042: Systems to manage terminology, knowledge and content—TermBase eXchange TBX, 1st edn (2008)Google Scholar
  25. 25.
    Justeson, J.S., Katz, S.M.: Technical terminology: some linguistic properties and an algorithm for identification in text. Nat. Lang. Eng. 1(1), 9–27 (1995)CrossRefGoogle Scholar
  26. 26.
    Kupietz, M., Keibel, H.: The Mannheim German Reference Corpus (DeReKo) as a basis for empirical linguistic research. In: Minegishi, M., Kawaguchi, Y. (eds.) Working Papers in Corpus-based Linguistics and Language Education, 3, Tokyo University of Foreign Studies, Tokyo, pp. 53–59 (2009)Google Scholar
  27. 27.
    Lang, C., Suchowolec, K., Schneider, R.: Extracting technical terminology from linguistic corpora. In: Proceedings of Grammar and Corpora 2016, Mannheim, Heidelberg University Publishing (heiUP), Heidelberg (2018)Google Scholar
  28. 28.
    León Araúz, P., Magaña Redondo, P.J.: EcoLexicon: contextualizing an environmental ontology. In: Proceedings of the Terminology and Knowledge Engineering (TKE) Conference, pp. 341–355 (2010)Google Scholar
  29. 29.
    Mazzocchi, F.: Knowledge organization system (KOS). In: [22], version 1.1. http://www.isko.org/cyclo/kos (2017). Accessed 30 Sept 2017
  30. 30.
    Michel, F., Montagnat, J., Faron-Zucker, C.: A survey of RDB to RDF translation approaches and tools. Technical Report, Laboratoire d’Informatique, Signaux et Systèmes de Sophia-Antipolis (I3S) (2014)Google Scholar
  31. 31.
    Mueller, T., Schmid, H., Schütze, H.: Efficient higher-order CRFs for morphological tagging. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Seattle, Washington, USA, pp. 322–332. http://www.aclweb.org/anthology/D13-1032 (2013). Accessed 8 Nov 2017
  32. 32.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2016). Accessed 8 Nov 2017
  33. 33.
    Resnik, P., Elkiss, A.: The linguist’s search engine: An overview. In: Proceedings of the ACL 2005 Interactive Poster and Demonstration Session, Association for Computational Linguistics (ACL), pp. 33–36 (2005).  https://doi.org/10.3115/1225753.1225762
  34. 34.
    Schmid, H.: Improvements in part-of-speech tagging with an application to German. In: Proceedings of the ACL SIGDAT-Workshop, Dublin, Ireland, pp. 1–9 (1995)Google Scholar
  35. 35.
    Schneider, R., Gottron, T.: A hybrid approach to statistical and semantical analysis of Web documents. In: Proceedings of the IASTED International Conference Internet and Multimedia Systems and Applications (EuroImsa), pp. 115–120 (2009)Google Scholar
  36. 36.
    Schneider, R., Schwinn, H.: Hypertext, Wissensnetz und Datenbank: Die Web-Informationssysteme grammis und Progr@mm. In: Berens, F.J., Steinle, M. (eds.) Ansichten und Einsichten. 50 Jahre Institut für Deutsche Sprache, IDS Eigenverlag, Mannheim, pp. 337–346. http://nbn-resolving.de/urn:nbn:de:bsz:mh39-24719 (2014). Accessed 7 Nov 2017
  37. 37.
    Sejane, I.: Wissensrepräsentation Linguistik. Modellierung, Potenzial und Grenzen am Beispiel der Ontologie zur deutschen Grammatik im GRAMMIS-Informationssystem des IDS, Mannheim. Ph.D. Thesis, Ruprecht-Karls-Universität Heidelberg (2010)Google Scholar
  38. 38.
    Souza, R.R., Tudhope, D., Almeida, M.B.: Towards a taxonomy of KOS: dimensions for classifying knowledge organisation systems. Knowl. Organ. 39(3), 172–179 (2012)Google Scholar
  39. 39.
    Spärck Jones, K.: A statistical interpretation of term specifity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)CrossRefGoogle Scholar
  40. 40.
    Suchowolec, K.: Sprachlenkung—Aspekte einer übergreifenden Theorie. Frank & Timme, Berlin, dissertation, Stiftung Universität Hildesheim (2018)Google Scholar
  41. 41.
    Suchowolec, K., Lang, C., Schneider, R., Schwinn, H.: Shifting complexity from text to data model. In: Gracia, J., Bond, F., McCrae, J.P., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds.) Language, Data, and Knowledge. Proceedings of the First International Conference, LDK 2017, 19 June 2017/20 June 2017, Galway, Ireland, Springer, Cham, no. 10318 in Lecture Notes in Artificial Intelligence, pp. 203–212 (2017)Google Scholar
  42. 42.
    Suchowolec, K., Lang, C., Schneider, R.: Grammar and its terminology. Re-designing terminology management system according to best practices (forthcoming)Google Scholar
  43. 43.
    Winkler, W.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (American Statistical Association), pp. 354–359 (1990)Google Scholar
  44. 44.
    Zifonun, G., Hoffmann, L., Strecker, B.: Grammatik der deutschen Sprache: Bd. 1–3. de Gruyter, Berlin (1997)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Karolina Suchowolec
    • 1
    Email author
  • Christian Lang
    • 2
  • Roman Schneider
    • 2
  1. 1.Technische Hochschule KölnCologneGermany
  2. 2.Institut für Deutsche Sprache (IDS)MannheimGermany

Personalised recommendations