A User-Oriented Approach to Evaluation and Documentation of a Morphological Analyzer

  • Gertrud Faaß
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 100)

Abstract

This article describes a user-oriented approach to evaluate and extensively document a morphological analyzer with a view to normative descriptions of ISO and EAGLES. While current state-of-the-art work in this field often describes task-based evaluation, our users (supposedly rather NLP non-experts, anonymously using the tool as part of a webservice) expect an extensive documentation of the tool itself, the testsuite that was used to validate it and the results of the validation process. ISO and EAGLES offer a good starting point when attempting to find attributes that are to be evaluated. The documentation introduced in this article describes the analyzer in a way comparable to others by defining its features as attribute-value pairs (encoded in DocBook XML). Furthermore, the evaluation itself and its results are described. All documentation and the created testsuites are online and free to use: http://www.ims.uni-stuttgart.de/projekte/dspin.

Keywords

German documentation evaluation validation verification eHumanities ISO 9126 EAGLES morphological analyzer 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bankhardt, C.: D-SPIN – Eine Infrastruktur für Deutsche Sprachressourcen. Sprachreport 25(1), 30–31 (2009)Google Scholar
  2. 2.
    Baroni, M., Kilgarriff, A.: Large linguistically-processed web corpora for multiple languages. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 87–90 (2006)Google Scholar
  3. 3.
    Barr, V.B., Klavans, J.L.: Verification and Validation of Language Processing Systems: Is it Evaluation? In: ACL 2001 Workshop on Evaluation Methodologies for Language and Dialogue Systems, pp. 34–40 (2001)Google Scholar
  4. 4.
    Belz, A.: That’s Nice…What Can You Do With It? Comp. Ling. 35(1), 111–118 (2009)CrossRefGoogle Scholar
  5. 5.
    Bevan, N.: Quality in use: Meeting user needs for quality. J. Sys. Software 49(1), 89–96 (1999)CrossRefGoogle Scholar
  6. 6.
    EAGLES: Evaluation of Natural Language Processing Systems, EAG-EWG-PR.2, final report (1996)Google Scholar
  7. 7.
    Faaß, G., Heid, U.: Nachhaltige Dokumentation virtueller Forschungsumgebungen. In: Tagungsband: 12. Internationales Symposium der Informationswissenschaft (ISI 2011), Hildesheim, Germany, March 9-11 (2011)Google Scholar
  8. 8.
    Faaß, G., Heid, U., Schmid, H.: Design and application of a Gold Standard for morphological analysis: SMOR in validation. In: Proceedings of the 7th international Conference on Language Resources and Evaluation (LREC 2010), pp. 803–810 (2010)Google Scholar
  9. 9.
    Fitschen, A.: Ein Computerlinguistisches Lexikon als komplexes System (PhD Dissertation). AIMS – Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung, vol. 10. Lehrstuhl für Computerlinguistik, Universität Stuttgart, Stuttgart (2004)Google Scholar
  10. 10.
    Gonzales, A., Barr, V.: Validation and verification of intelligent systems – what are they and how are they different? J. Exp. Theor. Artif. Intell. 12(4), 407–420 (2000)CrossRefMATHGoogle Scholar
  11. 11.
    Harris, L.E.: Prospects of Practical Natural Language Systems. In: Proceedings of the 18th Annual Meeting of the Association for Computational Linguistics, p. 129 (1980)Google Scholar
  12. 12.
    Hausser, R. (ed.): Linguistische Verifikation. Dokumentation zur Ersten Morpholympics 1994. Niemeyer, Tübingen (1996)Google Scholar
  13. 13.
    Hinrichs, M., Zastrow, T., Hinrichs, E.: WebLicht: Web-based LRT Services in a Distributed eScience Infrastructure. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 489–493 (2010)Google Scholar
  14. 14.
    International Standard ISO/IEC 9126: Information technology – Software product evaluation – Quality characteristics and guidelines for their use. ISO, Geneva (1991)Google Scholar
  15. 15.
    King, M., Underwood, N.: Evaluating symbiotic systems: the challenge. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), pp. 2475–2478 (2006)Google Scholar
  16. 16.
    Kurimo, M., Varjokallio, M.: Unsupervised morpheme analysis evaluation by a comparison to a linguistic gold standard – Morpho Challenge 2008. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, Springer, Heidelberg (2009)Google Scholar
  17. 17.
    Kurimo, M., Virpioja, S., Turunen, V.T., Blackwood, G.W., Byrne, W.: Overview and results of Morpho Challenge 2009. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mostefa, D., Penas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 578–597. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  18. 18.
    Lehmann, S., Oepen, S., Regnier-Prost, S., Netter, K., Lux, V., Klein, J., Falkedal, K., Fouvry, F., Estival, D., Dauphin, E., Compagnion, H., Baur, J., Balkan, L., Arnold, D.: TSNLP – Test Suites for Natural Language Processing. In: Proceedings of the 16th International Conference on Computational Linguistics, vol. 2, pp. 711–716 (1996)Google Scholar
  19. 19.
    Mahlow, C., Piotrowski, M.: A Target-Driven Evaluation of Morphological Components for German. In: Searching Answers – Festschrift in Honour of Michael Hess on the Occasion of His 60th Birtday, pp. 85–99. MV-Verlag, Münster (2009)Google Scholar
  20. 20.
    Manzi, S., King, M., Douglas, S.: Working towards User-oriented Evaluation. In: Proceedings of the International Conference on Natural Language Processing and Industrial Applications (NLP+IA 1996), pp. 155–160 (1996)Google Scholar
  21. 21.
    Schiller, A.: Deutsche Flexions- und Kompositionsmorphologie mit PC-KIMMO. In: Hausser, R. (ed.) Linguistische Verifikation. Dokumentation zur Ersten Morpholympics, pp. 37–52. Niemeyer, Tübingen (1996)Google Scholar
  22. 22.
    Schiller, A., Teufel, S., Stöckert, C., Thielen, C.: Vorläufige Guidelines für das Tagging deutscher Textcorpora mit STTS. Technical report, Universität Stuttgart, Institut für maschinelle Sprachverarbeitung, and Seminar für Sprachwissenschaft, Universität Tübingen (1995)Google Scholar
  23. 23.
    Schmid, H.: A programming language for finite state transducers. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002, pp. 308–309. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  24. 24.
    Schmid, H., Fitschen, A., Heid, U.: A German Computational Morphology Covering Derivation, Composition, and Inflection. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), pp. 1263–1266 (2004)Google Scholar
  25. 25.
    Sparck Jones, K., Galliers, J.R.: Evaluating Natural Language Processing Systems. LNCS (LNAI), vol. 1083. Springer, Heidelberg (1996)Google Scholar
  26. 26.
    Spiegler, S., Monson, C.: EMMA: A novel Evaluation Metric for Morphological Analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 1029–1037 (2010)Google Scholar
  27. 27.
    Thompson, B.H.: Evaluation of Natural Language Interfaces to Data Base Systems. In: Proceedings of the 19th Annual Meeting of the Association for Compuational Linguistics (ACL 1981), pp. 39–42 (1981)Google Scholar
  28. 28.
    Underwood, N.L.: Issues in Designing a Flexible Validation Methodology for NLP Lexica. In: Rubio, A., Gallardo, N., Castro, R., Tejada, A. (eds.) Proceedings of the First International Conference on Language Resources and Evaluation, vol. 1, pp. 129–134 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gertrud Faaß
    • 1
  1. 1.Institut für maschinelle SprachverarbeitungUniversität StuttgartStuttgartGermany

Personalised recommendations