Skip to main content
Log in

The algorithms for preliminary text processing: Decomposition, annotation, morphological analysis

  • Published:
Automatic Documentation and Mathematical Linguistics Aims and scope

Abstract

This paper considers the existing algorithms and suggests new algorithms for preliminary text processing that permit its quality to be increased, including: the deduction-inversion architecture of decomposition, modified algorithm of bidirectional interference, and morphological analysis based on preliminary annotation with tags of parts of speech.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Vishnaykov, T.N. and Yatsko, V.A., RF Inventor’s Certificate no. 10599, 2008.

  2. Yatsko, V., Larchenko, E., Starikov, M., and Vishnyakov, T., Linguistic Toolbox — Package of Programs for Automatic Text Analysis, Textressourcen und lexikalisches wissen, Berlin, 2008, pp. 119–128.

  3. Vishnyakov, T.N., Larchenko, E.V., Yatsko, V.A., RF Inventor’s Certificate no. 2008615744, 2008.

  4. Yatsko, V. and Kozlov, M., A Bilingual Translation System in Foreign Language Teaching, Proceedings of the 11-th International Conference on Speech and Computer, St.-Petersburg, 2006, pp. 226–231.

  5. Yatsko, V. Shilov, S., and Vishnyakov, T., A Semi-Automatic Text Summarization System, Proceedings of the 10-th International Conference on Speech and Computer, Patras, 2005, pp. 283–288.

  6. Kilgarriff, A. BNC Database and Word Frequency Lists (an Electronic Resource), 2004, URL:http://www.kilgarriff. co.uk/bnc-readme.html (the accessed date: 20.08.2009).

  7. Marchisio. G., Dhillon, N., Liang, J., et al., A Case Study in Natural Language Based Web Search, in Natural Language Processing and Text Mining, Kao, A., Poteet, S., Eds., London, 2007, pp. 69–90.

  8. Mustafaraj, E., Hoof, V., and Freisleben, D., Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles, in Natural Language Processing and Text Mining, Kao, A., Poteet, S., Eds., London, 2007, pp. 45–68.

  9. Yatsko, V.A. and Vishnyakov, T.N., Some Problems of Developing Modern Systems for Automatic Text Summarization, Nauch.-Techn. Inf., Ser. 2, 2007, no. 9, pp. 7–13.

  10. Tsuruoka, Y., Tsujii, J., Bidirectional Interference with the Easiest-First Strategy for Tagging Sequence Data (an Electronic Resource), 2003, URL: http://www-tsujii.is.s. u-tokyo.ac.jp/~tsuruoka/papers/emnlp05bidir.pdf (the accessed date: 20.08.2009)

  11. Porter, M.F., Snowball: A Language for Stemming Algorithms (an Electronic Resource), 2001, URL: http://snowball. tartarus.org/texts/introduction.html (the accessed date: 20.08.2009).

  12. Paice, C.D., Another Stemmer, SIGIR Forum, 1990, vol. 24, no. 3, pp. 56–61.

    Article  Google Scholar 

  13. Börjars, K. and Burridge, K., Introducing English Grammar, London: Arnold, 2001.

    Google Scholar 

  14. Brinton, L.J., The Structure of Modern English. A Linguistic Introduction. XXI, Amsterdam; Philadelphia: John Benjamins, 2000.

    Google Scholar 

Download references

Authors

Additional information

Original Russian Text © V.A. Yatsko, M.S. Starikov, E.V. Larchenko, T.N. Vishnyakov, 2009, published in Nauchno-Tekhnicheskaya Informatsiya, Seriya 2, 2009, No. 11, pp. 24–30.

About this article

Cite this article

Yatsko, V.A., Starikov, M.S., Larchenko, E.V. et al. The algorithms for preliminary text processing: Decomposition, annotation, morphological analysis. Autom. Doc. Math. Linguist. 43, 336–343 (2009). https://doi.org/10.3103/S0005105509060041

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0005105509060041

Key words

Navigation