A Fast Algorithm for Words Reordering Based on Language Model

  • Theologos Athanaselis
  • Stelios Bakamidis
  • Ioannis Dologlou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4132)


What appears to be given in all languages is that words can not be randomly ordered in sentences, but that they must be arranged in certain ways, both globally and locally. The “scrambled” words into a sentence cause a meaningless sentence. Although the use of manually collected grammatical rules can boost the performance of grammar checker in word order diagnosis, the repairing task is still very difficult. This work proposes a method for repairing word order errors in English sentences by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of a permutations’ filtering approach in order to reduce the search space among the possible sentences with reordered words. The filtering method is based on bigrams’ probabilities. In this work the search space is further reduced using a threshold over bigrams’ probabilities. The experimental results show that more than 95% of the test sentences can be repaired using this technique. The comparative advantage of this method is that it is not restricted into a specific set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns. Unlike most of the approaches, the proposed method is applicable to any language (language models can be simply computed in any language) and does not work only with a specific set of words. The use of parser and/or tagger is not necessary.


Language Model Fast Algorithm Confusion Matrix Training Corpus Test Sentence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Atwell, E.S.: How to detect grammatical errors in a text without parsing it. In: Proceedings of the 3rd EACL, pp. 38–45 (1987)Google Scholar
  2. 2.
    Bigert, J., Knutsson, O.: Robust error detection: A hybrid approach combining unsupervised error detection and linguistic knowledge. In: Proceedings of Robust Methods in Analysis of Natural language Data (ROMAND 2002), pp. 10–19 (2002)Google Scholar
  3. 3.
    Chodorow, M., Leacock, C.: An unsupervised method for detecting grammatical errors. In: Proceedings of NAACL 2000, pp. 140–147 (2000)Google Scholar
  4. 4.
    Feyton, C.M.: Teaching ESL/EFL with the internet. Merill Prentice- Hall (2002)Google Scholar
  5. 5.
    Folse, K.S.: Intermediate TOEFL Test Practices (rev. ed.). The University of Michigan Press, Ann Arbor (1997)Google Scholar
  6. 6.
    Good, I.J.: The population frequencies of species and the estimation of population parameters. Biometrika 40(3 and 4), 237–264 (1953)MATHMathSciNetGoogle Scholar
  7. 7.
    Golding, A.A.: Bayesian hybrid for context-sensitive spelling correction. In: Proceedings of the 3rd Workshop on Very Large Corpora, pp. 39–53 (1995)Google Scholar
  8. 8.
    Hawkins, J.A.: A Performance Theory of Order and Constituency. Cambridge University Press, Cambridge (1994)Google Scholar
  9. 9.
    Heift, T.: Intelligent Language Tutoring Systems for Grammar Practice. Zeitschrift für Interkulturellen Fremdsprachenunterricht (Online) 6(2), 15 (2001)Google Scholar
  10. 10.
    Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech and Signal Processing 35(3), 400–401 (1987)CrossRefGoogle Scholar
  11. 11.
    Sjöbergh, J.: Chunking: an unsupervised method to find errors in text. In: Proceedings of the 15th Nordic Conference of Computational Linguistics, NODALIDA (2005)Google Scholar
  12. 12.
    Young, S.J.: Large Vocabulary Continuous Speech Recognition. IEEE Signal Processing Magazine 13(5), 45–57 (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Theologos Athanaselis
    • 1
  • Stelios Bakamidis
    • 1
  • Ioannis Dologlou
    • 1
  1. 1.Institute for Language and Speech Processing, Artemidos 6 and EpidavrouMaroussiGreece

Personalised recommendations