Skip to main content
Log in

Systematic review of spell-checkers for highly inflectional languages

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Performance of any word processor, search engine, social media relies heavily on the spell-checkers, grammar checkers etc. Spell-checkers are the language tools which break down the text to check the spelling errors. It cautions the user if there is any unintentional misspelling occurred in the text. In the area of spell-checking, we still lack an exhaustive study that covers aspects like strengths, limitations, handled errors, performance along with the evaluation parameters. In literature, spell-checkers for different languages are available and each one possesses similar characteristics however, have a different design. This study follows the guidelines of systematic literature review and applies it to the field of spell-checking. The steps of the systematic review are employed on 130 selected articles published in leading journals, premier conferences and workshops in the field of spell-checking of different inflectional languages. These steps include framing of the research questions, selection of research articles, inclusion/exclusion criteria and the extraction of the relevant information from the selected research articles. The literature about spell-checking is divided into key sub-areas according to the languages. Each sub-area is then described based on the technique being used. In this study, various articles are analyzed on certain criteria to reach the conclusion. This article suggests how the techniques from the other domains like morphology, part-of-speech, chunking, stemming, hash-table etc. can be used in development of spell-checkers. It also highlights the major challenges faced by researchers along with the future area of research in the field of spell-checking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Abbreviations

FSM:

Finite state machine

DLM:

Dictionary lookup method

MA:

Morphological analysis

ED:

Edit distance

MED:

Minimum edit distance

US:

Unicode splitting

cbLSTM:

Character-based longest short term memory

SM:

Soundex method

LED:

Levenstein edit distance

CS:

Confusion set

rMED:

Reverse minimum edit distance

DDLM:

Direct dictionary lookup method

EDM:

Edit distance method

PEM:

Phonetic encoding method

FSR:

Finite state representation

STM:

State table method

FSA:

Finite state automata

PAMC:

Partition around medoid clustering

DMEE:

Double metaphone encoding

WF:

Word frequency

S&SS:

Sound and shape similarity

rEDM:

Reverse edit distance method

HT:

Hash table

TBA:

Tree-based algorithm

POS:

Parts of speech

HMM:

Hidden Markov model

GUI:

Graphical user interface

FST:

Finite state transition

UWH:

Unknown word handling

UPH:

Unknown proper noun handling

POS:

Parts of speech

API:

Application programming interface

CW:

Constituent word

MBLP:

Memory based language model

FSTM:

Finite state transition model

DA:

Dictionary approach

CC:

Canti check

CSO:

Crowd sourcing

References

  • Abdullah M, Islam Z, Khan M (2007) Error-tolerant finite-state recognizer and string pattern similarity based spelling-checker for Bangla. In: Proceeding of 5th international conference on natural language processing (ICON)

  • Abeera VP, Aparna S, Rekha RU, Kumar MA, Dhanalakshmi V (2012) Morphological analyzer for Malayalam. In: Data engineering and management, pp 252–254

  • Allen JD et al (2012) The unicode standard, vol 3. Mountain view, CA

    Google Scholar 

  • Ambili T, Panchami KS, Subash N (2016) Automatic error detection and correction in Malayalam. IJSTE Int J Sci Technol Eng 3(02):92–96

    Google Scholar 

  • Angell RC, Freund GE, Willett P (1983) Automatic spelling correction using a tri-gram similarity measure. Inf Process Manag 19(4):255–261

    Google Scholar 

  • Badugu S (2014) Morphology based POS tagging on Telugu. Int J Comput Sci Issues 11(1):181–187

    Google Scholar 

  • Balabantaray C, Sahoo B, Swain M, Sahoo K (2012) IIIT-Bh FIRE 2012 submission: MET Track Odia, pp 1–3

  • Banks T (2008) Strategies, foreign language larning difficulaties and teching. Dominican University of California, San Rafael

    Google Scholar 

  • Bansal A, Banerjee E, Jha GN (2013) Corpora creation for Indian language technologies—The ILCI Project. In: The 6th proceedings of language technology conference (LTC ‘13)

  • Bhatti Z, Ismaili IA (2016) Phonetic-based Sindhi spell-checker system using a hybrid model. Digit Scholarsh Humanit 31(2):264–282

    Google Scholar 

  • Bhatti Z, Ismaili IA, Shaikh AA, Javaid W (2012) Spelling error trends and patterns in Sindhi. J Emerg Trends Comput Inf Sci 3(10):1435–1439

    Google Scholar 

  • Bhatti Z, Ismaili IA, Soomro WJ, Hakro DN (2014) Word segmentation model for Sindhi text. Am J Comput Res Repos 2(1):1–7

    Google Scholar 

  • Bhowmik K (2014) Development of a word-based spell-checker for Bangla language. Military Institute of Science and Technology, United International University, Dhaka

    Google Scholar 

  • Borah PP, Talukdar G, Baruah A (2014) Assamese word sense disambiguation using supervised learning. In: International conference on contemporary computing and informatics (IC3I). IEEE, pp 946–950

  • Bruno M, Silva MJ (2004) Spelling correction for search engine queries. In: Advanced natural language processing. Springer, Berlin, pp 372–383

  • Budanitsky A, Hirst G (2006) Evaluating wordnet-based measures of lexical semantic relatedness. Comput Linguist 32(1):13–47

    MATH  Google Scholar 

  • Budgen D, Brereton P (2006) Performing systematic literature reviews in software engineering. In: ICSE’06 Proceedings of the 28th international conference on Software engineering, pp 1051–1052

  • Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, pp 161–175

  • Chakrabarti B (1994) A comparative study of Santali and Bengali. K.P. Bagchi & Co., Kolkata

    Google Scholar 

  • Chaudhuri BB (2001) Reversed word dictionary and phonetically similar word grouping based spell-checker to Bangla text. In: Proceeding of LESAL Workshop, Mumbai

  • Chaudhuri BB (2002) Towards Indian language spell-checker design. In: Proceedings—language engineering conference, LEC 2002, pp 139–146

  • Choudhury R, Deb N, Kashyap K (2019) Context sensitive spelling checker for Assamese language. In: Kalita J, Balas V, Borah S, Pradhan R (eds) Recent developments in machine learning and data analytics. Springer, Singapore, pp 177–188

    Google Scholar 

  • Cordeiro de Amorim R, Zampieri M (2013) Recent advances in natural language processing. In: IEEE international conference on recent advances in natural language processing, pp 172–178

  • Dahar IA, Abbas F, Rajput U, Hussain A, Azhar F (2018) An efficient Sindhi spelling checker for microsoft word. Int J Comput Sci Netw Secur 18(5):144–150

    Google Scholar 

  • Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Commun ACM 7(3):171–176

    Google Scholar 

  • Das M, Borgohain S, Gogoi J, Nair SB (2002a) Design and implementation of a spell-checker for Assamese. In: Language engineering conference, Proceedings IEEE, pp 156–162

  • Das M, Borgohain S, Gogoi J (2002b) Design and implementation of a spell-checker for Assamese. In: Language engineering conference, proceedings IEEE, pp 156–162

  • Das M, Borgohain S, Gogoi J, Nair SB (2002c) Design and implementation of a spell-checker for Assamese. In: Proceedings—language engineering conference, LEC 2002, pp 156–162

  • Daud A, Khan W, Che D (2016) Urdu language processing: a survey. Artif Intell Rev 47(3):279–311

    Google Scholar 

  • Dhanabalan T, Parthasarathi R, Geetha TV (2003) Tamil spell-checker. In: 6th Tamil internet conference, Chennai, Tamilnadu, India, pp 18–27

  • Dhanju KS, Lehal GS, Saini TS, Kaur A (2015) Design and implementation of Punjabi spell-checker. Int J Sci Technol 8(27):1–12

    Google Scholar 

  • Dongre VJ, Mankar VH (2010) A review of research on Devnagari character recognition. Int J Comput Appl 12(2):8–15

    Google Scholar 

  • Dowlagar S, Mamidi R (2015) A semi supervised dialog act tagging for Telugu. In: Proceedings of the 12th international conference on natural language processing, pp 376–383

  • Etoori P, Chinnakotla M, Mamidi R (2018) Automatic spelling correction for resource-scarce languages using deep learning. In: Proceeding of ACL 2018, Student research workshop, pp 146–152

  • Fossati F, Di Eugenio B (2007) I saw TREE trees in the park: how to correct real-word spelling mistakes. In: LREC, pp 896–901

  • Ganfure GO, Midekso D (2014) Design and implementation of morphology based spell-checker. Int J Sci Technol Res 3(12):118–125

    Google Scholar 

  • Ghafour HHA, El-bastawissy A, Heggazy AFA (2011) AEDA : Arabic edit distance algorithm towards a new approach for Arabic name matching. In: IEEE, international conference on computer engineering and systems, pp 307–311

  • Gokcay E, Gokcay D (1995) Combining statistics and heuristics for language identification. In: Proceedings of the 4th annual symposium on document analysis and information retrieval

  • Gottron T, Lipka N (2010) A comparison of language identification approaches on short, query-style texts. Lecture notes in computer science, pp 611–614

  • Goyal V, Lehal GS (2010) Automatic standardization of spelling variations of Hindi text. In: International conference on computer and communication technology ICCCT 2010, pp 764–767

  • Gupta V (2014) Automatic stemming of words for Punjabi. In: Advances in signal processing and intelligent recognition systems, pp 73–84

  • Gupta P, Goyal V (2009) Implementation of rule-based algorithm for Sandhi-Vicheda of compound Hindi words. Int J Comput Sci Issues 3:45–49

    Google Scholar 

  • Gupta V, Lehal GS (2011) Punjabi language stemmer for nouns and proper names. In: Proceedings of the 2nd workshop on South and Southeast Asian Natural Language Processing (WSSANLP), IJCNLP 2011, pp 35–39

  • Gupta V, Lehal GS (2019) Complete pre processing phase of Punjabi text extractive summarization system. In: Proceedings of COLING 2012: demonstration papers, pp 199–206

  • Harrison GL, Goegan LD, Jalbert R, Mcmanus K, Sinclair K, Spurling J (2016) Predictors of spelling and writing skills in first and second language learners. Read Writ 29(1):69–89

    Google Scholar 

  • Hassan A, Amin MR, Al Azad AK, Mohammed N (2017) Sentiment analysis on Bangla and Romanized Bangla text using deep recurrent models. In: IWCI 2016—2016 international workshop on computational intelligence, pp 51–56

  • Hayes B, Lahiri A (1991) Bengali international phonology. Nat Lang Linguist Theory 9(1):47–96

    Google Scholar 

  • Hema PH, Sunitha C (2016) Malayalam spell-checker using N-gram method. In: Computational intelligence in data mining-advances in intelligent systems and computing, vol 1, pp 217–225

  • Heshaam F (2010) Detection and correction of real-word spelling errors in Persian language. In: IEEE-international conference on natural language processing and knowledge engineering (NLP-KE)

  • Hoque T, Kaykobad M (2002) Coding system for Bangla spell-checker. In: 5th international conference on computer and information technology, pp 186–190

  • Huang A (2008) Similarity measures for text document clustering. In: Proceedings of the 6th New Zealand computer science research student conference (NZCSRSC2008)

  • Humayoun M, Ranta A (2014) Developing Punjabi morphology, corpus and lexicon. In: Proceedings of the 24th Pacific Asia conference on language, information and computation

  • Hussain I, Saharia N, Sharma U (2011) Development of assamese wordnet. In: Nath B, Sharma U, Bhattacharyya DK (eds) Machine intelligence: recent advances. Narosa Publishing House, ISBN-978-81-8487-140-1

  • Iqbal S, Anwar W, Bajwa UI, Rehman Z (2013) Urdu spell-checking: reverse edit distance approach. In: Proceedings of the 4th workshop on South and Southeast Asian Natural Language Processing, pp 58–65

  • Islam A, Inkpen D (2009) Real-word spelling correction using google web 1T 3-grams. In: EMNLP’09, conference on empirical methods in natural language processing, pp 1241–1249

  • Islam MZ, Uddin M, Khan M (2007) A light weight stemmer for Bengali and its use in spelling checker. In: Proceedings of international conference on digital communication and computer applications (DCCA), pp 19–23

  • Jain A, Jain M (2014) Detection and correction of non-word spelling errors in Hindi language. In: International conference on data mining and intelligent computing (ICDMIC)

  • Jain U, Kaur J (2015) Text chunker for Punjabi. Int J Curr Eng Technol 5(5):3349–3353

    Google Scholar 

  • Jananie S, Sarveswaran K (2014) Hybrid approach for spell-checking of Tamil language. In: Proceedings of the Peradeniya University, International Research Session, vol 18, no 1

  • Jindal S (2017) Building English–Punjabi parallel corpus for machine translation. Int J Comput Appl 180(8):26–29

    Google Scholar 

  • Justin Z, Dart P (1995) Finding approximate matches in large lexicons. Softw Pract Exp 25(3):331–345

    Google Scholar 

  • Kabeer R, Idicula SM (2014) Text summarization for Malayalam documents—an experience. In: Proceedings of international conference on data science and engineering, ICDSE 2014, pp 145–150

  • Kashyap K, Sarma H, Sarma SK (2015) Luitspell: development of an Assamese language spell-checker for open office writer. Eur J Adv Eng Technol 2(5):135–138

    Google Scholar 

  • Kashyap L, Joshi SR, Bhattacharyya P (2017) Insights on Hindi WordNet coming from the IndoWordNet. In: The Wordnet in Indian languages, pp 19–43

  • Kaur H, Kaur G, Kaur M (2015) Punjabi spell-checker using dictionary clustering. Int J Sci Eng Technol Res 4(7):2369–2374

    Google Scholar 

  • Keselj V, Peng F, Cercone N, Thomas C (2003) N-gram based author profiles for authorship attribution. In: Proceedings of the conference of the Pacific association for computational linguistics (PACLING)

  • Khan NH, Saha GC (2014) Checking the correctness of Bangla words using N-gram. Int J Comput Appl 89(11):1–3

    Google Scholar 

  • Kleenankandy J (2014) Implementation of Sandhi-rule based compound word generator for Malayalam. In: Proceedings of 4th international conference on advances in computing and communications, ICACC 2014, pp 134–137

  • Kukich K (1992) Technique for automatically correcting words in text. ACM Comput Surv 24(4):377–439

    Google Scholar 

  • Kumar SS, Suma S, Sneha N (2017) Spell-checker for Kannada OCR. Int Digit Libr Technol Res 1(4):1–12

    Google Scholar 

  • Lakshmi K, Babu T (2018) A new hybrid algorithm for Telugu word retrieval and recognition. Int J Intell Eng Syst 11(4):117–127

    Google Scholar 

  • Lawaye AA, Purkayastha BS (2016) Design and implementation of spell-checker for Kashmiri. Int J Sci Res 5(7):199–202

    Google Scholar 

  • Lehal GS (2007) Design and implementation of Punjabi spell-checker. Int J Syst Cybern Inform 3(8):70–75

    Google Scholar 

  • Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710

    MathSciNet  Google Scholar 

  • Mahar JA, Shaikh H, Memon GQ (2012) A model for Sindhi text segmentation in word tokens. Sindh Univ Res J SUR J (Sci Ser) 44(1):43–48

    Google Scholar 

  • Mala C, Parameshwari K, Rao GUM, Kulkarni AP (2012) Telugu spell-checker. In: International Telugu internet conference proceedings, pp 1–8

  • Mandal P, Hossain BMM (2017a) Clustering-based Bangla Spell-checker. In: IEEE international conference on imaging, vision and pattern recognition (icIVPR)

  • Mandal P, Hossain BMM (2017b) A systematic literature review on spell-checkers for Bangla language. Int J Mod Educ Comput Sci 9(6):40–47

    Google Scholar 

  • Manohar N, Lekshmipriya PT, Jayan V, Bhadran VK (2015) Spell-checker for Malayalam using finite state transition models. In: IEEE recent advances in intelligent computational systems, RAICS 2015, pp 157–161

  • Mateen A, Malik MK, Nawaz Z, Danish HM, Siddiqui MH (2017) A hybrid stemmer of Punjabi Shahmukhi script. Int J Comput Sci Netw Secur 17(8):90–97

    Google Scholar 

  • Mishra D, Venugopalan M, Gupta D (2016) Context-specific lexicon for Hindi reviews. In: 6th international conference on advances in computing and communications, ICACC 2016, vol 93, pp 554–563

  • Mittal S, Sethi NS, Sharma SK (2014) Part of speech tagging of Punjabi language using N gram model. Int J Comput Appl 100(19):20–23

    Google Scholar 

  • Mohapatra DD (2018) A sketch of Odia morphology. Glob J Res Anal 7(4):80–81

    Google Scholar 

  • Mon AM (2012) Spell-checker for Myanmar language. In: International conference on information retrieval and knowledge management (CAMP). IEEE, pp 12–16

  • Murthy KN (2001) Computer processing of Kannada language. Workshop at Kannada University, pp 1–10

  • Mustafa SH (2005) Character contiguity in N -gram-based word matching: the case for Arabic text searching. Inf Process Manag 41:819–827

    Google Scholar 

  • Naseem T (2004) A hybrid approach for Urdu spell-checking. National University of Computer & Emerging Sciences

  • Naseem T, Hussain S (2007) A novel approach for ranking spelling error corrections for Urdu. Lang Resour Eval 41(2):117–128

    Google Scholar 

  • Nielsen J (1999) Internet-based spelling checker dictionary system with automatic updating

  • Nisha M, Reji Rahmath K, Rekha Raj CT, Reghu Raj PC (2015) Malayalam morphological analysis using MBLP approach. In: Proceedings of international conference on soft-computing and network security, ICSNS 2015

  • Pareek G, Modi D (2016) Feature extraction in Hindi text summarization. Ski Res J 6(2):14–19

    Google Scholar 

  • Peterson JL (1980) Computer programs for detecting and correcting spelling errors. Commun ACM 23(12):676–687

    Google Scholar 

  • Prathibha RJ, Padma MC (2016) Design of morphological analyzer for Kannada inflectional words using hybrid approach. Int J Comput Linguist Res 7(4):133–161

    Google Scholar 

  • Pratip S, Chaudhuri BB (2013) A simple real-word error detection and correction using local word bigram and trigram. In: Proceedings of the 25th conference on computational linguistics and speech processing (ROCLING 2013), pp 211–220

  • Puri R, Bedi RPS, Goyal V (2015) Punjabi stemmer using Punjabi wordnet database. Indian J Sci Technol 8(27):1–5

    Google Scholar 

  • Rahman MU (2015) Towards Sindhi corpus construction. In: Conference on language and technology, pp 1–6

  • Rahutomo F, Kitasuka T, Aritsugi M (2012) Semantic cosine similarity. In: 7th international student conference on advanced science and technology ICAST

  • Rajashekara Murthy S, Akshatha AN, Upadhyaya CG, Ramakanth Kumar P (2017) Kannada spell-checker with Sandhi splitter. In: International conference on advances in computing, communications and informatics, ICACCI 2017, pp 950–956

  • Rajashekara Murthy S, Madi V, Sachin D, Ramakanth PK (2012) A non-word Kannada spell-checker using morphological analyzer and dictionary lookup method. Int J Eng Sci Emerg Technol 2(2):43–52

    Google Scholar 

  • Rama T, Sowmya V (2018) A dependency treebank for Telugu. In: Proceedings of the 16th international workshop on treebanks and linguistics theories, pp 119–128

  • Robertson AM, Willet P (1998) Applications of N-grams in textual information systems. J Doc 54(1):48–67

    Google Scholar 

  • Rout Y, Santi PK, Subudhi S, Sahu B (2013) An approach for designing Odia spell-checker. In: National conference on recent advances on business intelligence & data mining (RABIDM 2013), pp 1–7

  • Saharia N (2011) A first step towards parsing of Assamese text. Spec Vol Probl Parsing Indian Lang 11(5):30–34

    Google Scholar 

  • Saharia N, Konwar KM (2012) LiuitPad: a fully unicode compatible Assamese writing software. In: Proceedings of the 2nd workshop an advances in text input methods (WTIM 2) COLLING 2012, pp 79–88

  • Saharia N, Das D, Sharma U, Kalita J (2009) Part of speech tagger for Assamese text. In: Proceedings of the ACL-IJCNLP conference short papers, pp 33–36

  • Saharia N, Sharma U, Kalita J (2012) Analysis and evaluation of stemming algorithms : a case study with Assamese. In: International conference on advances in computing, communications and informatics, ICACCI 2012, pp 842–846

  • Sahoo K, Vidyasagar VE (2003) Kannada WordNet—a lexical database. In: Conference on convergent technologies for Asia-Pacific Region (TENCON 2003), vol 4, pp 1352–1356

  • Sakuntharaj R, Mahesan S (2016) A novel hybrid approach to detect and correct spelling in Tamil text. In: International conference on information and automation for sustainability: interoperable sustainable smart systems for next generation, ICIAFS 2016. IEEE, pp 1–6

  • Sakuntharaj R, Mahesan S (2017) Use of a novel hash-table for speeding-up suggestions for misspelt Tamil words. In: International conference on industrial and information systems (ICIIS) IEEE, pp 1–5

  • Santosh T, Sulochana KG, Kumar RR (2002) Malayalam spell-checker. In: Proceedings of the international conference on universal knowledge and language

  • Saranya SK (2008) Morphological analyzer for Malayalam verbs. Amrita Vishwa Vidyapeetham, Amrita School of Engineering, Coimbatore

    Google Scholar 

  • Sarma P (2017) An approach to prepare lexicons of Assamese text for unit selection concatenation TTS. Int J Emerg Trends Sci Technol 4(8):5631–5637

    Google Scholar 

  • Sarma SK, Medhi R, Gogoi M, Saikia U (2010) Foundation and structure of developing an Assamese wordnet. In: Proceedings of 5th international conference of the global WordNet Association

  • Sarmah J, Barman AK, Sharma SK (2013) Automatic Assamese text categorization using wordnet. In: International conference of advances in computing, communications and informatics IEEE, pp 85–89

  • Segar J, Sarveswaran K (2015) Contextual spell-checking for Tamil language. In: 14th Tamil internet conference, pp 1–5

  • Sekhar N, Pushpak D, Jyoti B (2017) The WordNet in Indian Languages. Springer Nature, Singapore

    Google Scholar 

  • Sethi DP (2014) A survey on Odia computational morphology. Int J Adv Res Comput Eng Technol 3(3):623–625

    Google Scholar 

  • Shaalan K, Allam A, Gomah A (2003) Towards automatic spell-checking for Arabic. In: Proceedings of the 4th conference on language engineering, Egyptian Society of language engineering (ELSE), Egypt, pp 240–247

  • Shah ZA, Mashori GM (2013) Oxford English-Sindhi dictionary: a critical study in lexicography. ELF Annu Res J 13:37–46

    Google Scholar 

  • Shambhavi BR, Ramakanth Kumar P, Srividya K, Jyothi BJ, Kundargi S, Shastri G (2011) Kannada morphological analyser and generator using trie. Int J Comput Sci Netw Secur 11(1):112–116

    Google Scholar 

  • Sheykholeslam MH, Minaei-Bidgoli B, Juzi H (2013) A framework for spelling correction in Persian language using noisy channel model. In: LREC, pp 58–65

  • Singh A (2016) Review for dialects in Punjabi language. Int J Innov Adv Comput Sci 5(8):25–30

    Google Scholar 

  • Singh J, Singh G, Singh R, Singh P (2018) Morphological evaluation and sentiment analysis of Punjabi text using deep learning classification. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.04.003

    Article  Google Scholar 

  • Sinha RMK, Singh KS (1984) A programme for correction of single spelling errors in Hindi words. IETE J Res 30(6):249–251

    Google Scholar 

  • Solak A (1993) Design and implementation of a spelling checker for Turkish. Institute of Engineering & Sciences, Bilkent University, Ankara

    Google Scholar 

  • Sooraj S, Manjusha K, Anand Kumar M, Soman KP (2018) Deep learning based spell-checker for Malayalam language. J Intell Fuzzy Syst 34(3):1427–1434

    Google Scholar 

  • Strnad J (2001) Hindi dictionaries and the Hindi lexicographical corpus. Festschrift Helmut Nespital, pp 1–14

  • Subhashini R, Kumar VJS (2010) Evaluating the performance of similarity measures used in document clustering and information retrieval. In: IEEE, 1st international conference on integrated intelligent computing

  • Tomovic A, Janicic P, Keselj V (2006) N-gram based classification and unsupervised hierarchical clustering of genome sequences. Comput Methods Program Biomed 81:137–153

    Google Scholar 

  • Uzzaman N, Khan M (2006) A comprehensive Bangla spelling checker. BRAC University, Dhaka

    Google Scholar 

  • Varghese ST, Sulochana KG, Kumar RR (2002) Malayalam spell-checker. In: Proceedings of the international conference on universal knowledge and language

  • Veerappan R, Antony PJ, Saravanan S, Soman KP (2011) A rule-based Kannada morphological analyzer and generator using finite state transducer. Int J Comput Appl 27(10):45–52

    Google Scholar 

  • Verberne S (2002) Context-sensitive spell-checking based on word trigram probabilities

  • Wasala A, Weerasinghe R, Pushpananda R (2010) A data-driven approach to checking and correcting spelling errors in Sinhala. Int J Adv ICT Emerg Reg 03(01):11–24

    Google Scholar 

  • Wu S, Mamber U (1992) AGREP—a fast approximate pattern matching tool. In: Proceedings of the Winter 1992 USENIX conference San Francisco USA. Berkeley, pp 153–162

  • Yue T, Briand LC, Labiche Y (2011) A systematic review of transformation approaches between user requirements and analysis models. Requir Eng 16(2):75–99

    Google Scholar 

  • Zampieri M, Cordeiro de Amorim R (2014) Between sound and spelling: combining phonetics and clustering algorithms to improve target word recovery. In: International conference on natural language processing, pp 438–449

  • Zhang Y, Zhao X (2013) Automatic error detection and correction of text: the state of the art. In: 6th international conference on intelligent networks and intelligent systems, ICINIS, pp 274–277

  • Zhuang L, Bao T, Zhu X, Wang C, Naoi S (2004) A Chinese OCR spelling check approach based on statistical language models. In: International conference on systems, man and cybernetics, IEEE, vol 5, pp 4727–4732

Download references

Acknowledgements

The authors thank the reviewers for their insightful comments. The authors would also like to thank Ministry of Electronics and IT, Government of INDIA, for providing fellowship under Grant Number: PhD-MLA-4 (69)/2015-16 (Visvesvaraya PhD Scheme for Electronics and IT) to pursue Ph.D. work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shashank Singh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, S., Singh, S. Systematic review of spell-checkers for highly inflectional languages. Artif Intell Rev 53, 4051–4092 (2020). https://doi.org/10.1007/s10462-019-09787-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-019-09787-4

Keywords

Navigation