Systematic review of spell-checkers for highly inflectional languages
- 81 Downloads
Abstract
Performance of any word processor, search engine, social media relies heavily on the spell-checkers, grammar checkers etc. Spell-checkers are the language tools which break down the text to check the spelling errors. It cautions the user if there is any unintentional misspelling occurred in the text. In the area of spell-checking, we still lack an exhaustive study that covers aspects like strengths, limitations, handled errors, performance along with the evaluation parameters. In literature, spell-checkers for different languages are available and each one possesses similar characteristics however, have a different design. This study follows the guidelines of systematic literature review and applies it to the field of spell-checking. The steps of the systematic review are employed on 130 selected articles published in leading journals, premier conferences and workshops in the field of spell-checking of different inflectional languages. These steps include framing of the research questions, selection of research articles, inclusion/exclusion criteria and the extraction of the relevant information from the selected research articles. The literature about spell-checking is divided into key sub-areas according to the languages. Each sub-area is then described based on the technique being used. In this study, various articles are analyzed on certain criteria to reach the conclusion. This article suggests how the techniques from the other domains like morphology, part-of-speech, chunking, stemming, hash-table etc. can be used in development of spell-checkers. It also highlights the major challenges faced by researchers along with the future area of research in the field of spell-checking.
Keywords
Spelling Spell-check Non-word errors Real-word errors Dictionary lookup Edit-distance Recurrent neural network (RNN)Abbreviations
- FSM
Finite state machine
- DLM
Dictionary lookup method
- MA
Morphological analysis
- ED
Edit distance
- MED
Minimum edit distance
- US
Unicode splitting
- cbLSTM
Character-based longest short term memory
- SM
Soundex method
- LED
Levenstein edit distance
- CS
Confusion set
- rMED
Reverse minimum edit distance
- DDLM
Direct dictionary lookup method
- EDM
Edit distance method
- PEM
Phonetic encoding method
- FSR
Finite state representation
- STM
State table method
- FSA
Finite state automata
- PAMC
Partition around medoid clustering
- DMEE
Double metaphone encoding
- WF
Word frequency
- S&SS
Sound and shape similarity
- rEDM
Reverse edit distance method
- HT
Hash table
- TBA
Tree-based algorithm
- POS
Parts of speech
- HMM
Hidden Markov model
- GUI
Graphical user interface
- FST
Finite state transition
- UWH
Unknown word handling
- UPH
Unknown proper noun handling
- POS
Parts of speech
- API
Application programming interface
- CW
Constituent word
- MBLP
Memory based language model
- FSTM
Finite state transition model
- DA
Dictionary approach
- CC
Canti check
- CSO
Crowd sourcing
Notes
Acknowledgements
The authors thank the reviewers for their insightful comments. The authors would also like to thank Ministry of Electronics and IT, Government of INDIA, for providing fellowship under Grant Number: PhD-MLA-4 (69)/2015-16 (Visvesvaraya PhD Scheme for Electronics and IT) to pursue Ph.D. work.
References
- Abdullah M, Islam Z, Khan M (2007) Error-tolerant finite-state recognizer and string pattern similarity based spelling-checker for Bangla. In: Proceeding of 5th international conference on natural language processing (ICON)Google Scholar
- Abeera VP, Aparna S, Rekha RU, Kumar MA, Dhanalakshmi V (2012) Morphological analyzer for Malayalam. In: Data engineering and management, pp 252–254Google Scholar
- Allen JD et al (2012) The unicode standard, vol 3. Mountain view, CAGoogle Scholar
- Ambili T, Panchami KS, Subash N (2016) Automatic error detection and correction in Malayalam. IJSTE Int J Sci Technol Eng 3(02):92–96Google Scholar
- Angell RC, Freund GE, Willett P (1983) Automatic spelling correction using a tri-gram similarity measure. Inf Process Manag 19(4):255–261CrossRefGoogle Scholar
- Badugu S (2014) Morphology based POS tagging on Telugu. Int J Comput Sci Issues 11(1):181–187Google Scholar
- Balabantaray C, Sahoo B, Swain M, Sahoo K (2012) IIIT-Bh FIRE 2012 submission: MET Track Odia, pp 1–3Google Scholar
- Banks T (2008) Strategies, foreign language larning difficulaties and teching. Dominican University of California, San RafaelGoogle Scholar
- Bansal A, Banerjee E, Jha GN (2013) Corpora creation for Indian language technologies—The ILCI Project. In: The 6th proceedings of language technology conference (LTC ‘13)Google Scholar
- Bhatti Z, Ismaili IA (2016) Phonetic-based Sindhi spell-checker system using a hybrid model. Digit Scholarsh Humanit 31(2):264–282CrossRefGoogle Scholar
- Bhatti Z, Ismaili IA, Shaikh AA, Javaid W (2012) Spelling error trends and patterns in Sindhi. J Emerg Trends Comput Inf Sci 3(10):1435–1439Google Scholar
- Bhatti Z, Ismaili IA, Soomro WJ, Hakro DN (2014) Word segmentation model for Sindhi text. Am J Comput Res Repos 2(1):1–7Google Scholar
- Bhowmik K (2014) Development of a word-based spell-checker for Bangla language. Military Institute of Science and Technology, United International University, DhakaGoogle Scholar
- Borah PP, Talukdar G, Baruah A (2014) Assamese word sense disambiguation using supervised learning. In: International conference on contemporary computing and informatics (IC3I). IEEE, pp 946–950Google Scholar
- Bruno M, Silva MJ (2004) Spelling correction for search engine queries. In: Advanced natural language processing. Springer, Berlin, pp 372–383Google Scholar
- Budanitsky A, Hirst G (2006) Evaluating wordnet-based measures of lexical semantic relatedness. Comput Linguist 32(1):13–47zbMATHCrossRefGoogle Scholar
- Budgen D, Brereton P (2006) Performing systematic literature reviews in software engineering. In: ICSE’06 Proceedings of the 28th international conference on Software engineering, pp 1051–1052Google Scholar
- Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, pp 161–175Google Scholar
- Chakrabarti B (1994) A comparative study of Santali and Bengali. K.P. Bagchi & Co., KolkataGoogle Scholar
- Chaudhuri BB (2001) Reversed word dictionary and phonetically similar word grouping based spell-checker to Bangla text. In: Proceeding of LESAL Workshop, MumbaiGoogle Scholar
- Chaudhuri BB (2002) Towards Indian language spell-checker design. In: Proceedings—language engineering conference, LEC 2002, pp 139–146Google Scholar
- Choudhury R, Deb N, Kashyap K (2019) Context sensitive spelling checker for Assamese language. In: Kalita J, Balas V, Borah S, Pradhan R (eds) Recent developments in machine learning and data analytics. Springer, Singapore, pp 177–188CrossRefGoogle Scholar
- Cordeiro de Amorim R, Zampieri M (2013) Recent advances in natural language processing. In: IEEE international conference on recent advances in natural language processing, pp 172–178Google Scholar
- Dahar IA, Abbas F, Rajput U, Hussain A, Azhar F (2018) An efficient Sindhi spelling checker for microsoft word. Int J Comput Sci Netw Secur 18(5):144–150Google Scholar
- Damerau FJ (1964) A technique for computer detection and correction of spelling errors. Commun ACM 7(3):171–176CrossRefGoogle Scholar
- Das M, Borgohain S, Gogoi J, Nair SB (2002a) Design and implementation of a spell-checker for Assamese. In: Language engineering conference, Proceedings IEEE, pp 156–162Google Scholar
- Das M, Borgohain S, Gogoi J (2002b) Design and implementation of a spell-checker for Assamese. In: Language engineering conference, proceedings IEEE, pp 156–162Google Scholar
- Das M, Borgohain S, Gogoi J, Nair SB (2002c) Design and implementation of a spell-checker for Assamese. In: Proceedings—language engineering conference, LEC 2002, pp 156–162Google Scholar
- Daud A, Khan W, Che D (2016) Urdu language processing: a survey. Artif Intell Rev 47(3):279–311CrossRefGoogle Scholar
- Dhanabalan T, Parthasarathi R, Geetha TV (2003) Tamil spell-checker. In: 6th Tamil internet conference, Chennai, Tamilnadu, India, pp 18–27Google Scholar
- Dhanju KS, Lehal GS, Saini TS, Kaur A (2015) Design and implementation of Punjabi spell-checker. Int J Sci Technol 8(27):1–12Google Scholar
- Dongre VJ, Mankar VH (2010) A review of research on Devnagari character recognition. Int J Comput Appl 12(2):8–15Google Scholar
- Dowlagar S, Mamidi R (2015) A semi supervised dialog act tagging for Telugu. In: Proceedings of the 12th international conference on natural language processing, pp 376–383Google Scholar
- Etoori P, Chinnakotla M, Mamidi R (2018) Automatic spelling correction for resource-scarce languages using deep learning. In: Proceeding of ACL 2018, Student research workshop, pp 146–152Google Scholar
- Fossati F, Di Eugenio B (2007) I saw TREE trees in the park: how to correct real-word spelling mistakes. In: LREC, pp 896–901Google Scholar
- Ganfure GO, Midekso D (2014) Design and implementation of morphology based spell-checker. Int J Sci Technol Res 3(12):118–125Google Scholar
- Ghafour HHA, El-bastawissy A, Heggazy AFA (2011) AEDA : Arabic edit distance algorithm towards a new approach for Arabic name matching. In: IEEE, international conference on computer engineering and systems, pp 307–311Google Scholar
- Gokcay E, Gokcay D (1995) Combining statistics and heuristics for language identification. In: Proceedings of the 4th annual symposium on document analysis and information retrievalGoogle Scholar
- Gottron T, Lipka N (2010) A comparison of language identification approaches on short, query-style texts. Lecture notes in computer science, pp 611–614Google Scholar
- Goyal V, Lehal GS (2010) Automatic standardization of spelling variations of Hindi text. In: International conference on computer and communication technology ICCCT 2010, pp 764–767Google Scholar
- Gupta V (2014) Automatic stemming of words for Punjabi. In: Advances in signal processing and intelligent recognition systems, pp 73–84Google Scholar
- Gupta P, Goyal V (2009) Implementation of rule-based algorithm for Sandhi-Vicheda of compound Hindi words. Int J Comput Sci Issues 3:45–49Google Scholar
- Gupta V, Lehal GS (2011) Punjabi language stemmer for nouns and proper names. In: Proceedings of the 2nd workshop on South and Southeast Asian Natural Language Processing (WSSANLP), IJCNLP 2011, pp 35–39Google Scholar
- Gupta V, Lehal GS (2019) Complete pre processing phase of Punjabi text extractive summarization system. In: Proceedings of COLING 2012: demonstration papers, pp 199–206Google Scholar
- Harrison GL, Goegan LD, Jalbert R, Mcmanus K, Sinclair K, Spurling J (2016) Predictors of spelling and writing skills in first and second language learners. Read Writ 29(1):69–89CrossRefGoogle Scholar
- Hassan A, Amin MR, Al Azad AK, Mohammed N (2017) Sentiment analysis on Bangla and Romanized Bangla text using deep recurrent models. In: IWCI 2016—2016 international workshop on computational intelligence, pp 51–56Google Scholar
- Hayes B, Lahiri A (1991) Bengali international phonology. Nat Lang Linguist Theory 9(1):47–96CrossRefGoogle Scholar
- Hema PH, Sunitha C (2016) Malayalam spell-checker using N-gram method. In: Computational intelligence in data mining-advances in intelligent systems and computing, vol 1, pp 217–225Google Scholar
- Heshaam F (2010) Detection and correction of real-word spelling errors in Persian language. In: IEEE-international conference on natural language processing and knowledge engineering (NLP-KE)Google Scholar
- Hoque T, Kaykobad M (2002) Coding system for Bangla spell-checker. In: 5th international conference on computer and information technology, pp 186–190Google Scholar
- Huang A (2008) Similarity measures for text document clustering. In: Proceedings of the 6th New Zealand computer science research student conference (NZCSRSC2008)Google Scholar
- Humayoun M, Ranta A (2014) Developing Punjabi morphology, corpus and lexicon. In: Proceedings of the 24th Pacific Asia conference on language, information and computationGoogle Scholar
- Hussain I, Saharia N, Sharma U (2011) Development of assamese wordnet. In: Nath B, Sharma U, Bhattacharyya DK (eds) Machine intelligence: recent advances. Narosa Publishing House, ISBN-978-81-8487-140-1Google Scholar
- Iqbal S, Anwar W, Bajwa UI, Rehman Z (2013) Urdu spell-checking: reverse edit distance approach. In: Proceedings of the 4th workshop on South and Southeast Asian Natural Language Processing, pp 58–65Google Scholar
- Islam A, Inkpen D (2009) Real-word spelling correction using google web 1T 3-grams. In: EMNLP’09, conference on empirical methods in natural language processing, pp 1241–1249Google Scholar
- Islam MZ, Uddin M, Khan M (2007) A light weight stemmer for Bengali and its use in spelling checker. In: Proceedings of international conference on digital communication and computer applications (DCCA), pp 19–23Google Scholar
- Jain A, Jain M (2014) Detection and correction of non-word spelling errors in Hindi language. In: International conference on data mining and intelligent computing (ICDMIC)Google Scholar
- Jain U, Kaur J (2015) Text chunker for Punjabi. Int J Curr Eng Technol 5(5):3349–3353Google Scholar
- Jananie S, Sarveswaran K (2014) Hybrid approach for spell-checking of Tamil language. In: Proceedings of the Peradeniya University, International Research Session, vol 18, no 1Google Scholar
- Jindal S (2017) Building English–Punjabi parallel corpus for machine translation. Int J Comput Appl 180(8):26–29Google Scholar
- Justin Z, Dart P (1995) Finding approximate matches in large lexicons. Softw Pract Exp 25(3):331–345CrossRefGoogle Scholar
- Kabeer R, Idicula SM (2014) Text summarization for Malayalam documents—an experience. In: Proceedings of international conference on data science and engineering, ICDSE 2014, pp 145–150Google Scholar
- Kashyap K, Sarma H, Sarma SK (2015) Luitspell: development of an Assamese language spell-checker for open office writer. Eur J Adv Eng Technol 2(5):135–138Google Scholar
- Kashyap L, Joshi SR, Bhattacharyya P (2017) Insights on Hindi WordNet coming from the IndoWordNet. In: The Wordnet in Indian languages, pp 19–43Google Scholar
- Kaur H, Kaur G, Kaur M (2015) Punjabi spell-checker using dictionary clustering. Int J Sci Eng Technol Res 4(7):2369–2374Google Scholar
- Keselj V, Peng F, Cercone N, Thomas C (2003) N-gram based author profiles for authorship attribution. In: Proceedings of the conference of the Pacific association for computational linguistics (PACLING)Google Scholar
- Khan NH, Saha GC (2014) Checking the correctness of Bangla words using N-gram. Int J Comput Appl 89(11):1–3Google Scholar
- Kleenankandy J (2014) Implementation of Sandhi-rule based compound word generator for Malayalam. In: Proceedings of 4th international conference on advances in computing and communications, ICACC 2014, pp 134–137Google Scholar
- Kukich K (1992) Technique for automatically correcting words in text. ACM Comput Surv 24(4):377–439CrossRefGoogle Scholar
- Kumar SS, Suma S, Sneha N (2017) Spell-checker for Kannada OCR. Int Digit Libr Technol Res 1(4):1–12Google Scholar
- Lakshmi K, Babu T (2018) A new hybrid algorithm for Telugu word retrieval and recognition. Int J Intell Eng Syst 11(4):117–127Google Scholar
- Lawaye AA, Purkayastha BS (2016) Design and implementation of spell-checker for Kashmiri. Int J Sci Res 5(7):199–202Google Scholar
- Lehal GS (2007) Design and implementation of Punjabi spell-checker. Int J Syst Cybern Inform 3(8):70–75Google Scholar
- Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10(8):707–710MathSciNetGoogle Scholar
- Mahar JA, Shaikh H, Memon GQ (2012) A model for Sindhi text segmentation in word tokens. Sindh Univ Res J SUR J (Sci Ser) 44(1):43–48Google Scholar
- Mala C, Parameshwari K, Rao GUM, Kulkarni AP (2012) Telugu spell-checker. In: International Telugu internet conference proceedings, pp 1–8Google Scholar
- Mandal P, Hossain BMM (2017a) Clustering-based Bangla Spell-checker. In: IEEE international conference on imaging, vision and pattern recognition (icIVPR)Google Scholar
- Mandal P, Hossain BMM (2017b) A systematic literature review on spell-checkers for Bangla language. Int J Mod Educ Comput Sci 9(6):40–47CrossRefGoogle Scholar
- Manohar N, Lekshmipriya PT, Jayan V, Bhadran VK (2015) Spell-checker for Malayalam using finite state transition models. In: IEEE recent advances in intelligent computational systems, RAICS 2015, pp 157–161Google Scholar
- Mateen A, Malik MK, Nawaz Z, Danish HM, Siddiqui MH (2017) A hybrid stemmer of Punjabi Shahmukhi script. Int J Comput Sci Netw Secur 17(8):90–97Google Scholar
- Mishra D, Venugopalan M, Gupta D (2016) Context-specific lexicon for Hindi reviews. In: 6th international conference on advances in computing and communications, ICACC 2016, vol 93, pp 554–563Google Scholar
- Mittal S, Sethi NS, Sharma SK (2014) Part of speech tagging of Punjabi language using N gram model. Int J Comput Appl 100(19):20–23Google Scholar
- Mohapatra DD (2018) A sketch of Odia morphology. Glob J Res Anal 7(4):80–81Google Scholar
- Mon AM (2012) Spell-checker for Myanmar language. In: International conference on information retrieval and knowledge management (CAMP). IEEE, pp 12–16Google Scholar
- Murthy KN (2001) Computer processing of Kannada language. Workshop at Kannada University, pp 1–10Google Scholar
- Mustafa SH (2005) Character contiguity in N -gram-based word matching: the case for Arabic text searching. Inf Process Manag 41:819–827CrossRefGoogle Scholar
- Naseem T (2004) A hybrid approach for Urdu spell-checking. National University of Computer & Emerging SciencesGoogle Scholar
- Naseem T, Hussain S (2007) A novel approach for ranking spelling error corrections for Urdu. Lang Resour Eval 41(2):117–128CrossRefGoogle Scholar
- Nielsen J (1999) Internet-based spelling checker dictionary system with automatic updatingGoogle Scholar
- Nisha M, Reji Rahmath K, Rekha Raj CT, Reghu Raj PC (2015) Malayalam morphological analysis using MBLP approach. In: Proceedings of international conference on soft-computing and network security, ICSNS 2015Google Scholar
- Pareek G, Modi D (2016) Feature extraction in Hindi text summarization. Ski Res J 6(2):14–19Google Scholar
- Peterson JL (1980) Computer programs for detecting and correcting spelling errors. Commun ACM 23(12):676–687CrossRefGoogle Scholar
- Prathibha RJ, Padma MC (2016) Design of morphological analyzer for Kannada inflectional words using hybrid approach. Int J Comput Linguist Res 7(4):133–161Google Scholar
- Pratip S, Chaudhuri BB (2013) A simple real-word error detection and correction using local word bigram and trigram. In: Proceedings of the 25th conference on computational linguistics and speech processing (ROCLING 2013), pp 211–220Google Scholar
- Puri R, Bedi RPS, Goyal V (2015) Punjabi stemmer using Punjabi wordnet database. Indian J Sci Technol 8(27):1–5CrossRefGoogle Scholar
- Rahman MU (2015) Towards Sindhi corpus construction. In: Conference on language and technology, pp 1–6CrossRefGoogle Scholar
- Rahutomo F, Kitasuka T, Aritsugi M (2012) Semantic cosine similarity. In: 7th international student conference on advanced science and technology ICASTGoogle Scholar
- Rajashekara Murthy S, Akshatha AN, Upadhyaya CG, Ramakanth Kumar P (2017) Kannada spell-checker with Sandhi splitter. In: International conference on advances in computing, communications and informatics, ICACCI 2017, pp 950–956Google Scholar
- Rajashekara Murthy S, Madi V, Sachin D, Ramakanth PK (2012) A non-word Kannada spell-checker using morphological analyzer and dictionary lookup method. Int J Eng Sci Emerg Technol 2(2):43–52Google Scholar
- Rama T, Sowmya V (2018) A dependency treebank for Telugu. In: Proceedings of the 16th international workshop on treebanks and linguistics theories, pp 119–128Google Scholar
- Robertson AM, Willet P (1998) Applications of N-grams in textual information systems. J Doc 54(1):48–67CrossRefGoogle Scholar
- Rout Y, Santi PK, Subudhi S, Sahu B (2013) An approach for designing Odia spell-checker. In: National conference on recent advances on business intelligence & data mining (RABIDM 2013), pp 1–7Google Scholar
- Saharia N (2011) A first step towards parsing of Assamese text. Spec Vol Probl Parsing Indian Lang 11(5):30–34Google Scholar
- Saharia N, Konwar KM (2012) LiuitPad: a fully unicode compatible Assamese writing software. In: Proceedings of the 2nd workshop an advances in text input methods (WTIM 2) COLLING 2012, pp 79–88Google Scholar
- Saharia N, Das D, Sharma U, Kalita J (2009) Part of speech tagger for Assamese text. In: Proceedings of the ACL-IJCNLP conference short papers, pp 33–36Google Scholar
- Saharia N, Sharma U, Kalita J (2012) Analysis and evaluation of stemming algorithms : a case study with Assamese. In: International conference on advances in computing, communications and informatics, ICACCI 2012, pp 842–846Google Scholar
- Sahoo K, Vidyasagar VE (2003) Kannada WordNet—a lexical database. In: Conference on convergent technologies for Asia-Pacific Region (TENCON 2003), vol 4, pp 1352–1356Google Scholar
- Sakuntharaj R, Mahesan S (2016) A novel hybrid approach to detect and correct spelling in Tamil text. In: International conference on information and automation for sustainability: interoperable sustainable smart systems for next generation, ICIAFS 2016. IEEE, pp 1–6Google Scholar
- Sakuntharaj R, Mahesan S (2017) Use of a novel hash-table for speeding-up suggestions for misspelt Tamil words. In: International conference on industrial and information systems (ICIIS) IEEE, pp 1–5Google Scholar
- Santosh T, Sulochana KG, Kumar RR (2002) Malayalam spell-checker. In: Proceedings of the international conference on universal knowledge and languageGoogle Scholar
- Saranya SK (2008) Morphological analyzer for Malayalam verbs. Amrita Vishwa Vidyapeetham, Amrita School of Engineering, CoimbatoreGoogle Scholar
- Sarma P (2017) An approach to prepare lexicons of Assamese text for unit selection concatenation TTS. Int J Emerg Trends Sci Technol 4(8):5631–5637Google Scholar
- Sarma SK, Medhi R, Gogoi M, Saikia U (2010) Foundation and structure of developing an Assamese wordnet. In: Proceedings of 5th international conference of the global WordNet AssociationGoogle Scholar
- Sarmah J, Barman AK, Sharma SK (2013) Automatic Assamese text categorization using wordnet. In: International conference of advances in computing, communications and informatics IEEE, pp 85–89Google Scholar
- Segar J, Sarveswaran K (2015) Contextual spell-checking for Tamil language. In: 14th Tamil internet conference, pp 1–5Google Scholar
- Sekhar N, Pushpak D, Jyoti B (2017) The WordNet in Indian Languages. Springer Nature, SingaporeGoogle Scholar
- Sethi DP (2014) A survey on Odia computational morphology. Int J Adv Res Comput Eng Technol 3(3):623–625Google Scholar
- Shaalan K, Allam A, Gomah A (2003) Towards automatic spell-checking for Arabic. In: Proceedings of the 4th conference on language engineering, Egyptian Society of language engineering (ELSE), Egypt, pp 240–247Google Scholar
- Shah ZA, Mashori GM (2013) Oxford English-Sindhi dictionary: a critical study in lexicography. ELF Annu Res J 13:37–46Google Scholar
- Shambhavi BR, Ramakanth Kumar P, Srividya K, Jyothi BJ, Kundargi S, Shastri G (2011) Kannada morphological analyser and generator using trie. Int J Comput Sci Netw Secur 11(1):112–116Google Scholar
- Sheykholeslam MH, Minaei-Bidgoli B, Juzi H (2013) A framework for spelling correction in Persian language using noisy channel model. In: LREC, pp 58–65Google Scholar
- Singh A (2016) Review for dialects in Punjabi language. Int J Innov Adv Comput Sci 5(8):25–30Google Scholar
- Singh J, Singh G, Singh R, Singh P (2018) Morphological evaluation and sentiment analysis of Punjabi text using deep learning classification. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2018.04.003 CrossRefGoogle Scholar
- Sinha RMK, Singh KS (1984) A programme for correction of single spelling errors in Hindi words. IETE J Res 30(6):249–251CrossRefGoogle Scholar
- Solak A (1993) Design and implementation of a spelling checker for Turkish. Institute of Engineering & Sciences, Bilkent University, AnkaraCrossRefGoogle Scholar
- Sooraj S, Manjusha K, Anand Kumar M, Soman KP (2018) Deep learning based spell-checker for Malayalam language. J Intell Fuzzy Syst 34(3):1427–1434CrossRefGoogle Scholar
- Strnad J (2001) Hindi dictionaries and the Hindi lexicographical corpus. Festschrift Helmut Nespital, pp 1–14Google Scholar
- Subhashini R, Kumar VJS (2010) Evaluating the performance of similarity measures used in document clustering and information retrieval. In: IEEE, 1st international conference on integrated intelligent computingGoogle Scholar
- Tomovic A, Janicic P, Keselj V (2006) N-gram based classification and unsupervised hierarchical clustering of genome sequences. Comput Methods Program Biomed 81:137–153CrossRefGoogle Scholar
- Uzzaman N, Khan M (2006) A comprehensive Bangla spelling checker. BRAC University, DhakaGoogle Scholar
- Varghese ST, Sulochana KG, Kumar RR (2002) Malayalam spell-checker. In: Proceedings of the international conference on universal knowledge and languageGoogle Scholar
- Veerappan R, Antony PJ, Saravanan S, Soman KP (2011) A rule-based Kannada morphological analyzer and generator using finite state transducer. Int J Comput Appl 27(10):45–52Google Scholar
- Verberne S (2002) Context-sensitive spell-checking based on word trigram probabilitiesGoogle Scholar
- Wasala A, Weerasinghe R, Pushpananda R (2010) A data-driven approach to checking and correcting spelling errors in Sinhala. Int J Adv ICT Emerg Reg 03(01):11–24Google Scholar
- Wu S, Mamber U (1992) AGREP—a fast approximate pattern matching tool. In: Proceedings of the Winter 1992 USENIX conference San Francisco USA. Berkeley, pp 153–162Google Scholar
- Yue T, Briand LC, Labiche Y (2011) A systematic review of transformation approaches between user requirements and analysis models. Requir Eng 16(2):75–99CrossRefGoogle Scholar
- Zampieri M, Cordeiro de Amorim R (2014) Between sound and spelling: combining phonetics and clustering algorithms to improve target word recovery. In: International conference on natural language processing, pp 438–449CrossRefGoogle Scholar
- Zhang Y, Zhao X (2013) Automatic error detection and correction of text: the state of the art. In: 6th international conference on intelligent networks and intelligent systems, ICINIS, pp 274–277Google Scholar
- Zhuang L, Bao T, Zhu X, Wang C, Naoi S (2004) A Chinese OCR spelling check approach based on statistical language models. In: International conference on systems, man and cybernetics, IEEE, vol 5, pp 4727–4732Google Scholar