Stemming is a process by which word endings or other affixes are removed or modified in order that word forms which differ in non-relevant ways may be merged and treated as equivalent. A computer program which performs such a transformation is referred to as a stemmer or stemming algorithm. The output of a stemming algorithm is known as a stem.
The need for stemming first arose in the field of information retrieval (IR), where queries containing search terms need to be matched against document surrogates containing index terms. With the development of computer-based systems for IR, the problem immediately arose that a small difference in form between a search term and an index term could result in a failure to retrieve some relevant documents. Thus, if a query used the term “explosion” and a document was indexed by the term “explosives,” there would be no match on this term (whether or...
KeywordsRetrieval Performance Information Retrieval System String Match Distinct Word Lexical Chain
- 4.Aljlayl M, Frieder O. On arabic search: improving the retrieval effectiveness via a light stemming approach. In Proceedings of international conference on information and knowledge management. 2002. p. 340–7.Google Scholar
- 11.Lovins JB. Development of a stemming algorithm. Mech Transl Comput Linguist. 1968;11:22–31.Google Scholar