Skip to main content

A Fuzzy String Matching-Based Reduplication with Morphological Attributes

  • Conference paper
  • First Online:
Pattern Recognition and Data Analysis with Applications

Abstract

String matching is a common problem in the field of computer science, and it is a common operation in various language processing tasks. Several efficient algorithms have been developed for string matching problems like Knuth-Morris-Pratt (KMP) algorithm [1], Rabin-Karp’s algorithm [2], matching using a finite-state machine, etc. But in natural language processing, the problem related to string matching is much complex and the conventional string matching algorithm does not fulfil their requirement. This paper presented one such issue related to the string matching on partial reduplication. In that context, a fuzzy-based string matching technique has been proposed. In this approach, the fuzzy membership is not only considered based on the character/symbol or sub-string matching rather some other grammatical information like morphological information, prosodic pattern, etc., are considered. The experiment is done on the Bengali dataset, and finally, the system is tested on real-life text to measure the accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Knuth, D., Pratt, M.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)

    Article  MathSciNet  Google Scholar 

  2. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)

    Article  MathSciNet  Google Scholar 

  3. Rubino, C.: Reduplication. Max Planck Institute for Evolutionary Anthropology, Leipzig (2013)

    Google Scholar 

  4. Senapati, A., Garain, U.: A computational approach for corpus based analysis of reduplicated words in Bengali. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 456–466. Springer, Cham (2015)

    Google Scholar 

  5. Millward, C.M., Hayes, M.: A Biography of the English Language. Nelson Education (2011)

    Google Scholar 

  6. Crystal, D.: The Cambridge Encyclopedia of the English Language. Ernst Klett Sprachen (2004)

    Google Scholar 

  7. Burridge, K.: Gift of the Gob: Morsels of English Language History. Harper Collins (2010)

    Google Scholar 

  8. Chattopadhyay, S.K.: Bhasa-Prakash Bangala Vyakaran, 3rd edn. Pupa publication (1992)

    Google Scholar 

  9. Chaudhuri, B.B.: Bangla Dhwanipratik: Swarup o Abhidhan (Bangla Sound Symbolism: Properties and Dictionary). Paschimbanga Bangla Academy, Kolkata (2010)

    Google Scholar 

  10. Thompson, H.R.: Bengali: A Comprehensive Grammar, pp. 663–672. Routledge publication (2010)

    Google Scholar 

  11. Bandyopadhyay, S.: Identification of reduplication in Bengali corpus and their semantic analysis: a rule-based approach. In: Proceedings of the Workshop on Multiword Expressions: From Theory to Applications (MWE 2010), pp. 72–75. Beijing (2010)

    Google Scholar 

  12. Dash, N.: A Descriptive Study of Bengali Words, pp. 225–251. CUP (2015)

    Google Scholar 

  13. Dolatian, H., Heinz, J.: Modeling reduplication with 2-way finite-state transducers. In: Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology, pp. 66–77 (2018)

    Google Scholar 

  14. Walther, M.: Finite-state reduplication in one-level prosodic morphology. arXiv preprint arXiv:cs/0005025v1 (2000)

  15. Beesley, K.R., Lauri, K.: Finite-State Morphology: Xerox Tools and Techniques. CSLI, Stanford (2003)

    Google Scholar 

  16. Cohen-Sygal, Y., Shuly, W.: Finite-state registered automata for non-concatenative morphology. Comput. Linguist. 32(1), 49–82 (2006)

    Article  MathSciNet  Google Scholar 

  17. Hulden, M.: Finite-state machine construction methods and algorithms for phonology and morphology (2009)

    Google Scholar 

  18. Hulden, M., Shannon, T.B.: A simple formalism for capturing reduplication in finite-state morphology. In: Proceedings of the 2009 Conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008, pp. 207–214 (2009)

    Google Scholar 

  19. Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumen Maji .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Senapati, A., Mondal, A., Maji, S. (2022). A Fuzzy String Matching-Based Reduplication with Morphological Attributes. In: Gupta, D., Goswami, R.S., Banerjee, S., Tanveer, M., Pachori, R.B. (eds) Pattern Recognition and Data Analysis with Applications. Lecture Notes in Electrical Engineering, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-19-1520-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-1520-8_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-1519-2

  • Online ISBN: 978-981-19-1520-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics