Skip to main content

A Case Study of Using Domain Engineering for the Conflation Algorithms Domain

  • Conference paper
Formal Foundations of Reuse and Domain Engineering (ICSR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5791))

Included in the following conference series:

Abstract

In this study we used domain engineering as a method for gaining deeper formal understanding of a class of algorithms. Specifically, we analyzed 6 stemming algorithms from 4 different sub-domains of the conflation algorithms domain and developed formal domain models and generators based on these models. The application generator produces source code for not only affix removal but also successor variety, table lookup, and n-gram stemmers. The performance of the generated stemmers was compared with the stemmers developed manually in terms of stem similarity, source, and executable sizes, and development and execution times. Five of the stemmers generated by the application generator produced more than 99.9% identical stems with the manually developed stemmers. Some of the generated stemmers were as efficient as their manual equivalents and some were not.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • (n.d.) New Yorker Magazine (retrieved April 12, 2007), http://www.newyorker.com

  • (n.d.) Sample Corpus of Professional Spoken English (retrieved April 12, 2007), http://www.athel.com/sample.html

  • (n.d.) Harpers Magazine (retrieved April 12, 2007), http://www.harpers.com

  • (n.d.) Washington Post New Paper (retrieved April 12, 2007), http://www.washingtonpost.com

  • Adamson, G., Boreham, J.: The use of an association measure based on character structure to identify semantically related pairs of words and document titles. Information Storage and Retrieval, 253–260 (1974)

    Google Scholar 

  • Dawson, J.L.: Suffix removal and word conflation. ALLC Bulletin, 33–46 (1974)

    Google Scholar 

  • Fox, B., Fox, C.J.: Efficient Stemmer generation. Information Processing and Management: an International Journal, 547–558 (2002)

    Google Scholar 

  • Frakes, W.B.: Stemming Algorithms. In: Frakes, W.B.-Y. (ed.) Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  • Frakes, W.B., Fox, C.J.: Strength and similarity of affix removal stemming algorithms. SIGIR Forum, 26–30 (2003)

    Google Scholar 

  • Frakes, W.: A Method for Bounding Domains. In: IASTED International Conference Software Engineering and Applications 2000, Las Vegas, NV (2000)

    Google Scholar 

  • Frakes, W., Kang, K.: Software Reuse Research: Status and Future. IEEE Transactions on Software Engineering, 529–536 (2005)

    Google Scholar 

  • Frakes, W., Prieto-Diaz, R., Fox, C.J.: DARE: Domain analysis and reuse environment. Annals of Software Engineering, 125–141 (1998)

    Google Scholar 

  • Hafer, M., Weiss, S.: Word segmentation by letter successor varieties. Information Storage and Retrieval, 371–385 (1974)

    Google Scholar 

  • Harman, D.: How Effective is Suffixing? Journal of the American Society for Information Science, 7–15 (1991)

    Google Scholar 

  • Krovetz, R.: Viewing morphology as an inference process. In: 16th ACM SIGIR conference, Pittsburgh, PA, pp. 191–202 (1993)

    Google Scholar 

  • Lovins, J.B.: Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, 22–31 (1968)

    Google Scholar 

  • Paice, C.D.: Another Stemmer. SIGIR Forum, 56–61 (1990)

    Google Scholar 

  • Porter, M.: An algorithm for suffix stripping. Program, 130–137 (1980)

    Google Scholar 

  • Salton, G.: Automatic information organization and retrieval. Mc Graw Hill, New York (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yilmaz, O., Frakes, W.B. (2009). A Case Study of Using Domain Engineering for the Conflation Algorithms Domain. In: Edwards, S.H., Kulczycki, G. (eds) Formal Foundations of Reuse and Domain Engineering. ICSR 2009. Lecture Notes in Computer Science, vol 5791. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04211-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04211-9_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04210-2

  • Online ISBN: 978-3-642-04211-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics