Skip to main content

Identifying Quotations in Reference Works and Primary Materials

  • Conference paper
Research and Advanced Technology for Digital Libraries (ECDL 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5173))

Included in the following conference series:

Abstract

Identifying quotations from reference works in primary materials is a very important feature for digital libraries. By adding corresponding citation links to the original text, we can help contextualize the source material. In this paper we introduce an algorithm for identifying citations automatically based on an analysis of the structure of quotations from three different reference works of Latin texts. An evaluation shows that this approach is capable of finding a large number of quotations with which no machine actionable citations are associated. Additionally this approach can be applied for quotations that have been altered in a range of ways from their source.

This work was supported by a grant from the Mellon Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Crane, G.: What Do You Do With A Million Books? D-Lib Magazine 12 (2006), http://www.dlib.org/dlib/march06/crane/03crane.html

  2. Stewart, G., Crane, G., Babeu, A.: A New Generation of Textual Corpora: Mining Corpora from Very Large Collections. In: JCDL 2007: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 356–365. ACM Press, New York (2007)

    Google Scholar 

  3. Kinable, G.: Computerized Restoration of Historical Dictionaries: Uniformization and Date-assigning in Dictionary Quotations of the Woordenboek der Nederlandsche Taal. Literary & Linguistic Computing 21, 295–310 (2006)

    Article  Google Scholar 

  4. Pouliquen, B., Steinberger, R., Best, C.: Automatic Detection of Quotations in Multilingual News. In: Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2007) (2007)

    Google Scholar 

  5. Lukashenko, R., Graudina, V., Grundspenkis, J.: Computer-Based Plagiarism Detection Methods and Tools: an Overview. In: Rachev, B., Smrikarov, A., Dimov, D. (eds.) CompSysTech 2007: Proceedings of the 2007 International Conference on Computer Systems and Technologies, Article no. 40. ACM Press, New York (2007)

    Google Scholar 

  6. Brin, S., Davis, J., García-Molina, H.: Copy Detection Mechanisms for Digital Documents. In: Carey, M., Schneider, D. (eds.) Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pp. 398–409. ACM Press, New York (1995)

    Chapter  Google Scholar 

  7. Hoad, T.C., Zobel, J.: Methods for Identifying Versioned and Plagiarized Documents. Journal of the ASIS&T 54, 203–215 (2003)

    Google Scholar 

  8. Zaslavsky, A., Bia, A., Monostori, K.: Using Copy-Detection and Text Comparison Algorithms for Cross-Referencing Multiple Editions of Literary Works. In: Constantopoulos, P., Sølvberg, I.T. (eds.) ECDL 2001. LNCS, vol. 2163, pp. 103–114. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Stein, B., Meyer zu Eissen, S.: Near Similarity Search and Plagiarism Analysis. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds.) From Data and Information Analysis to Knowledge Engineering, pp. 430–437. Springer, Berlin (2005)

    Google Scholar 

  10. Metzler, D., Dumais, S., Meek, C.: Similarity Measures for Short Segments of Text. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 16–27. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Metzler, D., Bernstein, Y., Croft, B.W., Moffat, A., Zobel, J.: Similarity measures for tracking information flow. In: CIKM 2005: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 517–524. ACM Press, New York (2005)

    Chapter  Google Scholar 

  12. Lee, J.: A Computational Model of Text Reuse in Ancient Literary Texts. In: 45th Annual Meeting of the Association of Computational Linguistics, pp. 472–479. ACL (2007)

    Google Scholar 

  13. Takeda, M., Fukuka, T., Nanri, I., Yamasaki, M., Tamari, K.: Discovering Instances of Poetic Allusion from Anthologies of Classical Japanese Poems. Theoretical Computer Science 292, 497–524 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hori, H., Shimozono, S., Takeda, M., Shinohara, A.: Fragmentary Pattern Matching: Complexity, Algorithms and Applications for Analyzing Classic Literary Works. In: Eades, P., Takaoka, T. (eds.) ISAAC 2001. LNCS, vol. 2223, pp. 719–730. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  15. Ernst-Gerlach, A., Fuhr, N.: Generating Search Term Variants for Text Collections with Historic Spellings. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 49–60. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Birte Christensen-Dalsgaard Donatella Castelli Bolette Ammitzbøll Jurik Joan Lippincott

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ernst-Gerlach, A., Crane, G. (2008). Identifying Quotations in Reference Works and Primary Materials. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2008. Lecture Notes in Computer Science, vol 5173. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87599-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87599-4_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87598-7

  • Online ISBN: 978-3-540-87599-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics