Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 467))

Abstract

Bibliographic references between scholarly publications contain valuable information for researchers and developers involved with digital repositories. They are indicators of topical similarity between linked texts, impact of the referenced document, and improve navigation in user interfaces of digital libraries. Consequently, several approaches to extraction, parsing and resolving said references have been proposed to date. In this paper we develop a methodology for evaluating parsing and matching algorithms and choosing the most appropriate one for a document collection at hand. We apply the methodology for evaluating reference parsing and matching module of the YADDA2 software platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apache Solr, http://lucene.apache.org/solr/

  2. PostgreSQL, http://www.postgresql.org/

  3. Bolelli, L., Ertekin, S., Giles, C.L.: Clustering Scientific Literature Using Sparse Citation Graph Analysis. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 30–41. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Transactions on Knowledge and Data Engineering (2011)

    Google Scholar 

  5. Elmagarmid, A., Ipeirotis, P., Verykios, V.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)

    Article  Google Scholar 

  6. Garfield, E.: Citation Indexing: Its Theory and Application in Science, Technology, and Humanities. John Wiley & Sons, New York (1979)

    Google Scholar 

  7. Garfield, E.: The history and meaning of the journal impact factor. Journal of the American Medical Association 295(1), 90–93 (2006)

    Article  Google Scholar 

  8. Giles, C., Bollacker, K., Lawrence, S.: CiteSeer: An automatic citation indexing system. In: Proceedings of the Third ACM Conference on Digital Libraries, pp. 89–98. ACM (1998)

    Google Scholar 

  9. Goutorbe, C.: Document Interlinking in a Digital Math Library. In: Towards a Digital Mathematics Library, pp. 85–94 (2009)

    Google Scholar 

  10. Hirsch, J.E.: An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America 102(46) (2005)

    Google Scholar 

  11. Hitchcock, S.M., Carr, L.A., Harris, S.W., Hey, J.M.N., Hall, W.: Citation Linking: Improving Access to Online Journals. Proceedings of Digital Libraries 97, 115–122 (1997)

    Google Scholar 

  12. Lawrence, S., Giles, C.L., Bollacker, K.D.: Autonomous citation matching. In: Etzioni, O., Müller, J.P., Bradshaw, J.M. (eds.) Proceedings of the Third Annual Conference on Autonomous Agents AGENTS 1999, vol. 1, pp. 392–393. ACM Press (1999)

    Google Scholar 

  13. Liao, Z., Zhang, Z.: A Generalized Joint Inference Approach for Citation Matching. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 601–607. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Macskassy, S.A., Provost, F.: Classification in Networked Data: A Toolkit and a Univariate Case Study. Journal of Machine Learning Research 8, 935–983 (2007)

    Google Scholar 

  15. McCallum, A., Nigam, K., Rennie, J.: Automating the construction of internet portals with machine learning. Information Retrieval, 127–163 (2000)

    Google Scholar 

  16. Pasula, H., Marthi, B., Milch, B., Russell, S., Shpitser, I.: Identity uncertainty and citation matching. In: Proceedings of NIPS 2002. MIT Press (2002)

    Google Scholar 

  17. Poon, H., Domingos, P.: Joint Inference in Information Extraction. In: Artificial Intelligence, vol. 22, pp. 913–918. AAAI Press (2007)

    Google Scholar 

  18. Sylwestrzak, W., Borbinha, J., Bouche, T., Nowiski, A., Sojka, P.: EuDML Towards the European Digital Mathematics Library. In: Towards a Digital Mathematics Library, pp. 11–26 (2010), http://www.eudml.eu/

  19. Tkaczyk, D., Bolikowski, L., Czeczko, A., Rusek, K.: A modular metadata extraction system for born-digital articles. In: 10th IAPR International Workshop on Document Analysis Systems, pp. 11–16 (2012)

    Google Scholar 

  20. Wellner, B., McCallum, A., Peng, F., Hay, M.: An integrated, conditional model of information extraction and coreference with application to citation matching. In: Proc. UAI, pp. 593–601 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mateusz Fedoryszak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Fedoryszak, M., Bolikowski, Ł., Tkaczyk, D., Wojciechowski, K. (2013). Methodology for Evaluating Citation Parsing and Matching. In: Bembenik, R., Skonieczny, L., Rybinski, H., Kryszkiewicz, M., Niezgodka, M. (eds) Intelligent Tools for Building a Scientific Information Platform. Studies in Computational Intelligence, vol 467. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35647-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35647-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35646-9

  • Online ISBN: 978-3-642-35647-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics