Detecting Citation Types Using Finite-State Machines

  • Minh-Hoang Le
  • Tu-Bao Ho
  • Yoshiteru Nakamori
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


This paper presents a method to extract citation types from scientific articles, viewed as an intrinsic part of emerging trend detection (ETD) in scientific literature. There are two main contributions in this work: (1) Definition of six categories (types) of citations in the literature that are extractable, human-understandable, and appropriate for building the interest and utility functions in emerging trend detection models, and (2) A method to classify citation types using finite-state machines which does not require user-interactions or explicit knowledge. The experimental comparative evaluations show the high performance of the method and the proposed ETD model shows the crucial role of classified citation types in the detection of emerging trends in scientific literature.


Hide Markov Model Concept Hierarchy Linguistic Pattern Training Sentence Citation Type 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kontostathis, A., Galitsky, L., Pottenger, W.M., Roy, S., Phelps, D.J.: A survey of emerging trend detection in textual data mining. In: Berry, M. (ed.) A Comprehensive Survey of Text Mining, ch. 9. Springer, Heidelberg (2003)Google Scholar
  2. 2.
    Pottenger, W.M., Yang, T.-H.: Detecting emerging concepts in textual data mining. Computational information retrieval, 89–105 (2001)Google Scholar
  3. 3.
    Swan, R., Allan, J.: Automatic generation of overview timelines. In: SIGIR 2000: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 49–56. ACM Press, New York (2000)Google Scholar
  4. 4.
    Small, H.: Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society of Information Science 24, 265–269 (1973)CrossRefGoogle Scholar
  5. 5.
    Lawrence, S., Giles, C.L., Bollacker, K.: Digital libraries and autonomous citation indexing. IEEE Computer 32(6), 67–71 (1999)CrossRefGoogle Scholar
  6. 6.
    Kostoff, R.N., del Rio, J.A., Humenik, J.A., Garcia, E.O., Ramirez, A.M.: Citation mining: integrating text mining and bibliometrics for research user profiling. Journal of the American Society for Information Science and Technology 52(13), 1148–1156 (2001)CrossRefGoogle Scholar
  7. 7.
    Gevry, D.R.: Detection of emerging trends: Automation of domain expert practices (2002)Google Scholar
  8. 8.
    Nanba, H., Okumura, M.: Towards multi-paper summarization using reference information. In: Proceedings of 16th International Joint Conference on Artificial Intelligence – IJCAI 1999, pp. 926–931 (1999)Google Scholar
  9. 9.
    Teufel, S.: Argumentative Zoning: Information Extraction from Scientific Text. PhD thesis, University of Edinburgh (1999)Google Scholar
  10. 10.
    Pham, S.B., Hoffmann, A.G.: A new approach for scientific citation classification using cue phrases. In: Australian Conference on Artificial Intelligence, pp. 759–771 (2003)Google Scholar
  11. 11.
    Weinstock, M.: Citation indexes. Encyclopedia of Library and Information Science 5, 16–41 (1971)Google Scholar
  12. 12.
    Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  13. 13.
    McCallum, A., Freitag, D., Pereira, F.: Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of the 17th International Conference on Machine Learning, pp. 591–598 (2000)Google Scholar
  14. 14.
    Darroch, J.N., Ratcliff, D.: Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 1470–1480 (1972)Google Scholar
  15. 15.
    Wordnet:Alexical database for the english language,

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Minh-Hoang Le
    • 1
  • Tu-Bao Ho
    • 1
  • Yoshiteru Nakamori
    • 1
  1. 1.School of Knowledge ScienceJapan Advanced Institute of Science and TechnologyIshikawaJapan

Personalised recommendations