Advertisement

Machine Learning

, Volume 34, Issue 1–3, pp 211–231 | Cite as

An Algorithm that Learns What's in a Name

  • Daniel M. Bikel
  • Richard Schwartz
  • Ralph M. Weischedel
Article

Abstract

In this paper, we present IdentiFinderTM, a hidden Markov model that learns to recognize and classify names, dates, times, and numerical quantities. We have evaluated the model in English (based on data from the Sixth and Seventh Message Understanding Conferences [MUC-6, MUC-7] and broadcast news) and in Spanish (based on data distributed through the First Multilingual Entity Task [MET-1]), and on speech input (based on broadcast news). We report results here on standard materials only to quantify performance on data available to the community, namely, MUC-6 and MET-1. Results have been consistently better than reported by any other learning algorithm. IdentiFinder's performance is competitive with approaches based on handcrafted rules on mixed case text and superior on text where case information is not available. We also present a controlled experiment showing the effect of training set size on performance, demonstrating that as little as 100,000 words of training data is adequate to get performance around 90% on newswire. Although we present our understanding of why this algorithm performs so well on this class of problems, we believe that significant improvement in performance may still be possible.

named entity extraction hidden Markov models 

References

  1. Aberdeen, J., Burger, J., Day, D., Hirschman, L., Robinson, P., & Vilain, M. (1995). MITRE: Description of the Alembic system used for MUC-6. Proceedings of the Sixth Message Understanding Conference (MUC-6) (pp. 141–155). Columbia, Maryland: Morgan Kaufmann Publishers, Inc.Google Scholar
  2. Appelt, D.E., Jerry, R.H., Bear, J., Israel, D., Kameyama, M., Kehler, A., Martin, D., Myers, K., & Tyson, M. (1995). SRI international FASTUS system MUC-6 test results and analysis. Proceedings of the Sixth Message Understanding Conference (MUC-6) (pp. 237–248). Columbia, Maryland: Morgan Kaufmann Publishers, Inc.Google Scholar
  3. Bennett, S.W., Aone, C., & Lovell, C. (1997). Learning to tag multilingual texts through observation. Proceedings of the Second Conference on Empirical Methods in Natural Language Processing (pp. 109–116). Providence, Rhode Island: Morgan Kaufmann Publishers, Inc.Google Scholar
  4. Borthwick, A., Sterling, J., Agichtein, E., & Grishman, R. (1998). Description of the MENE named entity system as used in MUC-7. Proceedings of the Seventh Message Understanding Conference (MUC-7). Fairfax, Virginia: Morgan Kaufmann Publishers, Inc.Google Scholar
  5. Brill, E. (1995). Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics, 21(4), 543–565.Google Scholar
  6. Chinchor, N. (1995). Statistical significance of MUC-6 results. Proceedings of the Sixth Message Understanding Conference (MUC-6) (pp. 39–43). Columbia, Maryland: Morgan Kaufmann Publishers, Inc.Google Scholar
  7. Chinchor, N. (1998). MUC-7 named entity task definition dry run version, version 3.5, 17 September 1997. Proceedings of the Seventh Message Understanding Conference (MUC-7) (to appear). Fairfax,Virginia: Morgan Kaufmann Publishers, Inc. URL: ftp://online.muc.saic.com/NE/training/guidelines/NE.task.def.3.5.ps.Google Scholar
  8. Church, K. (1988). A stochastic parts program and noun phrase parser for unrestricted text. Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas.Google Scholar
  9. Krupka, G. (1995). SRA: Description of the SRA system as used for MUC-6. Proceedings of the Sixth Message Understanding Conference (MUC-6) (pp. 221–235). Columbia, Maryland: Morgan Kaufmann Publishers, Inc.Google Scholar
  10. Merchant, R., Okurowski, M., & Chinchor, N. (1996). The multilingual entity task overview. Proceedings of the Tipster Text Program Phase II (pp. 445–447). Vienna, Virginia: Morgan Kaufmann Publishers, Inc.Google Scholar
  11. Rabiner, L.R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE.Google Scholar
  12. Sundheim, B., & Chinchor, N. (1995). Named entity task definition (version 2.1). Proceedings of the Sixth Message Understanding Conference (MUC-6) (pp. 319–332). Columbia, Maryland: Morgan Kaufmann Publishers, Inc.Google Scholar
  13. Viterbi, A.J. (1967). Error bounds for convolutional codes and an asympotically optimum decoding algorithm. IEEE Transactions on Information Theory, IT-13(2), 260–269.Google Scholar
  14. Weischedel, R. (1995). BBN: Description of the PLUM system as used for MUC-6. Proceedings of the Sixth Message Understanding Conference (MUC-6) (pp. 55–69). Columbia, Maryland: Morgan Kaufmann Publishers, Inc.Google Scholar
  15. Weischedel, R., Meteer, M., Schwartz, R., Ramshaw, L., & Palmucci, J. (1993). Coping with ambiguity and unknown words through probabilistic methods. Computational Linguistics, 19(2), 359–382.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Daniel M. Bikel
    • 1
  • Richard Schwartz
    • 1
  • Ralph M. Weischedel
    • 1
  1. 1.BBN Systems and TechnologiesCambridge

Personalised recommendations