Skip to main content

Looking for Information in Documents

  • Chapter
Fundamentals of Predictive Text Mining

Part of the book series: Texts in Computer Science ((TCS))

  • 3339 Accesses

Abstract

A common task in natural language processing and text mining is the extraction and formatting of information from unstructured text. One can think of the end goal of information extraction in terms of filling templates codifying the extracted information. The templates are then put into a knowledge database for future use. This chapter describes several models and learning methods that can be used to solve information extraction. We focused on two major subtasks, one is to extract entities, such as person name, organization, etc. from sentences, and the other is to determine the relationship among extracted entities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • D. Bikel, R. Schwartz, and R. Weischedel. An algorithm that learns what’s in a name. Machine Learning, 34(1–3):211–231, 1999.

    Article  MATH  Google Scholar 

  • A. Borthwick. A maximum entropy approach to named entity recognition. Ph.D. thesis, New York University, 1999.

    Google Scholar 

  • L. Breiman. Prediction games and arcing algorithms. Neural Computation, 11:1493–1517, 1999.

    Article  Google Scholar 

  • M. Califf and R. Mooney. Relational learning of pattern-match rules for information extraction. In Working Notes of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, pages 6–11. AAAI Press, Menlo Park, 1998.

    Google Scholar 

  • C. Cardie and K. Wagstaff. Noun phrase coreference as clustering. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora, pages 82–89. ACL, East Stroudsburg, 1999.

    Google Scholar 

  • M. Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of EMNLP’02. ACL, East Stroudsburg, 2002.

    Google Scholar 

  • M. Craven and S. Slattery. Relational learning with statistical predicate invention: Better models for hypertext. Machine Learning, 43:97–119, 2001.

    Article  MATH  Google Scholar 

  • J. Darroch and D. Ratcliff. Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43:1470–1480, 1972.

    Article  MATH  MathSciNet  Google Scholar 

  • R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. Named entity recognition through classifier combination. In Proceedings of CoNLL-2003, pages 168–171. ACL, East Stroudsburg, 2003.

    Google Scholar 

  • D. Freitag. Information extraction from HTML: Application of a general machine learning approach. In Proceedings of the 15th National Conference on Artificial Intelligence, pages 517–523. AAAI Press, Menlo Park, 1998.

    Google Scholar 

  • Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  • T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics. Springer, New York, 2001.

    Book  Google Scholar 

  • S. Huffman. Learning information extraction patterns from examples. In IJCAI Workshop on New Approaches to Learning for Natural Language Processing, pages 246–260. IJCAI, San Francisco, 1995.

    Google Scholar 

  • E. Jaynes. Information theory and statistical mechanics. Physical Review, 106:620–630, 1957.

    Article  MATH  MathSciNet  Google Scholar 

  • J. Kim and D. Moldovan. Acquisition of linguistic patterns for knowledge-based information extraction. IEEE Transactions on Knowledge and Data Engineering, 7(5):713–724, 1995.

    Article  Google Scholar 

  • G. Krupka and K. Hausman. IsoQuest Inc.: Description of the NetOwl TM extractor system as used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7). NIST, Washington, 1998.

    Google Scholar 

  • J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML-01, pages 282–289. Morgan Kaufmann, San Francisco, 2001.

    Google Scholar 

  • S. Lappin and H. Leass. An algorithm for pronominal anaphora resolution. Computational Linguistics, 20(4):535–561, 1994.

    Google Scholar 

  • J. McCarthy and W. Lehnert. Using decision trees for coreference resolution. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1050–1055. Morgan Kaufmann, San Francisco, 1995.

    Google Scholar 

  • A. Mikheev, C. Grover, and M. Moens. Description of the LTG system used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7). NIST, Washington, 1998.

    Google Scholar 

  • S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, and R. Weischedel. BBN: Description of the SIFT system as used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7). NIST, Washington, 1998.

    Google Scholar 

  • S. Pietra, V. Pietra, and J. Lafferty. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380–393, 1997.

    Article  Google Scholar 

  • E. Riloff. Automatically constructing a dictionary for information extraction tasks. In Proceedings of the 11th National Conference on Artificial Intelligence, pages 811–816. AAAI Press, Menlo Park, 1993.

    Google Scholar 

  • D. Roth and W. Yih. Relational learning via propositional algorithms: An information extraction case study. In Proceedings of the 17th International Joint Conference on Artificial Intelligence, pages 1257–1263. Morgan Kaufmann, San Francisco, 2001.

    Google Scholar 

  • S. Soderland. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1–3):233–272, 1999.

    Article  MATH  Google Scholar 

  • S. Soderland, D. Fisher, J. Aseltine, and W. Lehnert. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1314–1319. Morgan Kaufmann, San Francisco, 1995.

    Google Scholar 

  • W.-M. Soon, H.-T. Ng, and C.-Y. Lim. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521–544, 2001.

    Article  Google Scholar 

  • B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In S. Thrun, L. Saul and B. Schölkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, 2004.

    Google Scholar 

  • C. Tillmann and T. Zhang. An online relevant set algorithm for statistical machine translation. IEEE Transactions on Audio, Speech, and Language Processing, 16(7):1274–1286, 2008.

    Article  Google Scholar 

  • I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6:1453–1484, 2005.

    MATH  MathSciNet  Google Scholar 

  • V. Vapnik. Statistical Learning Theory. Wiley, New York, 1998.

    MATH  Google Scholar 

  • D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3:1083–1106, 2003.

    MATH  MathSciNet  Google Scholar 

  • T. Zhang and D. Johnson. A robust risk minimization based named entity recognition system. In Proceedings of CoNLL-2003, pages 204–207. ACL, East Stroudsburg, 2003.

    Google Scholar 

  • T. Zhang, F. Damerau, and D. Johnson. Text chunking based on a generalization of Winnow. Journal of Machine Learning Research, 2(5):615–637, 2002.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sholom M. Weiss .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag London Limited

About this chapter

Cite this chapter

Weiss, S.M., Indurkhya, N., Zhang, T. (2010). Looking for Information in Documents. In: Fundamentals of Predictive Text Mining. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-84996-226-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-84996-226-1_6

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84996-225-4

  • Online ISBN: 978-1-84996-226-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics