Abstract
A common task in natural language processing and text mining is the extraction and formatting of information from unstructured text. One can think of the end goal of information extraction in terms of filling templates codifying the extracted information. The templates are then put into a knowledge database for future use. This chapter describes several models and learning methods that can be used to solve information extraction. We focused on two major subtasks, one is to extract entities, such as person name, organization, etc. from sentences, and the other is to determine the relationship among extracted entities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
D. Bikel, R. Schwartz, and R. Weischedel. An algorithm that learns what’s in a name. Machine Learning, 34(1–3):211–231, 1999.
A. Borthwick. A maximum entropy approach to named entity recognition. Ph.D. thesis, New York University, 1999.
L. Breiman. Prediction games and arcing algorithms. Neural Computation, 11:1493–1517, 1999.
M. Califf and R. Mooney. Relational learning of pattern-match rules for information extraction. In Working Notes of AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, pages 6–11. AAAI Press, Menlo Park, 1998.
C. Cardie and K. Wagstaff. Noun phrase coreference as clustering. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora, pages 82–89. ACL, East Stroudsburg, 1999.
M. Collins. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of EMNLP’02. ACL, East Stroudsburg, 2002.
M. Craven and S. Slattery. Relational learning with statistical predicate invention: Better models for hypertext. Machine Learning, 43:97–119, 2001.
J. Darroch and D. Ratcliff. Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43:1470–1480, 1972.
R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. Named entity recognition through classifier combination. In Proceedings of CoNLL-2003, pages 168–171. ACL, East Stroudsburg, 2003.
D. Freitag. Information extraction from HTML: Application of a general machine learning approach. In Proceedings of the 15th National Conference on Artificial Intelligence, pages 517–523. AAAI Press, Menlo Park, 1998.
Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics. Springer, New York, 2001.
S. Huffman. Learning information extraction patterns from examples. In IJCAI Workshop on New Approaches to Learning for Natural Language Processing, pages 246–260. IJCAI, San Francisco, 1995.
E. Jaynes. Information theory and statistical mechanics. Physical Review, 106:620–630, 1957.
J. Kim and D. Moldovan. Acquisition of linguistic patterns for knowledge-based information extraction. IEEE Transactions on Knowledge and Data Engineering, 7(5):713–724, 1995.
G. Krupka and K. Hausman. IsoQuest Inc.: Description of the NetOwl TM extractor system as used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7). NIST, Washington, 1998.
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML-01, pages 282–289. Morgan Kaufmann, San Francisco, 2001.
S. Lappin and H. Leass. An algorithm for pronominal anaphora resolution. Computational Linguistics, 20(4):535–561, 1994.
J. McCarthy and W. Lehnert. Using decision trees for coreference resolution. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1050–1055. Morgan Kaufmann, San Francisco, 1995.
A. Mikheev, C. Grover, and M. Moens. Description of the LTG system used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7). NIST, Washington, 1998.
S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, and R. Weischedel. BBN: Description of the SIFT system as used for MUC-7. In Proceedings of the Seventh Message Understanding Conference (MUC-7). NIST, Washington, 1998.
S. Pietra, V. Pietra, and J. Lafferty. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380–393, 1997.
E. Riloff. Automatically constructing a dictionary for information extraction tasks. In Proceedings of the 11th National Conference on Artificial Intelligence, pages 811–816. AAAI Press, Menlo Park, 1993.
D. Roth and W. Yih. Relational learning via propositional algorithms: An information extraction case study. In Proceedings of the 17th International Joint Conference on Artificial Intelligence, pages 1257–1263. Morgan Kaufmann, San Francisco, 2001.
S. Soderland. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1–3):233–272, 1999.
S. Soderland, D. Fisher, J. Aseltine, and W. Lehnert. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1314–1319. Morgan Kaufmann, San Francisco, 1995.
W.-M. Soon, H.-T. Ng, and C.-Y. Lim. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521–544, 2001.
B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In S. Thrun, L. Saul and B. Schölkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, 2004.
C. Tillmann and T. Zhang. An online relevant set algorithm for statistical machine translation. IEEE Transactions on Audio, Speech, and Language Processing, 16(7):1274–1286, 2008.
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6:1453–1484, 2005.
V. Vapnik. Statistical Learning Theory. Wiley, New York, 1998.
D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3:1083–1106, 2003.
T. Zhang and D. Johnson. A robust risk minimization based named entity recognition system. In Proceedings of CoNLL-2003, pages 204–207. ACL, East Stroudsburg, 2003.
T. Zhang, F. Damerau, and D. Johnson. Text chunking based on a generalization of Winnow. Journal of Machine Learning Research, 2(5):615–637, 2002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Weiss, S.M., Indurkhya, N., Zhang, T. (2010). Looking for Information in Documents. In: Fundamentals of Predictive Text Mining. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-84996-226-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-84996-226-1_6
Publisher Name: Springer, London
Print ISBN: 978-1-84996-225-4
Online ISBN: 978-1-84996-226-1
eBook Packages: Computer ScienceComputer Science (R0)