Text Information Extraction and Retrieval
Every day people spend much time on creating, processing, and accessing information. In fact, most of the information exists in the form of "text", contained in books, emails, web pages, news paper articles, blogs, and reports. How to help people quickly find information from text data and how to help people discover new knowledge from text data has become an enormously important issue. Many research efforts have been made on text information extraction, retrieval, and mining; and significant progress has made in recent years. A large number of new methods have been proposed, and many systems have been developed and put into practical uses. This tutorial is aimed at giving an overview on two central topics of the area: namely Information Extraction (IE) and Information Retrieval (IR). Important technologies on them will be introduced. Specifically, models for IE such as Maximum Entropy Markov Model and Conditional Random Fields will be explained. Models for IR such as Language Model and Learning to Rank will be described. A brief survey on recent work on both IE and IR will be given. Finally, some recent work on the combined uses of IE and IR technologies will also be introduced.