Knowledge base population refers to the task of discovering new facts about entities from a large text corpus, and augmenting a knowledge base with these facts. We start this chapter by giving a brief overview of the broader problem area of extracting structured information from unstructured data. Then, we present a two-step approach that facilitates knowledge base population. In step one, an incoming document stream is filtered to identify documents that potentially contain new facts about a given entity. In step two, the filtered documents are processed for extracting new facts.
