Data Sources for Prediction: Databases, Hybrid Data and the Web

Part of the Texts in Computer Science book series (TCS)


Data for automated prediction comes from many sources. In this chapter we expand our horizons to encompass both text and structured numerical data. Initially, we review the ideal data representations for prediction using either numerical or text data. We consider numerous sources of data including databases, the web, and hybrid forms of text and numerical data. Prototypical examples of blended numerical and text data are given. Using the web as a source of data for prediction is examined. Among the examples presented of web-sourced data are downloaded scientific publications formatted in XML, stock price data and related newswire headlines. Sentiment and opinion analysis are considered with examples from online product reviews and Twitter data. Predictive mining of electronic medical records mining is presented as an example of mixed-data mining.


Text Mining Information Extraction Hedge Fund Text Data Sentiment Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceRutgers UniversityPiscatawayUSA
  2. 2.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia
  3. 3.Department of Statistics, Hill CenterRutgers UniversityPiscatawayUSA

Personalised recommendations