Advertisement

A Method for Automating the Extraction of Specialized Information from the Web

  • Ling Lin
  • Antonio Liotta
  • Andrew Hippisley
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3801)

Abstract

The World Wide Web can be viewed as a gigantic distributed database including millions of interconnected hosts some of which publish information via web servers or peer-to-peer systems. We present here a novel method for the extraction of semantically rich information from the web in a fully automated fashion. We illustrate our approach via a proof-of-concept application which scrutinizes millions of web pages looking for clues as to the trend of the Chinese stock market. We present the outcomes of a 210-day long study which indicates a strong correlation between the information retrieved by our prototype and the actual market behavior.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gaizauskas, R., et al.: Information Extraction: Beyond Document Retrieval. Computational Linguistics and Chinese Language Processing 3(2), 17–60 (1998)Google Scholar
  2. 2.
    Cheng, K.S., et al.: A Study on Word-Based and Integral-Bit Chinese Text Compression Algorithms. J. American Society for Information Science 50(3), 218–228 (1999)CrossRefGoogle Scholar
  3. 3.
    Gillam, L., et al.: Economic News and Stock Market Corr.: A study of the UK Market (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Ling Lin
    • 1
  • Antonio Liotta
    • 1
  • Andrew Hippisley
    • 2
  1. 1.Department of Electronic Systems EngineeringUniversity of EssexColchesterUK
  2. 2.Department of ComputingUniversity of SurreyUK

Personalised recommendations