Table of contents
About this book
The proliferation of massive data sets brings with it a series of special computational challenges. This "data avalanche" arises in a wide range of scientific and commercial applications. With advances in computer and information technologies, many of these challenges are beginning to be addressed by diverse inter-disciplinary groups, that indude computer scientists, mathematicians, statisticians and engineers, working in dose cooperation with application domain experts. High profile applications indude astrophysics, bio-technology, demographics, finance, geographi cal information systems, government, medicine, telecommunications, the environment and the internet. John R. Tucker of the Board on Mathe matical Seiences has stated: "My interest in this problern (Massive Data Sets) isthat I see it as the rnost irnportant cross-cutting problern for the rnathernatical sciences in practical problern solving for the next decade, because it is so pervasive. " The Handbook of Massive Data Sets is comprised of articles writ ten by experts on selected topics that deal with some major aspect of massive data sets. It contains chapters on information retrieval both in the internet and in the traditional sense, web crawlers, massive graphs, string processing, data compression, dustering methods, wavelets, op timization, external memory algorithms and data structures, the US national duster project, high performance computing, data warehouses, data cubes, semi-structured data, data squashing, data quality, billing in the large, fraud detection, and data processing in astrophysics, air pollution, biomolecular data, earth observation and the environment.
Web Crawling algorithms data compression data warehouse image processing information information system optimization performance