Abstract
Databases are nowadays one more building block in complex multi-tier architectures. In general, however, they are still designed and optimized with little regard for the applications that will run on top of them. This problem is particularly acute in scientific applications where the data is usually processed at the client and, hence, conventional server side optimizations are of limited help. In this paper we present a variety of techniques and a novel client/server architecture designed to optimize the client side processing of scientific data. The main building block in our approach is to store frequently accessed data as relatively small, wavelet encoded segments. These segments can be processed at different qualities and resolutions, thereby enabling efficient processing of very large data volumes. Experimental results demonstrate that our approach significantly reduces overhead (I/O, transfer across network, decoding and analysis), does not require changes to the analysis routines and provides all possible resolution ranges.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, O. M. Duschka Complexity of Answering Queries using Materialzied Views Technical report Stanford University, 1997
M. J. Aschwanden, B. Kliem, U. Schwarz, J. Kurths, B. R. Dennis, R. A. Schwartz Wavelet Analysis of Solar Flare Hard X-Rays Astrophysical Journal, 505, 941–956,1998
C. Aurrecoechea, A. Campell, L. Hauw A Survey of QoS architectures Multimedia Systems, pp. 138–151, Juni 1998
P. Bendjoya, J. M. Petit, F. Spahn Wavelet Analysis of the Voyager Data on Planetary Rings ICARUS, Vol 105, pp. 385–299, 1993
T. Barclay, D. R. Slutz, J. Gray TerraServer: A Spatial Data Warehouse Proc. of the ACM Conference on Management of Data (SIGMOD), 2000
F. Buccafurri, D. Rosaci, D. Sacca Compressed Datacubes for Fast OLAP Applications First International Conference on Data Warehousing and Knowledge Discovery (DaWaK), pp. 65–77, 1999
R. Buyya (Ed.) High Performance Cluster Computing, Vol. 1 and 2, Prentice Hall, 1999
S. Chaudhuri, U. Dayal An Overview of Data Warehousing and OLAP Technology ACM SIGMOD Record, 26(1), March 1997
K. Chakrabarti, M. Garofalakis, R. Rastogi, K. Shim Approximate Query Processing Using Wavelets Proc. of the VLDB Conference, Cairo, Egypt, pp. 111–120, 2000
S. Chaudhuri, R. Krishnamurthy, S. Potamianos, K. Shim Optimizing Queries with Materialized Views ICDE, pp. 190–200, 1995
S. Cohen, W. Nutt, A. Serenbrenik Algorithms for Rewriting Aggregate Queries Using Views Proc. of the International Workshop on Design and Management of Data Warehouses (DMDW), pp. 9.1–9.12, 1999
O. M. Duschka, M. R. Genesereth Answering Recursive Queries using Views Proc. of the PODS Conference, pp. 109–116, 1997
Jochen Doppelhammer, Thomas Höppler, Alfons Kemper, Donald Kossmann Database Performance in the Real World-TPC-D and SAP R/3 Proceedings ACM SIGMOD International Conference on Management of Data, May 13–15, 1997, Tucson, Arizona, USA
P. B. Gibbons, Y. Matias New Sampling-Based Summary Statistics for Improving Approximate Query Answers Proc. of the Conference on Managment of Data (SIGMOD), Seattle, USA, pp. 331–342, June 1998
D. Gunopulos, V. N. Tsotras, G. Kollios, C. Domeniconi Approximating multi-dimensional aggregate range queries over real attributes Proc. of the Conference on Management of Data (SIGMOD), Dallas, USA, May 2000
J. M. Hellerstein, P. J. Haas, H. J. Wang Online Aggregation Proc. of the Conference on Management of Data (SIGMOD), Tucson, USA, May 1997
W. Hoschek, J. J. Martinez, A. S. Samar, H. Stockinger, K. Stockinger Data Management in an International Data Grid Project ACM Workshop on Grid Computing (GRID-00), Bangalore, India, 17–20 Dec., 2000
Y. E. Ioannidis, V. Poosala Histrogram-Based Approximation of Set-Valued Query Answers Proc. of the VLDB Conference, Edinburgh, Great Britain, September 1999
B. Jawerth, W. Sweldens An Overview of Wavelet-based Multiresolution Analyses SIAM Review, 36(3), pp. 377–412, 1994
G. Kaestle, E. C. Shek, S. K. Dao Sharing Experiences from Scientific Experiments Proc. of the International Conference on Scientific and Statistical Database Management, 1998
A. Y. Levy, A. O. Mendelzon, D. Srivastava, Y. Sagiv Answering Queries Using Views Proc. of the PODS Conference, 1995
Y. Matias, J. S. Vitter, M. Wang Dynamic Maintenance of Wavelet-Based Histograms Proc. of the VLDB Conference, Cairo, Egypt, pp. 101–110, 2000
B. Oezden, R. Rastogi, A. Silverschatz Multimedia Support for Databases Proc. of the PODS Conference, 1997
M. Riedewald, D. Agrawal, A. E. Abbadi Flexible Data Cubes for Online Aggregation Proc. of the Int. Conference on Database Theory, pp. 159–173, 2001
G. Stoesser et. al. The EMBL Nucleotide Sequence Database Nuclear Acids Research, 27(1), 18–24. 1999
J. Shanmugasundaram, U. Fayyad, P. S. Bradley Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions KDD, Dan Diego, USA, pp. 223–231, 1999
F. Sheikholeslami, S. Chatterjee, A. Zhang WaveCluster: a wavelet-based clustering approach for spatial data in very large Databases The VLDB Journal, Vol. 8 No 3–4, pp. 289–304, 2000
A. Szalay, P. Z. Kunszt, A. Thakar, J. Gray, and D. R. Slutz Designing and mining multi-terabyte astronomy archives: The sloan digital sky survey Proc. of the Conference on Management of Data (SIGMOD), Dallas, USA, pp. 451–462, May 16–18, 2000
J. S. Vitter, M. Wang, B. Iyer Data Cube Approximation and Histograms via Wavelets Proc. of the CIKM, Bethesda, USA, 1998
J. S. Vitter, M. Wang Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets Proc. of the Conference on Management of Data (SIGMOD), Philadelphia, USA, June 1999
J. Z. Wang, G. Wiederhold, O. Firschein, S. X. Wei Content-based image indexing and searching using Daubechies wavelets International Jounal on Digital Libraries, Volume 1, Issue 4, pp. 311–328, 1998
J. T. Wang, K. Zhang, D. Shasha Pattern Matching and Pattern Discovery in Scientific, Program, and Document Databases. Proc. of the Conference on Management of Data (SIGMOD), 1995
M. Zemankova, Y. E. Ioannidis Scientific Databases-State of the Art and Future Directions. Proc. of the VLDB Conference, Santiago, Chile, 1994
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stolte, E., Alonso, G. (2002). Optimizing Scientific Databases for Client Side Data Processing. In: Jensen, C.S., et al. Advances in Database Technology — EDBT 2002. EDBT 2002. Lecture Notes in Computer Science, vol 2287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45876-X_26
Download citation
DOI: https://doi.org/10.1007/3-540-45876-X_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43324-8
Online ISBN: 978-3-540-45876-0
eBook Packages: Springer Book Archive