Abstract
Views over distributed information sources, such as data warehouses, rely on the stability of the schemas of underlying databases. In the event of meta data changes in the sources, such as the deletion of a table or column, such views may become undefined. Using meta data about information redundancy, views can be evolved as necessary to remain well defined after source meta data changes.
Previous work in view synchronization focused only on deletions of schema elements. We now offer an approach that makes use of additions also. Our algorithm returns view definitions to previous versions by using knowledge about the history of views and meta data. This technology enables us to adapt views to temporary meta data changes by canceling out opposite changes. It also allows undo/redo operations on meta data. Last, in many cases, the resulting evolved views even have an improved information quality. In this paper, we give a formal taxonomy of schema and constraint changes and a full description of the proposed history-driven view-synchronization algorithm for this taxonomy. We also prove the history-driven view-synchronization algorithm to be correct. Our approach falls in the global-as-view category of data integration solutions but, unlike prior solutions in this category, it now also deals with changes in the information space rather than requiring source schemas to remain constant over time.
Similar content being viewed by others
References
Agrawal D, El Abbadi A, Singh A, et al. (1997) Efficient view maintenance at data warehouses. In: Proceedings of SIGMOD, pp 417–427
Arens Y, Knoblock CA, Shen WM (1996) Query reformulation for dynamic information integration. J Intell Inf Syst 6:99–130
Blakeley JA, Larson P-E, Tompa FW (1986) Efficiently updating materialized views. In: Proceedings of SIGMOD, pp 61–71
Chen J, Chen S, Rundensteiner EA (2002) A transactional model for data warehouse maintenance. In: Proceedings of ER, Tampere, Finland, pp 247–262
Chen J, Zhang X, Chen S, et al. (2001) DyDa: data warehouse maintenance in fully concurrent environments. In: Proceedings of SIGMOD ’01, Demo session, Santa Barbara, California, USA, p 619
Chen S, Chen J, Zhang X, Rundensteiner EA (2004) Detection and correction of conflicting source updates for view maintenance. Proceedings of international conference on data engineering, Boston, MA, USA, pp 436–448
Cohen S, Nutt W, Serebrenik A (1999) Rewriting aggregate queries using views. In: Papadimitriou C (ed) Proceedings of ACM symposium on principles of database systems, Philadelphia, PA, USA, pp 155–166
Chu WW, Merzbacher MA, Berkovich L (1993) The design and implementation of CoBase. SIGMOD Rec 22:517-522
Duschka OM (1997) Query planning and optimization in information integration. Ph.D. thesis, Stanford University, Stanford, California
Duschka OM, Genesereth MR (1997) Answering recursive queries using views. In: ACM (ed) Proceedings of ACM symposium on principles of database systems, Tucson, AZ, USA, pp 109–116
DWQ (1999) http://www.dblab.ece.ntua.gr/∼dwq/
Etzioni O, Weld D (1994) A softbot-based interface to the internet. Commun ACM 37:72–76
García-Molina H, Hammer J, Ireland K, et al. (1995) Integrating and accessing heterogeneous information sources in TSIMMIS. AAAI spring symposium on information gathering
Genesereth MR, Keller AM, Duschka OM (1997) Infomaster: an information integration system. SIGMOD Rec (ACM special interest group on management of data) 26:539ff
Gupta A, Mumick I (1995) Maintenance of materialized views: problems, techniques and applications. IEEE Data Eng Bull (Special issue on materialized views and warehousing) 18:3–19
Haas LM, Kossmann D, Wimmers EL, et al. (1997) Optimizing queries across diverse data sources. International conference on very large data bases, pp 276–285
Halevy AY, Ives ZG, Suciu D, et al. (2003) Schema mediation in peer data management systems. In: Proceedings of IEEE international conference on data engineering, Bangalore, India, pp 505–516
Jarke M, Koch J (1984) Query optimization in database systems. ACM Comput Surv 16(2):111–152
Jarke M, Jeusfeld MA, Quix C, et al. (1998) Architecture and quality in data warehouses. Lecture Notes in Computer Science, vol 1413, pp 93ff
Jarke M, Vassiliou Y (1997) Data warehouse quality: a review of the DWC project. In: Proceedings of the 2nd conference on information quality
Jeusfeld MA, Quix C, Jarke M (1998) Design and analysis of quality information for data warehouses. Lecture Notes in Computer Science, vol 1507, pp 349ff
Koeller A, Rundensteiner EA (2000) History-driven view synchronization. In: Proceedings of 2nd international conference on data warehousing and knowledge discovery (DaWaK). Lecture Notes in Computer Science, vol 1874. Springer, Greenwich, UK, pp 168–177
Koeller A, Rundensteiner EA, Hachem N (1998) Integrating the rewriting and ranking phases of view synchronization. In: Proceedings of the ACM first international workshop on data warehousing and OLAP (DOLAP’98), pp 60–65
Lee AJ, Koeller A, Nica A, Rundensteiner EA (1999a) Data warehouse evolution: trade-offs between quality and cost of query rewritings. In: Proceedings of IEEE international conference on data engineering, Special poster session, p 255
Lee AJ, Koeller A, Nica A, Rundensteiner EA (1999b) Non-equivalent query rewritings. In: Proceedings of the 9th international databases conference, City University of Hong Kong Press, pp 248–262
Lee AJ, Nica A, Rundensteiner EA (2002) The EVE approach: view synchronization in dynamic distributed environments. IEEE Trans Knowl Data Eng (TKDE) 14:931-954
Levy A, Mendelzon A, Sagiv Y (1995a) Answering queries using views. In: Proceedings of ACM symposium on principles of database systems, pp 95–104
Levy A, Mumick IS, Sagiv Y, et al. (1993) Equivalence, query reachability and satisfiability in datalog extensions. In: Proceedings of the 12th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, Washington, DC, USA, pp 109–122
Levy A, Sagiv Y (1992) Constraints and redundancy in datalog. In: Proceedings of the 11th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, San Diego, CA, USA, pp 67-80
Levy AY, Rajaraman A, Ullman JD (1996) Answering queries using limited external processors. Pods, Montreal, Canada, pp 227-237
Levy AY, Srivastava D, Kirk T (1995b) Data model and query evaluation in global information systems. J Intell Inf Syst (special issue on networked information discovery and retrieval) 5:121–143
Lu JJ, Moerkotte G, Schue J, et al. (1995) Efficient maintenance of materialized mediated views. In: Carey MJ, Schneider DA (eds) Proceedings of SIGMOD, San Jose, CA, USA, pp 340–351
Manolescu I, Florescu D, Kossmann D (2001) Answering XML queries over heterogeneous data sources. International conference on very large data bases, pp 241–250
Nica A, Lee AJ, Rundensteiner EA (1998) The CVS algorithm for view synchronization in evolvable large-scale information systems. In: Proceedings of international conference on extending database technology (EDBT’98), pp 359-373
Nica A, Rundensteiner EA (1998a) The POC and SPOC algorithms: view rewriting using containment constraints in EVE. Technical Report WPI-CS-TR-98-3, Worcester Polytechnic Institute, Dept. of Computer Science
Nica A, Rundensteiner EA (1998b) Using containment information for view evolution in dynamic distributed environments. In: Proceedings of international workshop on data warehouse design and OLAP technology (DWDOT’98), pp 212–217
Nica A, Rundensteiner EA (1999) View maintenance after view synchronization. International database engineering and applications symposium (IDEAS’99), pp 213–215
Papakonstantinou Y, García-Molina H, Ullman J (1996) Medmaker: a mediation system based on declarative specifications. In: Proceedings of IEEE international conference on data engineering, pp 132–141
Papakonstantinou Y, García-Molina H, Widom J (1995) Object exchange across heterogeneous information sources. In: Proceedings of IEEE international conference on data engineering, pp 251–260
Papakonstantinou Y, Gupta A, Haas LM (1998) Capabilities-based query rewriting in mediator systems. Distributed Parallel Databases 6:73–110
Quass D, Widom J (1997) On-line warehouse view maintenance. In: Proceedings of SIGMOD, pp 393–400
Quix C (1999) Repository support for data warehouse evolution. In: Proceedings of international workshop on design and management of data warehouses (DMDW’99) pp 4.1–4.9
Rajaraman A, Sagiv Y, Ullman J (1995) Answering queries using templates with binding patterns. In: Proceedings of ACM symposium on principles of database systems, pp 105-112
Rajaraman A, Ullman J (1996) Integrating information by outerjoins and full disjunctions. In: Proceedings of ACM symposium on principles of database systems, pp 238–248
Rundensteiner EA, Koeller A, Zhang X, et al. (1999) Evolvable view environment. In: Proceedings of SIGMOD’99 demo session, pp 553–555
Rundensteiner EA, Lee AJ, Nica A (1997) On preserving views in evolving environments. In: Proceedings of 4th international workshop on knowledge representation meets databases (KRDB’97): intelligent access to heterogeneous information, pp 13.1–13.11
Srivastava D, Dar S, Jagadish H, et al. (1996) Answering queries with aggregation using views. International conference on very large data bases, pp 318–329
Staudt M, Quix C, Jeusfeld MA (1998) View maintenance and change notification for application program views. In: Proceedings of the 1998 ACM symposium on applied computing, ACM Press, pp 220–225
van den Berg CA, Kersten M (1994) An analysis of a dynamic query optimization schema for different data distributions. In: Freytag JC, Maier D, Vossen G (eds) Query processing for advanced database systems, Kaufmann, Chapter 15, pp 449–473
Vassiliadis P, Bouzeghoub M, Quix C (1999) Towards quality-oriented data warehouse usage and evolution. In: Proceedings of CAiSE. Lecture Notes in Computer Science, vol 1626. Springer, p 164ff
Vassiliadis P, Quix C, Vassiliou Y, et al. (2000) A model for data warehouse operational processes. In: Proceedings of CAiSE. Lecture Notes in Computer Science, vol 1789. Springer, p 446ff
Vassiliadis P, Quix C, Vassiliou Y, et al. (2001) Data warehouse process management. Inf Syst 26:205–236
Wiederhold G (1992) Mediators in the architecture of future information systems. IEEE Comput 25:38–49
Zhuge Y, García-Molina H, Wiener JL (1996) The strobe algorithms for multi-source warehouse consistency. In: Proceedings of the international conference on parallel and distributed information systems, pp 146–157
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koeller, A., Rundensteiner, E. A history-driven approach at evolving views under meta data changes. Knowl Inf Syst 8, 34–67 (2005). https://doi.org/10.1007/s10115-004-0171-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-004-0171-8