The VLDB Journal

, Volume 22, Issue 1, pp 73–98 | Cite as

Automating the database schema evolution process

  • Carlo Curino
  • Hyun Jin Moon
  • Alin Deutsch
  • Carlo Zaniolo
Special Issue Paper

Abstract

Supporting database schema evolution represents a long-standing challenge of practical and theoretical importance for modern information systems. In this paper, we describe techniques and systems for automating the critical tasks of migrating the database and rewriting the legacy applications. In addition to labor saving, the benefits delivered by these advances are many and include reliable prediction of outcome, minimization of downtime, system-produced documentation, and support for archiving, historical queries, and provenance. The PRISM/PRISM++ system delivers these benefits, by solving the difficult problem of automating the migration of databases and the rewriting of queries and updates. In this paper, we present the PRISM/PRISM++ system and the novel technology that made it possible. In particular, we focus on the difficult and previously unsolved problem of supporting legacy queries and updates under schema and integrity constraints evolution. The PRISM/PRISM++ approach consists in providing the users with a set of SQL-based Schema Modification Operators (SMOs), which describe how the tables in the old schema are modified into those in the new schema. In order to support updates, SMOs are extended with integrity constraints modification operators. By using recent results on schema mapping, the paper (i) characterizes the impact on integrity constraints of structural schema changes, (ii) devises representations that enable the rewriting of updates, and (iii) develop a unified approach for query and update rewriting under constraints. We complement the system with two novel tools: the first automatically collects and provides statistics on schema evolution histories, whereas the second derives equivalent sequences of SMOs from the migration scripts that were used for schema upgrades. These tools were used to produce an extensive testbed containing 15 evolution histories of scientific databases and web information systems, providing over 100 years of aggregate evolution histories and almost 2,000 schema evolution steps.

Keywords

Schema evolution Rewriting Updates Mapping  SMO Integrity constraints management Relational 

Notes

Acknowledgments

The authors would like to thank Fabrizio Moroni, MyungWon Ham for their help in developing the tool, and Letizia Tanca for the great feedback and support.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
    Abiteboul, S., Duschka, O.M.: Complexity of answering queries using materialized views. In: PODS, pp. 254–263 (1998)Google Scholar
  11. 11.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison Wesley, Reading (1995)MATHGoogle Scholar
  12. 12.
    Afrati, F.N., Kolaitis, P.G.: Repair checking in inconsistent databases: Algorithms and complexity. In: ICDT, pp. 31–41 (2009)Google Scholar
  13. 13.
    Arenas, M., Bertossi, L., Chomicki, J.: Consistent query answers in inconsistent databases. In: PODS, pp. 68–79 (1999)Google Scholar
  14. 14.
    Bancilhon, F., Spyratos, N.: Update semantics of relational views. ACM Trans. Database Syst. 6(4), 557–575 (1981)MATHCrossRefGoogle Scholar
  15. 15.
    Bernstein, P.A.: Applying model management to classical meta data problems. In: CIDR (2003)Google Scholar
  16. 16.
    Bernstein, P.A., Green, T.J., Melnik, S., Nash, A.: Implementing mapping composition. VLDB J. 17(2), 333–353 (2008)CrossRefGoogle Scholar
  17. 17.
    Bohannon, A., Pierce, B.C., Vaughan, J.A.: Relational lenses: A language for updatable views. In: PODS, pp. 338–347 (2006)Google Scholar
  18. 18.
    Cleve A., Hainaut, J.-L.: Co-transformations in database applications evolution. In: GTTSE, pp. 409–421 (2006)Google Scholar
  19. 19.
    Curino, C., Ham, M., Moroni, F., Zaniolo, C.: Pantha rei data set. http://data.schemaevolution.org/ (2009)
  20. 20.
    Curino, C., Moon, H.J., Deutsch, A., Zaniolo, C.: Update rewriting and integrity constraint maintenance in a schema evolution support system: Prism++. PVLDB 4(2), 117–128 (2010)Google Scholar
  21. 21.
    Curino, C., Moon, H.J., Ham, M., Zaniolo, C.: The prism workbench: Database schema evolution without tears. In: ICDE (2009)Google Scholar
  22. 22.
    Curino, C., Moon, H.J., Tanca, L., Zaniolo, C.: Schema evolution in Wikipedia: Toward a web information system benchmark. ICEIS (2008)Google Scholar
  23. 23.
    Curino, C., Moon, H.J., Zaniolo, C.: Graceful database schema evolution: The prism workbench. PVLDB 1(1), 761–772 (2008)Google Scholar
  24. 24.
    Curino, C., Moon, H.J.: C. Zaniolo. Managing the history of metadata in support for db archiving and schema evolution. In: ECDM (2008)Google Scholar
  25. 25.
    Dayal, U., Bernstein, P.A.: On the correct translation of update operations on relational views. ACM Trans. Database Syst. 7(3), 381–416 (1982)MathSciNetMATHCrossRefGoogle Scholar
  26. 26.
    Deutsch, A., Nash, A., Remmel, J.: The chase revisited. In: PODS, pp. 149–158 (2008)Google Scholar
  27. 27.
    Deutsch, A., Tannen, V.: Mars: A system for publishing xml from mixed and redundant storage. In: VLDB, pp. 201–212 (2003)Google Scholar
  28. 28.
    Ensembl development team. Ensembl Genetic DB http://www.ensembl.org, 2009 (Online)
  29. 29.
    Fagin, R.: Inverting schema mappings. ACM Trans. Database Syst. 32(4), 25:1–25:51 (2007)Google Scholar
  30. 30.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.-C.: Composing schema mappings: Second-order dependencies to the rescue. ACM Trans. Database Syst. 30(4), 994–1055 (2005)CrossRefGoogle Scholar
  31. 31.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.-C.: Quasi-inverses of schema mappings. In: PODS, pp. 123–132 (2007)Google Scholar
  32. 32.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W. C.: Reverse data exchange: Coping with nulls. In: PODS, pp. 23–32 (2009)Google Scholar
  33. 33.
    Hartung, M., Terwilliger, J.F., Rahm, E.: Recent advances in schema and ontology evolution. In: Schema Matching and Mapping, pp. 149–190 (2011)Google Scholar
  34. 34.
    Hernández, M.A., Miller, R.J., Haas, L.M.: Clio: A semi-automatic tool for schema mapping. In: SIGMOD, p. 607 (2001)Google Scholar
  35. 35.
    Hick, J.-M., Hainaut, J.-L.: Database application evolution: A transformational approach. Data Knowl. Eng. 59(3), 534–558 (2006)CrossRefGoogle Scholar
  36. 36.
    Hull, R.: Non-finite specifiability of projections of functional dependency families. Theor. Comput. Sci. 39, 239–265 (1985)MathSciNetMATHCrossRefGoogle Scholar
  37. 37.
    Kotidis, Y., Srivastava, D., Velegrakis, Y.: Updates through views: A new hope. In: ICDE, p. 2 (2006)Google Scholar
  38. 38.
    Lenzerini, M.: Data integration: A theoretical perspective. In: PODS, pp. 233–246 (2002)Google Scholar
  39. 39.
    Liu, Z., He, B., Hsiao, H.-I., Chen, Y.: Efficient and scalable data evolution with column oriented databases. In: EDBT (2011)Google Scholar
  40. 40.
    Madhavan, J., Halevy, A.Y.: Composing mappings among data sources. In: VLDB, pp. 572–583 (2003)Google Scholar
  41. 41.
    Melnik, S., Rahm, E., Bernstein, P.A.: Rondo: A programming platform for generic model management. In: SIGMOD (2003)Google Scholar
  42. 42.
    Miller, R.J., Ioannidis, Y.E., Ramakrishnan, R.: The use of information capacity in schema integration and translation. In: VLDB, pp. 120–133 (1993)Google Scholar
  43. 43.
    Miller, R.J., Ioannidis, Y.E., Ramakrishnan, R.: Schema equivalence in heterogeneous systems: Bridging theory and practice. Inf. Syst. 19(1), 3–31 (1994)CrossRefGoogle Scholar
  44. 44.
    Moon, H.J., Curino, C., Deutsch, A., Hou, C.-Y., Zaniolo, C.: Managing and querying transaction-time databases under schema evolution. PVLDB 1(1), 882–895 (2008)Google Scholar
  45. 45.
    Moon, H.J., Curino, C., Zaniolo, C.: Scalable architecture and query optimization for transaction-time dbs with evolving schemas. In: SIGMOD Conference, pp. 207–218 (2010)Google Scholar
  46. 46.
    Moroni, F.: Schema Evolution Toolsuite: Analysis and Interpretation of Relational Schema Changes. Master’s thesis, Politecnico di Milano—Dipartimento di Elettronica e Informazione (2009)Google Scholar
  47. 47.
    Papastefanatos, G., Vassiliadis, P., Simitsis, A., Vassiliou, Y.: Hecataeus: Regulating schema evolution. In: ICDE, pp. 1181–1184, March (2010)Google Scholar
  48. 48.
    Qian, L., LeFevre, K., Jagadish, H.V.: Crius: User-friendly database design. PVLDB 4(2), 81–92 (2010)Google Scholar
  49. 49.
    Ra, Y.-G.: Relational schema evolution for program independency. In: Proceedings of the 7th international conference on Intelligent Information Technology, pp 273–281, Springer, Heidelberg (2004). doi:10.1007/978-3-540-30561-3_29
  50. 50.
    Terwilliger, J.F., Bernstein, P.A., Unnithan, A.: Worry-free database upgrades: Automated model-driven evolution of schemas and complex mappings. In: SIGMOD Conference (2010)Google Scholar
  51. 51.
    Terwilliger, J.F., Fernández-Moctezuma, R., Delcambre, L.M.L., Maier, D.: Support for schema evolution in data stream management systems. J. UCS 16(20), 3073–3101 (2010)MATHGoogle Scholar
  52. 52.
    Ullman, J.: Principles of Database System. Computer Science Press, Rockville (1982)Google Scholar
  53. 53.
    Ullman, J.D.: Information integration using logical views. Theor. Comput. Sci. 239(2), 189–210 (2000)MathSciNetMATHCrossRefGoogle Scholar
  54. 54.
    Velegrakis, Y., Miller, R.J., Popa, L.: Mapping adaptation under evolving schemas. In: VLDB, pp. 584–595 (2003)Google Scholar
  55. 55.
    Wikimedia Foundation. Wikipedia, the free encyclopedia http://en.wikipedia.org/, 2007 (Online)
  56. 56.
    Wikimedia Foundation. The mediawiki http://www.mediawiki.org, 2008
  57. 57.
    Yu, C., Popa, L.: Semantic adaptation of schema mappings when schemas evolve. In: VLDB, pp. 1006–1017 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Carlo Curino
    • 1
  • Hyun Jin Moon
    • 2
  • Alin Deutsch
    • 3
  • Carlo Zaniolo
    • 4
  1. 1.MicrosoftMountain ViewUSA
  2. 2.Google Inc.Mountain ViewUSA
  3. 3.UCSDLa JollaUSA
  4. 4.UCLALos AngelesUSA

Personalised recommendations