Journal on Data Semantics

, Volume 6, Issue 4, pp 221–241 | Cite as

Schema Evolution Survival Guide for Tables: Avoid Rigid Childhood and You’re En Route to a Quiet Life

Original Article

Abstract

In this paper, we study the factors that relate to the survival of a table in the context of schema evolution in open-source software. We study the history of the schema of eight open-source software projects that include relational databases and extract patterns related to the survival or death of their tables. Our study shows that the probability of a table with a wide schema (i.e., a large number of attributes) being removed is systematically lower than average. Activity and duration are related to survival too. Rigid tables, without any change to their schema, are more likely to be removed than tables that sustain changes. Durations of dead and survival tables demonstrate a mirror image: dead tables’ durations are mostly short, whereas survivor tables gravitate toward higher durations. Our findings are mostly summarized by a pattern, which we call electrolysis pattern, due to its diagrammatic representation, stating that dead and survivor tables live quite different lives: tables typically die shortly after birth, with short durations and mostly no updates, whereas survivors mostly live quiet lives with few updates—except for a small group of tables with high update ratios that are characterized by high durations and survival. Equally important is the evidence that schema evolution suffers from the antagonism of gravitation to rigidity, i.e., the tendency to minimize evolution as much as possible in order to minimize the resulting impact to the surrounding code. Several factors contribute to this observation: the absence of long durations in removed tables, the low percentage of tables whose schema size is scaled up or down, and the low numbers of tables with a high rate of updates, contrasted to the high numbers of tables with zero or few updates. We complement our findings with explanations and recommendations to developers.

References

  1. 1.
    Cleve A, Gobert M, Meurice L, Maes J, Weber JH (2015) Understanding database schema evolution: a case study. Sci Comput Program 97:113–121CrossRefGoogle Scholar
  2. 2.
    Curino C, Moon HJ, Tanca L, Zaniolo C (2008) Schema evolution in wikipedia: toward a web information system benchmark. In: Proceedings of ICEIS 2008, CiteseerGoogle Scholar
  3. 3.
    Curino C, Moon HJ, Deutsch A, Zaniolo C (2013) Automating the database schema evolution process. VLDB J 22(1):73–98CrossRefGoogle Scholar
  4. 4.
    Hartung M, Terwilliger JF, Rahm E (2011) Schema matching and mapping, chap recent advances in schema and ontology evolution. Springer, New York, pp 149–190Google Scholar
  5. 5.
    Herrmann K, Voigt H, Behrend A, Lehner W (2015) Codel—a relationally complete language for database evolution. In: Proceedings of 19th East European conference on advances in databases and information systems (ADBIS 2015), Poitiers, France, September 8–11, 2015, pp 63–76Google Scholar
  6. 6.
    Lehman MM, Fernandez-Ramil JC (2006) Software evolution and feedback: theory and practice. Chap Rules and tools for software evolution planning and management. Wiley, New York. ISBN-13: 978-0-470-87180-5Google Scholar
  7. 7.
    Lin DY, Neamtiu I (2009) Collateral evolution of applications and databases. In: Proceedings of the joint international and annual ERCIM workshops on principles of software evolution (IWPSE) and software evolution (Evol) workshops, IWPSE-Evol ’09, pp 31–40Google Scholar
  8. 8.
    Manousis P, Vassiliadis P, Zarras AV, Papastefanatos G (2015) Schema evolution for databases and data warehouses. In: 5th European Summer School on Business Intelligence (eBISS 2015), Barcelona, Spain, July 5–10, 2015, Lecture notes in business information processing (LNBIP), vol 253, pp 1–31Google Scholar
  9. 9.
    Qiu D, Li B, Su Z (2013) An empirical analysis of the co-evolution of schema and code in database applications. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering, ESEC/FSE 2013, pp 125–135Google Scholar
  10. 10.
    Sjøberg D (1993) Quantifying schema evolution. Inf Softw Technol 35(1):35–44CrossRefGoogle Scholar
  11. 11.
    Skoulis I, Vassiliadis P, Zarras A (2014) Open-source databases: within, outside, or beyond Lehman’s laws of software evolution? In: Proceedings of 26th international conference on advanced information systems engineering—CAiSE 2014, pp 379–393Google Scholar
  12. 12.
    Skoulis I, Vassiliadis P, Zarras AV (2015) Growing up with stability: how open-source relational databases evolve. Inf Syst 53:363–385CrossRefGoogle Scholar
  13. 13.
    Vassiliadis P, Zarras AV (2017) Survival in schema evolution: putting the lives of survivor and dead tables in counterpoint. In: Proceedings of 29th international conference on advanced information systems engineering (CAiSE 2017), Essen, Germany, June 12–16, 2017, pp 333–347Google Scholar
  14. 14.
    Vassiliadis P, Zarras AV, Skoulis I (2015) How is life for a table in an evolving relational schema? Birth, Death and Everything in Between. In: Proceedings of 34th international conference on conceptual modeling (ER 2015), Stockholm, Sweden, October 19–22, 2015, pp 453–466Google Scholar
  15. 15.
    Vassiliadis P, Zarras A, Skoulis I (2017) Gravitating to rigidity: patterns of schema evolution- and its absence- in the lives of tables. Inf Syst 63:24–46CrossRefGoogle Scholar
  16. 16.
    Wu S, Neamtiu I (2011) Schema evolution analysis for embedded databases. In: Proceedings of the 2011 IEEE 27th international conference on data engineering workshops, ICDEW ’11, pp 151–156Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of IoanninaIoanninaHellas

Personalised recommendations