Well-being Changes from Year to Year: A Comparison of Current, Remembered and Predicted Life Satisfaction

I study yearly changes in personal well-being combining data on current, retrospective and prospective life satisfaction from the German Socio-Economic Panel. Predicted and remembered changes in life satisfaction are both positive on average and match well, whereas the average year to year-change inferred from reports of current life satisfaction is negative. Retrospective assessments of past well-being are strongly influenced by current life satisfaction, significantly related to past life satisfaction and linked to past predictions of current satisfaction. Due to different problems related to the ordinal measurement scale, changes in subjective reference systems and recall ability, the analysis overall suggests that direct reports of intertemporal changes provide valuable additional information for the analysis of individual well-being.


Introduction
Over the last decades, economists have become increasingly interested in the study of individual life satisfaction, subjective well-being or happiness (see, e.g., the reviews by Frey and Stutzer 2002;Di Tella and MacCulloch 2006;Dolan et al. 2008;Clark 2018). 1 These survey data can deliver new insights and offer ways to test economic theories and to advise or evaluate public policy (Stutzer and Frey 2010;Di Tella and MacCulloch 2006), but there are also reasons to be skeptical about their informational content (e.g., Schwarz recalled and predicted changes in well-being are more positive than implied by comparing the corresponding current assessments. Furthermore, the simultaneous look at current, retrospective and prospective assessments suggests that predictions and subsequent recall of life satisfaction are systematically linked and that these data carry additional informational value for the analysis of well-being developments over time. The next section presents the data and the descriptive empirical analysis in detail. The third section provides further discussion and concludes.

Data and Empirical Analysis
The first four waves of the German Socio-Economic Panel (GSOEP) from 1984 to 1987 offer individual-level data of more than 10,000 respondents on annual changes in personal life satisfaction. The GSOEP was started as a longitudinal survey in 1984 (for a detailed description, see Wagner et al. 2007). Since then, a typical question on life satisfaction has been surveyed each year: "And finally, we would like to ask you about your satisfaction with your life in general. Please answer by using the following scale, in which 0 means totally unhappy, and 10 means totally happy. How happy are you at present with your life as a whole?" In the following, the answers to this question are labeled "current satisfaction" CS it , i.e., life satisfaction of individual i in year t as reported in year t. Until 1987, the question that directly followed was: "How happy were you a year ago with your life?" The answers to this question are labeled "retrospective satisfaction" RS it , i.e., life satisfaction in year t − 1 as reported in year t. Finally, the survey participants were asked: "And what do you think it will be like in a year's time?" These answers are labeled "prospective satisfaction" PS it , i.e., life satisfaction in year t + 1 as expected by individual i in year t.
These variables permit the calculation of changes in life satisfaction between years t − 1 and t in three ways: Depending on the question asked, respondents can be expected to focus more on temporal, personal or social comparisons (see Dubé et al. 1998;Schwarz and Strack 1999;McBride 2010, for more detailed discussions). The latter two measures probably put relatively more weight on intrapersonal intertemporal comparisons because the time dimension is explicit in the questions. Since these assessments are made at the same time, social reference groups and aspirations should be identical for CS it and RS it as well as for CS it−1 and PS it−1 , or at least much more similar than for the current assessments CS it and CS it−1 . 3 Differences between ΔCS , ΔRS and ΔPS can be due to different comparison standards, but also due to recall problems, memory bias or forecast errors, e.g., people being too optimistic on average (Odermatt and Stutzer 2019;Schwandt 2016;Frijters et al. 2009). The accuracy of forecasts and recall naturally declines the longer the time horizon. Gibson and Kim (2010) (1) note that especially long-term retrospective recall poses a serious problem to data accuracy, and Hagerty (2003) suggests that recall only becomes difficult after 5 years. Here, the time period between interviews is approximately 1 year matching the time horizon of the survey question quite well (see Table 1). For more than 90% of the interviews, the time between surveys is between 9 and 15 months. 4 Table 1 shows the average current, prospective and retrospective satisfaction scores along with the corresponding average yearly changes. In order to create a consistent sample for which the yearly changes in satisfaction refer to the same time periods for all measures, only individuals for which all three satisfaction measures are available in at least two consecutive surveys are included in the analysis. 5 On average, current satisfaction has declined by 0.15 points annually between 1984 and 1987. In contrast, prospective and retrospective reports show average annual increases by 0.15 and 0.07 points, respectively. Such differences between current, prospective and retrospective, and also between personal and more global/societal assessments are common in the literature (Hagerty 2003;Gandelman and Hernandez-Murillo 2009;Deaton and Stone 2013;Schwandt 2016;Prati and Senik 2020). The correlation of RS it and CS it−1 amounts to 0.5, the correlation of PS it−1 and CS it to 0.4, which suggests that the intertemporal assessments capture a significant part of past and future current satisfaction (Krueger and Schkade 2008;Pavot and Diener 1993), but t-tests strongly reject the hypothesis that ΔCS , ΔRS and ΔPS are equal on average. While the three measures of yearly changes in life satisfaction are clearly not identical, ΔRS and ΔPS appear to match much better: ΔRS and ΔPS are identical in 52% of all observations, ΔCS and ΔPS in 27% and ΔCS and ΔRS in 18%. Allowing for a 1 point-difference, the congruence of the three measures amounts to 77%, 60% and 39%, respectively. A significant part of the better match between ΔRS and ΔPS can be traced back to a large fraction of recalled or predicted life satisfaction changes of 0: 67% and 64%, respectively. In contrast, only 29% of satisfaction changes measured by ΔCS are 0. The significantly larger variation of ΔCS is also mirrored by the standard deviations of the three measures in Table 1.
The comparison of retrospective and current data provides a similar picture as in Prati and Senik (2020) and Kaiser (2020) with regard to people over-stating their past life satisfaction development. The left panel of Fig. 1 shows the predicted and remembered changes in life satisfaction by the observed change in current satisfaction. In qualitative terms, all three measures coincide, but especially large negative developments are typically not expected. For all measures, reporting problems arise due to the bounded scale. In particular, individuals at the lower and upper ends of the scale cannot predict a further decline or increase in life satisfaction, and such changes cannot be identified from the current assessments. For example at the upper end of the scale, average ΔCS and ΔPS are null or negative by definition because values higher than 10 cannot be reported. The right panel of Fig. 1 shows that the retrospective assessments suggest positive well-being developments among people who report the same higher levels of life satisfaction in two consecutive years. Among those who report a current satisfaction of 10 in two consecutive years, ΔRS is 0.2 on average (n=1.743). At the lower end of the scale, ΔRS is -0.8 on average for those with current satisfaction below 5 and ΔCS = 0 (n=200). For all those with ΔCS = 0 , ΔPS decreases along the life satisfaction scale while ΔRS increases. This also points to another issue with the discrete ordinal measurement of unobservable latent continuous satisfaction: In order to change one's current assessments, stronger impulses might be necessary as assessments move closer to the more extreme values of the scale (see also Kaiser 2020). 6 Figure 2 plots the yearly change in life satisfaction by gender and age groups. Overall, the annual changes in life satisfaction are slightly larger in absolute terms for females. While the negative average change in current life satisfaction ΔCS is not statistically distinguishable between females and males, females predict and remember significantly higher satisfaction differences ΔPS and ΔRS . Consequently, the difference between the three satisfaction measures turns out larger for females. Similar to Schwandt (2016), younger people are more optimistic and older people more pessimistic about their future well-being. 7 A similar pattern is visible for remembered changes in well-being: People report well-being improvements when young and well-being reductions when old. From midlife onward, predicted and especially remembered changes are close to zero on average. For the two oldest age groups, all three measure point into the same negative direction. One explanation for this pattern could be stronger changes in reference systems, i.e., individuals rescale their life satisfaction assessments, until roughly the late 30's. Hence, it is possible that individual well-being increases on average at young age, while reported current satisfaction scores point into the opposite direction. Since younger individuals report higher levels of current satisfaction (more than half of those under the age of 30 report a current life satisfaction of 8 or higher), the issues related to the discrete ordinal measurement discussed above might also play a role.
The discussed patterns could be due to people generally being too optimistic when young and overstating their own past development at the same time (memory bias), e.g., for self-appraisal. In order to investigate the potential systematic links between the different life satisfaction assessments further, I start from Equations (2) and (3), assuming that ΔRS and ΔPS should be equal in the absence of systematic or random errors, e.g., memory or prediction biases or changes in the subjective reference system. 8 Rearranging yields the following regression equation for retrospective life satisfaction in year t − 1 as reported by an individual i in year t: This equation includes the main global determinants of retrospective life satisfaction: past current satisfaction CS it−1 about a year ago, the current level of satisfaction CS it which could be regarded as the anchor for the intertemporal assessments and the past prediction of current satisfaction PS it−1 . In addition to a random error term it , I include an individual fixed effect i because certain personality traits (e.g., optimism or extroversion) have the potential to affect retrospective assessments similar to other life satisfaction assessments  Table 1 and, e.g., Schwandt (2016) or Kaiser (2020) show, this does not appear to be the case. (Ferrer-i-Carbonell and Frijters 2004). In case of an intertemporally stable reference system and the absence of any other systematic factors (e.g., recall problems), should be 1, while and should be 0, i.e., RS it should only depend on CS it−1 , the well-being level to be recalled. While this simple regression analysis is not deeply based on psychological and economic theory, it provides a general description how the global assessment of past, current and future well-being are linked. The panel nature of the data allows me to control for unobserved time-invariant individual factors summarized by i . Furthermore, the regressions include year dummies and the time difference in months between the two surveys from which the satisfaction assessments stem. Table 2 reports the results of linear random and fixed effects regressions, a fixed effect ordered logistic regression by Baetschmann et al. (2015Baetschmann et al. ( , 2020) and a Gini regression (Schaffer 2015) as suggested by Schröder and Yitzhaki (2017). 9 A Hausman-test clearly rejects the assumption of random effects, but all estimated coefficients are very similar across the model specifications and for different samples as reported in Table 3. 10 Still, it must be kept in mind that the regressions first and foremost provide a descriptive analysis of these different general life satisfaction assessments. A more detailed investigation targeting the potentially different and time-varying Table 2 Determinants of retrospective life satisfaction reported in t about t−1 Table reports coefficients from linear random and fixed effects regressions, a fixed effects ordered logit model (Baetschmann et al., 2020) and a Gini regression (Schaffer, 2015), standard errors adjusted for clustering on individuals in parentheses (except Gini regression). All regressions include year dummies. Significance levels: *10% **5% ***1% Source: GSOEP 1984-87, own calculations  (Schröder and Yitzhaki 2017). Furthermore, Kaiser and Vendrik (2020) argue that sign reversals are not likely to be a serious concern in applied empirical research on life satisfaction, but that relationships between variables might be heterogeneous across the satisfaction scales. 10 The results of the random effects regression are also robust to the inclusion of a female-dummy, age and age squared as additional variables. Excluding the year dummies and the months between interviews does not affect any estimation results, either.
determinants of current, prospective and retrospective life satisfaction is beyond the scope of this analysis. Retrospective satisfaction is significantly related to all of the three other satisfaction measures. The positive coefficient of life satisfaction in the reference period CS t−1 creates confidence that retrospective assessments carry significant information about the survey participants' past well-being. But as is far from 1, recall error is very likely, not least due to the imprecise time frame of "a year ago". When limiting the sample to only those surveys that were conducted exactly 12 months apart, the coefficient increases slightly (first column of Table 3). The strongest correlation emerges between retrospective and current life satisfaction in the reporting period CS t . This is unsurprising because the question about current satisfaction is asked directly before the question about retrospective satisfaction. Thus, the former assessment probably serves as the anchor for the latter. This also suggests that the retrospective assessment is less about the recall of the satisfaction score reported a year ago, but more about an assessment of life a year ago relative to the current situation.
In all regressions, a significant negative correlation between the past prediction of current satisfaction and the current recall of past satisfaction is found. The size of the coefficient from the linear regressions ranges from -0.04 for the sample excluding high and low satisfaction scores to -0.2 for the sample of individuals surveyed 12 months apart. The coefficients turn out similar across all age groups. 11 This implies that positive predictions are related to future retrospective downgrades of life satisfaction assessments, and vice versa. If predicted and retrospective assessments were both relative to the corresponding current satisfaction scores, the assessments of CS t−1 and PS t−1 would be based on the reference system of year t − 1 and the assessments of CS t and RS t would be based on the potentially different reference system of year t. Since the regressions control for current life satisfaction in both years and individual fixed effects, the results provide evidence that the differences between the direct intertemporal assessments ΔPS and ΔRS compared to ΔCS are most likely not simply due to misremembering and misprediction, but reflect additional information about well-being, its development and changes in the reference system. In particular, small improvements in personal well-being can more easily be reported in predictions and in retrospect, but such changes might not be large enough to induce change in the current assessments, especially towards the end of the life satisfaction scale (right panel of Fig. 1). This is compatible with hedonic adaptation, not necessarily with regard to larger shocks or life events (Clark et al. 2008a), but rather on a small scale and as a continuous process.

Discussion and Conclusion
Most empirical research on personal well-being is based on current subjective reports. My study adds to the literature by linking current, retrospective and prospective reports of personal life satisfaction. In contrast to satisfaction changes measured by comparing current assessments from two years, people overpredict future well-being ex-ante (Odermatt and Stutzer 2019;Schwandt 2016;Frijters et al. 2009) and overstate their well-being development ex-post (Prati and Senik 2020;Kaiser 2020;Smith et al. 2008). In contrast, predicted and remembered changes in life satisfaction match quite well. This pattern is particularly pronounced at young age, i.e., young people predict and recall improvements in their wellbeing whereas their current life satisfaction scores decline on average. This observation could be due to relatively strong changes in the reference system for the assessment of personal life satisfaction. Odermatt and Stutzer (2019) point out for prediction errors that there are several possible methodological, psychological or behavioral explanations why different life satisfaction assessments can produce different empirical results concerning the development of individual well-being. Based on the same GSOEP data on current and remembered satisfaction as my analysis, they conclude that rescaling cannot fully explain the observed life satisfaction adjustments following particular life events like marriage or becoming unemployed (Odermatt and Stutzer 2019, Online Appendix 7). Additionally accounting for life satisfaction predictions, my analysis suggests that intertemporal assessments provide an informational value beyond that of current assessments alone. 12 In particular, the former can help to address adjustments of the measurement scale and thereby shed more light on the influence of reference systems and thus identify better true changes in well-being and adaptation processes (see, e.g., Kaiser 2020; Odermatt and Stutzer 2019).
The discrete ordinal measurement of latent well-being and especially the cardinal treatment of such data must be seen critical (Kaiser and Vendrik 2020;Bond and Lang 2019;Schröder and Yitzhaki 2017). Small changes in well-being might not be enough to induce changes in current assessments. The stimulus necessary for changing one's assessment might vary across the life satisfaction scale. And positive like negative developments cannot be reported at the ends of the measurement scale. 13 The comparison of current and intertemporal assessments provides evidence that these issues exist: Self-reported changes in well-being vary over the life satisfaction scale with the highest absolute values in the lower half of the scale. The remembered (predicted) change in life satisfaction is higher (lower) for high levels of satisfaction and lower (higher) for low levels of satisfaction. Prati and Senik (2020) attribute the latter finding to behavioral factors, e.g., strategic selfenhancement. I suspect that both aspects are relevant, but disentangling measurement problems and behavioral factors further is beyond the scope of this study and left to further research (see also Kaiser 2020).
While recall difficulties and memory biases probably limit the informational content of retrospective assessments of past well-being, current assessments might over-accentuate short-lived life circumstances, mood and affect (Deaton 2012;Wunder 2012;Schwarz and Strack 1999). The GSOEP data show a strong correlation between the contemporaneous assessments of current and past life satisfaction. Since the retrospective and past assessments are in fact significantly correlated (although the time horizon is not perfectly defined) the former might be well suited to study well-being more globally, net of rather noisy short-term influences, especially concerning correlations of well-being with other variables where recall error weighs less severe (Powers et al. 1978).
For the discussion and interpretation of well-being developments, the reference system used for the corresponding assessments is important. When reports of current life satisfaction from different time points are used the reference systems can be different which leads to a mutatis mutandis-interpretation of the psychological concept of experienced utility. A stable reference system permits a ceteris paribus-interpretation that comes closer to the economic concept of decision utility (Easterlin 2001(Easterlin , 2002. 14 The current, retrospective and prospective data thus can complement each other to address different questions. The significant association of past predictions and current recall of well-being, controlling for fixed personality traits and concurrent life satisfaction suggests that changes in well-being, in particular if expected or small, are not necessarily mirrored by changes in current reports but in ex-post updates of past assessments which offers an additional perspective on adaptation and response shift. In this case, current life satisfaction data can describe adequately the gross development of personal well-being (including changes in the personal, social or aspirational reference system) while intertemporal, especially retrospective, accounts can provide additional information about which life situation (e.g., now or a year ago) individuals net prefer. Not least, this distinction is relevant for policy questions about whether a particular development is rooted in externally objective living conditions or in the way the conditions are assessed internally. Overall, a case can be made for eliciting intertemporal and other, e.g., interpersonal comparisons, from survey participants directly (van Praag 2011). For policy analysis, a combination of different methodological approaches, including different well-being measures and based on decision as well as experienced utility would be preferable (Larsen and Fredrickson 1999;Loewenstein and Ubel 2008).
Funding Open Access funding enabled and organized by Projekt DEAL. The paper solely reflects the personal views of the author. The author has no financial or proprietary interests in any material discussed in this article. No funds, grants, or other support was received for conducting this study. The data is available to researchers from the German Socio-Economic Panel at the DIW Berlin; the code used in the empirical analysis is available from the author upon request.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.