Estimating transition probabilities between health states using US longitudinal survey data

Jung, Juergen

doi:10.1007/s00181-021-02157-6

Estimating transition probabilities between health states using US longitudinal survey data

Published: 30 January 2022

Volume 63, pages 901–943, (2022)
Cite this article

Empirical Economics Aims and scope Submit manuscript

Juergen Jung ORCID: orcid.org/0000-0003-3791-6293¹

366 Accesses
1 Citation
Explore all metrics

Abstract

We use data from two representative US household surveys, the Medical Expenditure Panel Survey (MEPS) and the Health and Retirement Study (RAND-HRS) to estimate transition probability matrices between health states over the lifecycle from age 20–95. We compare nonparametric counting methods and parametric methods where we control for individual characteristics as well as time and cohort effects. We align two year transition probabilities from HRS with one-year transition probabilities in MEPS using a stochastic root method assuming a Markov structure. We find that the nonparametric counting method and the regression specifications based on ordered logit models produce similar results over the lifecycle. However, the counting method overestimates the probabilities of transitioning into bad health states. In addition, we find that young women have worse health prospects than their male counterparts but once individuals get older, being female is associated with transitioning into better health states with higher probabilities than men. We do not find significant differences of the conditional health transition probabilities between African Americans and the rest of the population. We also find that the lifecycle patterns are stable over time. Finally, we discuss issues with controlling for time effects, sample attrition, the Markov assumption, and other modeling issues that can arise with categorical outcome variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A method for calculating the implied no-recovery three-state transition matrix using observable population mortality incidence and disability prevalence rates among the elderly

Article 30 May 2019

(Healthy) Aging Patterns in Europe: A Multistate Health Transition Approach

Article Open access 25 November 2022

The effect of trends in health and longevity on health services use by older adults

Article Open access 24 December 2015

Notes

MEPS is a representative household survey of the US working population, while the HRS is a survey representative of the older population in the USA.
Compare Catillon et al. (2018) and life tables from the CDC at: https://www.cdc.gov/nchs/nvss/life-expectancy.htm.
A related approach by Dalgaard and Strulik (2014) has focused on modeling the dynamics of health as a health deficit accumulation process that eventually ends in death which can be measured with a frailty index. An introduction to the frailty index measure can be found in Rockwood and Mitnitski (2007).
Cutler and Richardson (1997) and more specifically Grossman (2000) provide summaries of this empirical literature concerning health capital.
Appendices A–I contain additional results including additional summary statistics, ordered probit models, multinomial logit, and probit models, transition probabilities based on samples from two different time periods, detailed lifecycle transition probabilities by gender, race, and race and time, as well as results from finite mixture and mixed processes models.
Chowdhury et al. (2019) provides details about the MEPS survey designs.
Sect. 5.8 contains a more detailed discussion about attrition bias issues in MEPS and HRS.
The RAND-HRS is developed from the HRS and comprises a cross-wave file with variables derived consistently across waves. The RAND-HRS is maintained by the RAND Center of Aging. More information is available at: https://www.rand.org/well-being/social-and-behavioral-policy/centers/aging/dataprod/hrs-data.html.
Fisher and Ryan (2017) provides a recently published summary of the Health and Retirement Study.
A frequency distribution of the full sample is available in Figure A.1 and Table A.1 in Online Appendix A.
OECD (2018), Inflation (CPI) (indicator). doi: 10.1787/eee82e6e-en (Accessed on 29 June 2018) at https://data.oecd.org/price/inflation-cpi.htm.
It should be noted that the observed frequencies of the respective health categories are much more uneven in MEPS than in the HRS. For instance, only 391 individuals transition to death and 3, 172 into poor health states whereas much larger numbers of individuals transition into the other health states. Uneven counts of observations of different categories in the outcome variable can lead to convergence issues in multinomial models. We discuss some of these issues in Sects. 4.3 and 5.4.
Similar results of very persistent health states have been found in other surveys such as the British Household Panel Survey as shown in Contoyannis et al. (2004).
Increasing the polynomial order of age does not affect our results.
It should be noted that \(t+1\) refers to the future period which in the econometric implementation could mean a one year ahead variable for MEPS data and a two-year ahead variable for HRS data.
Other tests for IIA are based on Small and Hsiao (1985). Compare Long and Freese (2014) for further details on testing for IIA.
See Long and Freese (2014) for details.
The number of children in a household is sometimes added to the lifestyle equation as the number of children could affect smoking behavior but not one’s assessment of health.
We will discuss methods to transform two-year transition probabilities into one-year frequencies probabilities in the next Section.
Compare Contoyannis et al. (2004) for a discussion of using initial condition to control for individual effects in dynamic panel regressions.
Life expectancy numbers are from CDC life tables for the year 2001 and 2016 retrieved in June 2021 from https://www.cdc.gov/nchs/nvss/life-expectancy.htm:
The Python version of the algorithm is available on the author’s website at: https://juejung.github.io/research.htm.
We skip the test for state dependency as it is pretty clear that consecutive health states are not independent from each other as shown by the highly significant coefficients of current health states in the marginal effects estimations of Tables 9 and 10.
A joint hypothesis over all initial health types h can easily be implemented by summing up the individual \(\alpha _{n}\) over all health states with then follow a Chi-square distribution with \(6\left( 6-1\right) \left( T-1\right) \) degrees of freedom.
The joint distribution over all health states h follows a Chi-square distribution with \(6\times \left( 6-1\right) ^{2}\) degrees of freedom.
Tests for subcategories of individuals for whom we control in the parameterized version of the model are difficult to implement as some transitions between rare health states would not show up in the divided sample and the tests would have diminished statistical power.
The age group of 50–60 year olds has good representation in both surveys as can be seen from Figure A.1 in Online Appendix A.
In order to assess the robustness of our results based on the ordered logit model, we also report estimation results from an ordered probit model in Figures B.1–B.5 in Online Appendix B. The results are almost identical to the ordered logit model.
Marginal effects estimates for both the MLM and MPM using HRS data are available in Online Appendix C. They are very similar to the marginal effects based on the ordered logit model from Sect. 5.1.
For more detailed discussions of age, cohort, and time effects see Fernández-Villaverde and Krueger (2007) and Jung and Tran (2014).
This does not contradict our earlier result that finds some significant differences in the early and late time period dummy variables as differential effects from periods of recessions are potentially driving the results.
Adding additional interaction terms of gender with a higher order age polynomial does not change the resulting graphs in a statistically significant way.
Online Appendix G presents the lifecycle profiles of the differences in the conditional transition probabilities as well as summary statistics by race across the two time periods.
Baulch and Quisumbing (2010) contains detailed descriptions including Stata codes for these type of tests.
See Heeringa and Connor (1995) and Ofstedal et al. (2011) for more detail about the HRS sample design and sample weights.
Attrition on observables occurs when the dependent variable is independent of the attrition process conditional on the explanatory variables. Attrition on unobservables occurs when this conditional independence does not hold. A sample selection model can account for attrition on unobservables but requires an exclusion restriction for identification, that is, an instrumental variable that affects attrition only but not the dependent variable (Hausman and Wise 1979; Ridder 1992). Fitzgerald et al. (1998) point out that it is almost impossible to find plausible exclusion restrictions.

References

Aiyagari RS (1994) Uninsured idiosyncratic risk and aggregate saving. Q J Econ 109(3):659–684
Article Google Scholar
Alderman H, Behrman JR, Kohler H-P, Maluccio JA, Watkins SC (2001) Attrition in longitudinal household survey sata. Demogr Res 5:79–124
Article Google Scholar
Anderson TW, Goodman LA (1957) Statistical inference about markov chains. Ann Math Stat 28(1):89–110
Article Google Scholar
Balia S (2014) Survival Expectations, Subjective Health and Smoking: Evidence from SHARE. Empir Econ 47(2):753–780
Article Google Scholar
Balia S, Jones AM (2008) Mortality, lifestyle and socio-economic status. J Health Econ 27(1):1–26
Article Google Scholar
Baulch B, Quisumbing A (2010) Testing and adjusting for attrition in household panel data. Toolkit Note, Chronic Poverty Research Centre, London, UK 1–12
Becketti S, Gould W, Lillard L, Welch F (1988) The panel study of income dynamics after fourteen years: an evaluation. J Law Econ 6(4):472–492
Google Scholar
Bewley T (1986) Stationary monetary equilibrium with a continuum of independently fluctuating consumers. In: Hildenbrand W, Mas-Colell A (eds) Contributions to mathematical economics in Honor of Gerard Debreu. North-Holland
Billingsley P (1961) Statistical inference for markov processes, vol 7. University of Chicago Press, Chicago
Google Scholar
Brant R (1990) Assessing proportionality in the proportional odds model for ordinal logistic regression. Biometrics 46(4):1171–1178
Article Google Scholar
Cao H, Hill DH (2005) Active versus passive sample attrition: the health and retirement study. Econometrics 0505006, University Library of Munich, Germany
Catillon M, Cutler D, Getzen T (2018) Two hundred years of health and medical care: the importance of medical care for life expectancy gains. (25330)
Chhatwal J, Jayasuriya S, Elbasha EH (2016) Changing cycle lengths in state-transition models: challenges and solutions. Med Decis Making 36(8):952–964
Article Google Scholar
Chowdhury SR, Machlin SR and Gwet KL (2019) Sample designs of the medical expenditure panel survey household component, 1996–2006 and 2007–2016. Methodology report #33 (January 2019) agency for healthcare research and quality. Rockville, MD
Clarke PM, Ryan C (2006) Self-reported health: reliability and consequences for health inequality measurement. Health Econ 15(6):645–652
Article Google Scholar
Cohen SB, Machlin SR, Branscome JM (2000) Patterns of survey attrition and reluctant response in the 1996 medical expenditure panel survey. Health Serv Outcomes Res Method 1(2):131–148
Article Google Scholar
Contoyannis P, Jones AM (2004) Socio-economic status, health and lifestyle. J Health Econ 23(5):965–995
Article Google Scholar
Contoyannis P, Jones AM, Rice N (2004) The dynamics of health in the british household panel survey. J Appl Economet 19(4):473–503
Article Google Scholar
Crossley TF, Kennedy S (2002) The reliability of self-assessed health status. J Health Econ 21:643–658
Article Google Scholar
Cutler DM, Richardson E (1997) Measuring the health of the U.S. population. Brookings papers on economic activity: microeconomics pp 217–282
Dalgaard C-J, Strulik H (2014) Optimal aging and death: understanding the preston curve. J Eur Econ Assoc 12(3):672–701
Article Google Scholar
Deaton AS, Paxson CH (1998) Aging and Inequality in Income and Health. Am Econ Rev Papers Proceed 88(2):248–253
Google Scholar
Deb P, Trivedi PK (1997) Demand for medical care by the elderly: a finite mixture approach. J Appl Economet 12(3):313–336
Article Google Scholar
Diehr P, Patrick DL (2001) Probabilities of transition among health states for older adults. Qual Life Res 10:431–442
Article Google Scholar
Diehr P, Patrick DL, Bild DE, Gregory L, Williamson BJD (1998) Predicting future years of healthy life for older adults. J Clin Epidemiol 51(4):343–353
Article Google Scholar
Engels JM, Diehr P (2003) Imputation of missing longitudinal data: a comparison of methods. J Clin Epidemiol 56:968–976
Article Google Scholar
Fernández-Villaverde J, Krueger D (2007) Consumption over the life-cycle: some facts from consumer expenditure survey data. Rev Econ Stat 89(3):552–565
Article Google Scholar
Fisher GG, Ryan LH (2017) Overview of the health and retirement study and introduction to the special issue. Work, aging and retirement 4(1):1–9
Fitzgerald J, Gottschalk P, Moffitt R (1998) An analysis of sample attrition in panel data: the michigan panel study of income dynamics. J Hum Resour 33(2):251–299
Article Google Scholar
Fonseca R, Michaud P-C, Galama T, Kapteyn A (2021) Accounting for the rise of health spending and longevity. J Eur Econ Assoc 19(1):536–579
Article Google Scholar
French E (2005) The effects of health, wealth, and wages on labour supply and retirement behaviour. Rev Econ Stud 72(2):395–427
Article Google Scholar
French E, Jones JB (2011) The effects of health insurance and self-insurance on retirement behavior. Econometrica 79:693–732
Article Google Scholar
Gerdtham UG, Johannesson M, Lundberg L, Isacson D (1999) A note on validating wagstaff and Van Doorslaer’s health measure in the analysis of inequalities in health. J Health Econ 18(1):117–124
Article Google Scholar
Grossman M (2000) Handbook of health economics. Vol. 1A Elsevier North Holland chapter The Human Capital Model, pp 347–408
Grossman M (1972) On the concept of health capital and the demand for health. J Polit Econ 80(2):223–255
Article Google Scholar
Halliday TJ, Mazumder B, Wong A (2020) The intergenerational transmission of health in the United States: a latent variables analysis. Health Economics pp 1–15
Halliday TJ, Mazumder B, Wong A (2021) Intergenerational mobility in self-reported health status in the US. J Public Econ 193:104307
Article Google Scholar
Hausman JA, McFadden D (1984) Spedification tests for the multinomial logit model. Econometrica 52:1219–1240
Article Google Scholar
Hausman JA, Wise DA (1979) Attrition bias in experimental and panel data: the gary income maintenance experiment. Econometrica 47(2):455–473
Article Google Scholar
Heeringa SG, Connor JH (1995) Technical description of the health and retirement survey sample design. Institute for Social Research University of Michigan Ann Arbor, MI
Higham NJ, Lin L (2011) On Pth roots of stochastic matrices. Linear Algebra Appl 435:448–463
Article Google Scholar
Huggett M (1993) The risk-free rate in heterogeneous-agent incomplete-insurance economies. J Econ Dyn Control 17(5–6):953–969
Article Google Scholar
Idler EL, Benyamini Y (1997) Self-rated health and mortality: a review of twenty-seven community studies. J Health Soc Behav 38(1):21–37
Article Google Scholar
Idler EL, Kasl SV (1995) Self-ratings of health: do they also predict change in functional ability? J Gerontol Ser B, Psychol Sci Soc Sci 50(6):S344-353
Google Scholar
İmrohoroğlu S, Kitao S (2012) Social security reforms: benefit claiming, labor force participation, and long-run sustainability. Am Econ J Macroecon 4(3):96–127
Article Google Scholar
İmrohoroğlu A, İmrohoroğlu S, Joines D (1995) A life cycle analysis of social security. Econ Theor 6(1):83–114
Article Google Scholar
Israel RB, Rosenthal JS, Wei JZ (2001) Finding generators for markov chains via empirical transition matrices, with applications to credit ratings. Math Financ 11(2):245–265
Article Google Scholar
Juerges H (2007) True health vs response styles: exploring cross-country differences in self-reported health. Health Econ 16(2):163–178
Article Google Scholar
Jung J, Tran C (2014) Medical consumption over the life cycle: facts from a U.S. medical expenditure panel survey. Empir Econ 47(3):927–957
Article Google Scholar
Jung J, Tran C (2016) Market inefficiency, insurance mandate and welfare: U.S. health care reform 2010. Rev Econ Dyn 20:132–159
Article Google Scholar
Jung J, Tran C, Chambers M (2017) Aging and health financing in the U.S.: a general equilibrium analysis. Eur Econ Rev 100:428–462
Article Google Scholar
Juster FT, Suzman R (1995) An overview of the health and retirement study. J Human Resour 30(Supplement):S7–S56
Article Google Scholar
Kakwani N, Wagstaff A, van Doorslaer E (1997) Socioeconomic inequalities in health: measurement, computation, and statistical inference. J Econ 77(1):87–103
Article Google Scholar
Kaplan G (2012) Inequality and the life cycle. Quant Econ 3(3):471–525
Article Google Scholar
Kapteyn A, Meijer E (2014) A comparison of different measures of health and their relation to labor force transitions at older ages. In discoveries in the economics of aging. NBER Chapters National Bureau of Economic Research, Inc pp 115–150
Kapteyn A, Michaud PC, Smith JP, Van Soest A (2006) Effects of attrition and non-response in the health and retirement study. RAND Working Paper WR-407
Kerkhofs M, Lindeboom M (1995) Subjective health measures and state-dependent reporting errors. Health Econ 4(3):221–235
Article Google Scholar
Kropko J (2008) Choosing between multinomial logit and multinomial probit models for analysis of unordered choice data. College of Arts and Sciences, Department of Political Science, Masters Thesis
Kullback S, Kupperman M, Ku HH (1962) Tests for contingency tables and markov chains. Technometrics 4(4):573–608
Google Scholar
Lillard LA, Panis CWA (1998) Panel attrition from the panel study of income dynamics: household income, marital status, and mortality. J Hum Resour 33(2):437–457
Article Google Scholar
Lin L (2011) Roots of stochastic matrices and fractional matrix powers. Ph.D Thesis, Manchester Institute for Mathematical Sciences School of Mathematics
Lindeboom M, van Doorslaer E (2004) Cut-point shift and index shift in self-reported health. J Health Econ 23(6):1083–1099
Article Google Scholar
Lindeboom M, Kerkhofs M (2009) Health and work of the elderly: subjective health measures, reporting errors and endogeneity in the relationship between health and work. J Appl Economet 24(6):1024–1046
Article Google Scholar
Long SJ, Freese J (2014) Regression models for categorical dependent variables using stata, 3rd edn. Stata Press, College Station, TX
Google Scholar
McLachlan GJ, Basford KE (1988) Mixture models: inference and applications to clustering, vol 38. Dekker, New York
Google Scholar
Meijer E, Kapteyn A, Andreyeva T (2011) Internationally comparable health indices. Health Econ 20(5):600–619
Article Google Scholar
Nardi D, Mariacristina EF, Jones JB (2010) Why do the elderly save? The role of medical expenses. J Polit Econ 118(1):39–75
Article Google Scholar
Ofstedal MB, Weir DR, Kuang-Tsung C and James W (2011) Updates to HRS sample weights updates to HRS sample weights. Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI
Okun MA, Stock WA, Haring MJ, Witter RA (1984) Health and subjective well-being: a meta-analyis. Int J Aging Human Develop 19(2):111–132
Article Google Scholar
Palumbo MG (1999) Uncertain medical expenses and precautionary saving near the end of the life cycle. Rev Econ Stud 66(2):395–421
Article Google Scholar
Pashchenko S, Porapakkarm P (2013) Quantitative analysis of health insurance reform: separating regulation from redistribution. Rev Econ Dyn 16(3):383–404
Article Google Scholar
Ridder G (1992) An empirical evaluation of some models for non-random attrition in panel data. Struct Chang Econ Dyn 3(2):337–355
Article Google Scholar
Rockwood K, Mitnitski A (2007) Frailty in relation to the accumulation of deficits. J Gerontol: Ser A 62(7):722–727
Article Google Scholar
Ruhm CJ (2000) Are recessions good for your health? Q J Econ 115(2):617–650
Article Google Scholar
Siebert U, Alagoz O, Bayoumi AM, Jahn B, Owens DK, Cohen DJ, Kuntz KM (2012) State-transition modeling: a report of the ISPOR-SMDM modeling good research practices task force-3. Value Health 15(6):812–820
Article Google Scholar
Small KA, Hsiao C (1985) Multinomial logit specification tests. Int Econ Rev 26(3):619–627
Article Google Scholar
van Doorslaer E, and Jones AM (2003) Inequalities in Self-reported Health: Validation of a New Approach to Measurement. Journal of Health Economics 22(1):61–87
Vijverberg Wim PM (2011) Testing for IIA with the Hausman-Mcfadden Test. IZA Discussion Paper No. 5826
Wagstaff A, Van Doorslaer E (1994) Measuring inequalities in health in the presence of multiple-category morbidity indicators. Health Econ 3(4):281–291
Article Google Scholar
Wallace RB, Herzog RA (1995) Overview of the health measures in the health and retirement study. J Human Resour 30(Supplement):S84–S107
Article Google Scholar
Wilde J (2000) Identification of multiple equation probit models with endogenous dummy regressors. Econ Lett 69(3):309–312
Article Google Scholar
Ziebarth N (2010) Measurement of health, health inequality, and reporting heterogeneity. Soc Sci Med 71(1):116–124
Article Google Scholar

Download references

Funding

Not applicable. This study is not funded by any Grant.

Author information

Authors and Affiliations

Department of Economics, Towson University, Towson, USA
Juergen Jung

Authors

Juergen Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juergen Jung.

Ethics declarations

Conflict of interest

Juergen Jung declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We appreciate comments from Gerhard Glomm, Vinish Shrestha, Jialu Streeter, Pravin Trivedi, and an anonymous referee. This paper was formerly circulated as “Estimating Markov Transition Probabilities between Health States in the HRS Dataset”

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2226 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jung, J. Estimating transition probabilities between health states using US longitudinal survey data. Empir Econ 63, 901–943 (2022). https://doi.org/10.1007/s00181-021-02157-6

Download citation

Received: 03 November 2020
Accepted: 03 October 2021
Published: 30 January 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s00181-021-02157-6

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating transition probabilities between health states using US longitudinal survey data

Abstract

Access this article

Similar content being viewed by others

A method for calculating the implied no-recovery three-state transition matrix using observable population mortality incidence and disability prevalence rates among the elderly

(Healthy) Aging Patterns in Europe: A Multistate Health Transition Approach

The effect of trends in health and longevity on health services use by older adults

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Supplementary Information

Supplementary material 1 (pdf 2226 KB)

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Estimating transition probabilities between health states using US longitudinal survey data

Abstract

Access this article

Similar content being viewed by others

A method for calculating the implied no-recovery three-state transition matrix using observable population mortality incidence and disability prevalence rates among the elderly

(Healthy) Aging Patterns in Europe: A Multistate Health Transition Approach

The effect of trends in health and longevity on health services use by older adults

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Supplementary Information

Supplementary material 1 (pdf 2226 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation