Abstract
We propose a method to analyze interval-censored data using a multiple imputation based on a Heteroskedastic Interval regression approach. The proposed model aims to obtain a synthetic dataset that can be used for standard analysis, including standard linear regression, quantile regression, or poverty and inequality estimation. We present two applications to show the performance of our method. First, we run a Monte Carlo simulation to show the method's performance under the assumption of multiplicative heteroskedasticity, with and without conditional normality. Second, we use the proposed methodology to analyze labor income data in Grenada for 2013-2020, where the salary data are interval-censored according to the salary intervals prespecified in the survey questionnaire. The results obtained are consistent across both exercises.
Similar content being viewed by others
Data availability
The data used for the illustration in the current study are available from the Central Statistical Office from Grenada.
References
Angelov, A.G., Ekström, M.: Maximum likelihood estimation for survey data with informative interval censoring. AStA Adv. Stat. Anal. 103(2), 217–236 (2019). https://doi.org/10.1007/s10182-018-00329-x
Büttner, T., Rässler, S.: Multiple imputation of right-censored wages in the German IAB employment sample considering heteroscedasticity. IAB-Discussion Paper (2008)
Cameron, A. C., & Trivedi, P. K.: Microeconometrics: Methods and applications. Cambridge University Press (2005)
Chen, Y.-T.: A unified approach to estimating and testing income distributions with grouped data. J. Bus. Econ. Stat. 36(3), 438–455 (2018). https://doi.org/10.1080/07350015.2016.1194762
Chernozhukov, V., Fernández-Val, I., Melly, B.: Fast algorithms for the quantile regression process. Empirical Economics 62(1), 7–33 (2022). https://doi.org/10.1007/s00181-020-01898-0
Corral, P., Himelein, K., McGee, K., Molina, I.: A map of the poor or a poor map? Mathematics 9(21), 2780 (2021). https://doi.org/10.3390/math9212780
Enders, C.K.: Missing not at random models for latent growth curve analyses. Psychol. Methods 16(1), 1–16 (2011). https://doi.org/10.1037/a0022640
Enders, C.K.: Applied missing data analysis. The Guilford Press, New York London (2022)
Firpo, S., Fortin, N.M., Lemieux, T.: Unconditional quantile regressions. Econometrica 77(3), 953–973 (2009)
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B.:Bayesian data analysis (3rd edn). CRC Press, Taylor and Francis Group (2014)
Hagenaars, A., de Vos, K.: The definition and measurement of poverty. J. Human Res. 23(2), 211–221 (1988). https://doi.org/10.2307/145776
Han, J., Meyer, B. D., & Sullivan, J. X.: Income and Poverty in the COVID-19 Pandemic (Working Paper 27729). National Bureau of Economic Research (2020). https://doi.org/10.3386/w27729
Hsu, C., He, Y., Hu, C., Zhou, W.: A multiple imputation-based sensitivity analysis approach for regression analysis with a missing not at random covariate. Stat. Med. 42(14), 2275–2292 (2023). https://doi.org/10.1002/sim.9723
Hsu, C.-Y., Wen, C.-C., Chen, Y.-H.: Quantile function regression analysis for interval censored data, with application to salary survey data. Jpn. J. Stat. Data Sci. 4(2), 999–1018 (2021). https://doi.org/10.1007/s42081-021-00113-3
Jenkins, S.P., Burkhauser, R.V., Feng, S., Larrimore, J.: Measuring inequality using censored data: a multiple-imputation approach to estimation and inference. J. R. Stat. Soc.: Ser. A: Stat. Soc. 174(1), 63–81 (2011). https://doi.org/10.1111/j.1467-985X.2010.00655.x
Machado, J.A.F., Santos Silva, J.M.C.: Quantiles via moments. J. Econ. 213(1), 145–173 (2019). https://doi.org/10.1016/j.jeconom.2019.04.009
McDonald, J., Stoddard, O., Walton, D.: On using interval response data in experimental economics. J. Behav. Exp. Econ. 72, 9–16 (2018). https://doi.org/10.1016/j.socec.2017.10.003
Moore, J.C., Stinson, L.L., Welniak, E.J.: Income measurement error in surveys: a review. J. Off. Stat.-Stockholm 16(4), 331–362 (2000)
Muñoz, J., Efthimiou, O., Audigier, V., De Jong, V. M. T., & Debray, T. P. A.: Multiple imputation of incomplete multilevel data using Heckman selection models. Stat. Med. sim.9965 (2023). https://doi.org/10.1002/sim.9965
Parolin, Z., Wimer, C.: Forecasting estimates of poverty during the COVID-19 crisis. Povert. Soc. Policy Brief 4(8), 1–18 (2020)
Rios-Avila, F.: Recentered influence functions (RIFs) in Stata: RIF regression and RIF decomposition. The Stata Journal 20(1), 51–94 (2020). https://doi.org/10.1177/1536867X20909690
Royston, P.: Multiple imputation of missing values: further update of ice, with an emphasis on interval censoring. Stata J. 7(4), 445–464 (2007). https://doi.org/10.1177/1536867X0800700401
Rubin, D.B.: Multiple Imputation for nonresponse in surveys. Wiley, New York, NY (1987)
Stewart, M.B.: On least squares estimation when the dependent variable is grouped. Rev. Econ. Stud. 50(4), 737–753 (1983). https://doi.org/10.2307/2297773. JSTOR
Vega Yon, G.G., Quistorff, B.: parallel: a command for parallel computing. Stata J. 19(3), 667–684 (2019). https://doi.org/10.1177/1536867X19874242
Walter, P., Weimer, K.: Estimating poverty and inequality indicators using interval censored income data from the German microcensus. Freie Universität Berlin, School of Business & Economics, Berlin (Discussion Paper 2018/10) (2018)
Wang, X., Chen, M.-H., Yan, J.: Bayesian dynamic regression models for interval censored survival data with application to children dental health. Lifetime Data Anal. 19(3), 297–316 (2013). https://doi.org/10.1007/s10985-013-9246-8
World Bank: Macro poverty outlook: Country-by-country analysis and projections for the developing world. World Bank, Washington, DC (2020)
Yan, T., Qu, L., Li, Z., Yuan, A.: Conditional kernel density estimation for some incomplete data models. Electron. J. Stat. 12(1), 1299–1329 (2018). https://doi.org/10.1214/18-EJS1423
Zhou, X., Feng, Y., Du, X.: Quantile regression for interval censored data. Commun. Stat. - Theory Methods 46(8), 3848–3863 (2017). https://doi.org/10.1080/03610926.2015.1073317
Funding
This research received no funding.
Author information
Authors and Affiliations
Contributions
Fernando Rios-Avila developed and implemented the strategy, and wrote the manuscript. Gustavo Canavire-Bacarreza and Flavia Sacco-Capurro wrote the manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The opinions expressed in this paper are those of the authors and not necessarily reflect the views of the World Bank, its Board of Directors, or countries it represents.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rios-Avila, F., Canavire-Bacarreza, G. & Sacco-Capurro, F. Recovering income distribution in the presence of interval-censored data. J Econ Inequal (2024). https://doi.org/10.1007/s10888-023-09617-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10888-023-09617-2