Abstract
In much demographic analysis, it is important to know how occurrence-exposure rates or transition probabilities vary continuously by age or by time. Often we have coarse or fluctuating data so there can be a need for estimation and smoothing. Since the distributions of rates or counts across age or another variable are often curved, a nonlinear model is likely to be appropriate. The main focus of this paper is on the estimation of detailed information from grouped data such as age and income bands; however, the methods we outline could also be applied to other settings such as smoothing rates where the original data are ragged. The ability to carry out curve fitting is a very useful skill for population geographers and demographers. Curve fitting is not well covered in statistics textbooks, and whilst there is a large literature in journals thoroughly discussing the detail of functions which define curves, these texts are likely to be inaccessible to researchers who are not specialists in mathematics. We aim here to make nonlinear modelling as accessible as possible. We demonstrate how to carry out nonlinear regression using SPSS, giving stepped-through hypothetical and research examples. We note other software in which nonlinear regression can be carried out, and outline alternative methods of curve fitting.
References
Benjamin, B., & Pollard, J. (1980). The analysis of mortality and other actuarial statistics (2nd ed.). London: Heinemann.
Booth, H. (2006). Demographic forecasting: 1980 to 2005 in review. International Journal of Forecasting, 22, 547–581.
Brass, W. (1960). The graduation of fertility distributions by polynomial functions. Population Studies, 14, 148–162.
Brass, W. (1971). On the scale of mortality. In W. Brass (Ed.), Biological aspects of demography (pp. 69–110). London: Taylor Francis.
Brass, W. (1974). Perspectives in population prediction: Illustrated by the statistics of England and Wales. Journal of the Royal Statistical Society, 137(Series A), 532–583.
Chandola, T., Coleman, D. A., & Hiorns, R. W. (1999). Recent European fertility patterns: Fitting curves to ‘distorted’ distributions. Population Studies, 54(3), 317–329.
Coale, A., & Trussell, J. (1996). The development and use of demographic models. Population Studies, 50(3), 469–484.
Congdon, P. (1993). Statistical graduation in local demographic analysis and projection. Journal of the Royal Statistical Society. Series A (Statistics in Society), 156(2), 237–270.
de Beer, J. (2011). A new relational method for smoothing and projecting age specific fertility rates: TOPALS. Demographic Research, 24(18): 409–454. doi:10.4054/DemRes.2011.24.18.
Debón, A., Montes, F., & Sala, R. (2005). A comparison of parametric models for mortality graduation. Application to mortality data for the Valencia Region (Spain). SORT, 29(2), 269–288.
Freund, R. J., & Littell, R. C. (2000). SAS system for regression (3rd ed.). New York: Wiley.
Gage, T. B. (2001). Age-specific fecundity of mammalian populations: A test of three mathematical models. Zoo Biology, 20, 487–499.
Gilje, E. (1969). Fitting curves to age-specific fertility rates: Some examples. Statistisk tidskrift (Statistical Review of the Swedish National Central Bureau of Statistics), Series III, 7, 118–134.
Hadwiger, H. (1940). Eine analytische reproduktionsfunktion für biologische gesamtheiten. Skandinavisk Aktuarietidskrift, 23, 101–113.
Heligman, L., & Pollard, J. (1980). The age pattern of mortality. Journal of the Institute of Actuaries, 107, 49–80.
Hoem, J. M., Madsen, D., Nielsen, J. L., Ohlsen, E.-M., Hansen, H. O., & Rennermaln, B. (1981). Experiments in modelling recent Danish fertility curves. Demography, 18, 231–244.
Ibrahim, R. I. (2008). Expanding an abridged life table using the Heligman-Pollard Model. Matematika, 24(1), 1–10.
Kamara, J., & Lamsana, A. (2001). The use of model systems to link child and adult mortality levels: Peru data. Paper presented at International Union for the Scientific Study of Population 24th General Conference, Salvador de Bahia, Brazil. Available online at: www.iussp.org/Brazil2001/s10/S14_03_Kamara.pdf.
Keyfitz, N. (1982). Choice of function for mortality analysis: Effective forecasting depends on a minimum parameter representation. Theoretical Population Biology, 21(3), 329–352.
Kostaki, A., & Panousis, V. (2001). Expanding an abridged life table. Demographic Research, 5(1), 1–22.
Leaker, D. (2009). Unemployment trends since the 1970 s. Economic and Labour Market Review, 3(2), 1–5.
Marshall, A. (2009). Developing a methodology for the estimation and projection of limiting long term illness and disability. Ph.D. Thesis, School of Social Sciences, University of Manchester.
Marshall, A. (2010). Small area estimation using ESDS government surveys: An introductory guide. Online http://www.esds.ac.uk/government/docs/smallareaestimation.pdf.
McNeil, D. R., Trussell, J. T., & Turner, J. C. (1977). Spline interpolation of demographic data. Demography, 14(2), 245–252.
Melvin, M., & Taylor, M. (2009). The global financial crisis: Causes, threats and opportunities. Introduction and overview. Journal of International Money and Finance, 28(8), 1–3.
Minitab. (2011). Minitab’s nonlinear regression tool. Online: http://www.minitab.com/en-GB/training/articles/articles.aspx?id=9030&langType=2057.
Mitchell, R. (2009). Consumption patterns in recessionary times. The Yorkshire and Humber Regional Review, 19(1), 13–14.
Motulsky, H., & Christopoulos, A. (2004). Fitting models to biological data using linear and nonlinear regression. New York: Oxford University Press.
Newell, C. (1988). Methods and models in demography. New York: The Guildford Press.
Norman, P., Rees, P., Wohland, P., & Boden, P. (2010). Ethnic group populations: The components for projection, demographic rates and trends. In J. Stillwell & M. van Ham (Eds.), Ethnicity and integration. Series: Understanding population trends and processes (pp. 289–315). Dordrecht: Springer.
Peristera, P., & Kostaki, A. (2005). An evaluation of the performance of kernel estimators for graduating mortality data. Journal of Population Research, 22(2), 185–197.
Peristera, P., & Kostaki, A. (2007). Modeling fertility in modern populations. Demographic Research, 16(6), 141–194.
Preston, S., Heuveline, P., & Guillot, M. (2001). Demography: Measuring and modelling population processes. Oxford: Blackwell.
Ratkowsky, D. A. (1983). Nonlinear regression: A unified practical approach. New York: Marcel Dekker.
Rees, P., Wohland, P., Norman, P., & Boden, P. (2011). A local analysis of ethnic group population trends and projections for the UK. Journal of Population Research, 28(2–3), 149–183.
Ritz, C., & Streibig, J. C. (2008). Nonlinear regression with R. Dordrecht: Springer.
Rogers, A., & Watkins, J. (1987). General versus elderly interstate migration and population redistribution in the United States. Research on Aging, 9(4), 483–529.
Royston, P. (1993). Standard nonlinear curve fits. Stata Technical Bulletin Reprints, 2, 121.
Stata. (2011). Nonlinear regression. Online: http://www.stata.com/capabilities/nlreg.html.
Thompson, C., Stillwell, J., & Clarke, M. (2010a). Understanding and validating Acxiom’s research opinion poll data, working paper 10/6. Leeds: University of Leeds.
Thompson, C., Stillwell, J., & Clarke, M. (2010b). The changing grocery market in Yorkshire and Humber. The Yorkshire and Humber Regional Review, 20(2), 13–15.
Vaitilingam, R. (2009). Recession Britain: Findings from economic and social research, ESRC. Online: http://www.esrc.ac.uk/ESRCInfoCentre/Images/Recession_Britain1_tcm6-33756.pdf.
Williamson, L. E. P. (2007). Population projections for small areas and ethnic groups—developing strategies for the estimation of demographic rates. Ph.D. thesis, School of Social Science, Faculty of Humanities, University of Manchester.
Williamson, L., & Norman, P. (2011). Developing strategies for deriving small population fertility rates. Journal of Population Research, 28(2–3), 129–148.
Wilson, T. (2010). Model migration schedules incorporating student migration peaks. Demographic Research, 23(8), 191–222.
Zaba, B. (1979). The four-parameter logit life table system. Population Studies, 33(1), 79–100.
Zeng, Y., Zhenglian, W., Zhongdong, M., & Chunjun, C. (2000). A simple method for projecting or estimating a and h: An extension of the Brass relational Gompertz fertility model. Population Research and Policy Review, 19, 525–549.
Acknowledgments
Work underpinning this paper was funded by research grants as follows: Paul Norman and Phil Rees by ESRC Research Awards RES-163-25-0032 and RES-189-25-0162, Alan Marshall by ESRC Postdoctoral Fellowship ES/H030328/1, Chris Thompson by ESRC RIBEN CASE studentship with Acxiom, and Lee Williamson by ESRC CASE studentship S42200124001 and Bradford Metropolitan District Council. The case studies reported here have been enabled by Bradford Birth Statistics Database supplied by Bradford Metropolitan District Council; Research Opinion Poll data supplied by Acxiom Ltd; 1991 and 2001 Census data obtained through the MIMAS CASWEB facility, an academic service supported by ESRC and JISC; and midyear estimates and Vital Statistics provided by the Office for National Statistics. The Census and Vital Statistics data are Crown copyright and are reproduced with permission of the Office of Public Sector Information. The authors are grateful to the anonymous referees whose useful comments have helped us to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Appendix: Using SPSS syntax
Appendix: Using SPSS syntax
(a) Quick guide to SPSS syntax
SPSS syntax is a ‘high level’ programming language which can readily be used to automate tasks without the user being a trained computer programmer. Users of SPSS will be familiar with the Data Editor window (comprising Data View and Variable View) through which you can open and analyse files with the .sav extension. Any analyses you carry out have the results (e.g. tables and graphs) presented in the Output window (which can be saved as .spo or .spv files depending on SPSS version). SPSS syntax files have the .sps extension.
For people who are inexperienced in using syntax, there is no need to type the command instructions from scratch. There are two easy ways to obtain syntax since SPSS will write it for you. If you use dialogue boxes to carry out an analysis, when you have made your variable selections, click on ‘Paste’ and the syntax commands to carry out your mouse click selections will be pasted into a new or previously open syntax file. A better way is to select via the menu File > Options, then the Viewer tab, tick ‘Display commands in the log’ and the click Apply. Any choices you make through ‘point and click’ are then always recorded in the Output window as syntax. You can then copy and paste syntax which you need into a .sps file and save the syntax file for future use. Existing commands can be edited for a new analysis.
(b) Explanation of the SPSS syntax to carry out nonlinear regression
Rights and permissions
About this article
Cite this article
Norman, P., Marshall, A., Thompson, C. et al. Estimating detailed distributions from grouped sociodemographic data: ‘get me started in’ curve fitting using nonlinear models. J Pop Research 29, 173–198 (2012). https://doi.org/10.1007/s12546-012-9082-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12546-012-9082-9