Skip to main content
Log in

Estimating detailed distributions from grouped sociodemographic data: ‘get me started in’ curve fitting using nonlinear models

  • Published:
Journal of Population Research Aims and scope Submit manuscript

Abstract

In much demographic analysis, it is important to know how occurrence-exposure rates or transition probabilities vary continuously by age or by time. Often we have coarse or fluctuating data so there can be a need for estimation and smoothing. Since the distributions of rates or counts across age or another variable are often curved, a nonlinear model is likely to be appropriate. The main focus of this paper is on the estimation of detailed information from grouped data such as age and income bands; however, the methods we outline could also be applied to other settings such as smoothing rates where the original data are ragged. The ability to carry out curve fitting is a very useful skill for population geographers and demographers. Curve fitting is not well covered in statistics textbooks, and whilst there is a large literature in journals thoroughly discussing the detail of functions which define curves, these texts are likely to be inaccessible to researchers who are not specialists in mathematics. We aim here to make nonlinear modelling as accessible as possible. We demonstrate how to carry out nonlinear regression using SPSS, giving stepped-through hypothetical and research examples. We note other software in which nonlinear regression can be carried out, and outline alternative methods of curve fitting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  • Benjamin, B., & Pollard, J. (1980). The analysis of mortality and other actuarial statistics (2nd ed.). London: Heinemann.

    Google Scholar 

  • Booth, H. (2006). Demographic forecasting: 1980 to 2005 in review. International Journal of Forecasting, 22, 547–581.

    Article  Google Scholar 

  • Brass, W. (1960). The graduation of fertility distributions by polynomial functions. Population Studies, 14, 148–162.

    Google Scholar 

  • Brass, W. (1971). On the scale of mortality. In W. Brass (Ed.), Biological aspects of demography (pp. 69–110). London: Taylor Francis.

    Google Scholar 

  • Brass, W. (1974). Perspectives in population prediction: Illustrated by the statistics of England and Wales. Journal of the Royal Statistical Society, 137(Series A), 532–583.

    Google Scholar 

  • Chandola, T., Coleman, D. A., & Hiorns, R. W. (1999). Recent European fertility patterns: Fitting curves to ‘distorted’ distributions. Population Studies, 54(3), 317–329.

    Article  Google Scholar 

  • Coale, A., & Trussell, J. (1996). The development and use of demographic models. Population Studies, 50(3), 469–484.

    Article  Google Scholar 

  • Congdon, P. (1993). Statistical graduation in local demographic analysis and projection. Journal of the Royal Statistical Society. Series A (Statistics in Society), 156(2), 237–270.

    Article  Google Scholar 

  • de Beer, J. (2011). A new relational method for smoothing and projecting age specific fertility rates: TOPALS. Demographic Research, 24(18): 409–454. doi:10.4054/DemRes.2011.24.18.

  • Debón, A., Montes, F., & Sala, R. (2005). A comparison of parametric models for mortality graduation. Application to mortality data for the Valencia Region (Spain). SORT, 29(2), 269–288.

    Google Scholar 

  • Freund, R. J., & Littell, R. C. (2000). SAS system for regression (3rd ed.). New York: Wiley.

    Google Scholar 

  • Gage, T. B. (2001). Age-specific fecundity of mammalian populations: A test of three mathematical models. Zoo Biology, 20, 487–499.

    Article  Google Scholar 

  • Gilje, E. (1969). Fitting curves to age-specific fertility rates: Some examples. Statistisk tidskrift (Statistical Review of the Swedish National Central Bureau of Statistics), Series III, 7, 118–134.

    Google Scholar 

  • Hadwiger, H. (1940). Eine analytische reproduktionsfunktion für biologische gesamtheiten. Skandinavisk Aktuarietidskrift, 23, 101–113.

    Google Scholar 

  • Heligman, L., & Pollard, J. (1980). The age pattern of mortality. Journal of the Institute of Actuaries, 107, 49–80.

    Google Scholar 

  • Hoem, J. M., Madsen, D., Nielsen, J. L., Ohlsen, E.-M., Hansen, H. O., & Rennermaln, B. (1981). Experiments in modelling recent Danish fertility curves. Demography, 18, 231–244.

    Article  Google Scholar 

  • Ibrahim, R. I. (2008). Expanding an abridged life table using the Heligman-Pollard Model. Matematika, 24(1), 1–10.

    Google Scholar 

  • Kamara, J., & Lamsana, A. (2001). The use of model systems to link child and adult mortality levels: Peru data. Paper presented at International Union for the Scientific Study of Population 24th General Conference, Salvador de Bahia, Brazil. Available online at: www.iussp.org/Brazil2001/s10/S14_03_Kamara.pdf.

  • Keyfitz, N. (1982). Choice of function for mortality analysis: Effective forecasting depends on a minimum parameter representation. Theoretical Population Biology, 21(3), 329–352.

    Article  Google Scholar 

  • Kostaki, A., & Panousis, V. (2001). Expanding an abridged life table. Demographic Research, 5(1), 1–22.

    Article  Google Scholar 

  • Leaker, D. (2009). Unemployment trends since the 1970 s. Economic and Labour Market Review, 3(2), 1–5.

    Google Scholar 

  • Marshall, A. (2009). Developing a methodology for the estimation and projection of limiting long term illness and disability. Ph.D. Thesis, School of Social Sciences, University of Manchester.

  • Marshall, A. (2010). Small area estimation using ESDS government surveys: An introductory guide. Online http://www.esds.ac.uk/government/docs/smallareaestimation.pdf.

  • McNeil, D. R., Trussell, J. T., & Turner, J. C. (1977). Spline interpolation of demographic data. Demography, 14(2), 245–252.

    Article  Google Scholar 

  • Melvin, M., & Taylor, M. (2009). The global financial crisis: Causes, threats and opportunities. Introduction and overview. Journal of International Money and Finance, 28(8), 1–3.

    Google Scholar 

  • Minitab. (2011). Minitab’s nonlinear regression tool. Online: http://www.minitab.com/en-GB/training/articles/articles.aspx?id=9030&langType=2057.

  • Mitchell, R. (2009). Consumption patterns in recessionary times. The Yorkshire and Humber Regional Review, 19(1), 13–14.

    Google Scholar 

  • Motulsky, H., & Christopoulos, A. (2004). Fitting models to biological data using linear and nonlinear regression. New York: Oxford University Press.

    Google Scholar 

  • Newell, C. (1988). Methods and models in demography. New York: The Guildford Press.

    Google Scholar 

  • Norman, P., Rees, P., Wohland, P., & Boden, P. (2010). Ethnic group populations: The components for projection, demographic rates and trends. In J. Stillwell & M. van Ham (Eds.), Ethnicity and integration. Series: Understanding population trends and processes (pp. 289–315). Dordrecht: Springer.

    Google Scholar 

  • Peristera, P., & Kostaki, A. (2005). An evaluation of the performance of kernel estimators for graduating mortality data. Journal of Population Research, 22(2), 185–197.

    Article  Google Scholar 

  • Peristera, P., & Kostaki, A. (2007). Modeling fertility in modern populations. Demographic Research, 16(6), 141–194.

    Google Scholar 

  • Preston, S., Heuveline, P., & Guillot, M. (2001). Demography: Measuring and modelling population processes. Oxford: Blackwell.

    Google Scholar 

  • Ratkowsky, D. A. (1983). Nonlinear regression: A unified practical approach. New York: Marcel Dekker.

    Google Scholar 

  • Rees, P., Wohland, P., Norman, P., & Boden, P. (2011). A local analysis of ethnic group population trends and projections for the UK. Journal of Population Research, 28(2–3), 149–183.

    Article  Google Scholar 

  • Ritz, C., & Streibig, J. C. (2008). Nonlinear regression with R. Dordrecht: Springer.

    Google Scholar 

  • Rogers, A., & Watkins, J. (1987). General versus elderly interstate migration and population redistribution in the United States. Research on Aging, 9(4), 483–529.

    Article  Google Scholar 

  • Royston, P. (1993). Standard nonlinear curve fits. Stata Technical Bulletin Reprints, 2, 121.

    Google Scholar 

  • Stata. (2011). Nonlinear regression. Online: http://www.stata.com/capabilities/nlreg.html.

  • Thompson, C., Stillwell, J., & Clarke, M. (2010a). Understanding and validating Acxiom’s research opinion poll data, working paper 10/6. Leeds: University of Leeds.

    Google Scholar 

  • Thompson, C., Stillwell, J., & Clarke, M. (2010b). The changing grocery market in Yorkshire and Humber. The Yorkshire and Humber Regional Review, 20(2), 13–15.

    Google Scholar 

  • Vaitilingam, R. (2009). Recession Britain: Findings from economic and social research, ESRC. Online: http://www.esrc.ac.uk/ESRCInfoCentre/Images/Recession_Britain1_tcm6-33756.pdf.

  • Williamson, L. E. P. (2007). Population projections for small areas and ethnic groupsdeveloping strategies for the estimation of demographic rates. Ph.D. thesis, School of Social Science, Faculty of Humanities, University of Manchester.

  • Williamson, L., & Norman, P. (2011). Developing strategies for deriving small population fertility rates. Journal of Population Research, 28(2–3), 129–148.

    Article  Google Scholar 

  • Wilson, T. (2010). Model migration schedules incorporating student migration peaks. Demographic Research, 23(8), 191–222.

    Article  Google Scholar 

  • Zaba, B. (1979). The four-parameter logit life table system. Population Studies, 33(1), 79–100.

    Google Scholar 

  • Zeng, Y., Zhenglian, W., Zhongdong, M., & Chunjun, C. (2000). A simple method for projecting or estimating a and h: An extension of the Brass relational Gompertz fertility model. Population Research and Policy Review, 19, 525–549.

    Article  Google Scholar 

Download references

Acknowledgments

Work underpinning this paper was funded by research grants as follows: Paul Norman and Phil Rees by ESRC Research Awards RES-163-25-0032 and RES-189-25-0162, Alan Marshall by ESRC Postdoctoral Fellowship ES/H030328/1, Chris Thompson by ESRC RIBEN CASE studentship with Acxiom, and Lee Williamson by ESRC CASE studentship S42200124001 and Bradford Metropolitan District Council. The case studies reported here have been enabled by Bradford Birth Statistics Database supplied by Bradford Metropolitan District Council; Research Opinion Poll data supplied by Acxiom Ltd; 1991 and 2001 Census data obtained through the MIMAS CASWEB facility, an academic service supported by ESRC and JISC; and midyear estimates and Vital Statistics provided by the Office for National Statistics. The Census and Vital Statistics data are Crown copyright and are reproduced with permission of the Office of Public Sector Information. The authors are grateful to the anonymous referees whose useful comments have helped us to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Norman.

Appendix: Using SPSS syntax

Appendix: Using SPSS syntax

(a) Quick guide to SPSS syntax

SPSS syntax is a ‘high level’ programming language which can readily be used to automate tasks without the user being a trained computer programmer. Users of SPSS will be familiar with the Data Editor window (comprising Data View and Variable View) through which you can open and analyse files with the .sav extension. Any analyses you carry out have the results (e.g. tables and graphs) presented in the Output window (which can be saved as .spo or .spv files depending on SPSS version). SPSS syntax files have the .sps extension.

For people who are inexperienced in using syntax, there is no need to type the command instructions from scratch. There are two easy ways to obtain syntax since SPSS will write it for you. If you use dialogue boxes to carry out an analysis, when you have made your variable selections, click on ‘Paste’ and the syntax commands to carry out your mouse click selections will be pasted into a new or previously open syntax file. A better way is to select via the menu File > Options, then the Viewer tab, tick ‘Display commands in the log’ and the click Apply. Any choices you make through ‘point and click’ are then always recorded in the Output window as syntax. You can then copy and paste syntax which you need into a .sps file and save the syntax file for future use. Existing commands can be edited for a new analysis.

(b) Explanation of the SPSS syntax to carry out nonlinear regression

Rights and permissions

Reprints and permissions

About this article

Cite this article

Norman, P., Marshall, A., Thompson, C. et al. Estimating detailed distributions from grouped sociodemographic data: ‘get me started in’ curve fitting using nonlinear models. J Pop Research 29, 173–198 (2012). https://doi.org/10.1007/s12546-012-9082-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12546-012-9082-9

Keywords

Navigation