Skip to main content

Advertisement

Log in

Urban Form, Heart Disease, and Geography: A Case Study in Composite Index Formation and Bayesian Spatial Modeling

  • Published:
Population Research and Policy Review Aims and scope Submit manuscript

Abstract

Recent studies indicate a relationship between measures of urban form as applied to urban and suburban areas, and obesity, a risk factor for heart disease. Measures of urban form for exurban and rural areas are considerably scarce; such measures could prove useful in measuring relationships between urban form and both mortality and morbidity in such areas. In modeling area-level mortality, geographic relationships between counties warrant consideration because geographically adjacent areas tend to have more in common than areas farther from each other. We modify county-level indices of urban form found in the literature so that they can be applied to exurban and rural counties. We then use these indices in a Bayesian spatial model that accounts for spatial autocorrelation to determine if there is a relationship between such measures and cardiovascular disease mortality for white males age 35 and older for the time period 1999–2001. Issues related to the formation and usefulness of the indices, and issues related to the spatial model, are discussed. Maps of observed and expected relative risk of mortality are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • AHRQ (2006). Research on cardiovascular disease in women. Program Brief. AHRQ Publication No. 06-P016. Rockville: Agency for Healthcare Research and Quality. Retrieved March 8, 2007 from http://www.ahrq.gov/research/womheart.htm

  • American Heart Association (2007a). Risk factors and coronary heart disease. Retrieved March 6, 2007 from http://www.americanheart.org/presenter.jhtml?identifier=4726

  • American Heart Association (2007b). Cigarette smoking and cardiovascular disease. Retrieved March 9, 2007 from http://www.americanheart.org/presenter.jhtml?identifier=454

  • Anselin, L. (1988). The maximum likelihood approach to spatial process models. In L. Anselin (Ed.), Spatial econometrics: Methods and models (Chapt. 6, pp. 57–80). Dordrecht: Kluwer Academic.

  • Anselin, L. (1993). Discrete space autoregressive models. In M. Goodchild, B. Parks, & T. Steyaert (Eds.), Environmental modeling with GIS (pp. 454–469). Oxford: Oxford University Press.

    Google Scholar 

  • Anselin, L., & The Regents of the University of Illinois (1998–2004). GeoDa 0.95-i. Available at the University of Illinois GeoDa web site: https://www.geoda.uiuc.edu/default.php

  • Anselin, L., Syabri, I., & Kho, Y. (2006). GeoDa: An introduction to spatial data analysis. Geographical Analysis, 38, 5–22.

    Google Scholar 

  • Arab, A., Hooten, M. B., & Wikle, C. K. (2007). Hierarchical spatial models. In Encyclopedia of geographical information science. New York: Springer. Retrieved March 9, 2007 from Utah State University Department of Mathematics and Statistics web site: http://www.math.usu.edu/~hooten/papers/HSMv4.pdf (in press).

  • Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2004). Hierarchical modeling and analysis for spatial data. Boca Raton: Chapman and Hall/CRC Press.

    Google Scholar 

  • Bao, S. (n.d.). An overview of spatial econometric models. Retrieved June 23, 2005 from China Data Center, University of Michigan website: http://www.umich.edu/~iinet/chinadata/docs/topic_3.pdf

  • Benjamin, S. M., Geiss, L. S., Pan, L., Engelgau, M. M., & Greenlund, K. J. (2003). Self-reported heart disease and stroke among adults with and without diabetes—United States, 1999–2001. Morbidity and Mortality Weekly Report, 52(44), 1065–1070. Retrieved March 6, 2007 from http://www.cdc.gov/mmwR/preview/mmwrhtml/mm5244a2.htm

  • Besag, J. E., & Kooperburg, C. (1995). On conditional and intrinsic autoregressions. Biometrika, 82, 733–746.

    Google Scholar 

  • Besag, J., York, J., & Mollie, A. (1991). Bayesian image restoration with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43, 1–59.

    Article  Google Scholar 

  • Brooks, S. P., & Gelman, A. (1998). Alternative methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455.

    Article  Google Scholar 

  • Carmines, E. G., & Zeller, R. A. (1979). Reliability and validity assessment: Sage University paper series on quantitative applications in the social sciences (Vol. 7). London: Sage Publications.

    Google Scholar 

  • Casella, G., & Berger, R. L. (2002). Statistical inference (2nd ed.). Pacific Grove: Duxbury Publishing.

    Google Scholar 

  • Casella, G., & George, E. I. (1992). Explaining the Gibbs sampler. The American Statistician, 46(3), 167–174.

    Article  Google Scholar 

  • Clayton, D. G., & Kaldor, J. (1987). Empirical Bays estimates of age-standardized relative risks for use in disease mapping. Biometrics, 43, 671–691.

    Article  Google Scholar 

  • Congdon, P. (2001). Applied Bayesian modeling. Chichester: John Wiley and Sons.

    Google Scholar 

  • Cressie, N. (1993). Statistics for spatial data (revised ed.). New York: John Wiley and Sons.

  • Dobson, A. J. (2002). An introduction to generalized linear models (2nd ed.). Boca Raton: Chapman and Hall.

    Google Scholar 

  • Durham, C. A., Pardoe, I., & Vega, E. (2004). A methodology for evaluating how product characteristics impact choice in retail settings with many zero observations: An application to restaurant wine purchase. Journal of Agricultural and Resource Economics, 29(1), 112–131.

    Google Scholar 

  • Ewing, R. (1997). Is Los Angeles style sprawl desirable? Journal of the American Planning Association, 63(1), 107–126.

    Google Scholar 

  • Ewing, R., Pendall, R., & Chen, D. (2002). Measuring sprawl and its impact (Vol. 1). Retrieved March 9, 2007 from the Smart Growth America website: http://www.smartgrowthamerica.org/sprawlindex/MeasuringSprawlTechnical.pdf

  • Ewing, R., Schieber, R. A., & Zegeer, C. V. (2003). Urban sprawl as a risk factor in motor vehicle occupant and pedestrian fatalities. American Journal of Public Health, 93, 1541–1545.

    Google Scholar 

  • Ewing, R., Schmid, S., Killingsworth, R., Zlot, A., & Raudenbush, S. (2003). Relationship between urban sprawl and physical activity, obesity, and morbidity. American Journal of Health Promotion, 18(1), 47–57.

    Google Scholar 

  • Flegal, K. M., Carroll, M. D., Ogden, C. L., & Johnson, C. L. (2002). Prevalence and trends in obesity among US adults, 1999–2000. Journal of the American Medical Association, 288, 1723–1727.

    Article  Google Scholar 

  • Gelman, A., & Price, P. (1999). All maps of parameter estimates are misleading. Statistics in Medicine, 18, 3221–3234.

    Article  Google Scholar 

  • Hayes, D. K., Greenlund, K. J., Denny, C. H., Croft, J. B., & Keenan, N. L. (2005). Racial/ethnic and socioeconomic disparities in multiple factors for heart disease and stroke—United States, 2003. Morbidity and Mortality Weekly Report, 54(05), 113–117. Retrieved March 9, 2007 from: http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5405a1.htm

  • Holsinger, K. (2006). The deviance information criterion. Retrieved March 9, 2007 from University of Connecticut Department of Ecology and Evolutionary Biology faculty member Kent Holsinger’s website: http://www.darwin.eeb.uconn.edu/eeb348/lecture-notes/testing-hardy-weinberg/node5.html

  • Johnson, G. D. (2004). Smoothing small area maps of prostate cancer incidence in New York state using fully Bayesian hierarchical modeling. International Journal of Health Geographics, 3, 29. Retrieved March 9, 2007 from: http://www.ij-healthgeographics.com/content/3/1/29

  • Kim, J., & Mueller, C. (1978a). Factor analysis: Statistical methods and practical issues. Sage University Paper Series on Quantitative Applications in the Social Sciences, 14. Beverly Hills: Sage Publications.

  • Kim, J., & Mueller, C. (1978b). Introduction to factor analysis: What it is and how to do it. Sage University Paper Series on Quantitative Applications in the Social Sciences, 14. Beverly Hills: Sage Publications.

  • Lawson, A. (2001). Statistical methods in spatial epidemiology. Chichester: John Wiley and Sons.

    Google Scholar 

  • Lawson, A., Browne, W., & Rodeiro, C. (2003). Disease mapping with WinBUGS and MLwiN. Chichester: John Wiley and Sons.

    Google Scholar 

  • Lee, A. H., Stevenson, M. R., Wang, K., & Yau, K. K. W. (2002). Modeling young driver motor vehicle crashes: Data with extra zeroes. Accident Analysis and Prevention, 34, 515–521.

    Article  Google Scholar 

  • Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2001). Geographic information systems and science. Chichester: John Wiley and Sons.

    Google Scholar 

  • Mayo Clinic (2007). Heart disease prevention: 5 strategies keep your heart healthy. January 15. Retrieved March 6, 2007 from Mayo Clinic website: http://www.mayoclinic.com/health/heart-disease-prevention/WO0004

  • Mokdad, A. H., Bowman, B. A., Ford, E. S., Vinicor, F., Marks, J. S., & Koplan, J. P. (2001). The continuing epidemics of obesity and diabetes in the United States. Journal of the American Medical Association, 286, 1195–1200.

    Article  Google Scholar 

  • Moulton, L. H., Foxman, B., Wolfe, R. A., & Port, F. K. (1994). Potential pitfalls in interpreting maps of stabilized rates. Epidemiology, 5(3), 297–301.

    Article  Google Scholar 

  • Nandram, B., Sedransk, J., & Pickle, L. W. (2000). Bayesian analysis and mapping of mortality rates for chronic obstructive pulmonary disease. Journal of the American Statistical Association, 95, 1110–1118.

    Article  Google Scholar 

  • NCHS (1999–2001). Type II multiple cause of death files 1999–2001. Hyattsville: National Center for Health Statistics.

  • NWHIC (2007). Heart disease. Retrieved March 6, 2007 from National Women’s Health Information Center website: http://www.4woman.gov/faq/heartdis.htm

  • Nielsen, P. S., Okkels, H., Sigsgaard, T., Kyrtopoulos, S., & Autrup, H. (1996). Exposure to urban and rural air pollution: DNA and protein adducts and effect of glutathione-S-transferase genotype on adduct levels. International Archives of Occupational and Environmental Health, 68(3), 170–176.

    Article  Google Scholar 

  • NIST (2005). 8.1.10: How can Bayesian methodology be used for reliability evaluation? In NIST/SEMATECH e-Handbook of statistical methods. Retrieved March 7, 2007 from the National Institute of Standards and Technology website: http://www.itl.nist.gov/div898/handbook/apr/section1/apr1a.htm#What%20is%20Bayesian%20Methodology%20and%20why%20is%20it

  • Pickle, L. W., Mungiole, M., Jones, G. K., & White, A. A. (1996). Atlas of United States mortality. Hyattsville: U.S. Department of Health and Human Services.

    Google Scholar 

  • Pope, C. A., Burnett, R. T., Thun, M. J., Calle, E. E., Krewski, D., Ito, K., & Thurston, G. D. (2002). Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. Journal of the American Medical Association, 287(9), 1132–1141.

    Article  Google Scholar 

  • Rashid, M. N., Fuentes, F., Touchon, R. C., & Wehner, P. S. (2003). Obesity and the risk for cardiovascular disease. Preventive Cardiology, 6(1), 42–47.

    Article  Google Scholar 

  • Riccotti, H. (2003). Heart disease: Differences between men and women. Retrieved March 6, 2007 from Beth Israel Deaconness Medical Center website: http://www.bidmc.harvard.edu/display.asp?node_id=4952

  • SAS Institute (1999a). SAS/STAT user’s guide, version 8 (Vol. 1). Cary: SAS Institute.

  • SAS Institute (1999b). SAS/STAT user’s guide, version 8 (Vol. 2). Cary: SAS Institute.

  • SAS Institute (1999–2001). SAS 8.2. Cary: SAS Institute.

  • Shoultz, G., & Givens, J. (in progress). Heart disease, change in economic deprivation over time, and Bayesian spatial modeling: A case study.

  • Singh, G. K., & Siahpush, M. (2002). Increasing inequalities in all-cause and cardiovascular mortality among US adults aged 25–64 years by area socioeconomic status, 1969–1998. International Journal of Epidemiology, 31, 600–613.

    Article  Google Scholar 

  • Singh, G. K. (2003). Area deprivation and widening inequalities in U.S. mortality, 1969–1998. American Journal of Public Health, 93(7), 1137–1143.

    Google Scholar 

  • Spiegelhalter, D. J., Best, N., Carlin, B. P., & Van der Linde, A. (2002). Bayesian deviance, the effective number of parameters and the comparison of arbitrarily complex models. Journal of the Royal Statistical Society B, 64, 583–640.

    Article  Google Scholar 

  • Spiegelhalter, D. J., Thomas, A., Best, N., & Lunn, D. (2003). WinBUGS user manual, version 1.4, January 2003. Retrieved March 7, 2007 from the WinBUGS website: http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/manual14.pdf

  • Surgeon General (1996). Physical activity and health: A report of the Surgeon General. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion. Retrieved June 24, 2005 from http://www.cdc.gov/nccdphp/sgr/pdf/sgrfull.pdf

  • Surgeon General (2001). The Surgeon General’s call to action to prevent and decrease overweight and obesity. Rockville: U.S. Department of Health and Human Services, Public Health Service Office of the Surgeon General. Retrieved June 24, 2005 from http://www.surgeongeneral.gov/topics/obesity/calltoaction/CalltoAction.pdf

  • Szklo, M., & Nieto, F. J. (2000). Epidemiology: Beyond the basics. Sudbury: Jones and Bartlett.

    Google Scholar 

  • The BUGS Project (2004). WINBUGS 1.4. Updated version 1.4.3 retrieved September 14, 2007 at: http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml

  • Theobald, D. M. (2001). Land-use dynamics beyond the American urban fringe. Geographical Review, 91(3), 544–564.

    Article  Google Scholar 

  • Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46, 234–240.

    Article  Google Scholar 

  • U.S. Census Bureau (1990). 1990 census of population and housing block statistics, CD90-1B-(1-10). Washington: U.S. Census Bureau.

  • U.S. Census Bureau. (2002a). Census 2000 summary file 1. Washington: U.S. Census Bureau. Available at http://www.census.gov

  • U.S. Census Bureau. (2002b). Census 2000 summary file 3, CD-ROM CS-D00-S3ST-08-US1. Washington: U.S. Census Bureau.

  • U.S. Census Bureau (2001–2005). Geographic changes for Census 2000 + Glossary. Retrieved March 7, 2007 from: http://www.census.gov/geo/www/tiger/glossary.html#states

  • U.S. Department of Agriculture [USDA] (2003). Measuring rurality: Rural-urban continuum codes. Retrieved April 7, 2007 from the Economic Research Service, United States Department of Agriculture, at: http://www.ers.usda.gov/Briefing/Rurality/RuralUrbCon/

  • Wakefield, J. C., Best, N. G., & Waller, L. (2001). Bayesian approaches to disease mapping. In P. Elliott, J. Wakefield, N. Best, & D. Briggs (Eds.), Spatial epidemiology, methods and applications (pp. 104–127). Oxford: Oxford University Press.

    Google Scholar 

  • Whittle, P. (1954). On stationary process in the plane. Biometrika, 41, 434–449.

    Google Scholar 

  • World Health Organization (1992). International statistical classification of diseases and related health problems, 10th revision. Geneva: World Health Organization.

Download references

Acknowledgments

Disclaimer This publication was made possible through a fellowship sponsored by the Center for Disease Control (CDC), National Center for Health Statistics (NCHS) and the Association of Schools of Public Health (ASPH). The findings and conclusions contained in this paper represent the views of the authors. No official support or endorsement by either the Grand Valley State University Department of Statistics or the Centers for Disease Control and Prevention, Department of Health and Human Services is intended, nor should be inferred.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gerald Shoultz.

Additional information

Jimmie Givens retired from his service.

Appendices

Appendix A

This study used 1990 Census data but 1999–2001 mortality data. Therefore changes in county boundaries 1990 and 2000 (U.S. Census Bureau 2001–2005) are accounted for as follows: In Montana the portion of Yellowstone National Park inside of Montana was divided between Gallatin and Park Counties; therefore, Gallatin and Park Counties are pooled for this study. In Virginia South Boston town (a county equivalent) was merged into Halifax County; for this study South Boston and Halifax County are pooled. In Alaska Denali borough was formed from parts of Yukon-Koyukak and Southeast Fairbanks boroughs; for this study the areas for Yukon-Koyukak, Southeast Fairbanks, and Denali boroughs are merged. Also in Alaska, Skagway-Yakutat-Angoon borough was divided into Skagway-Hoonan-Angoon borough and Yakutat Borough; for this study the boroughs are merged. Annexations of portions of a county into another county that did not dissolve a county were not accounted for.

Appendix B

For our model the priors are: \(1/\sigma_u^2, 1/\sigma_v^2 \sim \Upgamma (0.5,\;1/0.0005),\) and all \(\beta^{\prime}s\sim N(0,\;1/0.00001),\) where \(\Upgamma \left(\alpha,\varepsilon \right)\) is a Gamma function with shape parameters α and ɛ and N(μ, σ2) is a Normal distribution with mean μ and variance σ2.

Appendix C

For each sprawl index WINBUGS 1.4 (The BUGS Project 2004; Spiegelhalter et al. 2003) was used for running three independent Markov Chains. First, for each of our two urban form indices (density and road accessibility), model (1) without spatial autocorrelation and unstructured variation components (e.g., without the terms u i and v k ) was run in PROC GENMOD in SAS (SAS Institute 1999b, pp. 1365–1464) to obtain maximum likelihood estimates of the coefficients. Initial values for the three chains were those estimates plus 4, 0 and −4 standard deviations. Initial values for \(1/\sigma_u^2\) and \(1/\sigma_v^2\) were taken as 0.001, 1,000 (mean of the prior distribution) and 7,000, respectively. Initial values for u i and v k were all set at 0.

Time series Gelman-Rubin diagnostic graphs and trace graphs were used to check convergence of relative risk estimates and parameter estimates for all three chains (Spiegelhalter et al. 2003). After 10,000 iterations the traces of the parameter estimates had good mixing around a common value, with varying degrees of white noise around those values. Gelman–Rubin graphs were convergent and stable. Chains were examined for \(1/\sigma_u^2, 1/\sigma_v^2, \sigma_u\) and σ v ; these chains also mixed well. Hence, after 10,000 iterations convergence of all parameters was concluded. An additional 30,000 iterations were then taken for each chain, but to reduce autocorrelation every third iteration was kept for later calculations. As a result, a total of 30,000 iterations (10,000 from each chain) were used to obtain parameter estimates.

Convergence of parameter estimates was checked according to the “Checking convergence” section of the WINBUGS 1.4 (Spiegelhalter et al. 2003) manual. The chains clearly appeared to be overlapping one another, and parameter estimates look stable. Posterior graphs had the desired bell shape.

The Brooks and Gelman (1998) version of the Gelman–Rubin convergence statistic (GR) as given in WINBUGS 1.4 was used for iterations 5,001–10,000. Let X be the width of the middle 80% of the parameter estimates of all three chains pooled together, and let Y be the average of the widths of the middle 80% of the parameter estimates for each of the three chains individually. Then GR = X/Y. Here, we want GR to converge close to 1 and both X and Y to converge to some number. Numerical values for X, Y and GR for later iterations are found in Table C1. The X and Y columns are numerically close to each other and the GR values are nearly equal to one.

Table C1 Summary of Gelman–Rubin statistics for convergence

Finally, we calculated parameter estimates for a variety of prior distributions to determine whether said estimates were sensitive to the choice of the prior. We found the parameter estimates to be very similar for all our choices.

Appendix D

This discussion of CAR and ICAR follows much of Wakefield et al. (2001, p. 110ff). We first define the CAR model. Define \(N_n \left({\mathbf{0}}_{{\mathbf{n}}}, \;\sigma^{2}\Upsigma \right)\) as an n-dimensional normal distribution with n × n positive definite (i.e., the matrix has an inverse) correlation matrix Σ and parameter σ2, and let \({\mathbf {Q}} = \Sigma^{-1}\) have elements \(Q_{id},\, i,d=1,\ldots,n.\) The general CAR model can be written as (Besag and Kooperburg 1995)

$$U_i |\left({U_d =u_d,\;d\neq i} \right)\sim N\left(\sum\limits_{d=1}^n {M_{id} u_d},\sigma_u^2 V_{ii} \right),$$
(5)

where

  • \(M_{id} =\left\{\begin{array}{ll} \left. -Q_{id} \right/ Q_{ii} & \hbox{if}\;i\neq d \\ 0 & \hbox{if}\;i=d \\ \end{array} \right.,\)

  • \(\sigma_u^2 \) is a measure of overall variance of the u i ’s and

  • V ii = 1/Q ii .

For a complete derivation of the above relationships see Wakefield et al. (2001, pp. 124–125). Since Q is symmetric \(M_{id} V_{dd} =M_{di} V_{ii}.\) In matrix form the correlation matrix Σ is:

$$\Upsigma =\hbox{Q}^{-1}=\hbox{V}^{-1}\hbox{(I}-\hbox{M)},$$
(6)

where:

  • V is a matrix with elements V ii ,i = 1,…,n and 0 otherwise, and

  • M is the matrix of spatial weights M id .

If the matrix Q has an inverse than the matrix \({\mathbf{U}} =\left[ u_1,\;u_2,\ldots,\;u_n \right]^{T}\) has distribution \({{\mathbf{U}}} \sim N_n \left(\hbox{0}_{\rm n},\,\sigma_u^2 \left(\hbox{I}-\hbox{M} \right)^{-1}\hbox{V}\right).\)

To obtain the ICAR model (4) from the general CAR model of (5) set V ii = 1/a i and M id w id /a i . Here Q does not have an inverse. Proof: Each row i of the matrix IM has a solitary 1 on the diagonal, a i elements with value −1/a i , and the remaining elements equal to zero. Since the sum of the elements on each row all equal zero the matrix Q has rank n − 1 <  n, and so is not full rank and is not invertible.

Appendix E

We first define the DIC and then derive the DIC for the model \(F_1 =\ln (\mu_k)=\ln (e_k)+u_i +v_k.\) where i and k are defined as in (1). Recall that after 10,000 “burn-in” iterations obtain convergence we obtained 30,000 more iterations for the parameter estimates. The Log Likelihood LL is

$$LL = LL \left(\user2{Y},\left\{\hat {\user2{Y}}|\varpi \right\} \right){\mathbf =} \sum {\left({Y_k \log \hat {Y}_k -\hat {Y}_k -Y_k !} \right)},$$
(7)

where \(\user2{Y} = \{Y_k \}\) are the observed counts and \(\left\{{\hat {{Y}}|\varpi} \right\} = \{\hat {{Y}}_k \}\) is the set of predicted counts derived from the set of parameter estimates \(\varpi \) (Dobson 2002, p. 76). Using (7), the DIC (Spiegelhalter et al. 2002, 2003) is:

$$ \hbox {DIC} = 2\overline{\left(-2 \user2{LL}\left(\user2{Y},\left\{\user2{{\hat {Y}}}|\varpi \right\} \right) \right)} -\left(-2 \user2 {LL}\left(\user2 {Y},\left\{\user2{\hat {Y}}|\bar{\varpi} \right\} \right) \right),$$
(8)

where:

  • \(\overline {-2 \user2 {LL}\left(\user2 {Y},\left\{\user2{{\hat {Y}}}|\varpi \right\} \right)} \) is the mean of the 30,000 individual iterations of \(-2\user2 {LL}\left(\user2{Y},\left\{\user2{{\hat {Y}}}|\varpi \right\} \right)\) and

  • \(\overline \varpi \) is the posterior mean of the parameter estimates from the 30,000 iterations.

To calculate the DIC for model F 1, follow these three steps. First, calculate:

$$\overline{-2\user2{LL}\left(\user2{Y},\left\{\user2{{\hat {Y}}}|\varpi \right\} \right)} = -2\sum\limits_{l=1}^{30000} {\;\sum\limits_{k=1}^{18822} \left(\left(Y_k \log \hat {Y}_k^{(l)} -\hat {Y}_k^{(l)} -Y_k ! \right) \right)}/30000,$$
(9)

where

  • \(l =\hbox{iteration number }(l = 1,2,\ldots,30,000.\) taken after the 10,000 iterations for convergence),

  • \(\hat {Y}_k^{(l)} = \hbox{the expected number of deaths predicted via the}\,\,l\hbox{th iteration from model }F_1 : \hat {{Y}}_k^{(l)} =e_k\,{\bullet}\,\hbox{exp} (\hat {u}_i^{(l)} +\hat {v}_k^{(l)})),\)

  • \(\hat {u}_i^{(l)} = \hbox{the spatial autocorrelation component predicted via the}\,\,l\hbox{th iteration,}\)

  • \(\hat {{v}}_k^{(l)} = \hbox{the unstructured variation component predicted via the}\,\,l\hbox{th iteration,}\)

  • \(\varpi =\left\{\hat {u}_i^{(l)},\hat {v}_k^{(l)} \right\},\) the set of parameter estimates for iteration k, and

  • \(\left\{\user2{\hat {Y}}|\varpi \right\} = \hbox{the set of individual}\,\,\hat {Y}_k^{(l)} \hbox{s}.\)

Second, calculate

$$-2\user2{LL}\left(\user2{Y},\left\{\user2{{\hat {Y}}}|\bar{\varpi} \right\} \right)= -2\sum\limits_{k=1}^{18822} {\left(Y_k \log \hat {Y}_k^{\bullet} -\hat {Y}_k^{\bullet} -Y_k ! \right)},$$
(10)

where

  • \(\hat {Y}_k^{\bullet} =\hbox{the estimated number of deaths using}\,\,\bar{\varpi} : \hat {Y}_k^{\bullet} =e_k\,{\bullet}\,\hbox{exp} (\bar{\hat {u}}_i^ +\bar{\hat {v}}_k),\)

  • \(\bar{{\hat {{u}}}}_i =\hbox{the mean of the }\hat {u}_i^{(l)} \hbox{'s}, \bar{\hat {u}}_i =\left. \sum\limits_{l=1}^{30000} \hat {u}_i^{(l)} \right/ 30000,\)

  • \(\bar{\hat {v}}_k =\hbox{the mean of the }\hat {v}_k^{(l)} \hbox{'s}, \bar{\hat {v}}_k =\left. \sum\limits_{l=1}^{30000} \hat {v}_k^{(l)} \right/ 30000,\)

  • \(\bar{\varpi} =\left\{\bar{\hat {u}}_i,\bar{\hat {v}}_k \right\},\) the set of means of the 30,000 parameter estimates, and

  • \(\left\{\user2{\hat {Y}}|\bar{\varpi} \right\}= \hbox{the set of individual} \hat {Y}_k^{\bullet} \hbox{s}.\)

Finally, substitute the results of Eqs. 9 and 10 into Eq. 8 to find the DIC.

Appendix F

SAR models are similar to the AR(1) model in time series. Note that in our model (1) we set \(\varepsilon_k =u_i +v_k\) with structured variation u i independent of unstructured variation v k . As this is not possible with the SAR model we default to the error term ɛ k .

For our conditions (1) we can write a SAR model as:

$$\left\{\begin{aligned} {\mathbf{Y}}\varvec{\sim} {\mathbf{Poisson}}\left(\varvec{\lambda} \right) \\ \hbox{ln} (\varvec{\lambda})=\varvec{\alpha} +\hbox{ln} ({\mathbf{e}})+{\mathbf{X}}\varvec{\beta} +\varvec{\varepsilon} \\ \varvec{\varepsilon} =\rho {\mathbf{S}}\varvec{\varepsilon} +\varvec{\eta} \end{aligned} \right.,$$
(11)

where

  • Y is the matrix of observed mortality counts as per (1),

  • \(\varvec{\lambda} =[\lambda_1 \ldots \lambda_{18822}]^{T}, \lambda_k =e_k \bullet\hbox{exp} \left(\alpha +{\mathbf{x}}_{{\mathbf{k}}}^{{\mathbf{T}}} \varvec{\beta} \right)\) for each k with remaining variables defined as per (1), and α is a constant,

  • \(\varvec{\varepsilon} =[\varepsilon_1 \ldots \varepsilon_{18822} ]^{T}=\) regression error terms,

  • ρ is a measure of spatial correlation (−1≤ ρ ≤ 1),

  • S is an 18,822 ×  18,822 neighborhood (spatial weighting) matrix with zeroes on the diagonals (not necessarily symmetric), standardized so the row sums add to one, and

  • \(\varvec{\eta} =[\eta_1 \ldots \eta_{18822} ]^{T}, \eta_k \sim N(0,\sigma^{2}).\)

It follows from (10) that \(\varvec{\varepsilon} =({\mathbf{I-}}\rho {\mathbf{S}})^{-1}\varvec{\eta} \) and so the covariance matrix Σ for the SAR model is \(\Sigma =\sigma^{2}{\mathbf{(I}}-\varvec{\rho} {\mathbf{S)}}^{{\mathbf{-1}}}{\mathbf{(I}}-\varvec{\rho} {\mathbf{S}}^{{\mathbf{T}}}{\mathbf{)}}^{{\mathbf{-1}}}.\)

If desired SAR can also use nearest neighbor weighting: Set s id w id /a i with w id a i defined as in (4)

Briefly comparing and contrasting the CAR and SAR models (Arab et al. 2007; Cressie 1993; Whittle 1954; Bao no date):

  • CAR sets specifications for the Y k s conditionally, while SAR does so simultaneously.

  • Spatial weighting matrices do not have to be symmetric in the SAR model but do in the CAR model.

  • A SAR model can always be restated as a CAR model, but not vice versa.

  • The SAR model and the CAR model are the same if and only if the covariance matrices are the same.

  • The CAR model is more computationally efficient than the SAR model because the matrix IM in the CAR model is symmetric while the matrix \({\mathbf{I}}-\varvec{\rho}{\mathbf{S}}\) in the SAR model is not.

  • In some cases the spatial weights in the SAR model may not be identifiable.

  • Parameter estimates for the SAR model are statistically not consistent. That is, for increasing sample sizes the parameter estimates may not converge with high probability to the actual parameter.

  • CAR gives the best (that is, the minimum mean squared prediction error) estimates of Y k based on all the other \(Y_l^{\prime}s,\;l\neq k.\)

  • If it makes more sense to specify the model conditionally, or if there is a symmetric structure in the correlation matrix, use a CAR model.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shoultz, G., Givens, J. & Drane, J.W. Urban Form, Heart Disease, and Geography: A Case Study in Composite Index Formation and Bayesian Spatial Modeling. Popul Res Policy Rev 26, 661–685 (2007). https://doi.org/10.1007/s11113-007-9049-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11113-007-9049-2

Keywords

Navigation