An Open Source Spatiotemporal Model for Simulating Obesity Prevalence

Part of the Advances in Geographic Information Science book series (AGIS)


Obesity may be the single most challenging example for a condition with causes and consequences at multiple levels and with multiple feedback loops among influencing factors. New approaches to modeling obesity prevalence are needed to fully understand the complexities associated with the relationship between obesity and the demographic, socio-economic and environmental factors.

We describe in this paper a computer simulation project that focuses on the causes of obesity-related health disparities. In particular, our project adopts the susceptible, infected, and recovered (SIR) framework and the categorization of population into normal, overweight, obese, and extremely obese subpopulations. This project is important to public health because the fully developed computer application provides a new, more comprehensive, decision support tool for policy makers than most existing applications. The implementation of policies that effectively combat obesity would improve the health and well-being of a high percentage of the population, including both adults and children. It will also greatly reduce associated economic costs to society such as health care expenses and loss of productivity.

Being written in open source, our computer application is entirely cross-platform, lowering the transmission costs in research and education. Free access to the source code allows a broader community to incorporate additional advances in generating research questions for specific goals, thus facilitating collaboration across disciplines.


Open source Spatiotemporal model Obesity 


Obesity is an exceedingly complex public health problem with hypothesized causes at multiple interacting levels that are embedded in the very structure of society [1, 2]. This complexity appears to be the reason that most one-dimensional preventive or therapeutic interventions have not been very successful. For example, the Foresight causal map prepared by UK Government Office illustrates the inherent complexity of obesity as a public health problem [3]. The Foresight map was built around energy balance and mammalian physiology, but the model rapidly expanded to include individual and collective physical activity, the built environment, individual and collective psychology, industrial food production, and population food consumption. Even with the expanded list of variables, obesogenic policy determinants of the relevant environments were excluded which seems to limit the validity of that approach. Obesity, per se, is only a small part of a larger public health problem that includes obesogenic policy, environments, and population characteristics. These population characteristics include unhealthy dietary habits and sedentary behavior, a high prevalence of obesity, high obesity-related morbidity and mortality, and high rates of diabetes or cardiovascular diseases among historically disadvantaged groups. Thus the obesity problem includes long-standing area disparities in health. Addressing these disparities, their spatio-temporal components, and their determinants requires new approaches.

Obesity prevalence has been predicted by using statistical models and simple dynamic models. However, they predicted only the size of the obese population as a whole without further distinguishing the population to various levels of obesity [4]. Such models over-generalized the movements of subpopulations between different levels of obesity. In addition, the simple models from current literature (e.g., [5, 6]) are often too simplified in the following ways: modeling future trends of obese population at a geographic scale that is often too coarse to be useful in revealing area disparities. Finally, most models, in order to accommodate the statistical and simple dynamic modeling structure, often miss important factors, such as death rates, birth rates of the population, and more importantly; lumping all levels of normal weight/overweight/obese/extremely obese subpopulations into one.

As such, the results of statistical analysis and predictions have limited practical use in assisting policy-making process by public health districts when designing and implementing more geographically- and temporally-focused intervention programs. Auchincloss and Roux [7] pointed out the weaknesses of traditional epidemiologic approaches when dealing with complex multilevel data with spatio-temporal components. They noted that traditional regression-based approaches to analyzing multi-level exposures and health disparities are limited by a variety of assumptions. These assumptions include the requirements that realizations of each independent variable do not influence one another, and that there are no feedback loops to address the interactions among variables. These requirements do not fit well with the complex realities of obesogenic policy, environments, and population characteristics where dependencies and feedback loops are common.

Obesity may be the single most challenging example for a condition with causes and consequences at multiple levels and with multiple feedback loops among the causes. New approaches are obviously needed. The principal research question of our work is: can we develop a prototype for a comprehensive simulation mechanism for estimating obesity prevalence and obesity-related disease or disparities that (1) addresses obesogenic policy, environments, and population characteristics; and (2) is calibrated against obesity-related morbidity and mortality?

Obesity studies have been, and continue to be challenged by dealing with temporal trend of geographic patterns and spatial dynamics of health development. There is an imperative need for effective and efficient methods to represent and examine the coupled space-time attributes of obesity phenomena in the comparative context. As a multi-dimensional and multi-scale phenomenon, obesity studies witness the role of geography and the awakening emphasis on space among public health practitioners. As discussed above, it is clear that a space-time perspective has become increasingly relevant to our understanding of public health dynamics. To this end, we argue that an open source solution is needed to systematically integrate space and time so to share and promote any advances in this direction. Though rich conceptual frameworks have highlighted the complexity of obesity dynamics, the gap has been widening between empirical studies and theories. Hence, the most crucial step is to systematically understand obesity dynamics data from the theoretical and policy context. Thus, the availability of codes and tools to support space-time data analysis are vital in the adoption of such a perspective in obesity studies.

An Open Source Approach to Obesity Simulations

The prevalence of obesity among adults and children in the United States has increased dramatically in recent decades [8]. This is a public health issue as obesity causes many other chronic health conditions, such as, hypertension, cardiovascular disease, type II diabetes, among others. Increasing obesity prevalence in a region affects the life expectancy and quality of its residents. It also increases social costs in many ways.

The basic cause of obesity is the imbalance between the amount of energy taken in through eating and drinking and the amount of energy expended through metabolism and physical activity [9]. To offset excessive energy intake, increased physical activity is encouraged as a way to keep energy in balance. However, energy imbalances appear to be encouraged by features of the physical, social, and economic environments. Lee et al. [10] found that the density of fitness centers and non-fresh food outlets are related to the prevalence of obesity, and that an analysis of smaller geographic units provides more details regarding area disparities in health than analyses carried out with larger geographic units.

Most of the obesity studies that have looked at the food environment have concentrated on the hypothesized effect of non-fresh food (fast-food, packaged food, pre-processed food, etc.) consumption on people’s diet and public health. With today’s fast-paced life styles and intensive marketing of various types, non-fresh food outlets have become an important part in people’s daily diet because of convenience, price, distance and other cultural factors [11]. The literature in this area suggests a positive correlation between regularly consuming non-fresh food and the prevalence of obesity unless daily physical activities are performed on a regular basis [12]. Positive correlation means, the more frequently one eats from non-fresh food outlets over time, the higher are the chances of being obese [13].

A study on non-fresh food consumption and obesity among Michigan adults suggested that regular fast food consumption was higher among younger adults and men [8]. In that study, the prevalence of obesity increased consistently with frequenting non-fresh food outlets, from 24% of those going less than once a week to 33% of those going three or more times per week. The predominate reason for choosing fast food was convenience. Another study found that youths 11–18 years old ate at non-fresh food outlets an average of twice per week [14], which also points to the alarming possibility of increasing obesity rates among young people.

Non-fresh food consumption has been found to be highly correlated with the prevalence of obesity. Reasons that may affect the consumption of non-fresh food are the price of the food, the walking or driving distance, and various cultural, behavioral, or environmental factors [8, 15]. In addition, marketing campaigns of non-fresh food outlets could play a significant role in the consumption of unhealthy food [9]. If marketed well, non-fresh food outlets can attract a significant number of customers, which can later lead to increases of overweight and obese people. Most often non-fresh food outlets are unhealthy because of the way foods are cooked and the high calories per “serving”. The increased supply of non-fresh food outlets has a significant impact on obesity. Frequently eating at non-fresh food outlets is becoming an important issue in the public health literature because of the apparent health effects.

Physical activity and the distribution of fitness centers can have a significant impact on the prevalence of obesity if exercise is taken regularly [16]. Over the last few years, there have been studies focused on the relationship between the built environment and physical activity [16]. However, there were no other studies besides Lee et al. [10] that examine the relationship between distances from fitness centers and obesity rates by using small geographical units such as tracts or block group. The proximity of fitness centers could change the prevalence of overweight and obesity in some neighborhoods. A relevant study in New Zealand neighborhoods found evidence of a relationship between beach access and body mass index (BMI) and physical activities [17]. Several other studies reported a positive association between the recreational environment and physical activity for both adults and children [18, 19]. Going to recreational centers regularly increased physical activity; therefore, lower rates of obesity and overweight can be expected in neighborhoods with sufficient access to fitness centers. Mobley et al. [20] found there is a lower average BMI in areas with more fitness centers. In addition, Boehmer et al. [21] reported that having fewer fitness centers within close proximity was associated with higher likelihood of obesity among women but not men.

Furthermore, being obese was found to be significantly associated with perceived absence of sidewalks, unpleasant communities, lack of interesting sites, and presence of garbage [21]. Several studies show that people tend to increase their frequencies of visiting fitness centers when the distance between home and facilities decreases [22]. For long-term health benefits, people should focus on improving fitness by increasing physical activity rather than relying only on diet for weight control [23]. It should be noted, however, that going to fitness centers maybe a critical behavior, but there are multiple factors that may discourage or encourage this key behavior (such as the price of membership, geographical (distance), time required for finding a parking space, etc.)

Our review of the literature in obesity suggests that a comprehensive computation model of obesity-related disparities with extensive calibration is possible. Some basic components of the model have been developed, but key components of a comprehensive model have been omitted from prior work. Calibration is also insufficient. As far as we know, no one has developed a comprehensive model of obesity and related area disparities with extensive calibration against obesity-related morbidity and mortality. Our innovative project has scientific merit because of the breadth of the proposed model and the possible calibration of the simulation against hard outcomes including obesity-related morbidity and mortality. A strength of our approach is that it may be possible to use a multi-year sample of geocoded individual inpatient discharge data from all hospitals in a representative urban-suburban county (such as Summit County, Ohio) where the simulation will be anchored as well as a corresponding sample of geocoded death certificates, US Census data, and geocoded environmental data from Summit County Public Health, the Ohio Department of Health, and other sources. Use of real world geocoded individual health outcome data in this research project will provide more robust tests of a given modeling strategy in nearly all circumstances.

In terms of obesity simulations, there have been various attempts discussed in obesity literature. In their review of obesity simulations, Levy et al. [24] list two agent-based models (ABM) and seven Markov models. Burke and Heiland’s ABM [25] looks at the obesity epidemic in terms of food prices and social norms, while the Hammond and Epstein [26] ABM looks at obesity in terms of the physiology of dieting and socially influenced weight changes. More recently, Auchincloss et al. [27] models residential segregation, income disparities, and diet quality; while Yang et al. [28] models disparities and walking behaviors in an urban setting. While these obesity simulations achieved the objectives of estimating obesity prevalence in some ways, they all fell short of allowing more detailed classification of population (e.g., grouping populations into normal/overweight/obese/extremely obese) and allowing movements between subpopulations. Furthermore, the geographic units of these simulations are mostly too big to have practical uses in assisting policy-making processes for intervention programs.

Overall, from many of the analyses we reviewed, they showed that obesity ratios are indeed affected by educational attainment, income level, and unemployment level (see reviews in [10]). In addition, obesity ratios also show the expected relationships with densities of fitness centers and non-fresh food outlets. While such relationships are all statistically significant, it is important for us to explore in more detail where inside the county we can expect such relationships to be stronger or weaker. This is so that, when making policies on how to promote health and allocating funding to different areas in the county. For example, area disparities in health can be incorporated for more effective outcomes at neighborhood level.

In terms of implementing a software tool for simulating obesity prevalence, we argue that both space and time are critical components in such simulations. Spatial turn in many socioeconomic theories has been noted in many disciplines, encompassing both social and physical phenomena [29, 30, 31]. This intellectual and technological change has yielded important insights on physical sciences, social sciences and the humanities, with an explosion of interest across disciplines [32]. During the past several decades, a number of efforts have been witnessed on the development and implementation of spatial statistical analysis packages, which continues to be an active area of research [33]. Meanwhile, spatial public health analysis is increasingly being supported by the emergence of advanced analytical methods in space-time data analysis and data visualization. The interactive spatial data analysis has motivated, if not directly provoked, new queries on spatial public health theories. Therefore, the current research implements the new methodological advances in an open source environment for exploring data that has both temporal and spatial dimensions, which lend support to the notion that space and time cannot be meaningfully separated.

The fast growth of spatial public health analysis is increasingly seen as attributable to the availability of spatio-temporal datasets. By contrast, most public health geographers have been slow to adopt and implement new spatially explicit methods of data analysis due to the lack of extensible software packages, which becomes a major impediment to promoting spatial thinking in public health studies.

ABM is not new to public health inequality studies, whereas an open source solution would give better support for the scientific investigation and management of data sets, including its description, representation, analysis, visualization, and simulation. Additionally, comparative space-time analysis enables access to a much wider thinking that addresses the role of space at different stages and thus identifies the research gaps and opportunities for more in-depth study.

Obesity Prevalence Simulator: A Case Study of Summit County, Ohio

Timely and rigorous analysis of obesity will open up a rich empirical context for the social sciences and policy interventions. The Obesity Prevalence Simulator (ObPSim) was developed in Python programming language with funding provided by the Summit County Public Health District of Summit County, Ohio. Python is a versatile language that is free to acquire, install, and use. Python is also a cross-platform programming language, which means a python script can be used by computers with one platform of operating system and be usable in other operating system platforms. In addition, many libraries that process GIS and other forms of data have been developed and are freely available in public domain. This allows further improvements and updates for existing codes to be carried out easily. The open source environment offers a straightforward way of benefiting wider community.

While Lee et al. [10] used Summit County, Ohio as a case study because of the availability of key data and the project’s funding, their findings may be applicable to many other geographic locations since demographic and socio-economic profiles in this area are very close to the national average in the US.

The objective of the study reported here is to model known multiple parameters associated with changes in body mass index (BMI) classes and to establish conditions under which obesity prevalence will plateau. Following Thomas et al. [4], a differential equation system is adopted that predicts population-wide obesity prevalence trends. The equation system is complex but very logical and practical. Interested readers can find the equation set in Thomas et al. [4].

The model considers both social and non-social influences on weight gain, incorporates other known parameters affecting obesity trends, and allows for country specific population growth. With 2011 data from American Community Survey (Census Bureau, 2011) and the 2008–2013 BMI data from the Bureau of Motor Vehicles, Summit County has 452 census block groups with a wide spectrum of obesity ratios (ranging from 16 per 1000 population to 549 per 1000 population) and overweight ratios (ranging from 32 per 1000 population to 541 per 1000 population).

As can be seen in the two maps in Fig. 1, (1) obese population, though still are in lower ratios than those of overweight population, does seem to have a geographic clustering patterns in the county, (2) overweight population prevails in most of the county with exceptions of only a few census block groups, and (3) the use of census block groups as the unit for geographic analysis indeed reveals more detail of how obesity prevails in the county than using the entire county as an analytic unit.
Fig. 1

Obesity and overweight ratios in Summit County, Ohio based on driver licenses. Data sources: BMI data from Ohio Bureau of Motor Vehicles, 2008–2013; population data from American Community Survey of the US Census Bureau, 2011

We adopted the concept of the susceptible, infected, and recovered (SIR) framework to divide a population into subpopulations categorized as normal weight, overweight, obese, and extremely obese by BMI data. To estimate the population moving between these categories, we use a simulation approach that allow analysts to specify the ratios that subpopulations change in between categories. The relationships and potential movements between subpopulations are shown in the diagram in Fig. 2 below:
Fig. 2

The susceptible, infected, and recovered (SIR) framework for obesity prevalence simulation

In each neighborhood (i.e., census block group in this project), population is categorized into six (6) subpopulations:
  • Normal weight (S_T),

  • Overweight (1_T),

  • Obese (2_T),

  • Extremely Obese (3_T),

  • Exposed (E_T, or S_T ➔ 1_T), and

  • Recovered (R_T, or 1_T ➔ S_T).

The ratios that define how subpopulations move in between categories are
  • α1(1_T2_T),

  • α2(2_T3_T),

  • β1(3_T2_T),

  • β2(2_T1_T),

  • ϒ1(S_T1_T), and

  • ϒ2(1_TS_T).

Following Thomas et al. [4]:
  • Total population at time0 (TotalPopulation) = S_T + 1_T + 2_T + 3_T +E_T + R_T

  • The exposed subpopulation (E_T) are individuals who are exposed to either social or non-social influences that lead to weight gain and these individuals will eventually become overweight.

  • The subpopulation (R_T) are individuals who have weight loss under social or non-social influences.

  • Social interactions between compartments are governed by the law of mass action and modeled by multiplying the population numbers in each class.

  • Estimated subpopulations at time1 can be derived as solutions for α1, α2, β1, β2, ϒ1, and ϒ2 from a set of differential equations as proved in Thomas et al. [4].

  • For the purpose of modeling and simulations, initial values for model parameters are estimated from publications in the obesity literature:

  • The probability of being born in obesogenic environment is set to be 0.55 of females of reproductive age who are overweight or obese, based on Balcan et al. [34].

  • Birth rate is set to be 0.0144, based on Jacobson et al. (2007).

  • Baseline prevalence rates are set to be 0.32 for overweight, 0.22 for obese, 0.03 for strictly obese, based on Flegal et al. [35].

  • Social influence by overweight and obese are set to be 0.4 for overweight subpopulation and 0.2 for obese subpopulations, both are based on fitting to initial trends as discussed in Flegal et al. [35].

  • Spontaneous rate of weight gain to each class are set to be: exposed (0.05), overweight (0.14), obese (0.08), and extremely obese (0.014), also based on Flegal et al. [35].

  • Rate of weight loss to each class are set to be: extremely obese to obese (0.05), obese to overweight (0.03), and overweight to normal weight (0.033), also based on Flegal et al. [35].

  • Rate of weight regainers transitioning from normal weight to overweight is set to be 0.04, also beased on Flegal et al. [35].

  • Death rate of obese and extremely obese populations is set to vary between 16.5 to 22 per 1000 population as suggested by Oizumi [36].

ObPSim comes with a sample data file in shapefile format (ESRI, Inc., Redlands, California). Users of the ObPSim can use it to work with any customized shapefile data. The only requirement for the shapefiles is to have the following columns in the attribute table:
  • S_T: the number of people in each neighborhood who are in normal weight range (BMI < = 25)

  • 1_T: the number of people in each neighborhood who are considered overweight (20 < BMI < = 30)

  • 2_T: the number of people in each neighborhood who are considered obese (30 < BMI < = 40)

  • 3_T: the number of people in each neighborhood who are considered extremely obese (BMI > 40)

  • E_T: the number of people in each neighborhood who are exposed to possibility of changing from normal weight to overweight

  • R_T: the number of people in each neighborhood who may have weight loss so to return from overweight to normal weight.

In the obesity prevalence folder of the sample data set, a shapefile subfolder holds a set of shapefiles, entitled SummitBG. This can be used to test run the Obesity Prevalence Simulator. Please note that the boundary data for block group polygons were downloaded from Data for the S_T, 1_T, 2_T, and 3_T subpopulations were calculated using height/weight data derived from drivers’ license data from the Ohio Bureau of Motor Vehicles. E_T and R_T data were derived from geographically weighted regression of the following relationships:
$$ E\_T= function\;\left(S\_T,\mathrm{density}\kern0.17em \mathrm{non}\hbox{-} \mathrm{fresh}\kern0.17em \mathrm{food}\kern0.17em \mathrm{outlets}\right) \vspace*{-7pt}$$
$$ R\_T= function\;\left(1\_T,\mathrm{Distance}\;\mathrm{to}\;\mathrm{nearest}\kern0.17em \mathrm{fitness}\kern0.17em \mathrm{centers}\right) $$

It should be noted that estimations for E_T and R_T with the above regression are provided here purely for the purpose of demonstrating the usage of ObPSim. Additional studies and analysis may be needed in order to derive better or more precise estimates.

The estimates for E_T and R_T should be done so each neighborhood has its own estimates. The examples included in the sample shapefile were derived using the relationships
  • between S_T and the density of non-fresh food outlets in each neighborhood for estimating E_T and

  • between 1_T and the distance to the nearest fitness centers from the neighborhood center for estimating R_T.

A simulation control panel, entitled Simulation, shows the various simulated year, parameters, and the Update button as below:

Please note that the parameters in Fig. 3 are set to their initial values (default values), which can be changed in simulation runs. Please note that parameters such as birth rates and death rates are assumed to be the same across the entire county. This is because a county is a small geographic area and there wasn’t any such data available for any geographical units inside a county. Other parameters may be formulated such that local conditions (i.e., unique parametric values for census blockgroups) can be reflected by the different values describing each neighborhood’s unique characteristics.
Fig. 3

Control panel for simulation parameters in Obesity Prevalence Simulator

Needless to say, any of the parameter values in this model can be changed to reflect the conditions of the simulated area. Essentially, we implemented the model described by Thomas et al. [4] for each neighborhood (census block groups) in Summit County. We developed ObPSim by using years as the temporal unit of analysis. The modeling process as described in Thomas et al. [4] was repeated for each neighborhood. With this approach, ObPSim allows users to
  • Observe the spatial distribution of obesity prevalence at any given year.

  • Observe the changes in each neighborhood’s obesity prevalence over time.

  • Observe the spatio-temporal patterns by neighborhoods by changing one or more parameter values.

  • Each round of simulation will generate an output file.

For example, Fig. 4 below shows the simulated obesity prevalence by neighborhoods from 2013 to 2019. As shown in this table, obesity prevalence does seem to plateau into future years. As can be seen in this series of maps, Summit County was simulated to evolve from having many neighborhoods (census blockgroups) seeing fast growth of obesity ratios (shown in bluish colors) in 2013–2015 to having much slowed growth of obesity ratios (shown in reddish colors 0 in 2016–2019. When obesity ratios are increasing (or growing) fast, the obesity prevalence is high. On the other hand, when obesity ratios are already high and only change little, the obesity prevalence is plateaued.
Fig. 4

Example runs of obesity prevalence simulations. Note: Increase rates are calculated with reference to baseline figures in 2013

The advantage of using ObPSim to estimate obesity prevalence is the ability to change values of model parameters by holding all others constant while varying only one or only a few parameter values in simulation runs. In Fig. 5 below, obesity prevalence is simulated for year 2018, by setting social influence value to be 0.2, 0.3, 0.4, and 0.5.
Fig. 5

Effects of social influence changes in obesity prevalence

As can be seen in the progressive changes of obesity prevalence by increasing social influences on overweight and holding that influence on obese constant, above figure shows that higher levels of social influence seem to be important in shaping simulated obesity prevalence. As a comparison, Fig. 6 below shows the insensitivity of social influence on obese subpopulation while that influence on overweight is held constant at 0.20. Figures 5 and 6 are listed here to demonstrate the influence of model parameters in the simulated pace of obesity prevalence.
Fig. 6

Insensitivity of social influence on obese subpopulation while that influence on overweight is held constant at 0.20

The concept of exploratory space-time data analysis is strongly associated with visualization because graphical presentation enables the analyst to open-mindedly explore the structure of the data set and gain some new insights. Shneiderman [37] argues that exploratory data analysis can be generalized as a three-step process: “overview first, zoom and filter and then details-on-demand”. More importantly, it is worth noticing that this process should be iterative, and the methods implemented in the current research addressed the challenge. To explain the observed patterns and trends, a follow-up research is needed on collecting determinants of economic growth.

As the last, but the most important step in an analysis such as using ObPSim to investigate spatio-temporal changes in obesity prevalence is the calibration of the model. If (and when) actual data are available for simulated years, it is possible to run the simulations retroactively for a target year and then calibrate the model parameters by incorporating actual data. For example, one can first simulate obesity prevalence in 2012 by using 2000 data and then calibrate the model with actual 2012 data. Such calibration would help to derive a set of parametric values that best approximates simulated results to actual trends in 2012. Understandably, the calibration processes can be tedious and repetitive, they are, however, necessary steps in ensuring simulations are meaningful and applicable.

Concluding Remarks

This paper explores the potential for the new open source tool to function in obesity studies. In other words, the current work is mainly from an exploratory perspective, which can motivate scholars to design a series of analysis questions and formulate new hypotheses from theoretical and policy perspectives. This space-time work provides an important contribution to the current literature, which lacks in comparative space-time studies. Although this comparative study stems from the analysis of obesity dynamics, it broadly aims to analyze the role of geography and location in public health phenomena. In addition, the methods are built in open source environments and thus easily extensible and customizable.

Obesity is an exceedingly complex public health problem with hypothesized causes at multiple interacting levels that are embedded in the very structure of society. This complexity appears to be the reason that one-dimensional preventive or therapeutic interventions are not very successful. The traditional epidemiologic approaches fail to address complex and multilevel data with spatial components. These simplifications do not fit well with the complex realities of obesogenic policy, environments, and population characteristics where dependencies and feedback loops are common. Hence, the reported research extends traditional regression-based approaches to multi-level exposures through a set of differential equation system. This project also integrates the following elements: spatial components, the influence among realizations of each independent variable, as well as feedback loops between outcomes and independent variables.

Given this, new approaches are needed to fully understand the complexities associated with obesity. ObPSim developed in this project is a new, more comprehensive, decision support tool for policy makers. The implementation of policies that effectively combat obesity would improve the health and well-being of a high percentage of the population, including both adults and children, as well as greatly reducing associated economic costs to society such as obesity-related health care expenses and loss of productivity. Based on the susceptible, infected, and recovered (SIR) framework, ObPSim is featured by categorizing the population into subpopulations of normal weight, overweight, obese, and extremely obese. Furthermore, ObPSim allows population to be moved between subpopulations. Such movements can be defined by any reasoning from the various physical environments, food environment, built environment, and socio-economic environments of the neighborhoods.

Beyond the features of categorizing a population to subpopulations and allowing people to move between subpopulations, ObPSim also allows users to set a suite of model parameters in estimating future obesity prevalence. These parameters do affect how estimations are calculated. However, the parameters as defined by the local conditions allow the simulations to be executed with spatial variations and with localized conditions. Finally, ObPSim provides a means of studying obesity prevalence at a very fine geographic scale. By using census block groups as neighborhoods, ObPSim goes beyond the conventional approaches of studying obesity prevalence at the scale of census tracts. The additional details reveal by using smaller geographic units certainly allow us to better understand spatial patterns and processes of obesity prevalence.

Beyond the scope of this project, studies that compare how simulated obesity prevalence levels react to different values of the model’s parameters would be valuable to engage. By fixing all but one parameter to vary in simulations, estimated obesity prevalence patterns can be used to related to how that particular parameter changes. If desired, multiple parameters can be allowed to change simultaneously so observations can be made to see how they affect obesity prevalence as a whole. This paper thus demonstrates an example to interface public health analysis with the open source revolution, which is among the burgeoning efforts seeking the cross-fertilization between the two fast-growing communities.

The ObPSim package is entirely open source, which can promote collaboration among researchers who want to improve current functions or add extensions to address specific research questions. Based on the strength of scientific visualization techniques, this paper stresses the need to study the space-time dimension underlying obesity data sets. Finally, a new interactive tool is suggested and demonstrated as providing an explanatory framework for space-time data. On this basis, the sincere hope here is that this dialogue between public health scholars and geographers will embrace the real world challenges of inequality issues.



This work is partially supported by the National Science Foundation under Grant No. 1416509, project titled “Spatiotemporal Modeling of Human Dynamics Across Social Media and Social Networks”. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.


  1. 1.
    Feng J, Glass TA, Curriero FC, Stewart WF, Schwartz BS (2010) The built environment and obesity: a systematic review of the epidemiologic evidence. Health Place 16(2):175–190CrossRefGoogle Scholar
  2. 2.
    Morland KB, Evenson KR (2009) Obesity prevalence and the local food environment. Health Place 15(2):491–495CrossRefGoogle Scholar
  3. 3.
    Butland B, Jebb S, Kopelman P, McPherson K, Thomas S, Mardell J, Parry V (2012) Tackling obesities: future choices: project report, 2nd edn. Foresight, United Kingdom Government Office for Science, LondonGoogle Scholar
  4. 4.
    Thomas DM, Weedermann M, Fuemmeler BF, Martin CK, Dhurandhar NV, Bredlau C, Heymsfield SB, Ravussin E, Bouchard C (2013) Dynamic model predicting overweight, obesity, and extreme obesity prevalence trends. Obesity. doi:10.1002/oby.20520
  5. 5.
    Finkelstein EA, Khavjou OA, Thompson H, Trogdon JG, Pan L, Sherry B et al (2012) Obesity and severe obesity forecasts through 2030. Am J Prev Med 42(6):563–570CrossRefGoogle Scholar
  6. 6.
    Wang Y, Beydoun MA, Liang L, Caballero B, Kumanyika SK (2008) Will all Americans become overweight or obese? Estimating the progression and cost of the US obesity epidemic. Obesity 16(10):2323–2330CrossRefGoogle Scholar
  7. 7.
    Auchincloss AH, Roux AVD (2008) A new tool for epidemiology: the usefulness of dynamic-agent models in understanding place effects on health. Am J Epidemiol 168(1):1–8CrossRefGoogle Scholar
  8. 8.
    Anderson B, Rafferty AP, Lyon-Callo S, Fussman C, Imes G (2011) Fast-food CONSUMPTION and obesity among Michigan adults. Prev Chronic Dis 8(4):A71Google Scholar
  9. 9.
    Sonya AG, Mensinger G, Huang SH, Kumanyika SK, Stettler N (2007) Fast-food marketing and children’s fast-food consumption: exploring parents’ influences in an ethnically diverse sample. J Publ Policy Mark 26(2):221–235CrossRefGoogle Scholar
  10. 10.
    Lee RE, Mama SK, Medina AV, Ho A, Adamus HJ (2012) Neighborhood factors influence physical activity among African American and Hispanic or Latina women. Health Place 18(1):63–70CrossRefGoogle Scholar
  11. 11.
    Nielsen SJ, Siega-Riz AM, Popkin BM (2002) Trends in food locations and sources among adolescents and young adults. Prev Med 35:107–113CrossRefGoogle Scholar
  12. 12.
    Cummins S, Macintyre S (2005) Food environments and obesity-neighborhood or nation? Int J Epidemiol 35:100–104CrossRefGoogle Scholar
  13. 13.
    Prentice AM, Jebb SA (2003) Fast foods, energy density and obesity: a possible mechanistic link. Obes Rev 4(4):187–194CrossRefGoogle Scholar
  14. 14.
    Paeratakul S, Ferdinand DP, Champagne CM, Ryan DH, Bray GA (2003) Fast-food consumption among U.S. adults and children: dietary and nutrient intake profile. J Am Diet Assoc 103(10):1332–1388CrossRefGoogle Scholar
  15. 15.
    McEntee J, Aygeman J (2009) Towards the development of a GIS method for identifying rural food deserts: geographic access in Vermont, USA. Appl Geogr 30:165–176CrossRefGoogle Scholar
  16. 16.
    Gebel K, Bauman AE, Petticrew M (2007) The physical environment and physical activity: a critical appraisal of review articles. Am J Prev Med 32(5):361–369CrossRefGoogle Scholar
  17. 17.
    Witten K, Hiscock R, Pearce J, Blakely T (2008) Neighbourhood access to open spaces and the physical activity of residents: a national study. Prev Med 47:299–303CrossRefGoogle Scholar
  18. 18.
    Davison KK, Lawson CT (2006) Do attributes in the physical environment influence children’s physical activity? A review of the literature. Int J Behav Nutr Phys Act 3:19CrossRefGoogle Scholar
  19. 19.
    Owen N, Humpel N, Leslie E, Bauman A, Sallis J (2004) Understanding environmental influences on walking; review and research agenda. Am J Prev Med 27(1):67–76CrossRefGoogle Scholar
  20. 20.
    Mobley LR, Root ED, Finkelstein EA, Khavjou O, Farris RP, Will JC (2006) Environment, Obesity, and cardiovascular disease in low-income women. Am J Prev Med 30(4):327–332CrossRefGoogle Scholar
  21. 21.
    Boehmer TK, Hoehner CM, Despande AD, Brennan Ramirez LK, Brownson RC (2007) Perceived and observed neighborhood indicators of obesity among urban adults. Int J Obes (Lond) 97(3):486–492Google Scholar
  22. 22.
    Giles-Corti B, Timperio A, Bull F, Pikora T (2005) Understanding physical activity environmental correlates: increased specificity for ecological models. Exerc Sport Sci Rev 33(4):175–181CrossRefGoogle Scholar
  23. 23.
    Lee CD, Blair SN, Jackson AS (1999) Cardiorespiratory fitness, body composition, and all-cause and cardiovascular disease mortality in men. Am J Clin Nutr 69:373–380Google Scholar
  24. 24.
    Levy D, Mabry P, Wang Y, Gortmaker S, Huang TK, Marsh T, Moodie M, Swinburn B (2011) Simulation models of obesity: a review of the literature and implications for research and policy. Obes Rev 12(5):378–394CrossRefGoogle Scholar
  25. 25.
    Burke MA, Heiland F (2007) Social dynamics of obesity. Econ Inq 45(3):571–591CrossRefGoogle Scholar
  26. 26.
    Hammond R, Epstein J (2007) Exploring price-independent mechanisms in the obesity epidemic. Center on Social and Economic Dynamics Working PaperGoogle Scholar
  27. 27.
    Auchincloss AH, Riolo RL, Brown DG, Cook J, Diez Roux AV (2011) An agent-based model of income inequalities in diet in the context of residential segregation. Am J Prev Med 40(3):303–311CrossRefGoogle Scholar
  28. 28.
    Yang Y, Diez Roux AV, Auchincloss AH, Rodriguez DA, Brown DG (2011) A spatial agent-based model for the simulation of adults' daily walking within a city. Am J Prev Med 40(3):353–361CrossRefGoogle Scholar
  29. 29.
    Goodchild MF, Glennon A (2008) Representation and computation of geographic dynamics. In: Hornsby KS, Yuan M (eds) Understanding dynamics of geographic domains. CRC, Boca Raton, FL, pp 13–30Google Scholar
  30. 30.
    Krugman P (1999) The role of geography in development. Int Reg Sci Rev 22(2):142–161CrossRefGoogle Scholar
  31. 31.
    Ye X, Wu L (2011) Analyzing the dynamics of homicide patterns in Chicago: ESDA and spatial panel approaches. Appl Geogr 31(2):800–807CrossRefGoogle Scholar
  32. 32.
    Ye X, Rey S (2013) A framework for exploratory space-time analysis of economic data. Ann Reg Sci 50(1):315–339CrossRefGoogle Scholar
  33. 33.
    Rey S, Ye X (2010) Comparative spatial dynamics of regional systems. In: Páez A et al (eds) Progress in spatial analysis. Springer, Berlin, pp 441–463CrossRefGoogle Scholar
  34. 34.
    Balcan D, Goncalves B, Hu H, Ramasco JJ, Colizza V, Vespignani A (2010) Modeling the spatial spread of infectious diseases: the GLobal Epidemic and Mobility computational model. J Comput Sci 1(3):132–145. doi:10.1016/j.jocs.2010.07.002. Epub 2011/03/19CrossRefGoogle Scholar
  35. 35.
    Flegal KM, Carroll MD, Ogden CL, Curtin LR (2010) Prevalence and trends in obesity among US adults, 1999–2008. JAMA 303(3):235–241. doi:10.1001/jama.2009.2014. Epub 2010/01/15 2009.2014CrossRefGoogle Scholar
  36. 36.
    Oizumi R, Takada T (2013) Optimal life schedule with stochastic growth in age-size structured models: Theory and an application. J Theor Biol 323:76–89. doi:10.1016/j.jtbi.2013.01.020. Epub 2013/02/09CrossRefGoogle Scholar
  37. 37.
    Shneiderman B (1996). The eyes have it: a task by data type taxonomy for information visualizations. In: Visual languages, 1996. Proceedings., IEEE Symposium on. IEEE, pp 336–343Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of GeographyKent State UniversityKentUSA
  2. 2.College of Environment and PlanningHenan UniversityKaifeng Shi, Henan ShengChina

Personalised recommendations