Introduction

Large concentrations of population in the United States live in areas at risk of natural disasters such as earthquakes, hurricanes, flooding, and tornados; or manmade disasters such as industrial contamination or terrorist attacks (United Nations Development Programme 2004). After catastrophic disasters, authorities and relief agencies have an urgent need to determine the impact on population distribution. As noted by the National Research Council (2007) in their report entitled “Tools and Methods for Estimating Populations at Risk from Natural Disasters and Complex Humanitarian Crises,” governmental and non-governmental organizations working in disaster response situations often complain of a data vacuum that hinders decision-making around short-term priorities, resource allocation, and long-term recovery planning.

Following a catastrophic disaster that causes major population movements, displaced populations are in great need of emergency assistance, and yet their locations may be unknown. Furthermore, they are likely to remain highly mobile for many months. Although demographic data describing age and gender are important for providing emergency services, the National Research Council asserts that frequently updated basic population estimates by location are the most essential. They advise, “It is better to collect basic population data frequently, rather than detailed information only on an occasional basis” (National Research Council 2007:94).

The Census Bureau cannot currently meet this need because it has funding to produce only one annual population estimate for every county in the country and this estimate has a nine-month lag associated with it. In addition, Census Bureau estimates have high error rates for counties that are rapidly declining or growing. High errors rates also occur in areas with large proportions of group quarters population or Hispanics, higher rates of international migration, or rapid growth in the Hispanic population (Baker 2001). These trends suggest that the current methodology may be particularly ill-suited for tracking post-disaster population shifts. Private data firms produce estimates at varying intervals but with a lag time of several months (Hodges 1998). Following a catastrophic disaster, the need is not for perfectly accurate estimates, but for quick and frequently updated estimates that provide a reasonably sound measure of population distribution. In a rapidly changing environment, monthly estimates of total population by county are perhaps most needed because they inform a host of decisions about the volume of housing, health care, retail, public safety, and other services needed in a given county. Note that county-level estimates for damaged counties will include both returning and new residents, and will not answer the question of who has returned nor how many remain displaced. Nonetheless, they provide important guidance for essential planning activities.

Population estimates for areas within the U.S. are typically developed based on available administrative data. A number of administrative data sets collected by states and the federal government have been used historically by demographers to generate population estimates. Researchers and authorities attempted to use several alternative data sets as the basis for population estimates following Hurricane Katrina. These include public school enrollment, utility accounts, United States Postal Service (USPS) counts of residences actively receiving mail and National Change of Address data, drivers’ licenses, registered passenger car data, voter registration and participation data, and traffic volume. In recent years, the federal government has begun making additional data available following disasters, such as housing damage estimates and Federal Emergency Management Agency (FEMA) data on the location of applicants for disaster assistance. This paper examines all these administrative data sets to identify the most appropriate data sets for generating total county-level population estimates for disaster-impacted counties based on each data set’s soundness, consistency, relevance, timeliness, and accessibility following a catastrophic disaster.

Background

Multiple methods exist for converting administrative data into population estimates. One of the simplest and quickest is the Censal Ratio Method, which assumes that the relationship between a symptomatic variable and the population count from the most recent decennial census is constant. It applies the historic ratio of the symptomatic variable and the census population count to the current value of the symptomatic variable to generate a current population estimate (Murdock and Ellis 1991; Raymondo 1992; Rives et al 1995). Population is estimated as:

$$ P_{t + 1} = \left( {{{{\text{SV}}_{t + 1} } \mathord{\left/ {\vphantom {{{\text{SV}}_{t + 1} } {{\text{SV}}_{t} }}} \right. \kern-\nulldelimiterspace} {{\text{SV}}_{t} }}} \right)*P_{t} $$

The primary limitations of the Censal Ratio Method arise from any errors in the administrative data set used for the symptomatic variable and in the assumption that the variable changes in direct relation to population over time (Kitagawa 1980; Murdock and Ellis 1991; Rives et al 1995).

Another relatively simple method is the Housing Unit Method. This method assumes that nearly everyone lives in some type of housing structure. Population is estimated as:

$$ P_{t} = \left( {H_{t} \times {\text{PPH}}_{t} } \right) + {\text{GQ}}_{t} $$

where P is population, H is households, PPH is the average number of persons per household, GQ is the group quartersFootnote 1 population and t refers to a specific date. An alternative version estimates housing units first and applies an occupancy rate to this estimate to derive occupied housing unit or households. But estimating both housing units and occupancy rates can be problematic—particularly following a disaster—and researchers often prefer to estimate households directly (Smith 1986; Smith and Cody 2004). Persons per household and group quarters population are often drawn from the most recent decennial census because the literature suggests that average household size and group quarters population contribute less to error rates in these population estimates than the number of occupied housing units (Smith and Cody 1994; Smith and Lewis 1983; Starsinic and Zitter 1968).

Households can be estimated based on a variety of data including electric accounts, telephone customers, and postal service records. However, there is rarely a one-to-one relationship between these data and households. For this reason, these data are often used as a symptomatic variable of households. A ratio of the 2000 value of the symptomatic variable and the count of households from the 2000 Census is applied to the current value of the symptomatic variable to generate a current estimate of households as follows:

$$ H_{t} = \left( {{{{\text{SV}}_{t} } \mathord{\left/ {\vphantom {{{\text{SV}}_{t} } {{\text{SV}}_{t - 1} }}} \right. \kern-\nulldelimiterspace} {{\text{SV}}_{t - 1} }}} \right)*H_{t - 1} $$

Again, the primary limitations of this method arise from any errors in the administrative data set used for the symptomatic variable and in the assumption that the variable changes in direct relation to households over time (Smith and Cody 2004).

In a post-disaster setting, if either of these methods is to be used to generate frequently updated population estimates, it is important to assess the soundness of the assumption that the symptomatic variable changes in direct relation to population or households. When using the Housing Unit Method, it is important to assess the soundness of the assumption that persons per household and Group Quarters population are little changed since the last decennial census.

Methods

This paper provides a comprehensive assessment of the appropriateness of administrative data collected at the federal and state level for developing county-level population estimates to inform planning efforts following a catastrophic event in the U.S. We assess administrative data sets used historically by demographers to generate population estimates as well as several unusual data sets used by the media, local authorities and public health researchers as the basis for population estimates following Hurricanes Katrina and Rita. This paper did not focus on locally collected data such as certificates of occupancy and building permits because the quality of these data sets differs greatly across localities and it is impossible to anticipate each local government’s capacity to collect and report administrative data following a catastrophe.

Building on the definition the Census Bureau applies to its data products (U.S. Census Bureau 2006a) as well as the Census Bureau’s “Basic Underlying Principles” (Swanson 2006), the definition we apply for the most appropriate administrative data set to be used as a basis for population estimates post-disaster encompasses these criteria:

  • Soundness—The data set must hold up to assumptions needed for accurate population estimates. For example, if it is to be used in a method that assumes the data has a constant relationship to total population, then research should indicate it has a constant relationship to population.

  • Consistency—The data set should be collected in a stable and consistent manner across collection points and across time, both pre- and post-disaster.

  • Relevance—The data set should be frequently updated, ideally monthly, and sufficiently granular to support county-level population estimates across geographies.

  • Timeliness – The data should become available with a short time lag, ideally within a few weeks of data collection.

  • Accessibility – The data set should be easy to access and the barriers to use it, including privacy act restrictions, cost, and complexity, should be minimal.

These criteria may come into conflict, but soundness is always paramount when developing population estimates from available administrative data, and lack of consistency can impact soundness. We evaluated each data set for soundness relative to the method in which it might be used, and the effect of post-disaster consistency on the soundness of the data. If a data set was determined not to have soundness for a post-disaster setting, it was determined not to be appropriate. Among those data sets with possible soundness, appropriateness is based on relevance, timeliness, and accessibility.

To determine soundness, we identify studies in the field of demography about the use of each data set in generating population estimates. When such literature was lacking, we sought studies across diverse disciplines that addressed the relationship between the data set and total population. We reviewed literature and government records, and gathered expert opinion specific to the post-Katrina time period, to document the consistency, relevance, accessibility, and timeliness of each data set following a catastrophic population displacement.

We calculated monthly population estimates for Louisiana counties (parishes) based on public school enrollment data and USPS counts of residences actively receiving mail, by using the Census Bureau’s 2005 population estimate as a base and applying the monthly rate of change in the symptomatic variable. For a sampling of Louisiana parishes, we compared these monthly estimates with July 2006 experimental population estimates conducted by both the Census Bureau and the Louisiana Public Health Institute (LPHI). We examined these estimates, along with other knowledge about demographic and social conditions in the area, to identify the potential and limitations of each of these data sources for producing population estimates for counties that are catastrophically damaged, counties that are moderately damaged, and counties that receive an influx of evacuees.

To quantitatively compare the resulting population estimates, we calculated the standard deviation and the absolute percent difference between the public school enrollment-based, USPS-based, Census Bureau, and LPHI population estimate for the 11 parishes for which LPHI calculated estimates. The absolute percent difference was determined by calculating the mean of the four estimates for each parish and, using this as a reference point for that parish, calculating the mean absolute percent difference (MAPD)Footnote 2 of the four estimates from the July 2006 mean. We cannot know how accurate these estimates are, but we can measure their differences to suggest something about the level of uncertainty associated with population estimates after a massive population displacement.

We conclude with recommendations about the use of administrative data sources for estimating population location and size at different points in time relative to a sudden onset catastrophic event for application in any U.S. setting.

Results

Administrative data collected on an ongoing basis

Utility accounts

The literature in the field of demography indicates that electric account data is a sound symptomatic variable for households in population estimates using the Housing Unit Method. Demographers use these data to estimate households (as described above), then apply an average person per household, and add an estimate of population living in group quarters to generate a total population estimate (Smith and Lewis 1983; Smith and Mandell 1984; Smith and Cody 1994, 2004, 2007). Among utilities data, electric accounts have better household coverage than telephone, water or gas accounts (Smith 1986). Demographers in Florida have successfully acquired electric account data for the entire state and have documented useful learnings about the limitations of the data as the basis for sound population estimates. Master meters serving multiple housing units are often found in public housing and other apartment buildings (Smith and Lewis 1980; Smith 1986; Raymondo 1992). Following Hurricane Katrina and Rita, master meters associated with multiple FEMA trailers were in use for many months in temporary group trailer sites around the Gulf Coast (personal communication: R. Rodi. February 2008). The number of housing units and/or trailers associated with these master meters should be determined and factored into estimates. For areas that receive displaced populations, the relationship between households and total population may be affected by families doubling up with existing households. Population estimates for these areas based on electric account data in a Housing Unit Method would require an updated measure of population per household and group quarters population to be sound. The consistency of these data in catastrophically damaged areas is likely to be impacted for a few months after the event, as observed in the USPS counts of active residences (see below).

Electric account data that is readily accessible is not relevant or timely. Relevant and timely electric account data is not readily accessible but very useful if it can be accessed. Consumer energy companies are required by the Federal Energy Regulatory Commission (FERC) to provide annual reports that include the average number of residential customers for the year—defined as the number of bills rendered during the year divided by the number of billing periods in the year (U.S. Environmental Protection Agency 2002; Federal Energy Regulatory Commission 2003). These data are publically reported several months after the end of the year, and provide an annual indicator of residential customers in a utility service area. Many utility companies cover a geographic area larger than a single county (Giles 2008), making the publicly available FERC data unusable as a measure for counties. If FERC were to require monthly reports of the number of residential customers by county, relevant and timely county-level population estimates could be calculated.

To acquire relevant monthly, disaggregated electric data, users must negotiate directly with each electric company serving the area of interest (Raymondo 1992). Alternatively, users may find success by enlisting the state public utility (or service) commission to negotiate with individual electric companies to obtain monthly reports by county.

USPS-counts of residences actively receiving mail

Private data firms rely on United States Postal Service (USPS) counts of residential addresses actively receiving mail as an indicator of population change for small areas, but only a small amount of research has been reported on their performance in population estimates. The available literature suggests that USPS data may serve as a sound symptomatic variable for households in county-level total population estimates using the Housing Unit Method except in rural areas or those with larger proportions of seasonal housing (Smith 1986; Lowe and Mohrman 2003). Following a catastrophic disaster, USPS counts may lose consistency if the USPS itself is unable to supply door-to-door service for some months after the event and, therefore, these data will lack soundness during that time frame (Congressional Research Service 2005). The data have known limitations which should be accounted for to use them as a sound basis for population estimates. Like electric accounts, a single address may be assigned to large apartment buildings where mail is dropped and distributed to multiple housing units by a doorman or resident employee (personal communication: P. Faulstich, October 2007). However, unlike electric accounts, following Hurricanes Katrina and Rita, individual mailing addresses were assigned to each FEMA trailer in group sites (personal communication: R. Rodi. February 2008). In counties receiving an influx of evacuees, not all displaced households will establish new residences in their destination. Like electric accounts, USPS data may not fully reflect the increase in population when families are in shelters or doubled up with existing households. USPS-counts of residential addresses actively receiving mail are publicly reported by county on a monthly basis and, thus, are relevant, timely, and accessible.

USPS national change of address data

The USPS maintains a National Change of Address (NCOA) database which contains voluntary change of address filings. These data do not have the soundness necessary to serve as a symptomatic variable of households in a population estimate using the Housing Unit Method because this database is known to substantially underrepresent moving households (LortonData). This database is continually maintained, and therefore, relevant and timely, but is not typically made publicly available, and thus, it is not accessible. After Hurricanes Katrina and Rita, the Census Bureau used individual USPS NCOA records to track the movements of individual households and develop January 2006 and July 2006 population estimates for affected counties (U.S. Census Bureau 2006b), but these estimates required a great deal of effort and expertise to develop, and thus, are not sufficiently accessible to support monthly post-disaster population estimates.

Drivers’ licenses and registered passenger cars

The consistency and, therefore, soundness of drivers’ license and registered passenger car data as a symptomatic variable of population in a ratio-based population estimate vary by state (Kitagawa 1980; Rives et al 1995). The soundness of these data likely suffers after a disaster. The relationship between counts of drivers’ licenses or registered passenger cars and total population is not constant but varies with demographic changes that may occur suddenly following a disaster. For example, immigrant construction workers often flock to damaged areas (Fussell 2008). This may increase the total population without increasing drivers’ licenses or registered passenger cars. Families with children often disproportionately leave a disaster-damaged area (Kirschenbaum 1996). This may cause a drop in total population without a proportional decrease in drivers’ licenses or registered passenger cars. In addition, following a disaster, the consistency of this data set may degrade in areas receiving evacuees if laws regarding the speed with which individuals re-register automobiles and report changes of address for driver’s licenses are not rigorously enforced. In those states where this data source is of high quality and has been used successfully to generate population estimates under normal conditions, users should be aware of these effects when using these data to generate post-disaster population estimates. The accessibility, relevance, and timeliness of drivers’ license and registered passenger car data vary by state (Rives et al 1995).

Traffic volume

The relationship between traffic volume and population is complex and we found no literature using traffic loads as a basis for total population estimates. Like drivers’ licenses and registered passenger cars, traffic loads are impacted by demographic rather than sheer population changes. Traffic loads are also impacted by commuting, shopping, and other errand-related trips, as well as business-related trips such as trucking, suggesting that traffic volume is not sound as a symptomatic variable of population in a ratio-based population estimate (Adams and VanDrasek 2006). Following a disaster, counties receiving a large influx of population may see an increase in traffic loads as evacuees are likely to be making trips to recovery centers, banks, and insurance offices. Traffic count data may increase in damaged areas due to recovery workers trucking in relief supplies and building materials. Given these factors the soundness of traffic volume as the basis for a population estimate likely degrades following a disaster. Traffic volume data are often timely and accessible but never relevant for population estimates. Hourly traffic count data are collected at 4,000 locations across the country and reported monthly to the U.S. Department of Transportation where it is made publicly available with a short time lag (U.S. Department of Transportation 2008); however, these data are not available for all counties.

Voter registration data

The consistency and, therefore, soundness of voter registration data as a symptomatic variable in a ratio-based population estimate varies (Rives et al 1995). Federal laws regarding the maintenance of voter rolls are oriented toward reducing voter disenfranchisement, and tend to emphasize maintaining registrants on rolls rather than accuracy of residence. Registrants may be removed from voter rolls by request, or if the registrant has been convicted of a crime or deemed mentally incompetent. States are also allowed to compare voter rolls with changes of addresses filed with the USPS, but these filings should be confirmed with the registrant (U.S. Department of Justice).

Following a disaster, the consistency of these data may suffer if displaced populations want to reserve the right to continue voting in their precinct of origin. Following Hurricane Katrina, the state of Louisiana maintained displaced individuals’ names on rolls and encouraged displaced voters to vote in their precinct of origin (Louisiana State Senate 2006). Later, attempts to purge the rolls of individuals displaced for more than two years were met with political resistance (Rosenfeld 2007). Authorities should cautiously examine the consistency of voter registration data before using this data set as a symptomatic variable in a population estimate. Although voter registration data are accessible, that is, they are publicly available, the relevance and timeliness of these data vary by state (Rives et al 1995).

Voter participation data

Voter participation data are not sound as a basis for population estimates. Voting behavior is highly complex, and the variables associated with voter turnout are numerous (Blais 2006). Demographic variables such as age, educational attainment, and income are highly correlated with voter turnout. Total media expenditure also impacts voter turnout. Larger population size can depress voter turnout (Ashenfelter and Kelley 1975; Settle and Adams 1976; Kushner et al. 1997). Because the number of people who vote in any given election is not stable relative to the total population, authorities can expect large errors from population estimates based on voter participation data.

Official voter participation data are not readily available in all states. Instead the News Election Service, which is a consortium of the five largest news agencies, gathers and reports its own election data, and these data are often treated as de facto official voter participation data in many states, despite inaccuracies. These data are available by county and released with a short lag time, and, thus, are relevant, timely, and accessible, but may not be consistent (Garrison 1982).

School enrollment

The literature in the field of demography indicates that elementary and secondary school enrollments are sound as a symptomatic variable of population in ratio-based population estimates (Kitagawa 1980). However, the soundness of public school enrollment alone in a ratio-based population estimate should be examined because the ratio of public school enrollment to total population may not be constant over time. If more parents choose private schools over time, public school enrollment would decrease without a population decrease.

Following a catastrophic event, the ratio of school enrollment to total population is likely to change in damaged areas. Families may be less likely to return to moderately or heavily damaged counties (Kirschenbaum 1996; Frey et al. 2007). In the most heavily damaged counties, school capacity may be constrained, which may dampen school enrollment (Brookings Institution 2007). For these reasons, school enrollment is not sound as a basis for population estimates for counties damaged by a disaster, but may be sound as an indicator of population increase in counties that receive an influx of population.

The accessibility, timeliness, and relevance of private school enrollment data vary by state because record keeping and reporting requirements related to private school enrollment vary widely (U.S. Department of Education). Public school enrollment, in contrast, is accessible and has the potential to be relevant and timely after a disaster. It is reported by all states at varying intervals. As of 2007, 14 states and the District of Columbia routinely collect and report student enrollment by public school district on a monthly or more frequent basis. Thirty-two states collect enrollment at least two times per year, while four states collect data once per year (National Center for Educational Achievement with Data Quality Campaign 2007). Following a massive population displacement, reporting frequency and timeliness may increase if school districts receiving an influx of evacuees seek state or federal aid to cover the costs of these additional students. For example, prior to Hurricanes Katrina and Rita, Louisiana collected public school enrollment data annually and reported it 3 to 4 months after data collection (personal communication: R. Robbins. July 2, 2008). After the storms, the federal government passed the Hurricane Education Recovery Act, which provided reimbursements for public school districts based on an accounting of the number of evacuee children served each quarter (U.S. Department of Education 2005). The Louisiana Department of Education required that public school districts provide a monthly accounting of total enrollment as a basis for redistributing state education funding among the districts (personal communication: J. France. June 6, 2008). Louisiana school districts started reporting public school enrollment within 6 weeks of Hurricane Katrina, and it was made publicly available on a monthly basis within 1 to 2 weeks of data collection (personal communication: J. Chapman. May 27, 2008).

Public school enrollment data are available by county in Maryland, Florida, Louisiana, and West Virginia, where most or all public school district boundaries match county boundaries (Oosse 2004). In many states, the enrollments of several districts must be combined to generate county-level enrollment. In some areas, public school district boundaries cross county boundaries, resulting in what are called “school district pieces.” In 2005, there were approximately 3,100 counties, but more than 14,000 school districts and 20,000 school district pieces (Maples and Bell 2005). To determine how to combine school district data to match county boundaries, readily available information gathered by the Census Bureau can be used.

Every two years the Census Bureau collects school district boundaries from state liaisons through its School District Review Program (U.S. Census Bureau 2007b). The Census Bureau then combines school district and county boundaries to identify each portion of a school district in each county. These portions are called “school district pieces.” Because the Census Bureau maintains relatively current school district boundary information, this boundary information can be used to estimate the students living in each “school district piece” and then recompile it into county-level enrollment numbers (Oosse 2004).

Administrative data collected after a disaster

Housing damage estimates

Following a disaster, government and private entities generate damage estimates based on remote sensing data and house-to-house inspections. These data can be used to identify undamaged areas or households within a county, and, combined with an estimate of persons per household (perhaps from decennial census data), develop an initial rough estimate of population that may remain in each damaged county. It is difficult to document exactly what damage data will be available following a disaster because remote sensing and mapping technology is rapidly evolving and new data become available at greater speeds following each disaster. Below we document some of the damage data available at the time of this writing.

FEMA’s Mapping and Analysis Center (MAC) publishes a “Disaster Map” based on remote sensing. Typically, a “Disaster Map” outlines the damaged areas with a thematic polygon whose coloring indicates the level of damage (limited, moderate, extensive, and catastrophic) and supplies accompanying population and household counts. The geographic granularity of the reporting usually ranges from the county to a larger area. “Disaster Maps” may be updated at regular intervals, including daily or in some cases multiple times per day. When available, “Disaster Maps” would reduce the need to generate independent housing damage estimates based on remote sensing. In the future, FEMA could codify their disaster maps to be compiled at the county level.

If “Disaster Maps” are not published or the granularity is not sufficient for county-level analyses, remote sensing data may be obtained from a variety of sources (Friedland et al 2007), including from FEMA’s MAC. These files can be used in Geographic Information System (GIS) software to create an estimate of the affected population. In some cases, FEMA may include in the polygon file the levels of damage. Officials can overlay the damage polygon on census information to generate counts of affected and unaffected households or population in the damaged areas (Greater New Orleans Community Data Center 2005; Logan 2006). However, care must be taken as the damage polygon will not align with census geographies. There are several methods for allocating population and households when the damage polygon crosses a census boundary. The simplest of these is to include all population and households as part of the damaged area. For rough estimates, this may be sufficient, and using census blocks would help minimize the error. More accurate methods involve calculating the ratio of damaged to undamaged areas and using it to allocate population and households or using a street grid to estimate the location of housing units within the census unit (Richardson and Renner 2007). If officials wish to refine population and household estimates by damage level, an apportionment by area or street grid is necessary.

In flooded areas, variance in the elevation of land or residential buildings can impact the accuracy of remote sensing estimates based on a simple polygon overlay. Within a few days of Hurricane Katrina’s landfall, several researchers combined water gauge data and then satellite imagery, once available, with elevation data to generate estimates of the extent and depth of flooding (Gesch 2007; Smith and Rowland 2007). The quality of these estimates is heavily dependent on the elevation data of the area, with Light Detecting and Ranging (LIDAR) technology generating the best elevation models (Sanyal and Lu 2004). LIDAR data are available for most coastal areas in the United States (National Oceanic and Atmospheric Administration). FEMA also provides the Hazards U.S. Multi-Hazard (HAZUS-MH), which is available for states experiencing risks from hurricanes, floods or earthquakes. HAZUS-MH is a risk assessment tool that can be combined with Geographic Information Systems to estimate population at risk and damage to residential buildings. Building stock data classified by occupancy in the HAZUS-MH system can be combined with data on ground shaking, wind speed, depth of flooding, and other remote sensing data to generate estimates of residential damage and displaced population (FEMAa; FEMAb; FEMAc). The accuracy of these HAZUS-MH estimates relies on the quality of the input data. FEMA populates building stock data with national databases that most closely resemble average building stock in an area and not individual buildings. Local information will improve the quality of the estimates, but many localities do not have the ability to complete this resource intensive task (Cutter 2003). In areas where flood insurance data include individual residential elevation certificates, FEMA could release this data to facilitate the task of improving damage estimates or alternatively, if privacy is a concern, generate refined estimates itself.

As soon as one month after the disaster, house-to-house inspections supplant these early, remote sensing-based damage estimates. FEMA develops housing damage counts or estimates based on inspections of housing units to determine disaster victims’ eligibility for housing assistance as required by current federal regulations (U.S. Department of Housing and Urban Development 2006). The Red Cross, in parallel, develops their own housing damage estimates in order to provide appropriate services to individuals based on the amount or type of damage to their home (Hallman 2004). Local authorities may also develop assessments of housing unit damages as a basis for developing and applying re-building ordinances (Meitrodt 2006).

Housing damage estimates can be used to estimate the population remaining and the population displaced from each damaged county. But these data do not provide clues as to the locations to which populations were displaced. Moreover, these data quickly lose relevance as displaced populations return and begin rebuilding damaged homes. Although some demographers following Hurricane Katrina attempted to estimate the return over time of population based on the severity of housing damage, this method is not sound. Availability of public services such as schools and hospitals, and exposure to risk are likely to factor into decisions about returning or staying in a damaged area (McCarthy et al 2006). Conversely, employment opportunities in the affected area can be a significant motivator to return (McCarthy et al 2006) or migrate to a damaged area (Fussell 2008).

FEMA’s individuals and households program

FEMA collects the home address information of all applicants to their Individuals and Households Program to verify their status as disaster victims. If an applicant has evacuated, they may have to supply an alternate address to which assistance funds can be mailed. Two and a half months after Hurricane Katrina, FEMA began making available counts of these mailing addresses by metropolitan area but cited privacy act restrictions when asked for the release of more detailed data. If FEMA were to make these data available as a cross tabulation of households from county of origin by county of destination for each month following a catastrophic event, they could form the foundation for county-level estimates of remaining and displaced households. One could apply an average of persons per household (from the previous decennial census) from the county of origin to the number of applicant households to derive an estimate of the household population arrived in each county. Individuals in group quarters would have to be tracked separately.

FEMA applicant data have limitations of which users should be aware. It is generally believed that disaster applicant data represent an overcount of the number of displaced households due to fraudulent applications for disaster assistance (National Research Council 2007; U.S. Government Accountability Office 2006). These data represent mailing addresses of evacuee households and may not represent the actual address where an evacuee is staying. And this information loses currency as applicants stop being eligible for benefits and have no motivation to advise FEMA of address changes (Johnson et al. 2008).

Assessment of population estimates based on USPS-counts of active residences and public school enrollment

When compared with Census Bureau and Louisiana Public Health Institute (LPHI) population estimates, and examined along with other knowledge about demographic and social conditions, monthly estimates for post-Katrina/Rita Louisiana parishes based on USPS-counts of active residences and public school enrollment illustrate the potential and limitations of these two data sets for this purpose. Specifically, we examined the performance of these estimates in two catastrophically damaged parishes, two that sustained moderate damage, and two that experienced an influx of evacuees.

The administrative data sources that the Census Bureau relies on to generate population estimates under normal circumstance (IRS filings, Medicare records, and state counts of group quarters populations) were either incomplete or too time-lagged to reflect post-disaster population conditions in 2006. The Census Bureau generated 2006 estimates for Katrina/Rita affected parishes by supplementing their normal administrative data sets with USPS change-of-address data.Footnote 3 LPHI estimated the household population of 11 Louisiana parishes one year after the disaster based on sample surveys that resulted in total household population estimates for each parish with a margin of error (Stone et al 2007). The researchers also gathered data on group quarters population for each parish in order to determine a total population estimate for each parish. Applying the margin of error to the household population estimate for each parish and adding the group quarters population, we created high and low LPHI population estimates. Although LPHI surveys were conducted primarily throughout the summer of 2006, for this purpose we treat them as July 2006 estimates. Because school enrollment data are not available for July, we generated school enrollment-based estimates for July 2006 by interpolating between the May 2006 and October 2006 public school enrollment-based estimates.

For Orleans Parish, which sustained severe or major damage to 56% of occupied housing units (U.S. Department of Housing and Urban Development 2006), we see that USPS-based estimates did not reflect housing loss and population displacement for many months, likely because the postal service was not able to supply door-to-door mail delivery for some time after the disaster (Congressional Research Service 2005). One year after the disaster, these estimates had reflected the impact and began to track recovery toward pre-hurricane numbers. Meanwhile public school enrollment-based estimates track recovery but at a lower rate than the Census Bureau, LPHI, or USPS-based estimates (See Table 1). An analysis of 2006 American Community Survey demographic profiles of post-Katrina New Orleans revealed that households with children were significantly less likely to have returned to New Orleans than households without children, and that the 2006 population of New Orleans had a higher proportion of couples without children and people living alone than before the storm (Frey et al. 2007). This phenomenon may help to explain the lower rate at which the public school population rebounded compared with that for households as reflected in the USPS-based estimates.

Table 1 Population estimates based on public school enrollment, interpolated public school enrollment, and USPS-counts of active residences with Louisiana Public Health Estimates (LPHI) and Census Annual Population Estimates, parishes with catastrophic damage

In Cameron Parish, 72% of occupied housing units in that parish received severe or major damage (U.S. Department of Housing and Urban Development 2006), and like Orleans Parish, the USPS-based population estimates remained frozen for several months following the landfall of Hurricane Rita. However, Cameron Parish is rural and the total number of occupied housing units in the parish is relatively small at only 3,593 according to Census 2000. Thus the sheer number of damaged housing units was small compared to Orleans Parish. USPS-based estimates rose in May 2006 and exhibit somewhat erratic patterns from October 2006 to January 2007 possibly because of the differences in the way USPS data are collected in rural areas as noted by Lowe and Mohrman (2003). In contrast, the decline and recovery of school enrollment was more consistent. But the differences between USPS-based and public school enrollment-based estimates were not large and both methodologies yielded population estimates that were consistent with the Census Bureau and LPHI estimates for mid 2006 (See Table 1).

Next we examined two relatively large parishes in the New Orleans metro area that received a moderate level of damage. Jefferson Parish sustained severe or major damage to 19.5% of occupied housing units, and St. Tammany Parish sustained this level of damage to 25.5% of occupied housing units (U.S. Department of Housing and Urban Development 2006). Sample surveys conducted in 2006 and analyzed by Hori and Bowman (2007) revealed that both of these parishes simultaneously lost population and gained evacuees from the more heavily damaged parishes of Orleans, Plaquemines, and St. Bernard.

In Jefferson Parish, although public school enrollment-based estimates dropped notably, USPS-based, Census Bureau, and LPHI estimates suggest that the total population remained relatively level (See Table 2). In St. Tammany Parish, school enrollment-based estimates dropped, while USPS-based and Census Bureau estimates indicate upward trending. Here the LPHI range of estimates encompasses all the other estimates (See Table 2). These trends suggest that evacuees who left these parishes and did not return were disproportionately those whose children had been enrolled in public school, and individuals moving into the parish were largely childless, sending their children to private school, or sending their children to school in a neighboring parish. Demographic profiles available from the American Community Survey for Jefferson Parish for both 2004 and 2006 confirm that the proportion of households with children under 18 dropped from 35% in 2004 to 32% in 2006.

Table 2 Population estimates based on public school enrollment, interpolated public school enrollment, and USPS-counts of active residences with Louisiana Public Health Estimates (LPHI) and Census Annual Population Estimates, parishes with moderate damage

Finally we examine estimates for parishes that received an influx of evacuees. In East Baton Rouge and Rapides parishes, the growth rate in the USPS-based estimates initially trailed the growth rates in public school enrollment-based estimates (See Table 3). These trends suggest that USPS counts of active addresses may not fully reflect the large and sudden population increases that occur in destination counties following large scale population displacements, but instead reflect more modest growth in households in each parish.

Table 3 Population estimates based on public school enrollment, interpolated public school enrollment, and USPS-counts of active residences with Census Annual Population Estimates, parishes with influx of evacuees

These analyses suggest that USPS data may be a good indicator of population recovery in damaged counties but not for the first several months if the postal service is not supplying door-to-door service. In contrast, school enrollment data may better indicate the influx of evacuees in receiving counties. Furthermore, the data tables illustrate that Census Bureau and other annual estimates are inadequate for tracking important post catastrophe population shifts.

Measures of difference between population estimates

For the 11 parishes where LPHI published population estimates, we calculated standard deviations and MAPDs of the differing estimates for each parish derived from the four methods (public school enrollment, USPS, Census, and LPHI). We provide these measures for each parish along with the Census Bureau’s 2005 (pre-disaster) population estimates for each parish, and the estimated percent of occupied housing units with major or severe damage from the U.S. Department of Housing and Urban Development (HUD). The Census Bureau estimate indicates how populous each parish was prior to the disasters and the HUD data indicate the extent of damage to populated areas of each parish. We would expect to see larger standard deviations for more populous parishes, and larger MAPDs for less populous parishes—because absolute percent differences, by their very nature, are larger for areas with small populations (Siegel and Swanson 2004). We provide HUD data to note that, in the case of Hurricane Katrina, areas with higher measures of estimation uncertainty were those with a high percent of damaged housing.

Indeed, standard deviations were largest for Orleans and Jefferson parishes—areas that had large pre-Katrina populations. But MAPDs were largest for St. Bernard, Orleans, and Plaquemines parishes (See Table 4). St. Bernard and Plaquemines had relatively small populations and large MAPDs are not unexpected. But the MAPD for Orleans Parish is particularly striking. The Orleans Parish standard deviation is large because of its large population, but its MAPD is large despite its large population. These measures of difference suggest that uncertainty surrounding population estimates for large, heavily damaged parishes is high and these estimates should be used with caution.

Table 4 July 2005 population estimates (U.S. Census Bureau), HUD housing damage estimates (U.S. Department of Housing and Urban Development 2006), July 2006 population estimates based on interpolated public school enrollment, and USPS-counts of active residences, Louisiana Public Health estimates (LPHI), and Census Bureau estimates, Standard Deviation and MAPD of mid-year estimates for select Louisiana parishes

Discussion and implications

Estimating population based on available administrative data is difficult following a disaster. Although the literature is not definitive about what data sets are most appropriate to use after a catastrophe, it does indicate the limitations of each to serve in this role and as such provides important guidance.

In the first month following a catastrophic disaster, the only relevant data will likely come from disaster-specific data sources. Preliminary estimates of total displacement and population remaining in damaged areas can be developed within the first 30 days after the disaster by applying an average person per household to estimates of damaged and undamaged housing. If displacement is significant, it would be helpful if FEMA published county-level aggregate counts based on the mailing addresses of evacuated households applying for disaster assistance as an indicator of the location of displaced populations. A cross tabulation of county of origin by county of destination could be used with an estimated average person per household to generate rough estimates of displaced and remaining population by county for a few months following the disaster.

But these data sets will lose relevancy within a few months of the disaster. FEMA data on displaced households receiving benefits, if available, would continue to be useful to authorities who need some estimate of the size and location of displaced households, but should not be used as a symptomatic indicator of all displaced households when all displaced families are no longer receiving FEMA benefits. Once disaster-specific data are no longer relevant, administrative data that are collected under normal circumstances may become useful, but one should be aware of the soundness of these data sets as a basis for population estimates, as well as the impacts of the disaster on these data sets. Table 5 provides a summary of findings regarding these data sets’ soundness, consistency, relevance, timeliness, and accessibility.

Table 5 Assessment of administrative data collected under normal circumstances against criteria for appropriateness as the basis for post-disaster population estimates

Traffic volume and voter participation data do not provide a sound basis for population estimates because the assumption that these variables change in direct relation to total population is rarely true. Drivers’ license and registered passenger car data are also problematic because the consistency of these data may be impacted by the disaster, and their relationship to total population may be altered due to demographics shifts associated with the disaster. USPS change of address data covers only those households that voluntarily submit new address information to the post office, and, therefore, does not provide a sound basis for population estimates using a ratio method. Voter registration data may be more stable relative to total population, but only if this data source is consistently maintained, which is less likely in a post-catastrophe situation. Electric account data may change more directly with households but this will be useful only in damaged areas. In receiving counties, an updated estimate of persons per household and group quarters population will be necessary to generate sound population estimates using the Housing Unit Method. But publicly available electric account data is not sufficiently relevant or timely. If electric account data are acquired directly from electric companies and used to develop estimates for damaged counties, users should be aware of the known limitations in these data when using them to estimate households. Master meters should be identified and the number of housing units associated with each identified, and the number of trailers in each FEMA trailer park should be individually tabulated.

USPS-counts of active residences and public school enrollment may be the best options for generating total population estimates for counties each month because they are freely available on a frequent basis, and they may be more stable than other data sets relative to total population. Still these data sets may be impacted by the disaster itself or by demographic changes in damaged counties.

Like electric account data, USPS data may not fully reflect population increases in counties that receive an influx of displaced population. In highly damaged counties, current counts of active residences may not be available while the post office itself recovers. Once postal service is resumed, these data may be appropriate for generating population estimates but users must be aware of the limitations of the data. Drop stops should be identified and the number of housing units associated with each tabulated and factored into counts of households if possible.

For moderate to highly damaged counties, school enrollment data are not sound because demographic changes and school capacity may impact the ratio of school enrollment to total population. School enrollment data may be sound as a basis for generating frequent population estimates for receiving counties. In order to ensure relevance of this data, users may have to advocate for state authorities to increase reporting to monthly intervals.

Conclusions and recommendations

These findings suggest that the appropriate administrative data sets to be used as a basis for monthly population estimates may differ by point in time relative to the disaster and by the impact of the disaster on each county (See Table 6). This analysis leaves an obvious gap for counties with catastrophic damage when FEMA data have lost currency and the local post office has not fully recovered. It is during this time frame that it may be necessary and useful to conduct sample surveys to generate county-level estimates. Although sample surveys are not typically used for generating total population estimates in the U.S., a basis exists for the generation of such estimates in the international and post-disaster literature (Smith 1996; Grais et al 2006; Henderson 2006). In addition, these surveys could be leveraged to gather essential demographic and health data to further officials’ understanding of the returned population in the most damaged areas, and questions could be tailored for heavily damaged and receiving parishes to identify important migration patterns (Hori and Bowman 2007).

Table 6 Most appropriate administrative data for generating timely post-disaster, county-level population estimates, by time frame relative to disaster, and impact of disaster on county

Additional research is needed on the performance of monthly population estimates using basic ratio methods based on USPS-counts and school enrollment data after catastrophic events. The 2010 census will provide the next indication of the accuracy of post-disaster population estimates, but by then the most severe impacts of Hurricane Katrina will have dissipated. Population displacement from Hurricane Ike or future disasters will provide additional opportunities to compare population estimates based on these administrative data sets against decennial census counts and indicate how these data sets can be used as the basis for more accurate estimates of displacement and resettlement following a disaster. In the meantime, this research can serve as a reference for authorities and researchers using administrative data to estimate total county population size for damaged counties and for counties receiving displaced populations in the event of future disasters.

However, basing population estimates on differing data sets at different points in time and for different regions may cause differing biases in the population estimates. Caution should be taken in analyzing trends in population estimates based on different administrative data sets and methods. Finally, measures of difference among population estimates suggest that users of these estimates should exercise caution with any estimate for heavily damaged, larger counties. Sensitivity analyses when using population estimates for planning and research in heavily damaged counties are advised.

The Census Bureau can build on this research by codifying recommendations to local authorities for developing a standard set of monthly post-disaster population estimates that are consistent with the “Basic Underlying Principles” (such as integrity, parsimony, and transparency) of their Intercensal Population Estimates and Projections Program (Swanson 2006). In addition, high level authorities established to oversee recovery, such as the Federal Coordinator for Gulf Coast Rebuilding, could work to break down accessibility barriers to key data such as FEMA’s Individuals and Households Program data or enlist FERC to negotiate with individual electric companies to obtain monthly reports by county.

Finally, it should be noted that these efforts will only enhance the ability to make sound estimates of total population based on administrative data. Following Hurricane Katrina, decision-makers at all levels of government expected accurate counts and demographic data on the population resettling in damaged counties. To provide this, the federal government would need to set aside appropriations for the Census Bureau to conduct special population counts for disaster impacted counties.