Addressing Unintentional Exclusion of Vulnerable and Mobile Households in Traditional Surveys in Kathmandu, Dhaka, and Hanoi: a Mixed-Methods Feasibility Study

The methods used in low- and middle-income countries’ (LMICs) household surveys have not changed in four decades; however, LMIC societies have changed substantially and now face unprecedented rates of urbanization and urbanization of poverty. This mismatch may result in unintentional exclusion of vulnerable and mobile urban populations. We compare three survey method innovations with standard survey methods in Kathmandu, Dhaka, and Hanoi and summarize feasibility of our innovative methods in terms of time, cost, skill requirements, and experiences. We used descriptive statistics and regression techniques to compare respondent characteristics in samples drawn with innovative versus standard survey designs and household definitions, adjusting for sample probability weights and clustering. Feasibility of innovative methods was evaluated using a thematic framework analysis of focus group discussions with survey field staff, and via survey planner budgets. We found that a common household definition excluded single adults (46.9%) and migrant-headed households (6.7%), as well as non-married (8.5%), unemployed (10.5%), disabled (9.3%), and studying adults (14.3%). Further, standard two-stage sampling resulted in fewer single adult and non-family households than an innovative area-microcensus design; however, two-stage sampling resulted in more tent and shack dwellers. Our survey innovations provided good value for money, and field staff experiences were neutral or positive. Staff recommended streamlining field tools and pairing technical and survey content experts during fieldwork. This evidence of exclusion of vulnerable and mobile urban populations in LMIC household surveys is deeply concerning and underscores the need to modernize survey methods and practices. Electronic supplementary material The online version of this article (10.1007/s11524-020-00485-z) contains supplementary material, which is available to authorized users.


Introduction
In low-and middle-income countries (LMICs), household survey methods have remained consistent, while population trends have changed substantially over 40 years. This mismatch has likely increased exclusion of vulnerable and mobile populations from survey data. LMIC survey best practices were established when LMICs were majority rural by agencies that have been critiqued for holding a "sedentary bias" in development initiatives [1,2]. Globally, human mobility has increased substantially over the last two decades, and today, most LMICs are in the midst of urban transitions, or will be soon [3]. An estimated 2.5 billion people will be added to the planet by 2050, with 90% of that population increase concentrated in Asian and African cities alone [4]. While rates of urban growth in LMIC cities are consistent with rates previously observed in high-income countries, the number of people added to LMIC cities today creates unprecedented scenarios of urbanization. For example, Lagos Nigeria, Delhi India, and Dhaka Bangladesh are each expected to add more than 700,000 people per year through 2030 [4].
Rapid in-migration to LMIC cities is accompanied by increased socioeconomic inequalities, growth in slum populations, and housing crises, all of which contribute to increasingly complex living arrangements [5,6]. As urbanization changes the structure and nature of communities and households in LMICs [7], survey methods must evolve in response. To date, most surveys about slum communities are conducted as one-off exercises and focus on a selection of slums in a city [8,9]. A few national surveys have explicitly sampled and reported about slum dwellers in all urban areas (e.g., the 2013 Bangladesh Urban Health Survey [10]) or select cities (e.g., 2015-16 India National Family Health Survey [11] in eight cities).
The largest survey programs in LMICs include the Demographic and Health Surveys (DHS), Multiple Indicator Cluster Surveys (MICS), and Living Standard Measurement Surveys (LSMS), which essentially use the same methods and tools [12]. Collectively, these programs have performed nearly 700 national surveys in more than 130 countries since 1980. Across these surveys, census enumeration areas (EAs) are sampled with probability proportional to population size (PPS), households in selected EAs (i.e., clusters, primary sampling units) are mapped and listed, approximately 20 households are sampled in each cluster, and interviewers return later to administer questionnaires to selected households [13][14][15]. Among DHS surveys conducted since 2000, the average sample frame was 7 years old (up to 30 years old), and 94% of surveys used the previous census as a sample frame, while the remaining 6% used an official list of areas or households [16]. By relying on census sample frames, unregistered and special populations excluded from the standard census are intentionally omitted from surveys including the homeless, internally displaced people, refugees, informal slum dwellers, nomadic populations, and institutional populations [6,17].
Unintentional exclusion of vulnerable and mobile populations, particularly slum dwellers, can additionally occur in three ways. First, if structures built and occupied since the last census are over-represented in deprived areas, vulnerable and mobile populations are systematically under-represented in the first-stage sample frame. Second, two-stage sample designs result in a gap of several months between the mapping-listing and interview activities, resulting in systematic nonresponse from vulnerable and mobile populations not present at time of interview, and exclusion of recently occupied dwellings (living spaces). Third, disproportionate exclusion of vulnerable and mobile populations can result from poorly defined or difficult to operationalize mapping-listing protocols in the time allotted for fieldwork, for example, assuming that one household occupies each dwelling. In this case, systematic under-listing of vulnerable and mobile households who share a dwelling results in their exclusion during the second stage of sampling [18].
These three issues are labeled coverage error, nonresponse error, and sampling error, respectively, in the total survey error framework, and threaten to bias survey results [19]. Additional measures of survey data relevance are of concern. Given the use of survey results by decision-makers to make inferences about the general population, intentional omission of the homeless, displaced populations, informal settlers, and others due to use of census sample frames threatens relevance of survey results, particularly with respect to social and economic indicators [19]. Furthermore, without maps of deprived/non-deprived urban areas [20], the survey results of the urban poorest are masked, or hidden, in aggregated urban averages resulting in limited relevance of survey results for decision-making [19].
In recent years, national surveys that developed fieldreferenced slum/non-slum urban sample frames in Bangladesh [10] and India [11] found stark inequalities in health outcomes, access to health care, living conditions, and livelihood opportunities between slum and non-slum residents. A comparison of stratified slum/non-slum surveys with routine national surveys in Bangladesh, India, Kenya, and Egypt, points to conditions of the urban poorest being masked in urban averages, under-sampling of slum populations in non-stratified urban samples, or both [21]. These analyses follow years of work to highlight the absence of data about the urban poorest in censuses and surveys [8,22]. While there are multiple other sources of slum population data in select communities, districts, or cities from single cross-sectional surveys [9][10][11], qualitative studies [23], community-based initiatives [24], and the INDEPTH longitudinal Demographic and Health Surveillance System [25], representative and routine measurement of populations in slums and other deprived areas via national surveys has yet to be achieved [20]. Crucially, national surveys are used to measure progress against onefourth of the Sustainable Development Goal (SDG) indicators [26]. If current survey methods systematically under-represent and mask vulnerable and mobile urban populations, our understanding of progress toward the SDGs is fundamentally flawed.
To address problems of unintentional exclusion of vulnerable and mobile households in surveys, the Surveys for Urban Equity (SUE) project piloted and evaluated three survey innovations in Kathmandu, Dhaka, and Hanoi: (1) use of modeled gridded population data as a sample frame which was assumed to be more current and have better coverage of the entire population than census, (2) area-microcensus sample design to remove the time-lag between mapping-listing and interviewing, and (3) mapper-lister protocols including a script, OpenStreetMap and OpenDataKit tools, and a broadened household definition to identify atypical dwellings and households. We were not able to obtain maps of deprived/non-deprived areas to stratify the surveys to address problems of robustness. Here, we present results of the pilot including the extent to which populations were unintentionally excluded from a standard survey design. Further, we evaluate the feasibility, cost, and skills required to implement our novel methods in complex urban settings.

Methods
We evaluated whether three survey innovations resulted in samples of different types of households and individuals compared with standard surveys. To establish feasibility of the innovations, we recorded costs and team skills required, and conducted focus group discussions (FGDs) to explore enumerator experiences.

Setting
We selected Kathmandu, Nepal; Dhaka, Bangladesh; and Hanoi, Vietnam, as they typify different points on the urbanization trajectory. The pace of growth in South Asia has particularly strained urban housing markets, increasing the number of people living in atypical arrangements and locations [3]. While some poorer households live in informal settlements, others live in economically heterogeneous neighborhoods [3]. In Kathmandu and Dhaka, for example, it is common for the building owner to occupy the top floor, rent the middle floor to a middle-class family, and rent the bottom floor to multiple low-wage workers. In Vietnam, old, cramped buildings continue to house the economically and socially vulnerable, while migrant laborers live in multiple-occupancy inadequate structures near work [27]. We sampled the entire Kathmandu Valley and purposefully chose to survey a slum and an economically mixed ward in Dhaka and an economically mixed district with a large migrant population in Hanoi. The Hanoi survey occurred soon after a government campaign to evict illegal occupants.

Study Design and Protocol
In 2017 and 2018, we conducted three cross-sectional household surveys in Kathmandu, Dhaka, and Vietnam [28].
Coverage Area The survey in Kathmandu was of the general population, while the surveys in Dhaka and Hanoi focused in areas where vulnerable and mobile population were likely located. Nepal's government is in transition to a new federal republic system, and administrative boundaries were recently updated. Old Kathmandu municipality boundaries only included the city center, while new municipality boundaries included rural communities beyond the peri-urban reach [29]. To ensure coverage of the functional city, we used the Global Human Settlement (GHS) layer of 1 × 1 km grid cells defining "high dense urban" areas ( Fig. 1). In Dhaka, the survey covered one ward and one slum community, and in Hanoi, the survey covered one district ( Fig. 1).
Sample Size A cluster sample of 20 households was chosen for ease of fieldwork, and to be consistent with other routine surveys such as the DHS, MICS, and LSMS. The survey in Kathmandu targeted 1200 households in 60 clusters to estimate depression and injury prevalence with a maximum 95% confidence interval of ± 4.27% (assuming the most conservative scenario where an indicator is estimated at 50%) [28]. This assumes a design effect of 1.41 (the mean design effect across all indicators for men and women in urban Nepal in the 2011 DHS) [30], a household and an individual response rate of 0.98 and 0.93, respectively, and one eligible individual per household. The Dhaka and Hanoi surveys targeted 400 households in 20 clusters each, with dual aims of evaluating transferability of SUE innovations while providing sufficient sample size to estimate key demographic and poverty indicators ± 5% with 95% confidence for indicators estimated at 50%.
Back-up Clusters Given the chance of selecting areas without residential buildings (e.g., airport or factory buildings) from gridded population data and the possibility of selecting cells with no buildings, we selected 30% back-up clusters for each sample. This meant that we sampled 78 clusters in Nepal, and 26 clusters in Dhaka and Hanoi, before randomly assigning 60 (or 20) clusters to the main sample. If a sampled cluster had no residential buildings, then it was replaced with a randomly selected back-up cluster. Four additional back-up clusters were sampled in Hanoi after masking already selected clusters, because more than 6 clusters were dropped.
Sample Design Area-microcensus sampling (akin to compact segment sampling [31,32]) means that all households in a cluster are sampled, allowing the household listing and interviews to occur on the same day. Area-microcensus sampling also allows inclusion of populations typically omitted from surveys by design. In concept, area-microcensuses can be performed in clusters of any size, though in practice, smaller clusters are preferred to reduce inter-cluster correlation [33]. Furthermore, area-microcensus sampling can be performed after multiple stages of sampling, which is a common practice in surveys that use a gridded population sample frame [33]. In this study, all areamicrocensuses occurred after a single stage of sampling. In Kathmandu, we randomized half of the clusters to an area-microcensus arm and the other half to a two-stage arm to compare survey designs, and treated the arms as strata (Table 1). In Dhaka, we used an area-microcensus design, stratified by ward/community with proportional allocation. The Hanoi survey followed an areamicrocensus design and was not stratified.

Sample Frame
We used WorldPop gridded population estimates as sample frames rather than older censuses. At the time of planning, the last censuses in Nepal (2011), Bangladesh (2011), and Vietnam (2009) were seven or more years old [34]. WorldPop is modeled with a machine learning approach that disaggregates UNadjusted population counts from administrative areas to approximately 100 × 100 m grid cells based on dozens of recently collected spatial covariates derived  from satellite imagery and GIS data [35]. This means that total population counts, and the spatial distribution of these populations, are likely more accurate than the last census. The small size of grid cells enables areamicrocensus sampling. The Kathmandu sample was drawn from 2017 WorldPop estimates, while the Dhaka and Hanoi surveys were drawn from 2020 WorldPop estimates produced in 2017 and 2013, respectively (Table 1) [34].
Sample Selection At the time of survey, the GridSample R package was the only publicly available tool to perform PPS sampling from gridded population data [36].
The algorithm allows aggregation of population estimates to larger cells (e.g., 200 × 200 m) and selection with PPS. Users can optionally "grow" non-overlapping clusters to a minimum population by randomly adding neighboring cells to selected "seed" cells. This is not ideal, as sampling units should be formed before sampling; however, gridded population sampling tools with this capability were only recently developed [37]. We used the population in the "grown" sampling unit for sample weight calculations following the logic that a frame of "grown" sampling units is implied in the sample weights calculation (Appendix) [36]. Theoretically an adaptive sample weight could be calculated [38]; however, the number of terms required for all combinations of potential cells that could be covered by the "growth" algorithm approaches infinity. In the Kathmandu two-stage sample, households were systematically sampled in Excel following standard methods [13,14,39].
Cell Size In Kathmandu, all clusters were initially sampled from 100 × 100 m cells and "grown" to a minimum of 820 people (approximately 200 households) ( Table 1). Among these 60 selected clusters, half were randomized to the area-microcensus arm and given the boundary of the original 100 × 100m "seed" cell ( Fig.  1). In Dhaka, the sample frame comprised 100 × 100 m cells, and in Hanoi, the sample frame comprised 200 × 200 m cells (Fig. 1). The optimum cell size for each survey was determined using satellite imagery (SUE training manual [39]).
Pre-field Review and Segmentation We visualized each cluster boundary over satellite imagery in ArcGIS before producing field maps and manually segmented clusters that clearly exceeded 200 (two-stage) or 20 (area-microcensus) households. Segment boundaries followed roads and property fences and had approximately equal populations; then, one segment was selected at random to represent the cluster (Fig. 1).

Mapping-Listing Protocols
The mapping-listing trainings were each one-week and involved lectures, roleplay, group discussion, and a field test. Before fieldwork, mappers-listers updated buildings, roads, and pathways in each cluster in OpenStreetMap using the iD editor tool [40]. In ArcGIS, the survey planning teams used the updated OpenStreetMap layer and cluster boundaries to create a geographically accurate map for each cluster (Fig. 1) [41]. In the field, mappers-listers noted changes on the paper map, followed a script to approach residents, and upon request, distributed a written description of the survey. The household listing was collected in GeoODK, an OpenDataKit-based application [42], for all buildings within the cluster or intersected by its boundary. Mappers-listers commuted from home to assigned nearby clusters using a provided stipend. Daily, they submitted listing records and an image of the field map, and periodically they visited the office to debrief and update OpenStreetMap with changes noted on paper maps.
Post-field Segmentation (Area-Microcensus) To ensure that interviewers would find approximately 20 households in each area-microcensus cluster, any such cluster with more than 25 dwellings was segmented manually in ArcGIS by a GIS specialist and the survey coordinator after mapping-listing fieldwork, ensuring equal numbers of dwellings in each segment [39].

Household Definitions
The DHS and MICS define household members as (i) usual residents or people who slept in the dwelling the previous night and who (ii) share living arrangements and (iii) share food [13,14]. The LSMS defines household members as (i) people who slept in the dwelling three or more of the last 12 months and (ii) share food [15]. By all DHS, MICS, and LSMS definitions, households in both residential and commercial buildings should be included [13][14][15], guards and servants are subsumed into the household of their employment [13][14][15], and seasonal and migrant populations are usually excluded by design [43]. The SUE household definition was broader and simply included all selfreported usual residents. The SUE definition additionally included hostel-dwellers and long-term occupants of guesthouses (defined as last 7+ consecutive days and working, looking for work, or in the city for another purpose such as supporting someone in hospital), and street-sleepers who slept in the cluster the previous night. Servants (and their families) who lived at the employer's residence were counted as a separate household [39].
Interview Protocols In the Kathmandu two-stage arm, geospatial specialists mapped and listed households, while public health specialists conducted interviews with sampled households later (Table 1). In Kathmandu and Dhaka's area-microcensus samples, geospatial experts mapped and listed dwellings, and the household listing was performed by interviewers on the day of interview. Due to time constraints in Hanoi, mapping, listing, and interviews were wrapped into one activity and conducted by public health specialists. This meant that maps used by interviewers in Kathmandu and Dhaka were field-verified, while in Hanoi, maps had only been updated during pre-field enumeration using satellite imagery. In all three surveys, the SUE household definition was used to determine eligibility, and respondents provided written informed consent and were 18+ years of age and usually a senior household member. The interviewers read questions and recorded responses on a tablet in GeoODK. The household questionnaire collected demographics, assets, income/savings/expenditures, social capital, migration, and injury information.
We also collected information about living arrangements, meals, and length of time at the dwelling to classify individuals and households that met DHS/ MICS and LSMS definitions during analysis. One adult in each household was randomly selected using the Kish method to complete an individual questionnaire with mental health and migration questions [44].

Public Involvement
Members of the public, including survey respondents, were not involved in setting the research questions, outcome measures, design, or implementation of the study, nor the dissemination of study results.

Statistical Evaluation
Sample weights were calculated separately according to the SUE and DHS/MICS household definitions. We analyzed survey results in Stata 14.0 with svy commands, adjusting for sample weights and estimating Taylor-linearized variances to account for clustering of observations within clusters (and household definition in select analyses-see below). The analyses in Kathmandu were stratified by arm (area-microcensus/two-stage), and the analysis in Dhaka was stratified by community (ward/slum).
In the area-microcensus samples in all cities, we evaluated whether use of the DHS/MICS household definition resulted in different estimates of individual and household characteristics compared with use of the SUE household definition using percentages and logit regression at 5% alpha level with "exclusion from DHS/ MICS" as the dependent variable and one characteristic as the independent variable. In these comparisons, the DHS/MICS households are a subset of the SUE households and thus treated in regressions as a matched pair by including "SUE vs. DHS/MICS ID" in the svyset statement as a second-stage cluster to correctly estimate variances and differences (p values). This approach with dichotomous variables is the survey analysis equivalent of the McNemar test for paired data [45]. In the Kathmandu sample, we also used percentages and logit regression to compare whether characteristics differed in the area-microcensus versus two-stage sample: first, holding the DHS/MICS household definition constant, and second, comparing two-stage-DHS/MICS with area-microcensus-SUE households. Because the households are from independent samples in this comparison, variance estimates (p values) adjusted only for the clustering of households within cluster. For every 20 comparisons, we would expect one comparison to be statistically significant by chance (type I error). With this in mind, our interpretation focuses on characteristics which were statistically significant, and for which a large percentage and number of people were excluded.
Household characteristics included building type, member configuration, migration status of household head, slum household, and urban poverty index (UPI) [46]. Individual characteristics included age-gender groups, employment status, marital status, and highest level of education. A reference group was selected for each variable to make statistical comparisons, and observations were dropped if they lacked data to determine household definition eligibility.
Days worked by each staff member and costs were recorded by the survey coordinator in each city. Time spent by survey coordinators to develop and learn the novel methods was excluded from cost calculations. However, time spent training mappers-listers and interviewers was included. In Kathmandu, we estimated costs for the area-microcensus and two-stage arms separately by holding constant costs of administration, training, and durable goods, and varying days of fieldwork.

Qualitative Evaluation
An FGD was held with each of mapping-listing teams using the same guide covering topics of OpenStreetMap enumeration, mapping-listing, and workflow. Additional questions exploring differences in area-microcensus and two-stage clusters were included in the Kathmandu FGD. FGDs were facilitated and audio-recorded by two trained qualitative researchers and conducted in the local language. The recordings were transcribed into the local language and then translated into English. We performed a thematic Framework Analysis in NVivo 11, coding every line by theme and summarizing positive/neutral experiences, challenges, and recommendations [47].

Results
In Kathmandu, 15% of clusters were dropped and replaced. No clusters were dropped in the targeted areas of Dhaka, and 45% were dropped and replaced in the Hanoi district (Table 1). Due to high density in Dhaka, and larger clusters in Hanoi, nearly all clusters in those cities required segmentation to achieve 20 households per cluster (Table 1). Household response rates were 96.8% in the Kathmandu two-stage arm, 88.3% in the Kathmandu area-microcensus arm, 98.7% in Dhaka, and 82.7% in Hanoi ( Table 1). The treatment of survey arms as strata in the Kathmandu sample meant that weights were larger in the two-stage arm because clusters comprised larger populations (mean: 1.673, range: 0.298-5.524) than in the area-microcensus arm (mean: 0.347, range: 0.157-0.985) ( Table 1). The root design effects (DEFTs) for key demographic and socioeconomic outcomes were larger in area-microcensus units for demographic indicators, but smaller in areamicrocensus units for slum household, UPI, migrant status, and education indicators (Table 1).

Unintentional Exclusion due to Sample Design
Applying the DHS/MICS household definition, we compared area-microcensus and two-stage samples in Kathmandu to understand how sample design might influence types of respondents (Table 3). We found average household size was smaller in the areamicrocensus sample, but dwellings had more occupants (household: 3.5 vs. 3.9, dwelling: 5.0 vs. 3.9) (Table 3). Further, the area-microcensus design had more nonfamily households (6.0% vs. 1.9%), but the two-stage design included more shack and tent dwellers (0.7% vs. 3.8%) ( Table 3).

Unintentional Exclusion due to Sample Design and Household Definition
Building off the previous analysis, we compared the area-microcensus sample with SUE definition and the two-stage sample with DHS/MICS definition in Kathmandu to understand the combined effects of survey design and household definition. In the areamicrocensus-SUE sample, there were more single adult (10.4% vs. 4.5%) and non-family households (6.0% vs. 1.9%), plus inclusion of hostel-dwellers (3.8%), street- Ref.
Other family*  Ref.

Time and Cost
In Kathmandu, the area-microcensus gridded population survey arm with a target of 600 households in 30 clusters cost approximately US$26,769, or US$45 per household, while a comparable two-stage survey cost approximately US$35,284, or US$59 per household. Area-microcensus survey costs per household in Dhaka (US$34) and Hanoi (US$76) differed due to cost of living and limited economy of scale in those smaller samples. The main cost difference between Kathmandu's survey arms was the mapping-listing activity; costs were 2.5 times greater in the two-stage arm due to larger clusters (Table 4).

Skill Mix
The skills required to plan and implement SUE surveys were similar to standard household surveys. The main difference was skillset of the mapping-listing team. In a standard survey, mapping-listing staff are required to have a secondary education [48]. To use SUE tools and methods, the mapping-listing staff should additionally have training in geography, GIS, or related fieldwork and be comfortable using mobile technologies for data collection and navigation. The skillsets of other staff including survey planners, trainers, and interviewers were identical to a standard household survey. The GridSample R package required intermediate R programming and GIS skills; however, a free pointand-click tool called gridsample.org is now available, allowing non-technical design and implementation of gridded population surveys.

Experiences
Feedback from the mapper-lister FGDs was generally neutral or positive, and staff resoundingly said they would prefer SUE tools and protocols to a conventional paper-based protocol. The SUE survey fieldwork, however, was not without limitations.
Key Challenges In Kathmandu, the mapping-listing staff were comprised of university geospatial students. Several described approaching residents as their greatest challenge, as well as their greatest reward. One mapperlister explained, "It was fun to work at the social level and interacting with the local people. We always used to be limited to using the computers before." Mappers-listers added that role-play and practical activities prepared them for fieldwork, though additional training on the survey aims would have helped to explain the survey's purpose to residents. In Kathmandu, mapping-listing staff initially enumerate 20-30 households daily, and this increased to 40-50 households daily after a week. The challenges in Dhaka and Hanoi were different. In these cities, the survey planners were trained about SUE tools and protocols but did not have field experience before training mapper-listers and interviewers. As a result, mapping-listing staff, including the geospatial students in Dhaka, described challenges using the tablet applications during the first days of fieldwork. In Hanoi where public health experts performed mapping, listing, and interviews, staff additionally struggled with navigation. Due to community skepticism following recent government evictions in Hanoi, teams enlisted local guides to help approach residents and introduce the survey.
Across cities, mappers-listers described working in pairs as essential because it provided them with "mutual support" to adapt to the moods and reactions of residents, interact in more languages, and work faster with more accuracy. Overwhelmingly, mappers-listers recommend that teams be comprised of one geospatial and one public health specialist.
Response Rates In all three cities, mapping-listing staff reported that residents seemed to omit mention of neighbors who did not have official mortgages or rental contracts, presumably for fear of evictions or fines. This was a particular challenge in Hanoi where "people tended to answer our question following their household record book," an official registry of households administered by the government. One mapper-listerinterviewer explained, "for residents who were living in evacuated houses, they felt worry and scare as if something wrong could happen." In Hanoi, teams returned to each cluster multiple times to build trust with residents and identify households not reported during previous visits. While the presence of guides likely improved response rates, it also meant that survey teams were limited by guides' schedules. Most teams performed the listing and interviews in the evenings when guides were home, though this meant that residents were eating dinner and rushed, or refused. Mapper-listers and interviewer in Kathmandu and Dhaka performed their work during the day.
Residential building access was a problem across cities. The Hanoi teams faced secured apartment buildings without a guard. In these situations, the planning team contacted the building management boards and were usually able to gain access to these buildings; however once inside, mappers-listers-interviewers often found that residents knew little about their absent neighbors. Kathmandu had wealthy "VIP" neighborhoods and mapping-listing staff reported substantial skepticism and non-response in these neighborhoods.
Travel Mapping-listing staff commuted to clusters via bus, rickshaw, motorbike, and foot. In Kathmandu, most staff never traveled more than 1 h to a cluster; however, a team working in peri-urban Kathmandu spent 3 h commuting one way to one cluster due to the absence of buses or taxis. In Dhaka, where traffic is notoriously bad, commute times to clusters ranged from 1.5 to 3 h. Across the three cities, mapping-listing staff recommended hired vehicles to save time.
Area-Microcensus versus Two-Stage Clusters Mapperslisters in Kathmandu reported different experiences in area-microcensus and two-stage clusters. The two-stage clusters were, by definition, ten times the size of areamicrocensus clusters resulting in extra days of work and more physical barriers to navigate such as hills and rivers. In addition, the two-stage clusters required more information than area-microcensus clusters, resulting in longer interactions and higher levels of skepticism among residents.
Residents in Kathmandu were generally willing to report the number of apartments/dwellings per building; however, they were reluctant to specify the number of households per dwelling and to give household head names. In many two-stage clusters, teams approached a business owner on the ground level who gave number of dwellings on the above floors, but refused to give household-level information, and instead directed the mapping-listing staff to the building owner. One way that mappers-listers addressed this challenge was to approach people at a local grocery store and start a conversation away from their building. In this context, residents were less likely to feel they were speaking on behalf of the landlord.
Technology Across sites, mapping-listing staff faced challenges with the tablet applications. While some challenges could have been averted with more, or better, training, other challenges were inherent to the tools and protocols used. First, although OpenStreetMap was updated by mappers-listers before visiting clusters, the updates in various applications occurred on different schedules resulting in different versions of the same map in the field. Specifically, updates to ArcGIS (from which field maps were printed), GeoODK (to collect building GPS points during the listing), and OSMAnd and MAPS.ME (used for navigation) were updated 1-30 days after a change was made to OpenStreetMap.
A second problem was the number of unintegrated applications that the mapping-listing staff were expected to use, resulting in lost time and confusion. Despite having multiple navigation applications and a paper map, mappers-listers in all cities reported delays and difficulty navigating to clusters. Once in a cluster, however, mappers-listers did not report challenges identifying cluster boundaries, despite their blocky shapes. Mappers-listers also found recording the listing data in GeoODK was arduous, and they often took notes on paper when speaking to residents and then entered information into the tablet immediately after.
Third, the location precision within OSMAnd and GeoODK was poor, often showing a circle up to 36 m in which the tablet could be located. Location precision was a particular problem in high-density areas (presumably with tall buildings blocking or refracting signals) and resulted in a few instances of a mapping-listing team starting their work and then realizing that they were recording data one or two streets away from the cluster.

Discussion
By comparing DHS/MICS and SUE household definitions, and area-microcensus and two-stage sampling, we found evidence that standard household survey methods unintentionally omit single adults and non-family households, both of which are more likely to represent disjoined households or be mobile compared with stable nuclear family households [17,43,49]. This is among the first studies in a LMIC context to evaluate undercoverage due to survey design and methods in face-toface surveys; such studies tend to be conducted in highincome countries [18,50].
Although the same protocols and household definitions were used to identify households in Kathmandu's area-microcensus and two-stage arms, the quality of the household listing data appeared to be more thorough in area-microcensus clusters where interviewers (rather than mapper-listers) listed households. Interviewers had more skills to interact with the public and substantially more time at each building while administering questionnaires (2.5-3 h per household as opposed to 15 min per household) to build rapport with residents and learn about atypical and informal housing arrangements. Indicator design effects point to another possible benefit of the area-microcensus design. Although one might expect larger design effects in area-microcensus clusters because near neighbors are assumed to be more similar than far neighbors [31], the DEFTs for slum, migration, and education indicators in area-microcensus clusters were smaller than in two-stage clusters. This might indicate better coverage of the heterogeneous mix of urban residents and better identification of atypical and "hidden" households. Smaller design effects for similar indicators (less than primary education, willingness to take risks, and mental health status) were consistent with a similar study comparing area-microcensus with standard probability sampling in a South African city [32]. Others argue that standard household definitions are no longer suitable in complex LMIC cities; rather, individuals and communities are more appropriate units of measurement [5,49]. Further research is needed to evaluate potential trade-offs and benefits of moving the household listing responsibility to interviewers using area-microcensus survey designs, but our findings suggest multiple benefits.
Without urban strata, the two-stage sample in Kathmandu was better able to measure tent and shack dwellers than the area-microcensus sample, likely due to the larger area of two-stage clusters. The only way to ensure representative surveys of shack/tent dwellers and other vulnerable populations concentrated in slums is to treat deprived/non-deprived areas as strata, in both areamicrocensus and two-stage designs. Others have suggested that censuses classify EAs as slum/non-slum to support stratified urban surveys and numerous initiatives to improve the well-being of slum dwellers and the health of cities [20]. Given the resource constraints facing LMICs, adapting methodologies to leverage slum-classified census EA units within existing global programs for household surveys, such as the DHS, would provide greater value for money. Though this approach would only work for censuses that enumerate residents of slums and informal settlements [9]. While stratifying urban populations by slum and non-slum areas would not diminish the need for high-quality informal settlement-specific data such as those generated through the Nairobi Urban Demographic and Health Surveillance System [25], it would fill the gap in the current evidence base for datasets that measure intraand inter-urban inequities, and allow valid comparison of rural, urban slum, and urban non-slum populations.
We found that response rates in area-microcensus clusters were lower than in two-stage clusters. This may have been due to the greater proportion of vulnerable and mobile households identified in areamicrocensus clusters if they were less willing to participate, more likely absent, or felt disempowered to respond. Readers who are interested in areamicrocensus survey designs should take account of lower response rates and potentially higher design effects for certain indicators when calculating sample size. The surveys conducted in Dhaka and Hanoi focused on vulnerable and mobile communities, so rates of exclusion identified in this study may have been higher than in the general population.
Societal changes, particularly rapid urbanization in LMICs, have likely caused decay in survey data accuracy due to increased complexity in living arrangements, urban disparity, and population mobility. Not only are vulnerable and mobile populations more likely to be intentionally excluded from surveys, but also they are at increased risk of unintentional, unmeasured exclusion, and their data are masked in urban averages when they are sampled. Given the importance of household survey data to policy-making, planning, and monitoring progress toward development goals, it is time to evaluate new survey tools and protocols that ensure inclusion of all households.