The value of social and spatial data associated with health is recognised around the world (Banerjee, 2016; Warren-Gash, 2017). To make the best of such data and apply insight in a meaningful way relies on interdisciplinarity, co-production and effective collaboration within a suitable environment and infrastructure. Such collaborations, collaborative models or alliances exist at numerous levels, from international initiatives down to the local level, with varying levels of success. Here we will discuss an example of a successful national initiative in NZ that has effectively translated research outcomes into policy impact. We then reflect upon partnerships in a UK context, which include research cohort studies and the use of novel (found/re-purposed) data, and how we can learn from each other in a health geography research and policy setting.
Collaborative Partnerships in New Zealand
Established in 2005, the GeoHealth Laboratory (GHL) is an effective and innovative collaborative partnership at the University of Canterbury in NZ. The GHL is directly funded by the NZ Ministry of Health (MoH), with the University of Canterbury indirectly funding the lab through staff time (e.g. directorship and collaboration with other staff members) and resources. The reciprocal partnership has a focus on practical health research and is designed to benefit both the health sector and academia. It is at the nexus of ground-breaking and policy-relevant geospatial health research. Historically, projects have included indices of access to health promoting neighbourhood factors, access to undesirable neighbourhood destinations, and access to and utilisation of health services (Bowie et al., 2013). More recent outputs include identifying risk factors for women with obesity of childbearing age (Hobbs et al., 2019a), and accessibility to food retailers and socio‐economic deprivation (Wiki et al., 2019). Other examples include relating the visibility of nature, in the form of green and blue space to psychological distress (Nutsford et al., 2016). These projects are uniquely co-designed by policymakers and researchers to tackle the exigent health issues in NZ in specific policy areas. Effective communication between the GHL and the MoH is therefore critical.
To maximise effective reporting to the MoH, the GHL produces short reports designed to communicate complex analyses to a range of audiences, including policymakers who are not experts in the subject. This approach acknowledges that the length and technicality of many academic publications is a barrier to their use by policymakers, who may lack both time and expertise (Davis & Howden-Chapman, 1996). The ability to simplify research and address the ‘so what?’ is an increasingly valuable skill for academics engaged within applied research settings, however this often conflicts with the publication-focused paradigm of academic career progression. By summarising research outcomes into short reports alongside academic publications, the GHL have developed an effective communication strategy that helps to actualise the real-world impact of research.
The GHL uses data provided by the MoH for health-related and policy-relevant projects (Bowie et al., 2013). Among these data resources is the Integrated Data Infrastructure (IDI), a world-leading innovative research database that is maintained and operated by Statistics New Zealand (Stats NZ) (Social Investment Agency, 2017). The IDI is a longitudinal dataset which holds individual and household level microdata from a range of Government agencies (e.g. housing, health, policing), Stats NZ surveys, and non-governmental organisations (Statistics New Zealand, 2013). The IDI is unique in many ways, as not only does it hold information for 9 million individuals who reside or have resided in NZ, it also hosts data for tens of millions of visitors to NZ (Social Investment Agency, 2017; Statistics New Zealand, 2018a, b). The data are linked using deterministic and probabilistic linkage and completely de-identified before being made available for researchers (Statistics New Zealand, 2013). This is possible due to strict adherence to five ‘safes’; safe people, safe projects, safe settings, safe data, safe outputs (Social Investment Agency, 2017). Researchers are trained and vetted before being granted access to the data; furthermore, the IDI must be used in a safe setting such as the secure data laboratory located at the University of Canterbury. Only projects in the public interest, such as those co-designed by the MoH and the GHL, are approved to use IDI data and all outputs are checked by Stats NZ to ensure they are ‘safe outputs’ before publication (Statistics New Zealand, 2017).
The GHL has reported findings to the MoH on a wide range of topics investigating how neighbourhood and national contexts shape health outcomes and inequality (Bowie et al., 2013). Using linked health-data resources, including the IDI, has allowed researchers at the GHL to answer complex research questions, gaining unique insights for the benefit of broader society. While most conventional data sources used to derive evidence supporting health policies often suffer from a lack of demographic and socioeconomic information, new linked microdata allows better integration of available recorded information in order to generate in-depth insight not (or only hardly) possible before. For example, the GHL utilised the IDI to identify population transience (methodology based on earlier research (Jiang et al., 2018)) and the utilisation of health services. This study identified that up to 5.6% of the NZ population, or 250,000 people, are classified as either ‘vulnerable transient’ or ‘transient’. This research was carried out in collaboration with the MoH and Lakes District Health Board (DHB), one of 20 DHBs in NZ and aimed to determine how home address and frequency of address change (transience) can affect long-term health outcomes and health service utilisation. In the Lakes DHB area, population transience was found to be higher (8.3%) than the NZ national average. A higher proportion of Māori people, fewer people in the 20–39 age group compared to the national average, and a relatively high proportion of people living in the most deprived areas (Sheridan et al., 2011), were also found within these transient groups. These findings enabled Lakes DHB to better understand the characteristics of the affected population, use of primary health services, and their accessibility for vulnerable populations (Ministry of Health | Manatū Haoura, 2019). It is still early to examine the direct impact of utilisation of linked population microdata on health policies. Yet even now, the findings have raised new, more targeted questions from national and regional health sector leaders.
The ongoing longitudinal investigation of immunisation rates in NZ is another example of research focused on health service utilisation undertaken at the GHL. This work identifies socioeconomic and demographic determinants of immunisation using the National Immunisation Register, established in 2005. Researchers accessed general information about vaccinated children including area of residence, gender, ethnicity, socioeconomic status of the residence area, and accessibility of health-related services, as well as detailed information on the immunisation trajectory. The study identified significant differences in immunisation coverage. Specifically, that spatial variation remains even when socioeconomic deprivation, demographic variables, health service accessibility, and urban/rural classifications are controlled for. Higher immunisation rates were associated with less socioeconomically deprived areas and the rates of Ambulatory Sensitive Hospital admissions of children (0–4 years) are lower in areas with higher immunisation rates (Marek et al., 2020). Moreover, recent evidence from NZ suggests there are structural, economic and cultural barriers to immunisation (Walker et al., 2019) and healthcare access in general (Hobbs et al., 2019b). These findings facilitate an improved, area-specific understanding of socioeconomic and demographic determinants of immunisation trajectories throughout a child's lifespan. This empirical evidence has informed progress on the MoH priorities that focuses on child wellbeing and better population health outcomes supported by a strong and equitable public health and disability system (Ministry of Health | Manatū Haoura, 2019).
There are multiple data sources that can be accessed by researchers without any additional steps other than downloading the data. In the New Zealand context, Statistics New Zealand manages multiple online services that allow easy data download (NZ.Stat and Datafinder). The GeoHealth Laboratory also has data it is starting to make available. For instance, on its website, it has a road network layer and other data it has processed. This includes nationwide data on what they defined as environmental “goods” (i.e. greenspaces) and “bads” (i.e. alcohol outlets or gambling venues) (Marek et al., Under Review). The location of such environmental “goods” and “bads” have recently been associated with adverse outcomes for mental health and psychological distress in a nationally representative population of New Zealand adults after controlling for key covariates (Hobbs et al., In Press). For health-specific datasets, the Ministry of Health publishes data and reports on the website, however the spatial (and time) domain of the data is not always optimal. In some of the GeoHealth Laboratory projects the data is not ordinarily available but access was facilitated through the collaborative GHL-MoH partnership. The IDI database is different in that it is available to approved New Zealand researchers. The IDI is then accessible to any researcher upon submitting the project that serves for the public good. To become an approved researcher, one needs to undergo training and check by Stats NZ due to accessing and handling potentially confidential data.
In all the projects carried out by the GHL through the partnership the research questions are co-designed with the Ministry of Health and are therefore are of real value to the health sector. It also means that the path between research evidence and policy change is short and direct. For example, for every project a short plain English summary is produced and has to be signed off at a high level within the Ministry of Health, and shown to the Minister of Health. It would be ideal to show how the research has directly impacted policy, but the link is rarely that simple. One example is investigating the link between the location of alcohol outlets and adverse health outcomes. The GHL has done research on this in 2012 (Day et al., 2012) and 2020 (Hobbs et al., 2020); the latter a response for an update to the earlier research. In addition requests for the research come from the public and other parts of government (e.g. a recent request from the Police) and increasingly alcohol outlets are failing to get permission to locate in residential neighbourhoods (e.g.Footnote 1). In another example, the “Transience” study served as evidence in the development of new policies on housing and displaced population. It also supported actions shaping transformations in the healthcare provision in Lakes DHB.
Collaborative Partnerships in the United Kingdom
The collaborative partnership between the MoH and GHL is one of a handful of initiatives worldwide which has successfully used linked population-based health data (Vogel et al., 2019; Warren-Gash, 2017), to mobilise change. In the UK, there have been attempts to move towards a solution for linked population health and administrative data, akin to the IDI in NZ. To date, no equivalent resource exists. One such initiative was the ‘Big Data Network’ funded by the Economic and Social Research Council (ESRC) in 2013, phase one of which included the Administrative Data Research Network (ADRN) (Economic Social Research Council, 2019). The main aim of this network was to link de-identified data collected routinely by government departments, for example, health records with education, employment and/or crime data, at an individual level. The ADRN had four nodes, representing the four countries within the UK: England, Wales, Scotland and Northern Ireland. Operating for the devolved nations presented different scales of data linkage and different organisational challenges. With England 10 times larger than the UK’s next largest country, Scotland, with 53 million people compared to 5.5 million respectively (Office for National Statistics, 2020), population size between the countries varies widely. NZ, with a population of 5.0 million (Statistics New Zealand, 2020) is akin to that of Scotland. Computational power exists to handle significant data volumes, suggesting this should not be a problem. However, as individual data sources are governed by a larger number of administrative units in bigger countries, data linkage becomes increasingly complex. Moreover, the time and resources required to extract, clean and anonymise the data, without added benefit or funding support, meant there has been a lack of incentive for data owners to share their data (UK Statistics Authority, 2016). As a result, the ADRN has been slow to acquire datasets, in particular, from government departments such as the Departments for Work and Pensions, which did not have the resource for these processes (UK Statistics Authority, 2017). To make matters worse, ADRN worked on a ‘create and destroy’ policy whereby data could not be reused in numerous projects, which was neither a sustainable nor a cost-effective way of sharing data for research.
In light of these challenges, in the second round of funding the ADRN has been re-invented as the Administrative Data Research (ADR) UK. ADR UK is a partnership between the three established ADRN nodes in Scotland, Wales and Northern Ireland, alongside the Office for National Statistics (ONS), which represents England. Taking learnings from the original ADRN, the current ADR UK has recognised the importance of investing in the data preparation process and realised that this should not be the onus of the data owner. The ‘create and destroy’ policy has been eliminated in favour of reusable themed datasets to which trained researchers may apply for access to de-identified data via a secure data centre. ADR UK also models their administrative data on the ‘five safes’ (Social Investment Agency, 2017) and emphasises the importance of partnerships between government departments and academia. However, unlike the NZ strategic partnership in which the research agenda is devised in collaboration between policy makers and academics, under the ADR UK model, research is led by academic interest.
In addition to administrative data resource linkage, there have been successes with incorporating routinely collected data into longitudinal research cohorts at the individual level. For example, the ONS Longitudinal Study (LS) contains linked individual-level census and administrative data across five successive censuses, for a 1% sample of the population of England and Wales (Shelton et al., 2019). Similar to the LS, the Scottish Longitudinal Study and Northern Ireland Longitudinal Study capture the populations of the rest of the UK (Boyle et al., 2009). Over the past 40 years, the LS has collected data for over 1.1 million individuals as new members enter the study through birth and immigration. Information on life events including births, deaths and health outcomes are linked to census records, for example cancer registrations via the National Health Service Central Register. The large sample size in the LS, afforded by utilising census records, enables analysis of small areas or subsets of the population, such as particular ethnic groups or occupational groups, which is not possible using other longitudinal datasets due to insufficient numbers. At a national level, the LS has provided evidence to support major reports for the government on health and mortality (Marmot, 2010) and in academic research on health inequalities over space and time (Blackburn et al., 2013; Johnson, 2011; Murray et al., 2019). The main limitation of the LS for health research is the lack of behavioural and lifestyle data. Regional cohort studies are better able to capture a greater breadth of such health indicators.
One advantage of regional birth cohort studies, such as the Avon Longitudinal Study of Parents and Children (ALSPAC) and Born in Bradford, is their ability to collect primary data on study participants via questionnaires, clinical assessments and biological samples alongside patient consent to link routine data moving forward. This is enabled by relatively small sample sizes. Both studies follow the lives of approximately 14,000 children born between 1991–1992 and 2007–2010 respectively, as well as the lives of their parents and, in the case of ALSPAC, their offspring. In addition, information about health, wellbeing and educational outcomes is collected by data linkage to routinely collected health data from hospitals, GP practices and local government systems recording educational progress. A sense of place is important in such studies and research investigating the relationship between the environment and health is facilitated by geospatial data linkages (Boyd et al., 2019). Study data can be linked with data on the physical and social environment using geocoded records of participants’ residential location across the life course. Local cohorts have the flexibility to capture data which tackles local issues, for example the long term impact of air pollution during pregnancy (Schembari et al., 2015) and the association between exposure to green space and mental health in children (McEachan et al., 2018). This model has proven successful in forging links between communities, health services and local government, maximising local impact.
The UK Biobank has utilised the success of a longitudinal model that collects behavioural, lifestyle and biological data but at a national level, recruiting a sample of 500,000 participants aged 40–69 years. With the aim of improving the prevention, diagnosis and treatment of chronic illnesses, primary participant information is linked to a range of electronic health records. Despite the richness of individual-level data, there are limitations to the use of UK Biobank for spatial analysis. Potential geographical bias and recruitment bias exist due to the location of test centres locations, which may affect generalisation to the wider population (Batty et al., 2019; Fry et al., 2017). Whilst these valuable local, national and regional models represent isolated examples of good practice in the UK, their coverage is not comparable to the IDI in NZ and the process of replicating this data infrastructure at a UK wide scale would be extremely costly.