Health benefits of reducing aircraft pollution: evidence from changes in flight paths

This paper investigates externalities generated by air transportation pollution on health. As a source of exogenous variation, we use an unannounced 5-month trial that reallocated early morning aircraft landings at London Heathrow Airport. Our measure of health is prescribed medications spending on conditions known to be aggravated by pollution, especially sleep disturbances. We observe a significant and substantial decrease in prescribed drugs for respiratory and central nervous system disorders in the areas subjected to reduced air travel between 4:30 am and 6.00 am compared with the control regions. Our findings suggest a causal influence of aviation on health conditions.


Introduction
Pollution has well-known economic consequences, affecting the health status of workers and their productivity and well-being.This paper contributes to the limited, but growing, field of studies that use exogenous variation to investigate the causal effect of transportation services on health (see Cesur et al., 2017;Deryugina et al., 2019 for recent examples and Graff Zivin and Neidell, 2013 for a review).
We present new evidence on the health impact of airports as major sources of pollution (Wolfe et al., 2017;Schlenker and Walker, 2016).We consider regions exposed to a change in patterns of plane landings around a global aviation hub located within a large metropolitan area, London Heathrow Airport.We make use of a trial implemented over 5 months (between November 2012 and March 2013) that redirected landing approach flight paths to reduce early morning traffic in designated areas.Reduced aviation traffic impacts both air and noise pollution.The short period used for our analysis is, however, more likely to detect impacts of noise rather than air pollution in our treated areas.As a control group, we use areas north and south of the approach paths (i.e., not overflown) and are therefore unaffected by this trial.We use spending on drugs prescribed by medical doctors as health indicators.We focus on three broad types of treatments for disorders that, as suggested by the medical literature, are aggravated by noise pollution: central nervous system depressants, respiratory, and cardiovascular agents1 .
Our main contribution is establishing new and concrete results linking air transportation pollution to medical spending in a causal framework.We do so by exploiting unique context and data.First, the nature of the trial surmounts avoidance behaviourspeople may rationally avoid places exposed to increased pollution-that plague earlier literature.This trial had the critical and unique feature of occurring at daybreak, between 4.30am and 6.00am, when targeted residents are most likely to be at home and therefore exposed to the full impact of the changed flight paths.Second, by using data on medicines prescribed by doctors to their patients, we can assess diagnosed health conditions, rather than relying on self-reported health conditions.Finally, by quantifying the health impact of air traffic pollution caused by airports, we add to recent literature trying to credibly estimate the impacts of transport congestion locations on health outcomes using natural experiments.This literature considered air pollution generated by airports (Schlenker and Walker, 2016;Boes et al., 2013), ports (Moretti and Neidell, 2011), and traffic congestion (Currie and Walker, 2011;He et al., 2016).
Our main results are that during the trial, we observe a decrease in monthly prescription expenditure on central nervous system and respiratory medication by around 6% and 3% respectively2 .We test the main results by running a battery of specification and robustness checks.As a placebo check, we test whether similar prescription changes happened for other diseases known to be unrelated to pollution (infections and musculoskeletal conditions); we do not detect any significant changes over the same period.The results are also robust to variations in the time periods and control groups.
Our results therefore suggest a potential link between noise pollution generated by aviation and health, which has financial implications for health spending.We estimate that a permanent reduction in early morning air traffic would save just under £5 million per year from prescribed medicines for respiratory and central nervous conditions, in the areas most affected.
This paper is structured as follows.The following section gives background information on airports, noise pollution, and health.Section 2 describes the literature, and Section 3 discusses the identification strategy.Section 4 presents and discusses the results, followed by our robustness checks and implied back-of-the-envelope costings.Section 5 concludes.

Airport traffic and health
Major airports such as Heathrow generate increased atmospheric pollution and noise levels, both of which have negative impacts on health.The clinical and medical literature suggest air pollutants affect the human body through activating oxidative stress that causes cell damage and death.There is strong evidence of symptom exacerbations of cardiovascular diseases such as arrhythmia and myocardial infarction, as well as asthma and other respiratory diseases, transient worsening in lung function, and increased respiratory infections, which all result in more visits to general medical practitioners and hospitals (Brunekreef and Holdgate, 2002;Li et al., 2012;Gutierrez, 2015;Wang et al., 2022;Cardoso De Mendoça et al., 2006).
There is equally strong evidence that noise pollution, defined as undesirable sound, impinges on human health.Among its adverse effects, we focus on those non-auditory ones-i.e., those health effects other than tinnitus and hearing loss, triggered by environmental noise.In their recent review, Basner et al., (2014) identified four main outcomes from excessive noise: sleep disturbance, annoyance, cognitive impairment, and cardiovascular disease.People react to various levels of noise when it interferes with sleep or daily activities.They experience a range of effects of varying severity, from exhaustion and stress-related symptoms to anger and displeasure.
The nature of our trial based on changing areas overflown by aircraft is most likely capturing the impact of noise pollution.The period of 5 months is rather short, and the number of flights (24 per morning) is unlikely to generate measurable health impacts of air pollution.Noise in contrast will directly affect those living below the flightpaths within a few days of the start of the trial.Therefore, in this paper, we concentrate on a discussion of the negative effects of noise pollution.Air pollution generated by flights adversely affects health, primarily cardiovascular disorders, but we cannot precisely separately identify those impacts with this trial.We, therefore, believe the results are most likely driven by noise, although we do not exclude that air pollution contributed too.Indeed, we do find some support for this impact in the significant impacts on respiratory diseases.
In the UK, the Civil Aviation Authority (CAA), on behalf of the Department for Transport, produces noise contour maps to estimate the size of the areas subject to different noise levels (Lee et al., 2014).As a standard, noise contours are plotted at levels from 57 to 72 dB 3 , in 3 dB steps.Additional steps from 48 to 57 dB are added for night contours due to the higher sensitivity of people during their sleeping hours 4 .A large number of residents are affected by night noise due to flights into Heathrow.In fact, the airport lies within the boundaries of Greater London (an unusual location for a major international hub)5 .
The human body can respond through direct and indirect pathways to acute exposure to noise.The latter refers to the path from perceived nuisance to emotional stress reactions.The direct pathway consists of the autonomic physiological stress triggered by the interaction between the central auditory system and the central nervous system.Even at low noise levels, this is considered to be the prevalent mechanism in sleeping individuals (Basner et al., 2014).Observations on chronically exposed populations show an effect on the metabolism and the deterioration of the cardiovascular system (Basner et al., 2014).Sleep disturbance is regarded as the most harmful effect of environmental noise exposure.Occasional incidents as low as dB 33 L Amax at night can induce various physiological reactions during sleep, such as tachycardia, body movements, and awakenings (Basner et al., 2014).There is conflicting evidence on the size of these effects, which vary according to whether the study considers the elderly, children, or people with existing conditions (van Kamp and Davies, 2013).
Noise source is a fundamental contributor to the reaction to noise.Different sources hold different acoustic characteristics: frequency, sound level, duration, intensity, and psychoacoustic measures.For instance, at the same average night noise level, aircraft noise is found to trigger a higher level of annoyance than other transportation noise (Schmidt et al., 2013) and (European Commission, 2004).
Studies on noise effects date back to the 1970s (Ando et al., 1975).Initially, laboratory settings were promoted, followed by field experiments with a focus on airports (Cohen et al., 1981;Chen and Chen, 1993;Evans et al., 1995).These found harmful effects of noise on cognitive ability and on blood pressure.There are many 3 Noise exposure is measured in decibels (dB), a logarithm scale that ranks noise pressure levels.When noise varies over time, the L Aeq,T is the equivalent average continuous sound which would contain the same sound energy as the time-varying noise for a given period T. When noise has instantaneous effects, such as sleep disturbance due to aircraft, it is better measured as a maximum value during the time period (L Amax ). 4 Traditionally, the 57 dB level represents the starting point of significant community annoyance.For Heathrow Airport, Lee et al. (2014) calculated that in 2013, about 266,000 and 421,000 people were exposed to 57 L Aeq,16hour during the day and 48 L Aeq,8h during the night respectively.We note too that a recent WHO report recommends using the limit for negative health effects of nighttime noise at 40 dB, which would include areas situated in East London, WHO (2019).
epidemiological studies drawing on large administrative sources of health outcomes to investigate the effects of noise on health.Examples include Tzivian et al., (2015) who reviewed studies on the mental health effects of exposure to noise pollution and reported a positive association with anxiety, depression, and impaired activities of daily living, among other outcomes.Hansell et al., (2013) focused on the Heathrow airport region specifically.They found that exposure to higher noise levels increased mortality and the prevalence of strokes, coronary heart disease, and cardiovascular disease for both hospital admissions and mortality.
Although these cross-sectional studies control for some of the confounding factors that could be associated with the relevant outcomes, such as socio-economic status and individual overall health conditions, they do not unequivocally determine causation between environmental factors and health.For example, they assume that exposure to noise happens mainly at the individual's home address.However, a large proportion of the population spends most of their day outside their home, thus raising problems of exposure bias.In response, economists have adopted quasi-experimental techniques to tackle some of these issues (Graff Zivin and Neidell, 2013).
Recent papers exploit exogenous shocks to emissions to estimate the related health effects.However, these typically focus on air pollution levels (Chen et al., 2018;Currie and Walker, 2011;Beatty and Shimshak, 2011;Schlenker and Walker, 2016;Zhang et al., 2017 among others) rather than noise.Exceptions include Wang et al. (2022) and Boes et al. (2013).The latter found that daytime exposure to an increase in aircraft noise significantly affects self-reported health problems.The former uses a differencein-differences approach on changed flight patterns near La Guardia airport and finds significant health impacts over the 2009-2016 period.Our paper extends the analysis in several important respects.We use monthly medicines prescribed by GP for conditions aggravated by noise during sensitive sleeping hours rather than self-reported measures.We also use a quasi-experimental analysis conducive to a causal interpretation by comparing exposures to changed flight patterns between treated and control groups and comparisons before, during, and after a 5-month trial.This framework enables us to precisely estimate short-term health impacts of changes in early morning noise (4:30 am to 6 am) generated by aircraft.

Identification strategy
In order to address its noise externalities, Heathrow Airport explores ways to reduce these through a number of adjustments and measures.For instance, it encourages the use of quieter planes especially during sensitive hours, promotes quieter operating procedures, and, working with local communities, provides individual home insulation (Heathrow Airport Limited, 2013).The early morning arrivals trial (EMAT) in 2012 and 2013 was introduced to provide noise respite to specific communities affected by landings at Heathrow Airport.
Our analysis focuses on this intervention.During 5 months, from 5 th November 2012 to 31 st March 2013, Heathrow Airport ran the trial in collaboration with the noise 123 pressure group HACAN (Heathrow Association for the Control of Aircraft Noise), British Airways, and NATS (formerly National Air Traffic Services).The main feature of the trial was the identification of four pairs of exclusion zones (two to the east and two to the west of Heathrow), which were designed to be free of aircraft movements during the night and early morning in alternate weeks for the duration of the trial, redirecting the night flights to other areas.The trial implemented a weekly switch between these two sets of exclusion zones, which we term 'odd' and 'even' weeks.A commissioned report (Tucker et al., 2013) evaluated the outcome of the trial but did not provide the exact flight paths for affected areas.We therefore rely on graphics produced by the report to illustrate the distribution of flights across the affected zones-shown in the online Appendix.
Night quota restrictions reduce landings at Heathrow between 11:30 pm and 6:00 am.However, airlines, responding to travelers' preferences for early morning landings, allocate nearly all those landing slots between 4:30 am and 6:00 am.This pattern translates into one aircraft landing every 4 to 10 min during those crucial 90 min when sleep is likely to be disrupted.In addition, these early morning landings are typically transcontinental large-bodied jets which are noisier than the average aircraft landing at other times of the day.
Our data on prescriptions are available on a monthly basis only; therefore, we use total drugs prescribed per month.This has the advantage of picking up most of the prescription changes in any 1 month of the trial, as patients often consult their doctors with a delay.The nature of the trial means that residents will have experienced reduced or no noise in 2 weeks in a month but may have increased noise in alternate weeks.There are a number of reasons why this exposure does not cancel out in aggregate, allowing us to identify the impact of the trial on prescriptions.
The first is climate related.Aircrafts have to land into the wind when the speed exceeds 5 knots, which happens very often; in South East England, 70% of the year, the wind direction is west to east.This little-known pattern implies that, as opposed to a more regular alternation between landing from the west and the east, more than 70% of planes typically land flying over central London (from the east)6 .Besides, when landing, planes have to join a direct line or corridor from the runway, which at Heathrow runs horizontally (east-west).During the trial, in order to avoid the exclusion zones, planes had to join the corridors further away from Heathrow to the east and west.So relative to the pre-trial flight patterns, areas closer to the airport experienced a reduction overall.Areas further away to the east and to the west in contrast experienced an increase in air traffic for each month during the trial (Tucker et al., 2013).
The second factor relates to population density.The areas showing a reduction in air traffic overall are densely populated, largely residential areas.Indeed, they cover large parts of metropolitan London stretching to the east of the city (a distance of about 20 miles to the east of Heathrow).This is obvious from Fig. 1 (discussed later) which shows the density of general medical practices for the areas covered by the trial.Likewise, areas that experienced increased exposure to landings are less densely populated.In Tucker et al. (2013), the report estimated that over 1 million to the east of Heathrow experienced a respite during the trial, whereas only 138,000 people to the west were similarly affected.
Finally, it is likely that a complete respite from night noise has stronger impacts on health than an increase in noise from an already noisy environment.This draws on the idea that people may become habituated to noise levels.Although such an effect is not always precisely estimated in the literature, there is a consensus that it is an important consideration and is very likely to be picked up by our data.Therefore, the combination of the wind direction bias, differential population densities, and any asymmetric reaction to noise enable us to identify the trial impacts on monthly prescribed medicines.
A visual inspection of the flight tracks comparing our baseline time span (November 2011 to March 2012) to the trial period (November 2012 to March 2013)-see Figs.A1 and A2 in the Appendix-suggests five geographical zones in the Greater London (GL) area experiencing varied exposure as a result of the trial.These are our 'treated' regions, drawn as trapeziums on Fig. 1).We labelled them as follows: GLW1 and GLW2 to the west of Heathrow and GLE1, GLE2, and GLE3 to the east of the airport.The average height of the areas is 10 miles, and the average width is 5 miles.The figures suggest considerable variation in the exposure to early morning aircraft noise for affected sub-populations.The regions called GLE1 and GLE2 were almost free of flights during their exclusion weeks; they were over-flown only slightly more than normal on the other weeks, so overall experienced a reduction in early morning aircraft noise.Similarly, there were areas overflown more, which were mostly located to the west of Heathrow, GLW1 and GLW2.
As a control group, we chose all medical practices located in two rectangles situated north and south of Heathrow, lying outside of the approach path corridor.The residents in those areas remained unaffected by changes in air traffic throughout the trial period.Residential sorting did not seem to be an issue within this setting thanks to two inherent attributes of the trial.The first is suggested by the name of the trial: the early morning arrivals trial.We assume that most people are at home between 4.30 am and 6.00 am and are in light sleep hours where deep sleep is infrequent7 .Secondly, no advance notification about the start of the trial was given to residents (Tucker et al., 2013).The organisations involved decided to communicate the implementation of the change only after the first week of the ongoing trial and then to collect feedback from residents through media and meetings.Therefore, it is unlikely that people relocated due to this unexpected temporary change.

Empirical specification
Our estimation relies on difference-in-differences (DiD) regressions, with 5 months (Oct-Mar 2012/3) during which the trials operated and the same 5 months a year earlier (2011/2) used as the 'pre-period'.We describe our four treated regions (GLE1-GLE3, GLW1) as treated, versus the two regions used as control.This framework is neither staggered nor time dependent since each region is treated once and at the same time (de Chaisemartin and D'Haultfoeuille, 2020).The control regions are never treated.There is a recent and growing literature on two way fixed effect (2WFE) which bears relevance to our context given some regions experience increases and other decreases in early morning flights; there is a heterogeneous treatment effect as defined by de Chaisemartin and D'Haultfoeuille (2020).This heterogeneity is more likely to be biasing our results when estimating a global effect aggregating all four regions into one as we do initially in Table 3 and in the robustness checks.We therefore introduce robust estimates proposed by Gardner (2021) as well diagnostics by Słoczyński (2022).We also introduce an 'event-study' approach to more explicitly discuss dynamic effects and check for the parallel trends assumption.We discuss those extensions in Section 4.
The epidemiological literature on the detrimental impact of noise pollution on health suggests focusing on medical conditions related to the central nervous, respiratory, and cardiovascular systems.The adverse health consequences are measured by monthly spending on prescriptions for three therapeutic classes.This comprises medications to aid circulation and breathing and, for the central nervous system, includes antidepressants and drugs to treat insomnia.

ln(S P E N D I N G
where S P E N D I N G j it is the total spending on prescription medicines for one of the three conditions of interest ( j) per thousand patients in each practice (i) and month (t).The causal effect of the trial on medication spending is captured by the coefficient τ of the treatment taking the value 1 for treated practices during the 5 trial months (November 2012 to March 2013) and 0 for the 5 baseline months (November 2011 to March 2012) and for control practices during the whole period of 10 months.The model includes region effects (γ k ), where the region k which contains practice i can be broad or more narrowly defined geographical areas as explained below, and monthly time effects (λ t ).X it represents a series of controls including index of multiple deprivation (IMD) scores to account for local socio-economic levels; practice proportions of patients by gender and age; practice proportion of GPs by age and GPs who qualified in countries other than the UK and also the number of GPs per thousand patients (see Table 1 for a description and sources for those variables).The last term, ε j it , represents an idiosyncratic disturbance term.We estimated the model in Eq. 1 for different macro-regions: first, all areas grouped together, then regions GLE1, GLE2, GLE3, and GLW1 individually8 .In the first case, we estimated the overall effect of the trial.The remaining estimates show the effect by smaller geographical areas that from a visual inspection seemed to experience consistently distinct air traffic changes.The analysis of these variations is discussed in Section 4.

Landing patterns
As discussed in Section 3, the trial implemented a weekly switch between two sets of air traffic exclusion zones, which we term 'odd' and 'even' weeks below.The aim was to provide early morning noise respite to the population affected by landings at Heathrow airport.A very detailed report on the flight patterns during the trial is available (Tucker et al., 2013); here, we visually summarise the main findings.In Appendix Figs.A1 and A2, the top panel of both figures represents the map of all landing tracks during the 5-month period in the year before the trial.The second and the third panels show the aircraft tracks of planes landing at Heathrow on odd and even weeks during the trial.Since data on medication spending is available in the form of monthly data, we aggregated the second and third panels and interpreted the trial as a monthly event comprising a combination of alternated weekly changes.Below, we describe how these monthly events are different for each region of interest.
The control regions (outlined above and below the airport on the maps in Figs.A1 and A2 in the Appendix) included those regions that were not affected by changes implemented during the trial.The GLE1 area (see Fig. A1) appears to experience an overall notable reduction in air traffic on odd weeks and a slight increase on even weeks of the trial, with a reduction overall in each month of the trial.Similarly, GLE2 (see Fig. A1), an area generally subject to heavy early morning air traffic, may have experienced an increase in traffic on the odd weeks and an important drop on the even weeks.These are the two regions most affected by the trial.
The last region to the east of Heathrow is GLE3 (see Appendix Fig. A1; if we distinguished the northern from the southern region, the latter appears to experience an overall increase in air traffic and specifically a sharp increase in traffic on even weeks.This could point to an overall increase in traffic over the 5 months period.From the second and third panels of Appendix Fig. A2 we can see that the GLW1 area was characterised by a serious increase in air traffic on the odd weeks and a decrease on the other weeks, implying an overall increase in early morning air traffic. The GLW2 area (see Fig. A2) appears to see a drastic reduction of air traffic on odd weeks and almost no change on even weeks.However interesting this area might be, it contains only six GP practices in a mainly rural region and is not used in region-level investigations.As a caveat, we add here that those overall expectations drawn from pictures must be tempered by the fact the flights drawn are for the first 12 weeks of the trial only (the only one provided by the evaluation report, Tucker et al., 2013).While we do not expect large changes in flying patterns in the last third of the trials, we are aware that visual inspection can be deceptive of expected effects9 .We will therefore rely more on our empirical investigations and take those pictures as indicative at best of potential expected impacts.
These are the broad regions identified by the trial's final report.However, we assume that the level of variation occurred at a lower regional dimension.Our observations are at the practice level but the environmental quality is probably shared by groups of practices located in local areas.This is supported by the fact that noise and air pollution levels vary at a refined level.Maps of noise contours provided by the Civil Aviation Authority draw a picture of how much variation there is from one street to a few streets apart.This suggests using a geographical unit smaller than the broad regions but larger than practice level.We use the Middle Layer Super Output Areas, MSOAs, (shown on Fig. 2) in which environmental quality is likely to be more homogeneous10 ( Lee et al., 2014).Our unit of observation (practices) is smaller than the MSOAs which could bias our standard errors, as documented by Moulton (1986).Failure to take account of this clustering dimension could lead to a downward bias of the standard errors.The main specification, reported below, controlled for these potential common group variations by adopting cluster-robust standard errors-the number of clusters (MSOAs) is large (between 227 and 444-see Table 2).
We checked for possible standard error bias due to within-cluster correlation by calculating the intraclass correlation coefficients (ICC) of errors and covariates (i.e., D, the main regressor of interest) 11 .In fact, the correct standard error can be biased by a quantity that depends on the magnitude of those coefficients, on the number of clusters, and on the size of the clusters 12 .We obtained a very small ICC of covariate (0.073) and zero ICC of errors.This suggests standard error bias may not be a major concern.However, we decided to maintain the more conservative cluster-adjusted standard errors, rather than the commonly used robust adjustment.These are the main results reported in the paper, but later, we discuss in detail a series of alternative specifications and corrections to standard errors.Besides the regional variations due to the trial, we need to keep in mind that wind speed and wind direction affect the landing provenance regardless of the planned schedule.In other words, ideally, during the trial, there should have been a regular weekly switch between planes landing from the east (i.e., over London) and planes landing from the west (i.e., over Reading).The reality however departs from the forecast due to changing atmospheric conditions.When wind speed is above 5 knots, planes always land into the wind.As we have already mentioned, in South East England, on average, wind is westerly 70% of the year.We therefore expect more robust results for the three areas to the east of Heathrow-GLE1, GLE2, and GLE3-as for these regions, there was a significant change (decreases for GLE1 and GLE2 and increases for GLE3) during the trial (see Fig. 4).The westerly preference of planes landing over London was observed during the first 4 months of the trial.

Effect of the trial by health conditions
Our analyses focus on the effects of the trial on central nervous, respiratory, and cardiovascular system ailments.The previous literature showed that these conditions are associated with noise pollution exposure.An initial investigation of the parallel paths assumption is given by Fig. A3 to A6 where trends of monthly spending by thousand patients are adjusted by percent of female patients, percent of old patients (85+ years old), and IMD scores of the small socio-geographical areas.They show the patterns of medication spending on control and several treatment groups and generally suggest no differences in trends.Therefore, we take this as initial supporting evidence that the parallel paths assumption may hold.But we also produce robust DiD estimation following (Gardner, 2021).This author proposes to estimate a two-stage approach (2SDD) where in the first stage, the following equation is estimated: for observation where D = 0, i.e., the sub-sample of untreated and not yet treated observations.The regions effects (γ j k ), the monthly time effects (λ j t ) and the variables of controls X sit are the same as in Eq. 1.Then, in the second stage, the residuals from 12 The Moulton factor tells how much larger the corrected standard error would be compared to an unadjusted standard error.With unbalanced group sizes, this is given by , where n g is the size of group g; V (n g ) is the variance of group sizes, n is the average group size, and ρ ε and ρ x are the ICC of errors ε and covariate x, respectively.

123
this estimations ỹ j it = y j it − γ j k − λ j t − s X sit β j s (where y j it is our dependent variable) is regressed on the treatment dummy D t .
In another recent important contribution, Słoczyński (2022) shows that traditional 2WFE will estimate an impact which is a weighted average of the average treatment on the treated (ATT) and the average treatment on the untreated (ATU).He shows the interesting results that the weights are inversely related to the proportion of each group.We introduce the diagnostics proposed by Słoczyński (2022) in Table 3 and discuss their implications for the interpretations of our results.
In an extension of our simple static estimates and to test further the parallel trends assumption on which the results rely, we also present an 'event study' approach where we show the effects in a dynamic model with leads and lags.This allows us to directly address the parallel trend assumption and we use for this three recently developed approaches by Gardner (2021), Borusyak et al. (2022) and Sun and Abraham (2021) which can be applied to our framework where the treatment time is unique and not staggered 13 .
Table 3 summarises regression estimates using Eq. 1 by health condition for the whole sample for the main variable of interest, D in that equation 14 .
The first column of Table 3 shows the results for the central nervous system, a therapeutic class related to the treatment of sleep loss, concentration deficits, and other stress-related diseases.The 2WFE estimate is significantly negative overall for the regions involved in the trial.This condition showed the greatest reduction in spending of 5.8% during the trial.Column 2 of Table 3 reports the results for respiratory system conditions.The 5-month trial reduced the spending on respiratory medication by 3.3%.This effect is probably measuring the impacts of air pollution, although it is difficult to draw very strong conclusions from these estimates.Indeed, we do not control for the impact of wind, a strong predictor of exposure to air pollution (Anderson, 2020), although we have evidence showing the negative impact of noise on respiratory conditions (Aurelio et al., 2014;Eze et al., 2018).
The final column of Table 3 shows the estimates for cardiovascular diseases (CVD) medication spending.This indicates that the trial had no overall significant effect on all regions involved in the flight-path variations.As shown in the Appendix Table B3, the coefficient estimates are significantly positive around 7.2% for GLE3 and only significant at 1 level for GLE1 and GLE2.The weaker correlation may reflect the required long-term exposure to nighttime noise for these conditions to develop.Our very short time frame of 5 months may not be sufficient to identify such impacts.Another alternative and likely explanation could be that night-time noise affects CVD mortality rather than morbidity (see Münzel et al., 2021;Saucy et al., 2021).We are not capturing mortality when using prescribed medications as the outcome.
The next two sub-panels of Table 3 proposes robustly estimated impacts using the two-stage difference-in-differences approach developed recently by Gardner (2021).The estimates for the second stage show similar impacts from the trial, if only slightly more pronounced.We also included in this table an investigation following the diagnostics recommended by Słoczyński (2022).Those results all point to different impacts for the ATT, ATE, and ATU.The effects for col. 1 and 2 remain negative and significant.This, perhaps, should not be too surprising given the size of the control and treatments are similar in this table.Finally, we introduce a graph displaying an 'event-study' approach to show dynamic effects over time.Figure 3 appears to show that the parallel trend assumption is not violated in three of the four approaches and shows the effects vary over the 5 months of the trial.Clearly, the two-stage approach recommended by Gardner (2021) generates the smallest standard errors.
123 Fig. 3 Event study regressions, 10 months correspond to 5 in the baseline period (11/2011-03/2012) and 5 in the trial period (11/2012-03/2013); see Butts and Gardner (2022) On average, a negative effect on central nervous and respiratory system conditions seem to dominate.The explicit purpose of the systematic flight-paths variations set up by Heathrow Airport was to reduce the population exposed to high noise pollution levels during sensitive hours.The results from Table 3 and Fig. 3 for all regions appear to suggest an overall significant decrease in medication spending caused by the 5-month trial.

Impacts differentiated by regions
The trial final report documented the comments received by local communities after the trial was conducted (Tucker et al., 2013).The response was mixed; residents outside the areas of predictable respite expressed vocal complaints of increased air traffic and annoyance.However, other communities perceived a decrease in early morning noise and positively assessed the trial.Therefore, it is worthwhile focusing on the regional results in more detail.These are given in Table 4, where we concentrate on the more robust estimates for the central nervous systems and respiratory conditions.
The GLE1 area reported significant effects mainly for the nervous system class.In fact, there are negative changes in GP spending of 7.7% for central nervous system conditions.The GLE2 region was characterised by a marked decrease in air traffic during its respite weeks, and it produced the clearest picture.The almost complete reduction in landing aircraft prevailed over the increase in flights in alternate weeks.In fact, during the trial, monthly GP spending decreased significantly by 10.5% for central nervous conditions and by 6.8% for respiratory conditions (this region also shows a decrease in GP spending on cardiovascular conditions, as shown in the Appendix).
123 Evidently, the results for the GLE2 area indicate that residents benefited from the weekly respite during early morning hours.It appears that 2 weeks per month of air traffic suspension were enough to reduce monthly prescription spending on all conditions.For GLE3, as a whole, we found a 4.7% significant increase for those medicines related to the central nervous conditions.From the maps in Figs.A1 and A2, we can see that the change differently affected the northern and the southern part of GLE3.To investigate the effect of the trial on the two regions of GLE3, we separately estimated the model for the two areas.The results-not reported here-showed that prescribing practices in the northern part drove the change, in contrast to our expectations that the southern part experienced the most increase in medication spending.The main concern, here, is the reduction in the number of observations in the areas.Having smaller regions opened the issue of patient sorting.In fact, residents of one side of the region could easily be registered with a GP on the other side, with a maximum distance from the southern to the northern part of 10 miles.This division also resulted in small numbers of practices, sixteen for GLE3 north and just five practices for GLE3 south.
For the GLW1 region, the coefficient estimates are positive as expected due to an overall increase in air traffic.However, they are not statistically significant.As previously discussed and as shown in Fig. 4, we know that wind is predominantly westerly which implies that the majority of the flights landed over the three other areas.This, combined with the sparse population density and low number of practices in this region, could explain the lack of significant results.
To conclude, our estimates suggest the decreases in air traffic noise were responsible for the health effects.The identification of these effects was aided by the fact that the groups with the higher number of practices, hence more densely populated, and the higher percentage of landing aircraft happened to be the two regions that experienced an important reduction in air traffic during the trial.

Robustness tests
We introduced a number of robustness tests to further investigate our main results; these are summarised in Table 5.The top panel reports the baseline coefficient esti-  3. Panel 1 reports the estimates of the coefficient δ of Eq. 1 with heteroskedasticity-robust standard errors.As expected, the standard errors are lower, raising the significance relative to the variant with MSOA clusters.In panel 2 of Table 5, we changed the cluster dimension to a more aggregate level, the four trial zones: GLE1, GLE2, GLE3, and GLW1.The significance levels are comparable to the previous panel with larger standard errors.Therefore, the results are robust to alternative error term variance corrections.
For each outcome group, we repeated the analysis including all the 24 months of available data from November 11 to October 13.This has the advantage of including months when flight paths returned to normal operation but has the drawback of including seasonal variation unrelated to the trial.We found smaller coefficients with similar levels of significance (panel 3).The second panel of Fig. 4 shows the well-known seasonal pattern of flights with the majority of landings in the summer months.As the trial was during the off season, it seems preferable to compare landings with the same period 1 year earlier.
The structure of DD panel data raises concerns over serial correlation.The literature does not give unequivocal guidance over the resolution of this potential problem.One reference paper by Bertrand et al., (2004) highlighted that within the DD setting, the combined presence of long time series and the use of the period of treatment indicator impose very little variation over time, potentially leading to serious issues of serial correlation.A common solution is to aggregate the observations across time periods.Therefore, we average across all 5 months for the year before the trial and all 5 months during the trial period, equivalent to using two cross sections.
We estimated Eq. 1 with this new two-period setup, and we obtained the coefficient estimates for the regressor of interest (D) reported in panel 4. We can see that the size and the direction of the effects did not change; however, the significance was affected.With such a large reduction in observations, it is difficult to obtain very precise estimates.The less restrictive alternative of adding a time trend to Eq. 1 does not substantially affect the nervous coefficient, although it does impact on the significance of the respiratory coefficient (panel 5).An intermediate approach is to include area time trends as these allow for region-specific shocks.In this case, the nervous coefficient is larger and highly significant, but the value of the respiratory coefficient drops.
As we mentioned earlier, March 2013 showed an unusual wind direction pattern; see Fig. 4. To overcome possible issues caused by the easterly wind prevalence in that specific month, we decided to exclude observations for March 2013 and consequently for March 2012.The results in panel 7 of Table 5 suggest that this deviation from the usual wind direction pattern did not significantly affect our original estimates.
We also experimented with alternative regional groupings, given that they are differentially affected by the landing patterns.The results are shown in panels 8 to 11.In panel 8, we include only observations for GLE1 and GLE2, which, as previously discussed, and clearly shown in Table 4, reported the most significant results.We estimated the trial coefficients with these two regions grouped together, keeping the same control region and omitting the GLE3 and GLW1 areas.We, therefore, assess the impact of the trial on regions that experienced a visible decrease in air traffic.As expected, the estimates increased in absolute value.GP spending decreased most for central nervous system medication, from 5.8% in the original pooled estimate to 7.6%.

123
For respiratory medication, the overall decrease in GP spending went from 3.3% in the original estimation to 3.9%.
We next report results for all regions to the east of Heathrow, adding GLE3 to the previous specification (panel 9).This confirms the same estimates reported in row 7.The magnitude reduces as we would anticipate considering that in the GLE3 area, we see some increase in air traffic during the trial.Keeping observations for GLE3 and GLW1 groups all areas that had an overall increase in air traffic during the trial (panel 10).For these regions, we see a significant increase in cardiovascular medication spending by around 6%, as well as a positive change of 4.7% for nervous spending.This important result shows that the gains in some areas were, to some extent, counterbalanced by increased spending in regions overflown more heavily during the trial.This lends additional support to our identification strategy that relies on early morning changes in landing patterns.Not only are we observing reduced prescriptions in areas less overflown, but we also observe increased spending in those overflown more.Finally, we report the results grouping all regions to the west of Heathrow (panel 11).For this specification, we retrieve data for the GLW2 area that was excluded from the analysis due to too few practices.The signs remain positive for the three therapeutic classes, but the coefficients are not significant.
We detected a substantial decrease in spending from June 2012 onwards for cardiovascular system diseases.We discovered that in May 2012, the patent of a medicine widely used to control cholesterol levels (atorvastatin) expired inducing a 93% reduction in its price.Consequently, the NHS advised GPs to switch to atorvastatin15 .This change is likely to have been driven by the drop in the medicine price rather than in a decrease in the quantity prescribed.To account for the possibility that the switch to the generic medicine has been differentially adopted in the treated and control groups, we added a further division: cardiovascular diseases spending excluding atorvastatin medicines.Panel 12 of Table 5 shows results for all cardiovascular medicines other than atorvastatin to rule out a possible confounding effect caused by this drug.The coefficient estimate changed in size but remained statistically insignificant, as for the coefficients of the main specification.
In row 13, we report one of a series of sensitivity tests of our results to the choice of control groups.We can see in Fig. 1 that the two rectangles to the north and the south of the runways have slightly different sub-populations size (the north includes more medical practices than the south).We therefore run additional regressions using the north rectangle as control only, excluding practices to the south.The coefficients retain the same significance, with both the coefficients and standard errors only marginally increased.The difference in coefficients is not statistically significant at any conventional levels.We also investigated removing GP practices in the control area situated in the near vicinity of the flight paths and found our results were unaffected.
Finally, we ran a series of regressions as placebos using health conditions that were deemed unlikely to be affected by variations in ambient noise.We identified infections and musculoskeletal and joint diseases such as 'placebos', deemed unaffected by noise pollution exposure.Panel 14 of Table 5 shows the results of this analysis.The estimates for both therapeutic classes were found to be statistically insignificant, hence providing further support for our identification strategy.

Impacts on health spending
We next investigate the economic significance of our results.Table 6 shows backof-the-envelope calculations of changes in monthly prescribing costs due to the implementation of the trial by region, which generated an overall decrease in spending by GP practices.
For the overall impact, we found a 6% reduction in monthly spending on nervous system conditions per thousand patients (see Table 5).On average, a practice has 6800 patients and recorded about £1786 monthly spending per thousand patients (derived from Table 2).From these figures, we calculated the monthly change in spending per practice, and we multiplied it by 403-the total number of the practices in all treated regions (see Table 2).The result of this calculation is shown in Table 6 and adds up to about £294,000 saved in monthly spending for the whole region only for the nervous system therapeutic class.
To put this number in context, we calculated the monthly saving in these regions arising from the substitution to atorvastatin following the expiration of the patent in May 2012, as described above.This suggested about £110,000 savings per month from this one drug alone.Therefore, our estimate of the savings from the trial for the entire nervous system class of drugs, £294,000, seems realistic.
We similarly calculated the cost savings for respiratory conditions, which was generally significant but less robust, and added these to the nervous system savings.Looking at all the regions involved in the trial, we calculated a net monthly saving overall of about £388,000.Had the flight reduction been adopted permanently 16 , the NHS would have saved just less than £5 millions per year in respiratory and nervous system prescribing costs.To put this figure into perspective, we can calculate the total annual prescribing spend in the trial area.In 2013, in England, the prescribing spend was at £142 per person 17 .Multiplying this by the 403 practices times the average number of patients per practice, we obtain about £410 million, which consists of an estimate of the annual total spend in the trial regions.Therefore, the estimated savings account for 1.2% of the total prescribing spending.We should also note that these are likely to be conservative figures since we ruled out all those practices that did not have a patient list (e.g., specialist clinics, out-of-hours services, and hospitals-which accounted for about 24% of all practices).
To complete the figure of the induced monetary saving, we should add the reduced costs of GP time due to the likely lower number of visits by patients to request prescriptions, although we do not have sufficient data to estimate this additional cost.There are also potential indirect benefits, such as reduced absenteeism and related gains in productivity.Combining these with the direct reduction in medical spending is likely to lead to larger savings to the national health system (NHS) in London.

Conclusion
In this paper, we estimate the health externalities generated by noise pollution from aircraft, exploiting a 5-month trial that took place around London Heathrow Airport from November 2012 to March 2013.The trial involved changes in patterns of aircraft landings during early morning hours (4.30 am to 6.00 am).Health effects are measured through changes in medication prescribing by GP practice.We find a statistically significant response of monthly medication spending on central nervous and respiratory system conditions to these changes and some (less significant) effects for cardiovascular conditions.We detect stronger significant reductions in prescription spending on central nervous and respiratory conditions in the regions that experience a drop in air traffic during the trial.Residents in regions more overflown during the trial have increased their medicines intakes, but the effects are weaker.These results are also consistent with the idea that complete respite has a stronger effect than an increase in an already noisy environment.
This quasi-experimental approach suggests a causal impact of aircraft noise pollution on human health.By relying on a quasi-experimental research design from a short trial of 5 months, we add to epidemiology-based studies that find negative impacts of aircraft noise on health around major airports (Clark et al., 2012;Wang et al., 2022).
This study also illustrates the benefits of using publicly available data to estimate some of the direct costs from adverse environmental exposure imposed on society, whose costs are often borne by the public health system.Our calculations suggest a sizeable direct impact on GP spending in the areas affected.These estimates do not include the reduced costs of avoided GP visits, the gain in patients' well-being, and impacts on individual worker productivity through absenteeism or less effective effort in the workplace.Our findings suggest that small variations in air traffic exposure during critical hours affect health, and this could inform environmental policy.

Fig. 1
Fig. 1 Treated and control areas.Location of Heathrow Airport, GP practices (dots), and trial areas: two control rectangles, north and south-five treated trapeziums, two west (GLW1 and GLW2), and three east of Heathrow (GLE1, GLE2, and GLE3)

Fig. 2
Fig. 2 Middle super output areas (MSOAs) of GP practices included in our regressions

Fig. 4
Fig. 4 Average monthly number of days and flights per landing direction -robust standard errors in parentheses if not otherwise specified † Heteroskedasticity-robust standard errors.

Table 2
Sample descriptive statistics, monthly averages, Nov 2012-Mar 2013 (during the trial) and Nov 2011-Mar 2012 (before the trial)