This section details the construction of the data set used in the analysis. The steps include (i) the selection and aggregation procedure of the loan-level data, (ii) the definition of energy-efficient building ratings, and (iii) the methodology of merging the two data sets. In addition, we present the selection of variables for the analysis and the respective summary statistics.
Data and Sample Selection
In the following analysis, we employ Dutch mortgage data obtained from the European DataWarehouse (ED).Footnote 8 ED provides a comprehensive data set with periodically updated dynamic and static individual loan-level information on securitized European mortgages.Footnote 9 We restrict the data sample according to the following criteria. The sample period covers January 2014 to May 2018 and the country of assets is limited to the Netherlands. The type of borrower is “individual” and the primary income is between EUR 20,000 and 1,000,000. The property type is “residential detached/semi-detached house”, “apartment”, or “terraced house”. The building’s occupancy type is restricted to “owner-occupied” and the construction year of the buildings is between 1900 and 2016. In addition, we focus only on fixed-interest rate mortgages and exclude repurchased ones. Finally, we require that each borrower be associated with exactly one building and vice versa. Appendix A provides an overview of the variables selected for the analysis.
After applying the above selection criteria, our final data set totals 273,024 individual mortgage components that are associated with 127,309 individual buildings. The discrepancy between the number of mortgage components and the number of underlying buildings results from the Dutch-specific tax treatment of mortgages. A typical Dutch mortgage loan consists of multiple loan parts, e.g., a bank savings loan part that is combined with an interest-only loan part. This is more common for mortgages originated prior to 2013 when there was a specific tax preference for interest-only mortgages. Besides the tax reasons, the number of mortgage components may go beyond two if a borrower takes out an additional mortgage on the same building at a later date.
For the analysis, we aggregate loan component information at the building level. For certain variables, this is already done by the data provider. For example, variables such as LTV or debt-to-income (DTI) are available at the borrower level (i.e., the same value is reported for each loan component). Where necessary, we compute for each building the average variable value across loan components weighted by the loan component’s original balance. “Choice of Variables and Summary Statistics” provides further details on this aggregation procedure.
Defining Energy Efficiency
For the classification of buildings into different energy efficiency categories, we rely on the Dutch energy performance reference table prepared by the RVO.Footnote 10 The intention behind this table is to determine a provisional energy label for all existing Dutch residential buildings. The provisional EPC indicates the energy performance of a reference building that was developed using cadastral data (i.e., area, date of construction, building type, quality of insulation of floors, roof and walls, and systems for heating, hot water, and renewable energy) of the Dutch residential building stock. The dwelling owners are encouraged to modify or add additional information about energy improvement measures, which must be approved by a qualified expert before being posted on the website. In this respect, owners must also provide evidence of the measures carried out, such as invoices and photographs. The qualified expert reviews the uploaded changes and documents before approving the definite EPC. Based on this approval, the provisional EPC is finally replaced by the actual, new EPC, which is registered at RVO. This final EPC is based on a national calculation method that takes into account the retrofit measures carried out by the owner of the dwelling.
In the provisional rating table, the label classes are calculated as described in RVO (2014). This document relies on studies conducted on the Dutch residential market, including Boumeester et al. (2008), Agentshap NL (2011), and Agentshap NL (2013). For the provisional label, the RVO has drawn up 60 reference buildings that serve for determining the provisional energy label.Footnote 11 The ordinal rating scale ranges from G (lowest energy efficiency) to A (highest energy efficiency). For each type of dwelling and year of construction, the most common characteristics of the house were studied in terms of flooring, roof, heating and ventilation system, presence of solar panels, etc., and each characteristic was assigned a code. Then, based on these properties, the approximate energy consumption of the reference dwelling was calculated and a corresponding energy label between G and A was assigned.
It should be noted that the provisional label assignment procedure is not as optimal as the assignment of actual EPC ratings. Unfortunately, the actual EPC ratings are not available. Therefore, a measurement error may be present. The measurement error can occur in two ways: randomly or systematically. If the error is random, then it should be averaged out due to the size of the data set.Footnote 12 If the error is systematic in nature, then it should be attributed to one of two possible sources: either (i) the energy efficiency of buildings systematically improved after construction (e.g., due to homeowner participation in large-scale retrofit programmes) or (ii) it deteriorated. We can rule out systematic rating deterioration because it is irrational for a large base of homeowners to deliberately reduce their buildings’ energy efficiency or to neglect building maintenance. On the other hand, a systematic improvement in energy efficiency is a legitimate concern.Footnote 13 However, if actual energy ratings improve after a building was constructed, then our results should be placed at the lower end of the estimation spectrum, meaning that our findings are likely to underestimate the true relationship between EE and PD. Similar to other studies that have only EPC ratings available, there is a presence of uncertainty regarding actual energy consumption. Namely, buildings with the same energy rating do not necessarily have the same gas or energy consumption, which depends on the individual consumption patterns of borrowers. Consequently, consumption behaviour usually is treated as idiosyncratic.
Table 1 provides an overview of the final energy classification for the analysis. It is evident that energy efficiency has improved over time, with the most efficient buildings built after 2006. Furthermore, it should be noted that the construction year periods are not of equal length. This means that energy efficiency improvement is not a linear function of the year of construction. It is technological progress and legislation (and not simply time) that are the driving factors determining rapid energy efficiency improvement during certain periods. In particular, the Building Decree (Bouwbesluit), which came into force in 1992, stands out as an important piece of legislation requiring better roof and floor isolation for newly built dwellings. Its impact can be observed in the rating improvements from C to B between the 1988-1991 and 1992-1999 periods. As explained in the next section, we take advantage of the panel structure of the data and define a variable for building age that allows us to capture the EE effect arising due to the rapid improvement in energy efficiency (i.e., technological progress) between the years 1991 and 1992. In addition, we can observe that some ratings do not change simultaneously across property types and construction years. This feature allows us to decouple the energy efficiency component from the construction year and type effect when examining the degree of energy efficiency in the Appendix B.
Table 2 presents the energy rating distribution of all buildings in the sample, and Table 3 reports the building distribution across Dutch provinces. In both tables, a mortgage on a building is marked as defaulted if at least one of the mortgage components is in arrears for at least for 90 days. In can be observed that C-rated (E-rated) buildings represent the higher (lower) bucket in the sample, while the remaining ratings are distributed reasonably evenly. Column three in Table 2 reports the percentage of defaulted mortgages within each rating category. In this context, the rising share of defaults associated with a lower energy efficiency rating is worth highlighting. Overall, the share of defaulted mortgages is rather low at 0.55%. From Table 3, we can see that the mortgages are not evenly distributed across the Dutch provinces, with the largest share stemming from Holland. Within each province, between one-half and one-fifth of buildings are classified as energy-efficient (i.e., having an A or B rating). Among defaulted loans, the share of energy-efficient mortgages is always lower compared to non-efficient mortgages in each province.
Choice of Variables and Summary Statistics
The control variables for the analyses are those identified in the existing literature as having a significant impact on mortgage default probability (see An and Pivo 2015). In particular, the variables are chosen in order to account for potential risk mitigating channels that might arise due to energy efficiency. The variables can be categorized into four different types: mortgage, building, borrower, and macroeconomic/financial variables.
Among mortgage variables, we employ contemporaneous LTV, contemporaneous DSCR, contemporaneous DTI, and loan term. LTV and DTI are reported by ED.Footnote 14 The DSCR for each building is defined as the ratio of total monthly income to total monthly periodic constant payments. The latter being the sum of monthly periodic constant payments across all loan components associated with the building. The periodic constant payments were calculated using contemporaneous loan balance, the interest rate, and the number of periods remaining until maturity. The loan term at the building level is defined as the difference between issue date and the maturity date (measured in months) and aggregated as the original balance-weighted average across all loan components associated with that building. Using contemporaneous LTV allows us to control for the potential value channel that might arise due to energy efficiency. As pointed out by An and Pivo (2020), energy efficiency is likely to improve the dwelling’s market value, which in turn should lower the contemporaneous LTV. Chegut et al. (2020) document that Dutch energy efficient dwellings were valued significantly higher in 2015 compared to the baseline year of 2010. This suggests that the reported contemporaneous LTV ratios in our panel data set, which covers the period January 2014 to May 2018, should reflect the market value of buildings’ energy efficiency. Thus, LTV appears to be an appropriate control for the value channel and our findings should therefore be placed on top of the value-induced risk reduction effect. Regarding the income channel, i.e. the improvement in the borrower’s disposable income due to energy efficiency, neither DSCR nor DTI are appropriate control variables because they only capture general household income, but not savings from energy efficiency. Unfortunately, we do not have access to information on borrower energy use that could address this issue. However, in “Economic Mechanism” we examine the income channel by differentiating between high-, middle, and low-income households on top of the classical DSCR or DTI ratio used in the default probability models to capture the additional cash flow due to energy savings.
Building variables include property type, geographic location at the NUTS 3 level, and building age category.Footnote 15 Building age is defined as the difference between the current loan year and the year the building was built. We categorize building age into 3-year categories because, according to Underwood and Alshawi (2000), this is the shortest maintenance cycle of a building. Due to the panel structure of our data, this variable definition allows us to disentangle the building age component from the EE component in the regressions. This is due to the fact that EE variation remains within certain age categories.Footnote 16
Borrower-level information includes total income, defined as the sum of primary and secondary income, and the borrower’s age at origination of the earliest loan component. We categorize the total income across high, medium and low tercile groups. These variables are designed to account for household characteristics that may influence both the decision to purchase an energy efficient home and mortgage default risk. For instance, older, wealthier, and more financially literate households might be more likely to acquire a home with better energy performance while having a low-risk credit profile.
To control for general macroeconomic conditions, we include the quarterly Dutch unemployment rate, the end-of-month 10-year German government bond yields, the monthly standard deviation of 10-year German bond yields, and the yield curve slope, defined as the difference between 10- and 1-year EUR swap rates. The variables are obtained from Bloomberg.
Summary statistics at the property level are presented below. Table 4 provides summary statistics on key borrower, property, and mortgage characteristics as a one-time cross-sectional snapshot using the most recently reported values. The table differentiates between non-defaulted (Panel A) and defaulted (Panel B) mortgages. Within both panels, we also differentiate between energy-efficient (EE = 1) and energy-inefficient (EE = 0) buildings. A dwelling is considered EE if it has an A or B rating. Beginning with borrower characteristics, age at mortgage origination does not appear to differ substantially between EE and non-EE mortgages. However, younger borrowers are significantly more likely to experience default than older borrowers. In terms of income, EE-building borrowers have higher overall total household income for both defaulted and non-defaulted loans, while defaulted borrowers generally have relatively lower annual income. The construction year of buildings varies between EE and non-EE by definition. More recently constructed buildings are EE. About 68% of buildings are detached houses, while 17% are apartments and 15% are terraced houses in the sample (results are not reported in the table). Average interest rates and original LTV are higher for defaulted and non-EE mortgages. In line with the findings of An and Pivo (2020), we observe that on average LTV is lower for energy efficient dwellings, suggesting that EE is associated with higher property value.
Figure 1 shows the distribution of mortgages according to the buildings’ year of construction (Panel A), the total original balance (Panel B), and the earliest origination year (Panel C). It is worth noting that our data set is well diversified according to buildings’ construction year. In addition, we have a considerable number of mortgages older than ten years. This is an important feature since defaults typically do not occur in the first few years after origination.
Unreported statistics on market and economic variables indicate that the average quarterly Dutch unemployment rate for the period January 2014 to May 2018 is about 6.47%. For the same period, the average yield on 10-year German government bonds is 0.46%, their average monthly standard deviation is 0.095%, and the average difference between 10- and 1-year Euro swap rates amounts to 0.963%.