1 Introduction

Demand for mortgage-backed securities (MBS) climbed in the years leading to the 2007–2009 Great Financial Crisis (GFC) as these bonds offered higher yields and required lower capital charges. Strong demand created an incentive for issuing banks to create more of these highly rated securities from low-quality loans and to relax lending standards (Demyanyk and Van Hemert 2011; Keys et al. 2010; Nadauld and Sherlund 2013). Furthermore, misaligned incentives and imperfect information in the securitization chain reduced lending banks’ incentives to collect soft information, and to perform their screening and monitoring functions efficiently (Rajan et al. 2015). Credit ratings were also systematically biased. Rating agencies granted relative rating favours to larger issuers and issuers that offered them significant bilateral securitization business (Efing and Hau 2015; He et al. 2016).

Consequently, investors suffered significant losses during the 2007–2009 financial crisis (Watson 2008). They were criticised for being overly reliant on credit ratings when evaluating risks embedded in MBS (Mählmann 2012). Evidently, credit ratings did not reliably cover the risk profile of debt tranches, hence tranche pricing based exclusively on credit ratings created perverse incentives for issuers to exploit this unpriced risk (Mählmann 2016).Footnote 1 However, empirical evidence also show that prices of MBS accounted for incentive problems and other critical factors as well as credit ratings. For example, Fabozzi and Vink (2015) find that initial yield spreads of European MBS reflected rating risk and, even after conditioning on assigned credit ratings at issuance, yields of European MBS issues accounted for factors such as tranche seniority, nature of collateral and external credit enhancement (Fabozzi and Vink 2012a, b). In the US, investors priced the probability of rating shopping where pessimistic ratings are suppressed as yield spreads predicted losses on single-rated tranches while ratings could not (He et al. 2016). Spreads on equally rated US MBSs were higher for larger issuers signalling that investors perceived inflated ratings to be correlated with issuer size. Deku et al. (2019) find that engaging reputable trustees in securitization transactions led to lower spreads, and trustees’ reputation became more important when risk assessment was more challenging due to several layers of misaligned incentives in the securitisation chain.

Overall, this strand of the literature shows that investors attempted to incorporate the potential costs of misaligned interests beyond the informative content of ratings into initial MBS prices. Empirical evidence shows that yield spreads of US MBS at issuance were reliable predictors of both future downgrades and defaults. The ability of yield spreads to predict future performance in terms of future defaults and rating downgrades is stronger for lower rated, information-sensitive MBS (Adelino 2009; He et al. 2016).

In this paper, we investigate whether the predictive power of initial MBS yield spreads varies with the financial cycle,Footnote 2 a question that has not been addressed in the existing literature. There is empirical evidence in support of significant herding tendencies among institutional investors during asset bubbles. Furthermore, institutional trading patterns can generate short-term price pressures which worsen asset bubbles (Singh 2013). Nonetheless, theoretical models predict that institutional investors are likely to identify asset bubbles (Abreu and Brunnermeier 2003; De Long et al. 1990; Sato 2016).Footnote 3 Thus, some institutional investors become aware of developing bubbles due to mispricing; however, the inability to show coordinated arbitrage allows the bubble to expand and encourages them to “ride” the bubble instead. Empirical studies find evidence supporting this argument. For example, Brunnermeier and Nagel (2004) find that during the technology bubble of the late 1990s, hedge funds reduced their holdings of technology shares before prices collapsed. Hedge fund managers understood that prices would eventually decline but exploited this opportunity by fuelling the bubble.Footnote 4 Griffin et al. (2011) find that the most sophisticated investors actively purchased technology stocks during the run-up to the technology bubble, and quickly reversed course in March 2000 before the bubble burst. Our arguments rely loosely on this literature. We posit that institutional MBS investors are capable of detecting asset bubbles and revising their valuations accordingly.

We hypothesize that the information content of initial yield spreads, regarding MBS quality, should be more apparent during bubble periods. This is because investors would expect a decline in the quality of mortgages lent in these periods. Asset bubble phases are characterised by high credit growth rates while contractions are associated with negative credit growth and higher loan losses. Consequently, credit booms tend to precede periods of severe fall in credit quality (Caporale et al. 2014). We use two measures for asset bubbles: (1) the credit bubble period commonly observed between 2005 and the first half of 2007 in Europe, and (2) house price bubbles observed in individual countries between the period of 1999 and 2007.

We examine whether the information content of initial yield spreads vary between normal and asset bubble periods using a sample of 4203 MBS issued in twelve Western European countries. We find that MBS prices at issuance predict future downgrades due to deterioration in quality, after conditioning on initial credit ratings. The predictive power of spreads is higher during the asset bubble periods and for AAA-rated MBS. Furthermore, within the AAA category, spreads also predict the magnitude of credit rating downgrades. Our findings show that initial yield spreads of MBS incorporated additional information in excess of credit ratings. The results are robust to accounting for both types of asset bubbles (i.e. credit and housing), the severity of information asymmetries in MBS (such as rating disagreements) and possible endogeneity bias that may arise due to our modelling approach. Furthermore, we utilised machine learning (ML) techniques (regression trees, naïve Bayes, support vector machines and random forests), a novel approach in this literature, to check robustness of our results. Overall, the results from these methods confirm the predictive property of initial yield spreads.

Our contribution to the literature is fourfold. Firstly, we examine whether the predictive power of initial yield spreads differs depending on the financial cycle. The existing literature finds that yield spreads are reliable predictors of future performance, and credit ratings do not seem to capture all the risks embedded in MBS (Adelino 2009; He et al. 2016). He et al. (2016) also reports that the predictive power of yields for future losses was stronger during the period of 2004–2006 (i.e. the pre-crisis period), when the US market was at its peak. However, we do not know whether this predictive power differs in certain economic periods, especially during housing bubbles. Differing from previous studies, rather than only relying on the period of credit growth prior to the GFC, we contribute to the literature by identifying housing bubbles in each country to test the predictive power of spreads.

Secondly, we utilise a comprehensive dataset to capture the cross-country variation in MBS pricing. A key limitation of earlier studies on MBS yield spreads predictability of future performance is the focus on a single market, i.e. US (Adelino 2009; He et al. 2016). We contribute to the literature providing the first evidence outside the US using an international sample. Hence our results cannot be ascribed to any individual institutional or regulatory features idiosyncratic to any single country. We contribute to the literature by capturing housing bubbles across multiple countries. This is important as our data shows that housing bubbles in different neighbouring countries do not necessarily coincide with each other.

Thirdly, we provide the first evidence on the predictive value of the initial yield spreads of European MBSs. Despite being the second largest market after the US, the European market has received considerably less research attention. This is important because the evolution and the institutional framework of the European securitization market significantly differ from the US equivalent. First, unlike the US, where the Government Sponsored Enterprises (such as the Freddie Mac and Fannie Mae) are dominant market players, there is limited government involvement in the European market, which is purely driven by private institutions. Secondly, even though the US securitization market has been active since the late 1960s, the development of the European market has been relatively recent and has been attributed to the introduction of the Euro in the late 1990s (Altunbas et al. 2009; Kara et al. 2016). Against this backdrop, investors may have been exposed to higher information asymmetries. As a result, they may have been more cautious when assessing and pricing European MBS. Therefore, we also contribute to the literature by testing the predictive power of initial MBS yield spreads, and its possible variation with the financial cycle, in the European securitization market.

Finally, we contribute to the literature by introducing ML techniques for the first time in securitization relevant literature. Specifically, we utilise classification trees, naïve Bayes algorithm, support vector machines (SVM), and random forest (RF) algorithms. Different combinations of these methods alongside logistic regression analysis have recently gained wide attention in the banking and finance literature. For example, Chen (2011) combine regression trees with logistic regression to analyse and predict corporate financial distress. Tanaka et al. (2016) develop an early warning RF framework for bank failures and Tanaka et al. (2019) also use RF to predict industry level financial bankruptcy. Mercadier and Lardy (2019) employ RF to approximate credit default swaps to assess default risk of companies when these financial derivatives are not available. We hope that our study would encourage researchers in bond pricing and performance literature to use ML based approaches, as generally advocated by Varian (2014).

The organisation of the paper is as follows. In the next section, we describe the data and our empirical strategy. Results are presented in Sect. 3 in four sub-sections: credit and housing bubble, the magnitude of downgrade and robustness checks. In Sect. 4 we present results from ML analysis. Section 5 concludes.

2 Data and methodology

2.1 Data

We collect deal and tranche level data from Dealogic and Bloomberg on 4203 MBS issued in twelve European countries between 1999 and June 2007. These countries include Belgium, France, Germany, Greece, Ireland, Italy, Netherlands, Portugal, Spain, Sweden, Switzerland and the United Kingdom. The cut-off date is chosen to circumvent the minimal investor appetite for securitizations after June 2007 as the market volume had declined significantly as the financial crisis unfolded. Furthermore, originators have largely retained post-2007 European issuances rather than issue them to private investors. According to data published by Securities Industry and Financial Markets Association (SIFMA), issuing banks were only able to place 36% of all issuances between July and December of 2007, 13% in 2008 and only 6% in 2009.

2.2 Empirical model

Although recent evidence (Fabozzi and Vink 2012a, b, 2015; He et al. 2016) indicates that investors incorporated a variety of factors into pricing MBS, credit ratings are the single most important determinant of bond prices at origination. Structured finance credit ratings are forward-looking credit opinions that account for credit risk of the underlying assets, structural risk and counterparty risk. We also assume that ratings account for delinquency rates. However, structural features can be engineered to stave off rating downgrades. For instance, high levels of credit support can result in the maintenance or upgrade of an existing credit rating. Therefore, credit ratings measure the expected performance of the underlying assets as well as structural features. Given that there is no organised secondary market for MBS, pricing data is very scant. Therefore, we rely on credit rating downgrades as a measure of deterioration in at least one or more of these dimensions.

Following Adelino (2009)’s specification, our baseline logistic model to estimate the probability of downgrade of bond i as follows:

$$ log\frac{{\left( {P\left( {Downgrade = 1 | {\text{x}}} \right)} \right)}}{{1 - P\left( {Downgrade = 1 | {\text{x}}} \right)}} = \beta_{0} + \beta_{1} Spread_{i} + \beta_{2} Bubble_{c, t,i} + \beta_{3} Bubble_{c, t,i} \times Spread_{i} + \beta_{4} Weighted Average Life_{i} + \mathop \sum \limits_{k = 1}^{K - 1} \beta_{k} \times Credit Rating_{k,i} + \mathop \sum \limits_{y = 1}^{Y - 1} \beta_{y} \times Year_{ y,i} $$

where Downgrade equals to 1 if the credit rating of the tranche is adjusted downwards (with the latest observation point of 31 December 2014) relative to the rating awarded at issuance by at least one of the three largest credit rating agencies (Standard & Poor’s, Moody’s and Fitch), and 0 if rating is maintained or upgraded. We collect credit ratings at issuance and rating changes of all bonds and convert the ratings to a numerical point scale, where AAA/Aaa = 1, AA +/Aa1 = 2 and so on. Thus, downgrade is defined as a negative migration to a lower rating, for instance from AAA to AA+. Downgrades are typically triggered by adverse changes in credit risk, counterparty risk or structural risk associated with how the deal was engineered. Spread is the log of spread at issuance in excess of the pricing benchmark, i.e. 3 m Euribor. We use yield spreads as a predictor that subsumes the effects of deal, tranche, issuer and other macro-economic characteristics. We restrict the sample to floating rate tranches to circumvent the difficulties associated with estimating a consistent benchmark yield curve for each fixed rate tranche, and to tranches issued at par to preclude distortions of discounts or premiums on the actual yield spreads.Footnote 5

We use two alternative variables for Bubble. Credit Bubble captures the credit growth period prior to the GFC. This variable equals to 1 if the deal is issued during the boom period of 2005 and first half of 2007, and 0 otherwise. We utilise the interaction—Credit Bubble × Spread—to capture the predictive ability of yield spreads during the credit bubble period.

Housing bubble is a dummy variable that equals 1 if there was a house price bubble in the country (c) during the year (t) when the MBS is issued, and 0 otherwise. Housing Bubble is based on price-rent ratios. Bourassa et al. (2019), comparing seven alternative methods to identify house price bubble periods, suggest that the price-rent ratio measure is a reliable measure both ex-post and in real time. A house price bubble is identified when the price-rent ratio for a certain year exceeds its long-term average by 20%. We collect housing price data from the OECD’s housing database and calculate the long-term average of price-rent ratio for each country for the period between 1970 and 2016.Footnote 6 In Fig. 1 we present the identified housing bubble periods for each country. We use the interaction of spreads and bubble—Housing Bubble  × Spread—to examine the predictive power of yield spread during housing bubble periods.

Fig. 1
figure 1

Housing bubble periods in the sample countries

Weighted average life is computed as the weighted average time until each monetary unit of principal of the relevant tranche is repaid. We use Weighted Average Life to control for interest rate risk exposure. This variable also accounts for prepayment risk and therefore will always be shorter than the nominal maturity of the underlying mortgages.

Credit ratings are a set of dummy variables indicating the credit rating of the tranche at the issuance. Following the literature, we use composite credit ratings, reported by Dealogic, that combine the credit ratings from different rating agencies for each tranche (Campbell and Taksler 2003; Cuchra 2004; Fabozzi and Vink 2015). We assume that the composite credit ratings assigned at the issuance capture the expected default frequency of tranches.

We also include year fixed effects in all specifications to capture prevailing macroeconomic conditions. Our model exploits cross-sectional and within-entity time variation. It is unlikely that tranches within a specific deal are independent of each other; for instance, the ratings on multiple tranches tend to be modified around the same time (Adelino 2009). Therefore, the reported standard errors are clustered at the deal level to mitigate the correlation of errors within cross-sectional clusters (Cuchra 2004).

2.3 ML methods: classification trees, Naïve Bayes, support vector machines and random forests

We also employ several alternative ML methods to examine the initial spreads’ predictive strength of MBS downgrade outcomes. These are classification trees (from rpart R package) (Therneau et al. 2015), naïve Bayes algorithm (from naivebayes R package) (Majka 2019), support vector machines (SVM) classification (from e1071 R package) (David et al. 2019), random forest (RF) (from randomForest R package) (Liaw and Wiener 2002) and gradient boosting (GB) algorithms (from XGBoost and EIX R packages) (Karbowiak and Biecek 2020). Different combinations of these methods alongside logistic regression analysis have recently gained wide attention in the finance literature (see for example: Chen 2011; Mercadier and Lardy 2019; Tanaka et al. 2016, 2019). Our aim here is to apply these innovative machine-driven techniques to assess the reliability of our findings.

In our MBS setting, classification trees target creating a sequence of rules to reach a decision (either to downgrade or not to downgrade an MBS) by performing recursive tests on the MBS outcome variable (Downgrade). This splits downgrade and no downgrade space of MBS into the areas with clear decision boundaries of different shapes. If classification problem is complex, trees are not always stable in the out-of-sample (testing) phase and suffer from high variance problem in estimations. This can be resolved by building a forest (large number) of trees over the bootstrapped samples from a sub-set of training data and considering a random sub-set of variables at each split. This ensures diversity in each tree, reduces variance and is commonly known as RF procedure.

Naïve Bayes is a comparatively more straightforward approach. It aims at constructing the probability of each MBS outcome with Bayes’ rule given the data, and is based on the naive variables’ independence assumption. However, SVM is a more technically involved procedure than Naïve Bayes. It constructs a hyperplane to separate MBS outcomes into groups of downgraded and not-downgraded securities. SVM algorithm maximizes the distance between closest observations in the MBS groups and offers linear and nonlinear hyperplane separations. Holopainen and Sarlin (2017), Beutel et al. (2019), and Colombo and Pelagatti (2020) describe ML algorithms that we employ with in detail, and the most comprehensive technical guide is provided by Hastie et al. (2009). Overall, ML algorithms are nonparametric, because they do not make strong assumptions about the data and are flexible to identify variable interactions as well as linear and nonlinear patterns in the sample. However, large data sets are required to exercise modelling power of ML approaches. Moreover, Beutel et al. (2019) argue that ML tools do not necessarily void the relevance, competitiveness and out-of-sample superiority of the logistic regression and other standard tools. On the other hand, in the context of the exchange rate forecasting, Colombo and Pelagatti (2020) point out that due to extreme flexibility, ML can serve as a sophisticated version of technical analysis and unpack patterns of non-fundamental determinants, non-rational market behaviour and market interventions by regulators. Beutel et al. (2019) also argue that ML provide modelling benefits over the common binary logistic regression. To improve our understanding of the processes behind MBS credit rating downgrades, and enhance the interpretability of our results, we present intuitive visualisations of the ML output.

2.4 Descriptive statistics

Descriptive statistics of the sample are presented in Table 1 Panel A. The mean spread is 65.9 basis points (bp). The average tranche size is €228 million with a weighted average life of 5.5 years. In Panel B we present the descriptive statistics for bubble and normal periods separately. We observe that in asset bubble periods (either credit or housing) spreads are lower, maturities are shorter and issuance volumes are larger for MBS. These findings reflect the typical signs observed during asset bubble periods. In Panel C we present descriptive statistics for downgrade ratio and spread per rating category. In the sample, AAA-rated bonds and non-AAA investment grade (between AA + and BBB−) bonds constitute 37.5% and 55.7% of the sample, respectively. Only 6.70% of bonds are classified as non-investment grade. Secondly, as one would expect, a sharp increase in spreads is noticeable from 104.9 to 271.1 bp between the lowest rating level of investment grade (BBB−) and highest rating level of non-investment grade (BB +) categories. Accordingly, we estimate the models first for the full sample and subsequently for the sub-categories of AAA, non-AAA investment grade and non-investment grade (< BBB−) bonds separately. This should enable us to observe the predictive power of initial yield spreads along the credit quality spectrum as the severity of information asymmetries increases from high rated to lower-rated bonds.

Table 1 Descriptive statistics

3 Results of regression analysis

We present results below for the credit bubble and subsequently for the housing bubble estimations. We estimate the models first for the full sample and then for AAA, non-AAA investment grade and non-investment grade (< BBB−) subsamples to examine whether the predictability of initial yield spreads vary by risk levels of the MBS bonds.

3.1 Credit bubble

Results for Credit Bubble are reported in Table 2. In column 1, we estimate the regression for the full sample. Controlling for assigned credit rating at issuance, we find that Spread has a positive and statistically significant coefficient, showing that bonds with higher prices at origination are more likely to be downgraded in the future. This finding is in line with Adelino (2009) and He et al. (2016). The coefficient of Credit Bubble is also positive and significant, indicating that bonds issued during the credit growth period of 2005 and June 2007 were more likely to be downgraded subsequently, perhaps due to originating banks’ lax lending standards during this period (Du 2019; Rajan et al. 2015) or their reduced incentives for monitoring (Kara et al. 2019). Similar results are reported by He et al. (2016) for the US.

Table 2 Predictive power of initial yield spreads during credit bubbles

In column 2, we introduce the Credit Bubble x Spread interaction which captures whether the predictive power of initial spreads differ between the credit bubble and normal periods. We find a positive and significant coefficient for Credit Bubble × Spread, which shows that spreads at origination have more predictive power concerning future downgrades for MBS issued during a credit bubble period. For the full sample, on average, a 1 bp increase in spreads raises the odds of a downgrade by 1.31% and 0.6% during the credit bubble and normal periods, respectively. In other words, the predictive power of spreads doubles during the credit bubble periods.

In columns 3 we present the results for AAA-rated MBS only. We find similar results in terms of the direction of the coefficients but with larger magnitudes. Spreads at origination have a larger predictive power of future downgrades for AAA bonds in comparison to the full sample. This finding is in stark contrast with the literature. Using US data, He et al. (2016) finds initial spreads having a weaker predictive power for AAA-rated MBS and Adelino (2009) reports no predictive power. We find that spreads on investment-grade securities are highly informative and have significant predictive power. This is especially evident during periods associated with credit bubbles and high issuance levels. Thus, the informativeness of these spreads is conditional on market activity. Hence, the predictive potency of initial yield spreads is more influential during credit booms when credit standards are perceived to be falling. This is consistent with the perception that, compared to the US, underwriting standards in Europe are relatively robust. European structured finance suffered a default rate of 0.95% between 2007 and 2010, compared to 7.7% for US issuances and 6.34% for global corporate bonds (Blommestein et al. 2011). Furthermore, although spreads on non-investment grade tranches can also predict downgrades, we find no evidence of cyclical adjustment, indicating that investors tend to have significant predictive power regardless of issuance levels or the credit cycle.

In column 4 we introduce the Credit Bubble × Spread interaction, which is also significant with a much larger impact. Investors seem to rely more on credit ratings during the normal economic periods by adjusting the spreads less. However, during the bubble periods, they reflect risk sentiments on the initial yield spreads at the pricing stage over the risk assigned by the credit rating. Our results show that a 1 bp higher origination spread increases the odds of a downgrade by 12.2% for AAA-rated MBS during a bubble period. In normal periods, this predictive power declines to only 0.5%.

In column 5 we estimate the model for non-AAA investment grade MBS (< BBB−). Although the magnitude of the coefficients is lower, compared to the AAA sample, our earlier findings concerning predictability still stand. In column 6 we find that Credit Bubble × Spread is also significant. However, unlike the AAA estimations, the coefficient of Spread is statistically significant. This shows that when valuing non-AAA investment grade bonds, investors are less likely to rely on credit ratings, even during the normal economic periods. In other words, they are more cautious when valuing riskier bonds.

Results for the riskiest non-investment grade MBS is shown in columns 7 and 8. Spread and Credit Bubble remain significant with slightly lower explanatory power. However, we do not find the interaction of the two variables to be significant (column 8). This shows that for the lower quality MBS, spreads are likely to predict the future performance but this relationship does not differ between the bubble and normal periods.

3.2 Housing bubble

Results for Housing Bubble are presented in Table 3 for the full sample in columns 1 and 2. Similar to the results reported above, Spread is statistically significant and has a positive sign in predicting the likelihood of downgrade. The coefficient of Housing Bubble is insignificant, indicating that the likelihood of downgrade does not differ between MBS issued during housing bubbles and normal periods. Subsequently, we look at the result for the interaction variable Housing Bubble × Spread, in column 1, to examine whether the predictive power of initial spreads differs between housing bubbles and normal periods. We find that Housing Bubble x Spread is significant, indicating that that initial spreads have more predictive power during housing bubbles. Thus, investors seem to be more cautious in relying on credit ratings during the housing bubble periods.

Table 3 Predictive power of initial yield spreads during housing bubbles

In columns 3 and 4 we present the results for the AAA sample. We report similar results; however, the effect of Housing Bubble × Spread is much larger for this sample. For example, a 1 bp higher origination spread increases the odds of AAA-rated MBS downgrade by 13.0% in a house bubble period, compared to a 0.4% increase in these odds during a normal period. The results for the non-AAA investment grade sample, presented in columns 5 and 6 are also very similar to the full sample. However, the result differs for the non-investment grade (column 7), or lowest quality, MBS. Here we find that Housing Bubble significantly predicts the future downgrades. It shows that non-investment grade MBS are of worse quality if they are originated in a housing bubble period. We also do not find Housing Bubble × Spread to be significant. This result shows that the predictive power of spreads on the lowest quality MBS does not differ during the housing bubble periods.

3.3 Predicting the magnitude of the downgrade

The results presented so far show that the predictive power of initial yield spreads is higher for AAA-rated MBS bonds. We take our analysis further to examine whether this observed relationship varies with the magnitude of the downgrade within the AAA category. This may enable us to observe whether predictability power of initial yield spreads for the bonds that carry the same risk differed during the bubble periods.

To do so, we create a new categorical dependent variable Downgrade Mag, which captures the magnitude of the downgrade. Downgrade Mag is equal to the rounded average downward adjustment of the three credit rating agencies. For example, if an AAA tranche is downgraded two notches by Moody’s (to AA), two notches by Standard & Poor’s (AA) and three notches by Fitch (AA−), then we record the rounded average Downgrade Mag as two for this tranche.

Following Lugo (2014) we estimate an ordered logit regression and modify the baseline model as follows:

$$ Downgrade Mag_{i} = \beta_{0} + \beta_{1} Spread_{i} + \beta_{2} Bubble_{i,c} + \beta_{3} Bubble_{c, t,i} \times Spread_{i} + \beta_{4} Weighted Average Life_{i} $$

We present the estimation results in Table 4 for Credit Bubble in columns 1 and 2 and for Housing Bubble in columns 3 and 4. In columns 1 and 3, where interaction terms are not included, we find Spread to be positive and statistically significant. Thus, AAA bonds with higher prices at origination are more likely to be downgraded by a larger magnitude. This result shows that initial yield spreads do not only predict the likelihood of downgrade, but also the downgrade magnitude of least risky MBS tranches. In other words, investors seem to have the capability to identify riskier AAA bonds and adjust the price accordingly. We find that Credit Bubble is also positive and significant, showing that AAA bonds issued during the credit growth period of 2005 and June 2007 suffered more severe downgrades. Hence, these AAA bonds were riskier compared to AAA bonds issued pre-2005 period.

Table 4 Predictive power of initial yield spreads and downgrade magnitude

In column 2 of Table 4, we introduce Credit Bubble × Spread and find a positive and significant coefficient. This finding indicates that origination spreads predict future downgrades more forcefully if a bond is issued during a credit bubble period. Similar to above results, Housing Bubble is not significant (column 3) for the AAA sample. However, we find that the interaction variable Housing Bubble × Spread is significant, showing that initial spreads have more power in predicting the magnitude of future downgrades during housing bubbles.

3.4 Robustness tests

In this section, we check the robustness of our reported results. First, we estimate the models with a more uniform sample, using Residential MBS (RMBS) issues only, which constitute 80.1% of our data. Using a homogenous sample may give us more consistent results. The corresponding results are presented in Tables 5 and 6 for the credit and housing bubbles, respectively. The direction and statistical significance of the results are almost identical to the results reported in Tables 3 and 4 for the whole sample. The only difference in results for the RMBS sample is that the coefficient of Spread becomes statistically significant also for the non-investment grade sample. We find that initial yield spreads are powerful predictors of future performance also for the lowest quality RMBS, especially when the estimations are run with the housing bubble variable (column 8 of Table 6). We present estimation results for Downgrade Mag for the RMBS sample in Table 7. The results for the restricted sample are similar to what we have reported for the full sample. The magnitude of downgrades is larger for AAA RMBS bonds that have higher spreads at origination and for bonds that are issued during the credit bubble (column 1) and housing bubble (column 3) periods. Spreads at origination predict the future downgrades of RMBS more powerfully if a bond is issued during a credit bubble (column 2) but not in a housing bubble (column 4) periods. Overall, our results presented above for the full sample are confirmed by the results obtained from the RMBS sample.

Table 5 Predictive power of initial RMBS yield spreads during credit bubbles
Table 6 Predictive power of initial RMBS yield spreads during housing bubbles
Table 7 Predictive power of initial RMBS yield spreads and downgrade magnitude

A second issue is that the analysis presented so far splits the sample into credit or housing bubble subsamples. Although all countries in our sample experienced the credit bubble, housing bubble periods vary by country. We also observe that the credit and housing bubbles overlap to some extent in some countries (see Fig. 1). Hence, to test the validity of our results using a more stringent criterion for asset bubbles, we estimate the models using an alternative indicator which incorporates both credit and housing bubbles. Accordingly, Bubble equals to 1 if a deal is issued during the boom period of 2005 and the first half of 2007 and a housing bubble period in a given country, and 0 otherwise. The results (presented in Table 8) are more in line with our initial findings from the housing bubble baseline models. We find that Spread positively predicts the likelihood of downgrade. Bubble is insignificant, indicating that the probability of downgrade does not differ between the combined bubbles and normal periods. We still find that the coefficient of Bubble x Spread to be positive and statistically significant, showing the predictive power of initial spreads during bubbles. Similarly, for the lowest quality MBS (column 7), we confirm our results that Bubble significantly predicts the future downgrades of non-investment grade tranches and that informative power of the corresponding spreads does not differ during bubble periods.

Table 8 Predictive power of initial yield spreads during asset bubbles

Third, we control for signals that may indicate higher information asymmetries for a particular MBS. In particular, we account for that the number of assigned ratings per tranche (each tranche is rated by at least one of the top 3 global credit rating agencies: Standard & Poors, Moody’s and Fitch). Issuers are not required to report all ratings; however, ratings from all three agencies suggest more transparency while ratings from either one or two may indicate suppression of negative ratings. The information suppression tendency increases tail risk or the likelihood of extreme returns (Jirasakuldech et al. 2011).There may also be rating disagreements where bonds may be assigned dissimilar ratings by different agencies. Such bonds may carry a higher degree of asymmetric information and be more opaque for investors to assess. To control for the severity of information asymmetries, we re-run the estimations including a set of dummy variables capturing the impact of these factors on spread. Accordingly, we utilise credit rating agencies (CRA), which is the number of credit ratings assigned by different rating agencies to a tranche (CRA = 1 is the base). We also use Rating Gap, the numeric difference between the highest and lowest rating assigned by different rating agencies to the same tranche (Rating Gap = 0 is the base). The results for the credit bubble are presented in Table 9. In general, the results are consistent with our initial findings. We observe that MBS, especially investment grade tranches, rated by two rating agencies were more likely to be downgraded. We also find that, for non-investment grade MBS, a larger rating difference (i.e. three notches) between the rating agencies predicts the future downgrades. In Table 10, we present results for housing bubble and our results are consistent with the results of the baseline estimations.

Table 9 Predictive power of initial yield spreads during credit bubbles: controlling for rating disagreements and rating shopping
Table 10 Predictive power of initial yield spreads during housing bubbles: controlling for rating disagreements and rating shopping

We also address the potential endogeneity bias that may arise due to the credit ratings’ impact on MBS spread. In our setting, we use composite credit ratings as an indicator that captures the expected default frequency of tranches. However, it could also be argued that the credit rating of individual tranches is a determinant of the spread of the bonds at issuance. To check the robustness of our results we utilise a two-stage estimation methodology that addresses the potential endogeneity bias between the credit ratings and initial spread. Following the MBS pricing literature (Deku et al. 2019; Fabozzi and Vink 2012a) we estimate spread by using a set of instruments and other control variables via the following model at the first stage:

$$ Spread_{i} = \beta_{0} + \beta_{1} Weighted Average Life_{i} + \beta_{2} Retained_{i} + \beta_{3} Subordination_{i} + \beta_{4} Tranche Size_{i} + \beta_{5} Ratings/Tranches_{i} + \beta_{6} Collateral_{i} + \mathop \sum \limits_{c = 1}^{C - 1} \beta_{k} \times Collateral Country_{i} + \mathop \sum \limits_{k = 1}^{K - 1} \beta_{k} \times Credit Rating_{k,i} + \mathop \sum \limits_{y = 1}^{Y - 1} \beta_{y} \times Year_{ y,i} $$

where Spread is the log of quoted spread at issuance in excess of the pricing benchmark. Weighted Average Life is the log of weighted average time until each monetary unit of principal is repaid. Due to moral hazard concerns, an incentive-compatible contract would involve the issuer to retain riskier tranches, while selling safer ones (Palia and Sopranzetti 2004). Therefore, we hypothesise that spreads account for tranche retention (Retained) as a moderator factor regarding the misalignment of interests between investors and issuers. Retained equals to 1 if at least one tranche is retained by the issuer, and 0 otherwise. Subordination is computed as the value of tranches in the same deal that have an equal or higher rating than the given tranche as a fraction of the total deal value. Tranche Size is the natural logarithm of the principal value of the relevant tranche. Ratings/Tranches is the ratio of the number of unique ratings in a deal to the total number of tranches in a deal (indicating the level of information asymmetries for a particular deal). Collateral equals to 1 for residential mortgages, and 0 otherwise. Collateral Country is a set of dummy variables indicating the country where the underlying mortgages are issued. Credit Ratings are a set of dummy variables indicating the credit rating of the tranche at the issuance. Year is a set of dummy variables capturing the effects of the macro environment. In the first stage model we use Retained, Subordination, Tranche Size, Collateral, Collateral Country and Rating/Tranches as instruments. Subsequently, we estimate the main model in the second stage by using the predicted spread derived from first stage estimates. We present the results of the second stage estimations in Tables 11 and 12 for credit and housing bubbles, respectively. For both types of bubbles, we find that Predicted Spread is statistically significant and positive for the full, AAA and non-investment grade samples. We also find that the coefficients of the interaction variables Credit Bubble x Predicted Spread and Housing Bubble x Predicted Spread are still significant and positive. Overall, these results show that our initial findings remain unchanged and the potential impact of endogeneity bias on our results is minimal.

Table 11 Predictive power of initial yield spreads during credit bubbles: Regressions with Predicted Spread
Table 12 Predictive power of initial yield spreads during housing bubbles: Regressions with Predicted Spread

4 Results from ML methods

In this section we present the results of the ML methods we have employed. Following the literature, to identify parameters for estimations and validate our findings, we split our MBS data sample into two parts as the training sub-sample and testing sub-sampleFootnote 7 (e.g. Chen, 2011; Beutel et al., 2019).Footnote 8 We focus on the full, AAA and non-AAA investment grade samples (similar to the results presented in Tables 2, 3 above) with the corresponding testing sub-samples containing 838, 315 and 521 observations, respectively.Footnote 9

4.1 Classification trees

We present results for the classification trees analysis in Fig. 2 for full (a), AAA (b) non-AAA investment grade (c) samples from left to right, respectively. For the full sample in Fig. 2a, the first critical criterion is whether an MBS was issued before 2005. We observe that only 15% of the MBS issued prior to 2005 were downgraded, whereas the overall probability of downgrade is higher for MBS issued 2005 onwards. This result points to the Credit Bubble in the pre-crisis period of 2005-2007. For MBS issued during the bubble period, we can also highlight that there is a portion of securities with relatively low downgrade likelihood. These are 13% of the sample (issued after 2005) with lower Weighted Average Life, signalling a relatively lower interest rate risk as the next important determinant of the MBS to be not-downgraded. On the other hand, securities issued from 2005 onwards, and with higher than or equal to the ML selected threshold for the interest rate risk in Node 7 of Fig. 2a, have 65% observed likelihood to be downgraded. It is worth noting that there is a small interest rate risk “cluster” of MBS issued in 2005 emerging from the Node 14 in Fig. 2a.Footnote 10 Finally, there are two terminal nodes in the full sample tree indicating that MBS issued in 2006 and 2007 (constituting 36% of our sample as shown by Node 15 in Fig. 2a have the highest downgrade likelihood. A sizable portion of these MBS have higher initial yield spreads with the highest observed downgrade likelihood of 76%. Overall, the full sample tree offers a clear decision algorithm on the growing pattern of the well-documented speculations with MBS during the Credit Bubble and that initial yield spreads as well as Weighted Average Life contain valuable information in predicting the downgrade outcomes.

Fig. 2
figure 2

MBS downgrade decision trees for full sample (a), AAA (b) and excluding AAA (c). Notes: Each ellipse shaped body in the tree hierarchy is its Node. Every Node is numbered and the closer it is to the top of the tree, its root, the more classi_cation value it typically contains. Trees are pruned with penalty parameter obtained for corresponding testing sub-sample. Therefore, gaps in the Node numbers are due to the tree pruning process. Nodes at the very bottom of the tree are often referred to as tree terminal Nodes. N-Down denotes not downgraded MBS outcomes, while Y-Down denotes downgraded MBS outcomes

The full sample tree in Fig. 2a does not identify credit ratings as a valuable determinant of the MBS downgrade. However, for the AAA sample in Fig. 2b, the accuracy of the overall classification tree is 79.81% (79.75%) in comparison to the accuracy for the full sample tree of 76.29% (75.95%).Footnote 11 We also observe a higher ranked role of initial yield spreads in predicting the MBS downgrade outcomes. Moreover, ML suggests that MBS downgrade trend begins from 2004 and prior to our Credit Bubble dummy variable employed in the logistic regressions. This is an indication of Housing Bubble, which is also found to be more evident for the AAA sample in Table 3 and Sect. 3.2. Results show that 29% of the AAA sample issued from 2004 have 81% observed likelihood of not to be downgraded if they have an initial yield spread below the determined level in node 3 of Fig. 2b. Following the AAA tree in Fig. 2b further down to its terminal nodes, we can again observe previously identified cluster of MBS with higher interest rate risk in node 29 and pattern of the developing Credit Bubble as shown by node 7. However, both are conditional on the higher Spread in node 3 of Fig. 2b. Therefore, this is a strong indication of Spread classification value and its interaction with Credit Bubble for the AAA sample. The last tree in Fig. 2c presents the results for the non-AAA investment grade sample. We find a decline in overall downgrade classification accuracy to 73.65% (72.33%). We observe similar predictions where MBS is more likely to be downgraded during the Credit Bubble period and also if they have a higher Weighted Average Life (or carry a higher interest rate risk) than ML identified threshold in Node 3 of Fig. 2c. In sum, the exploratory classification tree analysis provides an empirical confirmation of the predictive strength of initial yield spread and suggests that Spread is the dominant predictor of the downgrade of AAA rated MBS.

5 Naïve Bayes

In Fig. 3 we present results for the naïve Bayes analysis. Here we restrict our analysis to the Credit Boom period since all exploratory classification trees presented above highlight this period consistently. Results clearly show that all nonparametric probability density functions (PDF) for the downgraded MBS are located more to the right of the initial yield Spread axes in comparison to not-downgraded. This is observed for all MBS samples (full, AAA and non-AAA investment grade); however, with different degree of the shift towards higher initial yield spreads. In Fig. 3, we also illustrate similar classification for Weighted Average Life and find less evident patterns. Hence, initial yield spread provides visually more distinguishable positions of PDFs for downgrade (and no-downgrade) outcomes than Weighted Average Life.

Fig. 3
figure 3

Credit bubble Naive Bayes MBS downgrade classification for full sample, AAA and EXCLUDING AAA. Notes: Silverman (1986) rule-of-thumb bandwidths and Gaussian kernels were employed for producing nonparametric PDF estimates

5.1 Support vector machines (SVM)

We present results for SVM in Fig. 4. Here we position Credit Bubble downgrade and no- downgrade outcomes conditional on initial yield spread and weighted average life, and let the SVM algorithm determine a decision boundary and areas of the respective downgrade outcomes. We find clear nonlinear patterns for all illustrated sub-samples and observe that the higher the initial yield spread and the interest rate risk, the higher the downgrade likelihood of a given tranche. For the AAA sample, SVM achieves the highest accuracy of 72.6% (73.59%) for the linear boundary and 76.62% (76.44%) for the nonlinear boundary with the most explicit patterns, and visually differentiable positions of downgraded and not-downgraded tranches. Moreover, allowing for nonlinear boundary, we are able to visualise a separate cluster of the downgraded tranches with higher interest rate risk, previously identified with classification trees and highlighted for AAA sample in Fig. 2b. Full sample accuracy results are lower at 66.71% [66.44%] for linear and 67.96% (67.11%) for nonlinear. Removing the AAA sub-sample from the full sample lowers the SVM accuracy further to 63.38% [56.83%] and 65.86% [59.01%] for linear and nonlinear plots in Fig. 4, respectively. Overall, results based on SVM show additional and explicit evidence that predictive potency of initial yield spreads is the highest for AAA sample during the asset bubble periods.

Fig. 4
figure 4

Credit Bubble support vector machines MBS downgrade area, patterns and decision boundaries for full sample, AAA and excluding AAA. Notes: *Excluding AAA sample linear kernel were not able to identify downgrade decision boundary and therefore, polynomial kernel was employed instead. For all nonlinear pattern recognition radial kernel was employed with tuning parameters obtained for the testing sub-sample

5.2 Random forest and gradient boosting

Results of the RF algorithm are presented in Fig. 5. We obtain the highest accuracy for modelling with RF achieving 81.72% accuracy on the training sub-sample and 80.58% on the testing sub-samples. We observe that initial yield spread is the second most important variable, after issuance year, for our RF accuracy accomplishment. Reflecting on previous classification trees results in Fig. 2, we argue that asset bubble periods contribute the most to modelling downgrades. We also note that credit ratings are the least important variable for our RF framework (similar to classification trees in Fig. 2). However, following analyses conducted in Sects. 3.1 and 3.2, it is also worthwhile exploring gains due to variable interactions and unpacking the value of Spread during the asset bubble periods. This can be implemented with GB algorithm, another tree-based classification method. GB attempts to build trees so that every new tree corrects the error of the previous one (i.e. step by step learning), while RF builds each tree independently with a random sample of the participating data (generalisation at the end of the learning). This suggests that GB can be more delicate to handle, and it is more sensitive to the choice of parameters than RF; however, its step by step tree building nature can supply an interactions matrix of the independent variables.Footnote 12 From the results in Fig. 5, we observe the highest accuracy gains from interactions of issuance year and Spread variables with the Weighted Average Life variable. Initial yield spread and credit rating interactions as well as issuance year and Spread interactions provide the second-best accuracy gains. This finding confirms our baseline regression results in Tables 2 and 3 that initial yield spreads’ predictive potency during the asset bubble periods varies with credit ratings at origination (Table 13).

Fig. 5
figure 5

Random forest variable importance and gradient boosting variables’ interactions. Notes: Mean decrease accuracy is used for importance measure

Table 13 Predictive power of initial yield spreads during credit bubbles for selected countries

6 Conclusion

In this paper, we investigate whether the predictive power of initial MBS yield spreads varies with the financial cycle using a cross-country sample of 4203 MBS tranches. We find that initial yield spreads of MBS incorporate more information than credit ratings and predict the future downgrades even after conditioning on initial credit ratings. The predictive power of spreads is higher during credit and housing bubbles. It is also stronger for the least risky AAA-rated MBS. However, for non-investment MBS, predictive power of initial yield spreads shows no evidence of variation with issuance levels or the credit cycle. We also find that initial yield spreads capture the magnitude of the downgrade, especially during asset bubbles.

Overall, when valuing MBS bonds investors seem to rely on credit ratings more during the normal economic periods by not adjusting spreads. In contrast, during the bubble periods, they reflect risk sentiments in the initial yield spreads in excess of the risk indicated by assigned credit ratings. Investors also seem to be more cautious when pricing non-AAA investment grade bonds and are less likely to rely on credit ratings even during the normal economic periods. For riskiest non-investment grade bonds, the predictive power of spreads does not differ between the bubble and normal periods. We also show that initial yield spreads of MBS incorporated more information than credit ratings and, therefore, investors seem not to solely rely on credit ratings when assessing MBS quality. Our results are robust after accounting for both types of asset bubbles, the severity of information asymmetries in MBS and possible endogeneity bias that may arise due to modelling approach. Furthermore, results obtained from ML techniques, which are utilised for the first time in this strand of the literature, confirm our original findings.