Google Search Queries, Foreclosures, and House Prices

We study whether Google search behavior for “mortgage assistance” and “foreclosure help” aggregated in the mortgage default risk indicator (MDRI) of Chauvet et al. (2016) helps predict future house prices and foreclosures in local residential markets. Using a long-run equilibrium model, we disaggregate house prices into their fundamental and bubble components, and we find that MDRI dampens both components of house prices. This negative relationship is robust to various model specifications and time horizons. A higher intensity of search online, however, is associated with lower future foreclosure rates. We also find that foreclosure rates increase after a decline in the fundamental component of home values, but are not sensitive to their transitory (bubble) component. Foreclosure rates are higher in metropolitan areas located in non-recourse states. We interpret these findings as evidence for strategic household behavior. Our paper sheds new light on the predictive power of household sentiment derived from Google searches on prices and foreclosure rates in local housing markets.


Introduction
The subprime mortgage crisis serves as a powerful reminder of the seismic impact that the financial behavior of homeowners can exert on the U.S. financial system and economy. In the aftermath of the financial crisis, a voluminous literature has developed that aims to shed light on a key relationship in the run-up to the crisis: the interdependence between downward spiraling house prices and rising mortgage default rates. A better grasp of this issue was a matter of urgency during the housing market downturn as policymakers evaluated initiatives to curb the wave of foreclosures and help 'underwater' homeowners to stay in their homes (Calomiris et al. 2013;Foote et al. 2008). Yet, the topic remains high on the public policy agenda as it lays bare the tension between housing affordability and financial stability, and carries implications for mortgage market design and macro-prudential regulation.
In the post-crisis period, there has also been substantial interest in the development of mortgage default risk indicators which can serve as a "warning signal" for ensuing future turmoil in housing and mortgage markets. The construction of such forwardlooking sentiment indices from household survey data (such as the consumer sentiment survey of the University of Michigan) however has proven elusive. Household surveys are constrained with respect to geographical coverage and number of participants. Their reliability is further complicated by the reluctance of respondents to truthfully answer sensitive questions particularly related to their financial affairs (Singer and Ye 2013), and hence they are of limited use as a predictive tool particularly in the context of housing and mortgage markets.
A viable alternative that has increasingly been pursued in recent research is the creation of sentiment indices from internet search queries. Da et al. (2011Da et al. ( , 2015 develop an investor sentiment indicator for the stock market while Beracha and Wintoki (2013) and Van Dijk and Francke (2018) create a proxy for housing demand and show that online behavior has predictive power for house prices and liquidity in local residential markets. More recently, Chauvet et al. (2016) construct a mortgage default risk index (MDRI) based on the intensity of online searches for keywords such as "mortgage help" and "foreclosure assistance" captured by Google Trends. They show that this broad-based index predicts house price returns, returns on subprime credit default swaps and other relevant mortgage indicators, and conclude that MDRI "acts as a leading indicator of the most up-to-date, real-time measures of housing market performance." Despite the advantages of MDRI as a predictive tool relative to survey-based alternatives, little is known about the identity, reasons, or intentions of the households whose online searches are aggregated in the index. As Chauvet et al. (2016) point out, "searches are derived from all households, a universe that includes both owners and renters," yet, it may be assumed that "the bulk of such searches likely emanate from property owners as they are most likely to be concerned with mortgage default." While this assumption seems plausible, it is unknown how households process the information they gather in their online searches. One possibility, suggested by Chauvet et al. (2016) is that MDRI captures "household concerns about mortgage failure or foreclosure." Another plausible alternative is that households learn by searching for relevant terms online and condition their behavior on the information they gathered. That is, as a result of the information they collect online, households may adapt their behavior when dealing with financial distress, learning how to take advantage of government programs, or interacting with their mortgage lenders. Tetlock (2007) for example, hypothesizes a similar bi-directional relationship when studying the effect of negative media coverage on investor sentiment: While news printed in the Wall Street Journal might convey investor attitudes toward stocks not yet impounded in asset prices, they might also directly shape investors' perception of stocks. 1 Similarly, online searches might divulge information and at the same time convey information to economic agents who then act on this information. Indeed, top results from online searches include information on government programs to avert foreclosure as well as legal information. The mechanism of information acquisition by online searchers, however, is likely different from the one discussed by Tetlock (2007). While investor reaction to media content described by Tetlock is consistent with noise trader theories implying irrational behavior, the information gathering by households via online searches could be rational. Online searches can help households chart an optimal plan of action given the legal and institutional framework available in the state in which they reside as well as provide guidance on how to take advantage of government assistance programs.
A third possible scenario is that some searches are originating from prospective home buyers or home sellers who are trying to time their transactions or from investors trying to form expectations about the future performance of mortgage-related assets. Online searches thus might reflect the expectations about future market trends of this group of agents.
From a theoretical perspective, these three hypotheses are consistent with different causal relationships. The first hypothesis would predict an increase in foreclosures while the second hypothesis would predict a decrease in foreclosures as a result of a surge in the MDRI. The third hypothesis would imply no relationship between MDRI and foreclosures but a negative relationship between MDRI and future house prices as agents reveal their negative expectations about future market trends when searching online. Currently little is known about which of these hypotheses applies to local housing markets as most of the analysis by Chauvet et al. (2016) is conducted at the national level (local level analysis is restricted in terms of geographical coverage and does not account for metropolitan-area specific demographic and economic conditions).
The main objective of this paper is to explore the relationship between MDRI and future house prices and foreclosure rates in local housing markets. We advance previous research by expanding the set of metropolitan areas and accounting for the differences in appreciation rates between house price segments within the same geographical area. Furthermore, we take into account local economic conditions as well as relevant aspects related to mortgage lending at the regional level. Using a large set of metropolitan-area specific fundamental factors, we estimate a long-run equilibrium model and disaggregate house prices into their fundamental (equilibrium) component and bubble (deviation from equilibrium) component. We then study the relationship between MDRI and future house prices as well as their fundamental and bubble components. Further, we use this house price disaggregation to provide a more detailed analysis of the impact of the fundamental and bubble components on foreclosures.

Related Literature
This paper contributes to two distinct strands of literature. The first strand examines the predictive power of online search intensity on real economic activity. Almost a decade ago, Hal Varian (Google's Chief Economist) suggested that Google Trends data on the search volume for specific keywords helps predict information contained in future government data releases. 2 Since then academics have explored the predictive power of Google's Search Volume Index (SVI) in other domains such as business activity and financial markets. Da et al. (2011Da et al. ( , 2015 show that SVI captures investor attention and predicts stock prices at 2-week horizons. Beracha and Wintoki (2013) show that search intensity for terms such as "real estate" and "rent" help predict home prices. Chauvet et al. (2016) construct a mortgage default risk index from the search intensity of SVI for terms such as "mortgage assistance" and "foreclosure help," and show that this index helps predict housing return, mortgage delinquency indicators, and subprime credit default swaps. In this paper, we examine the predictive power of this index for citylevel housing appreciation rates in different market segments while taking into account local fundamental factors, mortgage market conditions, and mortgage market legislation in the states in which metropolitan areas are located (i.e. whether mortgage contracts are recourse or non-recourse). In recourse states, lenders can pursue borrowers for the mortgage balance remaining after foreclosed properties are sold while in non-recourse states they cannot.
The second strand of literature, which developed rapidly in the aftermath of the subprime mortgage crisis, explores the impact of house prices on foreclosure rates. Studies on the contributing role of price declines to mortgage defaults examine the extent to which household behavior conforms to the "option-theoretical" model of mortgage default. A key prediction of this theory is that households find it optimal to walk away from their investment as soon as their equity falls below a certain (negative) threshold (Foster and Van Order 1984;Kau et al. 1994). Closely related research on the 'double trigger' hypothesis has developed which aims to disentangle the contributing role of the strategic motive from that of affordability issues and cash flow problems of households (e.g. income shock related to job loss, divorce, or unforeseen healthcare expenses). Empirical studies conducted before the financial crisis find that negative equity is indeed a significant determinant of default (see e.g., Deng et al. 2000;Bajari et al. 2008;and Foote et al. 2008). Using the data from the financial crisis, Elul et al. (2010) present the estimates for the contributions of negative equity, illiquidity (measured by credit card utilization), unemployment shocks and the existence of a second mortgage to the probability of default. More recently, Kelly and McCann (2016) find that short-term arrears are primarily driven by unemployment, negative income shocks or divorce, while long-term arrears are much more likely to be due to negative equity. Using post-crisis data, Mocetti and Viviano (2017) identify job losses as a primary reason for mortgage delinquencies. Ghent and Kudlyak (2011) find that borrowers are 30% more likely to default in non-recourse states, whereby this effect is much stronger for homeowners of high-value homes. Moreover, Guiso et al. (2013) use survey data to demonstrate that the willingness to default increases in the home-equity shortfall. Further, the exposure to people who recently defaulted for strategic reasons increases default probabilities because it shows that lenders are unlikely to pursue a deficiency judgment against borrowers.
The interdependence between foreclosures and house prices has also received considerable attention in the literature. Lin et al. (2009) explore how the distance to a foreclosed property in space and time impacts home values by focusing on the role of comparables in the price formation process. LaCour-Little et al. (2020) argue that the value of comparables need to be adjusted for loan assumability, particularly in areas with a higher concentration of Federal Housing Administration insured loans. Calomiris et al. (2013) provide evidence that foreclosures dampen house prices, yet the negative impact of prices on foreclosures is much greater, in line with the theory of strategic borrower behavior. In contrast, Bhutta et al. (2017) find that emotional and behavioral factors are more important in the decision-making process of households than option-theoretic considerations. Gerardi et al. (2018) use data from the Panel Study of Income Dynamics (PSID) to assess the relative importance of negative equity versus ability to pay. While they find that strategic effects are important, changes in the ability to pay (e.g., job losses) have large estimated effects.
In this paper, we add to these studies by disaggregating house prices into their fundamental and bubble components and differentiating between recourse and nonrecourse states. Consistent with strategic motives for default, we find that homes are foreclosed at higher rates in Metropolitan Statistical Areas (MSAs) located in nonrecourse states. Furthermore, foreclosures increase when fundamental home values decline but are not sensitive to transitory deviations from equilibrium (bubble component of house prices).
The remainder of this paper is organized as follows. In "Methodology" section, we present the methodology and in "Data, Variable Construction, and Summary Statistics" section we describe the data. The empirical results are presented in "Results" section, and the concluding remarks in "Conclusion" section.

Methodology
We begin our analysis by estimating a fundamental house price model. We assume that house prices converge toward their equilibrium values in the long run, yet may exhibit deviations from equilibrium in the short run. Furthermore, as different segments of the housing market (i.e. starter homes and trade-up homes) might react differently to changes in fundamentals, we allow for different functional relationships between fundamentals and Top tier and Bottom tier house prices. That is, the relationships between Top and Bottom house price tiers and fundamentals are given by the functions where P j* i;t is the logarithm of the fundamental value of the house in tier j ∈ {T, B} (Top and Bottom) in MSA i, in month t and X i, t is the vector of fundamental variables. Following Abraham and Hendershott (1996) and Capozza et al. (2004), we consider population, income, employment rate, user cost, and construction cost of housing in the MSA as fundamental factors. Further, because house prices are also affected by regional geographical and regulatory constraints, we add the land supply elasticity estimates derived by Saiz (2010) as a fundamental factor. 3 These supply elasticity indices vary across MSAs but not across time.
The objective of the fundamental model is to estimate the relationships f j (•) yet a key concern with the estimation is that the levels of the house price indices and (some of) the fundamental factors might be non-stationary. A standard approach to address this issue is the estimation of an error correction framework, and the literature has proposed various specifications for the long-run relationship between house prices and fundamentals as well as the short-run dynamics of house prices (see, e.g. Drake 1993, Ashworth and Parker 1997, Kasparova and White 2001, and Stevenson 2008. In this paper, we estimate versions of the error correction mechanism proposed by Abraham and Hendershott (1996). This estimation method accounts for the serial correlation and the mean reversion in the time series of US housing returns that are widely documented in the literature (see, e.g. Shiller 1989, 1990).
We denote the actual appreciation rates of house prices (i.e. continuously compounded returns) of the two house tiers by ΔP j i;t ¼ P j i;t −P j i;t−1 ; and the appreciation rates of the fundamental house prices to be estimated by ΔP j* i;t . Further, we assume that the way prices respond to fundamental factors is given by a linear relationship Hereby α j 0 þ α j 1 ΔX i;t is the change in the fundamental value, which we denote by P j* i;t , and θ j i;t denotes the "error term" which accounts both for momentum and mean reversion effects and is given by the equation In this equation, the coefficient λ j 1 measures the momentum (serial correlation) while the coefficient λ j 2 measures the speed of adjustment to the long-run equilibrium. Combining eqs. (2) and (3) we obtain: In addition to an OLS specification, we estimate fixed-effects models that allow for heterogeneity among MSAs and/or time 4 One difficulty with this estimation is that the fundamental values P j* i;t−1 depend on the estimates of the different versions of eqs. (4) while at the same time they are part of the error correction term which is used as an explanatory variable in these equations. We resolve this issue using the iterative procedure proposed by Abraham and Hendershott (1996). We assume that the observed house price in December 1999 corresponds to its fundamental value (i.e. P j* i;t ¼ P j i;t for t=December 1999) and recover the fundamental value time series from the relationship P j* i;t ¼ P j* i;t−1 þ ΔP j* i;t . We then re-estimate eqs. (4) and recalculate fundamental prices repeatedly until the estimates stabilize (typically we need to perform up to five iterations). 5 We then analyze how the current (and past) values of the mortgage default risk index, MDRI i,t impacts the future values of the fundamental component P j* i;t and the bubble component B j i;t ¼ P j i;t −P j* i;t of local house prices as well as the foreclosure rates HF i,t . Furthermore, we use the house price decomposition to explore how changes in the fundamental P j* i;t and the bubble B j i;t components of home values affect foreclosure rates HF i, t .

Data, Variable Construction, and Summary Statistics
The estimation of the fundamental house price model is based on a panel of 107 MSAs located in 29 U.S. states. A map with the location of these MSAs is presented in Fig. 1. For each MSA we observe the monthly growth rate of house prices and local fundamental factors. Further, in our analysis of the effect of the mortgage default risk index on house prices and foreclosures, we include additional variables that account for the mortgage market conditions in each MSA.
All MSAs in the dataset are listed in Table 7 in the Appendix along with the state in which they are located. The table also contains the classification of states into the recourse and non-recourse categories depending on whether states allow lenders to pursue a deficiency judgment against foreclosed borrowers (we use the classification of Ghent and Kudlyak 2011). In this figure, the recourse states are depicted in dark blue and the non-recourse states are represented in light blue color.

Local House Prices and Fundamental Factors
In this study, we use the monthly Zillow home value indices 6 for the period from April 1996 to December 2016. These indices are constructed from deed records using a hedonic methodology which accounts for individual attributes such as size and number of bedrooms and bathrooms. A major challenge in the construction of home value indices is the changing composition of the properties sold in different periods. Indices based on a repeat-sales methodologysuch as the S&P Case-Shiller index or the index of the Federal Housing Finance Agencyaccount for this issue by using only properties that are sold more than once. This methodology has limitations for smaller regions or smaller market segments where the number of repeat sales is limited. 7 Zillow, on the other hand, aggregates all transactions to create valuations for all properties (Zestimates) based on their characteristics 8 and uses the Zestimates to construct its regional price indices (see, e.g. Dorsey et al. 2010 for a discussion of this approach).
As we are interested in the dynamics of different market segments, we use the top and the bottom house price tiers in our analysis. The top tier index captures the median value of homes within the 65th to 95th percentile range while the bottom tier index captures the median value of homes within the 5th to 35th percentile range for each MSA. The dynamics of the top and the bottom price tiers for three of the MSAs in the dataset (San Diego, Minneapolis, and Phoenix) are presented in Fig. 2 (Panels A and B). These three MSAs represent a cyclical, a steady, and a bubble market, respectively (Mayer 2011).
Although there is a variation across regional housing market segments, on average the indices peak in late 2006, and then decline and reach their lowest values between 2009 and 2012. They recover thereafter by almost reaching 6 These data are obtained from https://www.zillow.com/research/data 7 Indeed, the S&P Case-Shiller index covers only 20 cities. 8 For more information on the Zillow methodology see https://www.zillow.com/research/zhvi-methodology/ their pre-crisis period values around 2016. In our analysis, we use the log differences of the price indices (i.e., the continuously compounded returns) for the two market segments.  Mayer (2011). The bubble component is calculated as the deviation from the fundamental house price, i.e. the difference between the logs of the house price index and its fundamental component: The fundamental variables used include the population, personal income per capita, total non-farm employment, construction cost, a derived user cost of homeownership, and the land supply elasticity index in the MSA. Descriptive statistics of these variables, except for land supply elasticity which is time-invariant, along with unit root tests are presented in Table 1.
The population and personal income data are collected from the Bureau of Economic Analysis. We use cubic spline interpolation (De Boor 1978) to derive monthly values from the original annual observations. The total non-farm employment, available at the state level, is collected from Datastream and used for all metropolitan areas located in the same state. The construction cost is measured by the price index of new single-family c d Fig. 2  The table reports descriptive statistics and stationarity tests. The Top and Bottom house price tiers are measured in US Dollars. The MDRI is constructed from 11 Google Trends search items such as "foreclosure," "mortgage help," or "government assistance," (see Chauvet et al. 2016, Table 1, for the full list). The Homes foreclosed are the number of foreclosures per 10,000 homes. The monthly observations of Population and Personal income are derived from annual data via cubic spline interpolation (De Boor 1978). The User cost is constructed from equation (5). The tests for stationarity are Fisher-type Augmented Dickey-Fuller (ADF) tests for unbalanced panels that have as the null hypothesis that all panels contain a unit root (Choi 2001). The ADF tests are performed on the logarithms of the variables. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively houses under construction, which is available from the U.S. Census Bureau. As only the national index is available in monthly frequency, the change in construction costs varies over time but not across MSAs. As a measure of land supply elasticity of MSAs, we use the land supply estimates derived by Saiz (2010). 9 To facilitate comparison to previous research, we construct the user cost by the method of Capozza et al. (2004) which accounts for mortgage rates, taxes, expected appreciation as well as annual maintenance and depreciation of properties. That is, the user cost is constructed by the formula Here the "Mortgage Rate" is the 30-Year fixed-rate mortgage average in the United States, collected from the Federal Reserve Bank of St. Louis. The "Property Tax Rate," collected from Wallethub, 10 is the effective real-estate state tax rate. The "Income tax rate" is the sum of the average federal income tax rate and average state income tax rate for the middle quintile of households. The federal income tax rate is collected from the Urban-Brookings Tax Policy Center, 11 while the state income tax rate is collected from the National Bureau of Economic Research. 12 For inflation, we use the CPI provided by the Federal Reserve Bank of St. Louis. The annual maintenance and obsolescence of properties are set at 3% as indicated in eq. (5).

Mortgage Lending
To account for local mortgage market conditions we construct two variables from the Home Mortgage Disclosure Act (HMDA) data 13 : the total amount of mortgage loans in a given year in each MSA (Loan supply) and the percentage of loans that are subprime, or higher-priced mortgage loans in each MSA (Subprime). Loans are categorized as subprime following the classification of Mayer and Pence (2008) according to which a mortgage is a subprime mortgage if its rate spread exceeds 3% for first-lien mortgages and 5% for junior lien mortgages. 14 9 As this elasticity measure has only limited coverage, we are left with only 93 MSAs in our sample. Another way to account for differences across MSAs is to estimate a model with MSA-level fixed effects while leaving out the supply elasticity as a regressor. In Table 2 we report results for both the OLS and the fixed effect model, but use the estimates of the fixed effect model in the subsequent analysis because this model allows us to use all 107 MSAs in our sample. 10 Property tax rates are collected from: https://wallethub.com/edu/states-with-the-highest-and-lowestproperty-taxes/11585/ 11 The average federal income tax rate is downloaded from: https://www.taxpolicycenter.org/statistics/historical-average-federal-tax-rates-all-households 12 The state income tax rate is downloaded from: http://users.nber.org/~taxsim/state-tax-rates/. We apply the rates for a family income of $50,000. 13 The HMDA data contains over 80% of home loans and is the most comprehensive source of data on mortgage loans (Avery et al. 2007). 14 The rate spread is the difference between the Annual Percentage Rate (APR) and a survey-based estimate of APRs offered on prime mortgage loans of a comparable type utilizing the "Average Prime Offer Rates" fixed table or adjustable table, action taken, amortization type, lock-in date, APR, fixed term (loan maturity) or variable term (initial fixed-rate period), and reverse mortgage.

Mortgage Default Risk
The Mortgage Default Risk Index (MDRI hereafter) of Chauvet et al. (2016) is constructed from the Search Volume Index (SVI) data for terms such as "foreclosure help" and "government mortgage help" in US states published by Google Trends. 15 The MDRI is obtained from the UCLA ZIMAN Center for Real Estate. 16 Zillow also publishes a Homes Foreclosed index (HF hereafter) which gives the number of homes foreclosed per 10,000 homes in metropolitan areas each month. As an illustration, in  Table 1.

Results
We first explore how real house prices respond to changes in local fundamental factors by estimating the models given in eq. (4). In particular, we consider population, personal income, employment, as well as the variable we created for the user cost, construction cost, and the land supply elasticity of the MSA (cf. Capozza et al. 2004;Stevenson 2008). As a preliminary step, we perform unit root tests on the tiered house price indices as well as the fundamental variables (see the last two columns in Table 1). These variables are nonstationary in levels and stationary in first differences. This points to the inherent difficulties that would be present if we tried to directly use the levels of these variables in our statistical analysis. Furthermore, it justifies our focus on growth rates and the use of an error correction modeling approach.

Long-Run Equilibrium Relationship
The regression results of the error correction models specified in the four versions of eq. (4) are presented in Table 2. They include OLS estimates as well as estimates of fixed-effect models in which we control for MSA and time fixed effects.
The coefficient estimates for all fundamental variables have the anticipated sign and are statistically significant at the 1% or 5% level. As expected, growth in population, personal income, and employment have a positive impact on house prices. An increase in user cost, a significant component of which constitutes the mortgage interest rate, is associated with lower house price growth. Similarly, an increase in construction cost 15 For the construction of their monthly MDRI, Chauvet et al. (2016) used "foreclosure assistance+foreclosure help+ government assistance mortgage+home mortgage assistance+home mortgage help+housing assistance+mortgage assistance program+mortgage assistance+mortgage foreclosure help+mortgage foreclosure+mortgage help" to obtain the joint SVI. 16 As the city-level MDRI data is only available for 20 cities, we use the state-level MDRI data for all the MSAs in our sample. The data on the MDRI indices are available at: https://github.com/ChandlerLutz/MDRI_ Data leads to an increase in home values. Further, the relationship between the land supply index and house prices is negative, as had been found in previous literature. The error correction term is significant indicating that both the top and the bottom house price tiers adust to their long-run equilibrium values. Similarly, the autoregressive coefficient is positive and statistically significant, indicating the presence of momentum in housing returns for both house price tiers.
The magnitude of the coefficients suggests that bottom tier homes are more sensitive to changes in population, employment, user cost, and construction cost as well as exhibit a stronger momentum. We formally test whether the coefficients for the top tier and the bottom tier are significantly different from each other using a b Fig. 3 Default risk indices.

Table 2
Estimates of the fundamental house price model ΔBottom and ΔTop, respectively. For the population, employment, personal income, and construction cost variables, the continuously compounded growth rates are used as regressors. Following Abraham and Hendershott (1996), the change of the user cost is used as a regressor. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively the OLS model specification (Model 1 in Table 2). In particular, we construct a dummy variable "Toptier," which takes on the value of one for the top tier and zero for the bottom tier index. We include it as a regressor along with the interactions of this variable with the fundamental variables. We estimate this regression using Abraham and Hendershott's (1996) iterative method described in the "Methodology" section by pooling the top tier and bottom tier observations together. We find that only the coefficients for the interaction variables (Toptier * ΔHouse Price t − 1 ) and (Toptier * ΔConstruction cost) are significant. They have a negative sign indicating starter homes exhibit a stronger momentum effect and their response to construction cost is greater compared to trade-up homes.
The fundamental house price model allows us to disaggregate house price indices into their fundamental and bubble components. Using the estimates of Model 2 (Panel Fixed Effects) in Table 2, we calculate these two components of house price and represent their dynamics for three of the MSAs, (San Diego, Minneapolis, and Phoenix) in Fig. 2. In the following subsections, we analyze whether the MDRI helps predict these components of house prices and whether these components affect future foreclosures.

Effect of MDRI on House Prices
As a next step, we explore how household sentiment revealed by the mortgage default risk index (MDRI) impacts house prices. To account for mortgage market conditions, we add as regressors two variables that we constructed from HMDA data: the total amount of mortgage lending in the previous year (Total Loans), and the percentage of mortgage loans that are classified as subprime (Subprime). The results are reported in Table 3.
We find that an increase in the MDRI index lowers house price growth in the following three to 6 months. In the regression in which all lags are included (see model 8), the coefficients for the lags between 3 and 6 months are statistically significant and range between −0.00017 and − 0.0012. Further, as anticipated we find that the amount of mortgage credit that flows into the area serves to increase home values, while subprime lending in the previous year dampens home values in the current year.
In the Appendix, we present regression results for the 2007-2012 and the 2013-2016 subsamples (see Table 8, Panel A and Panel B, respectively), and we find that the predictive power of MDRI applies mostly for the first subsample that includes the subprime mortgage crisis. In addition to considering actual growth rates of the house price tiers, we also analyze the decomposition of house prices into their fundamental and bubble components. We find that the MDRI dampens both the fundamental component (see Table 9) and the bubble component (see Table 10) of house prices.
The regression results also indicate that foreclosure legislation has a significant effect on home value appreciation rates. In particular, house price growth is on average lower in the metropolitan areas located in recourse states where lenders can pursue a deficiency judgment against borrowers. The coefficient for the recourse dummy variable in Table 3 is statistically significant at conventional levels and equals −0.0002 across all specifications. One possible explanation is that buying a home with a mortgage is less attractive to borrowers in Table 3 Predictive power of MDRI for the house price appreciation rates (ΔHP) Dependent Variable House Price appreciation rate (ΔHP) (1)

Effect of MDRI on Foreclosure Rates
Ghent and Kudlyak (2011), Table 1 provide an overview of foreclosure legislation across US states and present statistics of the timeline of different stages in the foreclosure process in each state. If there are no delays, a non-contested non-judicial foreclosure can take as little as 60 days, yet often the process takes longer. Furthermore, foreclosures are followed by a redemption period with a duration of another 6 months. It could only be speculated when delinquent borrowers start searching online for help and how the intensity of their searches varies over time. To allow for different timing of online search we explore alternative specifications. In Table 4 we present results for search behavior with lags between 1 and 6 months. We find that an increase in the MDRI lowers foreclosures for horizons between 2 and 6 months. 17 These coefficients are statistically significant and range between −0.0425 and − 0.1645 (see model 8). In Table 5 we aggregate the Google searches for periods of 3 months and considers regressions with lags of up to a year.
We find also for this setting that Google searches reduce foreclosures (the coefficients for the lags between 1 and 3 months and lags between 9 and 12 months are statistically significant). As a robustness check, in Table 11 presented in the Appendix we consider the 2007-2012 and the 2013-2016 subsamples and largely find an inverse relationship between MDRI index and future foreclosures.
These findings are consistent with the hypothesis that the MDRI index captures learning effects. That is, by searching online some households may access information that helps them avert foreclosure. Further, consistent with the theory of strategic default, we find that foreclosure rates are lower in recourse states. Similar findings are reported in the recent empirical literature on the effect of recourse on default. For example, Ghent and Kudlyak (2011), Table 3 report that the probability of default of loans made in recourse states is on average 6.2% smaller (although their coefficient estimate is not significant).

Effect of House Prices on Foreclosure Rates
In Table 6 we report results for several alternative specifications. We disaggregate house prices to their fundamental and bubble components and study the contributing effect of these two components on foreclosures.   The table presents regression estimates of the effect of the MDRI index on the log of homes foreclosed. The MDRI is included with monthly lags. Loan supply t − 12 is the total supply of mortgage loans in the previous year in the MSA (derived from HMDA data). Subprime t − 12 is the percentage of the total amount of mortgage loans in the MSA that are subprime. The Recourse variable takes on the value of one, if the MSA is in a recourse state, and zero otherwise. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively The finding that is robust across all specifications is that the foreclosures respond to changes in fundamental values. A 1 % drop in fundamental home values increases the log of the homes foreclosed by 1.2, i.e. leads to about 3.3 extra foreclosures in the following month for every 10,000 homes. The bubble component, on the other hand, does not appear to have an impact on the proportion of defaulting homeowners. We interpret this as further evidence for strategic sophistication by homeowners. A shock to the fundamental component of home values would have a long-term effect on future house prices while a shock to the bubble component would disappear over time as home values revert to their long-run equilibrium. Indeed, note that the speed of adjustment coefficient in the fundamental equation reported in Table 2 is significant and has the anticipated sign. We further examine strategic default behavior by constructing a dummy variable for house price declines of more than 5% in the past twelve months (PriceDecline>5%). We do not find evidence that foreclosures are driven primarily by option-theoretic defaults: the coefficients for both the Recourse dummy and the interaction term of the Recourse dummy with the (PriceDecline>5%) dummy are not significant. These findings correspond to results reported in the recent empirical literature documenting the relative rarity of defaults due solely to strategic motives, and the relative importance of affordability constraints. Bhutta et al. (2017) find that homeowners do not walk away from their investments unless they are   The table presents regression estimates of the effect of the MDRI index on the log of Homes Foreclosed. The MDRI is included with three-month lags (e.g. ΔMDRI 1 − 3 is the change in the log the MDRI index for the previous three months, etc.). Loan supply t − 12 is the total supply of mortgage loans in the previous year in the MSA (derived from HMDA data). Subprime t − 12 is the percentage of the total amount of mortgage loans in the MSA that are subprime. The Recourse variable takes on the value of one, if the MSA is in a recourse state, and zero otherwise. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively substantially more 'underwater' than option-theoretical models would predict, while Gerardi et al. (2018) find that default is primarily driven by income shocks rather than strategic motives. We further explore how foreclosures respond to house price declines of more than 5% in the previous year and find that the foreclosures increase more in the MSAs sustaining such declines. There is no evidence of a differential impact of such declines in recourse and non-recourse states. Furthermore, the effect on foreclosures is about the same regardless of whether the decline has originated in the top tier or the bottom tier of the local housing market.

Conclusion
As of 2020, Google commands more than 92% of the search engine market share worldwide with an estimated number of approximately 2 trillion global   The table presents regression estimates of the effect of house prices on foreclosures. Price Decline>5% takes on the value of one if house prices in the MSA declined more than 5% in the last 12 months and zero otherwise. The dummy variable Toptier equals one for the top-tier house prices, and zero for the bottom-tier house prices. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively searches per year (www.hubspot.com). Extant research has established that Google searches provide timely indicators for social and economic activity in a variety of domains ranging from automotive sales to the spread of infections, to asset returns in financial and housing markets. Da et al. (2011) find that search volume data predict stock returns and conclude that "search data has the potential to objectively and directly reveal to empiricists the underlying belief of an entire population of households." Chauvet et al. (2016) constructed a mortgage default risk index from data on Google search volumes for keywords such as "mortgage help" and "foreclosures assistance" and demonstrated that this index has predictive power for the returns on housing and mortgage-related assets.
In this paper, we analyze how this mortgage default risk index is related to house prices and foreclosure rates in local housing markets. Using a long-run equilibrium model, we disaggregate local house prices into their fundamental and bubble components. We then explore how the mortgage default risk index relates to future housing market outcomes such as house prices and foreclosures. In line with previous literature, we find that an increase in the mortgage default risk index leads to lower house price appreciation rates. Perhaps somewhat surprisingly, we also find that an increase in the mortgage default risk index reduces the percentage of foreclosures for various time horizons. One interpretation of these findings is that economic agents not only reveal their sentiments through their search behavior but also collect and process the information they access and as a consequence adapt their behavior. That is, through online searches for "foreclosure help" and "mortgage assistance" households can access relevant information that helps them avert foreclosures.
We also report new results on the interaction between housing and mortgage markets which suggest some degree of household strategic behavior. In particular, we find that declines in the fundamental component of house prices lead to an increase in foreclosure rates while declines in the transitory component of house prices have no statistically significant effect. Furthermore, foreclosures tend to be higher in metropolitan areas situated in non-recourse states where lenders cannot pursue a deficiency judgment against borrowers.
In addition to exploring its predictive properties, one can also use online search data to empirically test economic models that incorporate the learning of economic agents and view equilibria as the outcome of adaptive behavior. The empirical assessment of such models is left for future research.    Loan supply t − 12 is the total supply of mortgage loans in the previous year in the MSA (derived from HMDA data). Subprime t − 12 is the percentage of the total amount of mortgage loans in the MSA that are subprime. The Recourse variable takes on the value of one, if the MSA is in a recourse state, and zero otherwise. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively (3) (3) Recourse variable takes on the value of one, if the MSA is in a recourse state, and zero otherwise. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively (3) Notes: Loan supply t − 12 is the total supply of mortgage loans in the previous year in the MSA (derived from HMDA data). Subprime t − 12 is the percentage of the total amount of mortgage loans in the MSA that are subprime. The Recourse variable takes on the value of one, if the MSA is in a recourse state, and zero otherwise. One, two, and three asterisks represent significance at the 10, 5, and 1%, respectively Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.