1 Introduction

The online advertising market has grown rapidly since the appearance of the first online advertisement in 1994 (Hollis, 2005). According to eMaketer’s forecast, the total global digital advertising spending will reach $602.25 billion in 2022 (eMarketer 2022b). In addition, an astounding 81.5% of China’s total advertising spending will focus on digital in 2022, making it the top country in terms of the extent and scale of digital advertising spending around the world (eMarketer 2022a).

Advertisements have long been recognized as a convenient and quick way for consumers to find information they need during their product consideration and purchase process (Nelson, 1974). Prior research shows that advertising could lead to outcomes, such as increased product awareness, site visits, purchase intention, and actual purchases and sales (Assmus et al., 1984; Bass et al., 2007; Blattberg & Jeuland, 1981; Drèze & Hussherr, 2003; Ilfeld & Winer, 2002; Sherman & Deighton, 2001). Although the effects of advertising are widely documented in the literature, the advertised products are mainly consumer goods (e.g., food, clothes, and other daily necessities). Relatively, there is limited research that examines the effect of online advertising on other types of goods, such as real estate assets in the information systems (IS) and marketing literature.

Unlike consumer goods that are intended for consumption by the average consumer, hence typically cheaper, real estate assets (e.g., residential and commercial properties/houses) have consumption returns, which involve long-term investments and high transaction costs (Varian, 2014). They cannot be purchased repeatedly in the short term and are geographically fixed and heterogenous (Krainer, 2001). Hence, real estate investing requires consumers to undergo more procedures, spend additional time and energy, and acquire more information (Georgiev et al., 2003). This longer and more complex procedure of buying houses may lead to more questions. Does online advertising affect buyers’ purchases? Why and how are the eventual sales of real estate assets so intricate? Specifically, buyers’ considerations and purchase decisions of consumer goods are mainly driven by their needs. However, the purchase of houses are driven less by consumption needs and more from an investment decision that has high financial risks and can have a lifelong impact on buyers. Unfortunately, the role of online advertising in the sales of real estate assets, such as new housing units, remains an under-studied, but important, issue and is a research gap that this study attempts to address.

The objective of the paper is two-fold. First, we aim to explore the impact of online advertising on real estate asset sales. Second, we further examine and validate the underlying mechanism and discover some factors that may alter the relationship between online advertising and real estate asset sales. In essence, our research questions are: how and to what extent does online advertising influence the housing market? To examine these, we collect data from a rich database about online advertising and local housing market information for cities in China. China provides an ideal context for us to conduct our research because it is now the world’s largest residential property market with the total sales of both new and second-hand houses forecasted to be $3.5 trillion in 2020 (Forbes, 2020). Based on rigorous identification strategies and estimation methods, we find evidence that online advertising increases the sales of new houses. This effect is stronger with lower housing prices, higher residential incomes, and lower-tier cities. Furthermore, we discover that this effect could be due to the fact that online advertising lowers housing prices, attracts more immigrants, and reduces emigration to form a potential larger group of buyers. Lastly, we also report that online advertising on the new house market will not have a spillover effect on the related second-hand housing market. The notable findings from this research, therefore, provide important theoretical contributions and offer practical implications to practitioners and policy makers.

2 Research background

Advertising is the dissemination of information used to promote products for consumers. Based on dissemination channels, advertising is classified into two categories: offline and online advertising. Offline advertising is non-Internet-based advertising, including television, radio, and print ads (Bayer et al., 2020), while online advertising uses the Internet to disseminate product information, such as online display advertising (Goldstein et al., 2014), search advertising (Lu & Yang, 2017), and e-mail advertising (Wattal et al., 2012). Advertising has many economic benefits, including increasing product sales, enhancing firm performance and value, and improving market efficiency, which received considerable research attention. Our research is closely related to two streams of literature: (1) the usage and (2) the mechanism of online advertising’s impacts on economic outcomes.

First, past studies explored the influence of online advertising usage on economic outcomes. For instance, Xu et al., (2014) captured dynamic interactions among advertisement clicks to study the effects of various online advertisements on purchase conversion for consumer electronics. Ahn et al., (2018) explored whether being exposed to an advertisement during a search could change consumers’ e-commerce search behaviors by conducting eye-tracking field experiments on clothing products. Sun et al., (2020) studied the effectiveness of two popular advertising tools, sponsored search and social media endorsement, in increasing traffic and sales for online sellers at a retail e-commerce platform. They also examined the differential effects for sellers with low and high reputations. Moreover, Sahni (2016) focused on spillover benefits and explored the impact of online ads on the advertiser’s competitors, reporting that ads increased the sales of advertised products but also reminded consumers of similar (non-advertised) goods. Additionally, online advertising can lead to ad repetition, which can annoy consumers and is researched by Goldstein et al., (2014) and Todri et al., (2020). Goldstein et al., (2014) discussed the economic and cognitive costs of these annoying display advertisements, while Todri et al., (2020) further proposed a hidden Markov model that investigated both the impact of display advertising on consumers’ purchase decisions and the potential of persistent display advertising to stimulate annoyance for consumers. Other researchers compared the impacts of online and offline advertising (Bayer et al., 2020; Goldfarb & Tucker, 2011b). Despite the abundant research on the impact of online advertising for different kinds of products, most of these IS and marketing studies only explored the economic effects of online advertising based on consumer goods, but overlooked other types of goods, such as houses. Given the significant difference between the two types of goods, the overlooked impact of online advertising on real estate assets remains a research gap that this study attempts to address.

Another stream of literature explored the mechanism of online advertising influencing economic outcomes. Some focused on the pricing design of online advertising (Feng & Xie, 2012; Hu et al., 2016). For instance, Feng & Xie (2012) proposed a performance-based advertising model and discovered that adopting this advertising scheme profoundly impacted one fundamental function of advertising (i.e., signaling product quality). Hu et al., (2016) compared the cost per click (CPC) with cost per action (CPA) pricing schema to study the economic impacts, such as purchase rates, profits, etc. Some also explored the design of keyword auction mechanisms (Du et al., 2017; Liu et al., 2010). Researchers studied the impact of weighting schemes and minimum-bid policies in keyword bidding on economic revenues (Liu et al., 2010; Du et al., 2017) further considered multiple keyword auctions and empirically examined the role of keyword categories and match types in advertising and financial performance, such as the conversion rate and product sales. Additionally, some studies focused on online advertising auctions (Decarolis et al., 2020), considering the moderating role of keyword competition (Ayanso & Karimi, 2015). Others investigated the relationship between organic result mechanisms and sponsored search advertising performance (Agarwal et al., 2015; Yang & Ghose, 2010). Moreover, there are some studies about marketing mechanisms, such as targeting (Chen & Stallaert, 2014), retargeting (Sahni et al., 2019), and advertising competition (Agarwal & Mukhopadhyay, 2016). Although the above studies uncovered various mechanisms of online advertising’s influence in the consumer goods market, these findings can not be directly applied to our research context because of the significant differences between the consumer goods market and the real estate market. Specifically, in contrast to the commodity market or manufactured consumer goods that are partially differentiated, the real estate markets are considered completely product-differentiated because of the local nature of houses and are closely related to urban economics (DiPasquale & Wheaton, 1996). Therefore, it is important to factor in regional characteristics and movements across metropolitans when investigating questions related to real estate economics. Our study focuses on a basket of factors on city characteristics and macroeconomics to exploit and unravel the contingency factors and underlying mechanism of online advertising’s impact on the real estate market.

3 Data

3.1 The database: CREIS

Our main objective is to analyze the relationship between online advertising and sales of new houses. This information can be extracted from the China real estate index system (CREIS) databaseFootnote 1, which is operated by China Index AcademyFootnote 2. China Index Academy collaborates with Fang.com, one of the largest online real estate platforms in China, and conducts regular field investigations in various regions and cities across China to collect data on real estate projects and houses from enterprises, brokerage agencies, intermediaries, and governments. To ensure the accuracy and objectivity of the collected data, it is reviewed and verified through telephone reviews by professionals in different regions, spot checked by analysts from headquarters, and collated by headquarter staff. To examine our research question, we worked with China Index Academy to collect and construct a rich data set that includes information on our focal and control variables for empirical estimations. Our data set consists of three sources: (1) online advertising information (e.g., number of real estate projects involved, count of advertising impressions) for the new house market, (2) housing market information (e.g., prices, sales), and (3) factors, such as macroeconomics (e.g., GDP, CPI) and demographics of cities across China.

Our final data set covers 25 cities in China over 120 months from January 2010 to December 2019. These cities range from low-tier (i.e., less developed) to high-tier (more developed) cities in 17 municipalities and/or provinces across China, providing a representative sample to examine our research question. However, due to the existence of missing data, our final data set is an unbalanced panel with 1,133 observations.

4 Model and analysis

4.1 Empirical model

Based on the above discussion surrounding our motivations, we examine how online advertising affects the sales of new, local houses. We conduct our analysis at the city-month level to construct all our model variables. Let subscript i denote each city, and subscript t denote each month. To examine the impact of online advertising on house sales, our dependent variable, house sales NUit, indicates the total units sold of new houses of city i in month t. Our independent variable, online advertising ADit, represents the total impressions of online advertising (i.e., the number of times that advertising is displayed on websites or search engines) on new houses of city i in month t. Lastly, we assemble a comprehensive set of factors as our control variables, including: (1) average price (CNY/m2) of new houses of city i in month t (NPit), (2) total units sold of second-hand houses of city i in month t (SUit), (3) average price (CNY/m2) of second-hand houses of city i in month t (SPit), (4) GDP growth rate of city i in month t (GDPit), (5) per capita disposable income (CNY) of residents in city i in month t (INCit), (6) consumer price index of city i in month t (CPIit), (7) total population of city i in month t (POPit), (8) a linear time trend variable (TNDt), and (9) a set of time dummies at the monthly level (Tts). We include control variables (1) to (3) because prices and sales of the focal and related (second-hand) housing markets could be correlated. (4) to (7) are macroeconomic and demographic factors, which are typical drivers of housing price (Leung, 2004; Mulder, 2006). (8) is included to control for the natural growth of the housing market and changes in house sales over time. Finally, (9) accounts for other unobserved time-variant factors that are correlated with sales.

Table 1 presents the descriptive statistics of our model variables. Before the subsequent formal model estimation, we provide some model-free evidence by plotting the relationship between online advertising and new house sales using Beijing as an example. Fig. 1 shows that the peaks in house sales often coincide with increases in advertising. This suggests a positive relationship between online advertising and new house sales. Next, we conduct a formal estimation of this relationship.

Table 1 Descriptive statistics
Fig. 1
figure 1

Online advertising and new house sales

4.2 Result

As our dependent variable of new house sales units (NU) is a discrete count variable, we need to employ a count data model for estimation. Thus, we first estimate a fixed effects (FE) Poisson model of NU on all the control variables. As reported in Table 2, Column (1), various control variables have significant relationships with NU. These significant relationships imply that our control variables have good explanatory power and can control for many factors that confound our investigation of the impact of AD on NU. Beyond these control variables, we estimate a full Poisson FE model by further including the independent variable of online advertising impression (AD). We summarize the results in Table 2, Column (2). As indicated, the estimated coefficient of AD, 0.052 (± 0.002), is positive and statistically significant, showing that AD has a positive relationship with NU. Specifically, the expected increase in log count of NU for one-unit increase in AD is 0.052. Therefore, the percentage change in the incident rate of new house units sold is an increase of 0.005% for every one more impression in online advertising. We also estimate random effects (RE) Poisson model of NU on all the explanatory variables and present the results in Table 2, Column (3). The estimate of AD is consistent with that of the FE estimate. RE models are intended for a large number of groups and relatively few time periods per group and asymptotic assumption holds only when there are a large number of groups (Wooldridge, 2010). Since the number of groups in our data set (i.e., the number of cities) is relatively small, FE model is preferred over RE model. Thus, we consider the results of the full FE model from Column (2) as our baseline results. We also employ FE as the main estimation for subsequent analysis. Besides, we report the model fit index—Bayesian information criterion (BIC) in Table 2. BIC is one of the most commonly used model selection criteria for non-linear models. A lower BIC value indicates a better model fit.

Table 2 Baseline results

4.3 Identification

The baseline result in Table 2, Column (2), shows that online advertising has a positive relationship with new house sales. However, the above analysis and estimate may be subject to a potential endogeneity issue, as the independent variable of online advertising can be endogenous due to the omission of relevant factors. For instance, larger cities are more economically developed and have better access to the Internet, which could be related to higher advertising investment. On the other hand, these cities may have a more active housing market and higher house sales. Consequently, the potential endogeneity issue may exist. Nevertheless, we included a large set of relevant control variables to avoid the omission of relevant variables. Furthermore, we employ identification strategies to address the potential endogeneity concern. For ease of reference, Table 3, Column (1), presents the baseline FE Poisson results from Table 2, Column (2).

Table 3 Identification

Specifically, we make use of the instrumental variable (IV) and control function (CF) method (Wooldridge, 2010, 2015). An ideal IV should be correlated to the endogenous variable (AD) but uncorrelated to the error term (or dependent variable NU). Past average values of the endogenous variables have been widely used as IVs in prior research (Brynjolfsson & Hitt, 1996; Ghose, 2009; Hitt and Brynjolfsson1996; Li et al., 2019; Luo et al., 2012). Therefore, we first use AvgPastADit, the past average value of AD of the previous four months before month t (i.e., the average lagged terms from t-4 to t-1) as one IV for the current ADit. The rationale is that past values of AD should be highly correlated with the potentially endogenous variable at time t (ADit); however, they are predetermined and, thus, not correlated to the error term at time t. Hence, the past values are often accepted as an IV in panel data analysis (Groves et al., 1994; Hitt, 1999). Cities in the same tier have similar populations, GDP, and advertiser appeal. Therefore, companies’ investments in online advertising in the focal city are largely related to the scale, pattern, and strategy of online advertising investments in other cities in the same tier as the focal city. Nevertheless, the online advertising in other cities should not influence the housing market of the focal city. Hence, we construct OtherCitiesADit, the average value of AD of other cities, which are in the same tier as the focal city at t, as another IV for the current ADit.

First, after choosing valid IVs, we use a control function approach to address the endogeneity problem in the fixed effects Poisson model. Following the two-step procedure of control function methods (Mallipeddi et al., 2021; Wu et al., 2021), we generate the control function using our IVs for the endogenous explanatory variable AD and then incorporate it into our nonlinear count model on NU to correct for the endogeneity of AD. As shown in Table 3, Column (2), the coefficient of AD (0.121) remains positive and significant, which suggests that one unit increase in AD leads to a 0.121 increase in the expected log count of NU. Equivalently, this means that one more impression in online advertising can lead to a 0.013% increase in the incident rate of new house units sold. Second, apart from the control function method, we estimate a two-stage least squares model and summarize the results of the two stages in Table 3, Columns (3) and (4). As indicated, the estimates of AD remain positive and significant. We further check the validity of the two IVs jointly. Specifically, the result of the Sargan test (p = 0.2536) for over-identifying restrictions suggests the validity of the set of two IVs. Moreover, we conduct the weak identification test following the approach of Stock & Yogo (2005). The obtained Cragg-Donald Wald F statistic (557.645) surpasses the Stock-Yogo critical values for weak instruments, suggesting the IVs are valid and strong. This suggests that, after accounting for the potential endogeneity issue of AD, online advertising still has a positive impact on new house sales. Particularly, one more impression in an online advertising channel leads to an increase of 1.152 new units sold.

4.4 Robustness

To show the robustness and consistency of our findings, we further conduct multiple checks in different ways. For ease of reference, Table 4, Column (1), and Table 5, Column (1), present the baseline results from Table 2, Column (2).

Table 4 Robustness (1)
Table 5 Robustness (2)

First, one could be concerned about the potential existence of heteroskedasticity that biases our estimated standard errors. To address, we re-estimate our model by reporting robust standard errors. The results in Table 4, Column (2), show that the impact of online advertising on new house sales remains consistent.

Second, our dependent variable of NU is a count variable. The use of a Poisson model relies on an assumption of equi-dispersion in NU (i.e., an equal mean and variance), whereas the real-world data are typically over- or under-dispersed (Wooldridge, 2003). To address the bias due to the potential violation of assumption, we employ a FE negative binomial (NB) model and summarize the results in Table 4, Column (3). As indicated, the results are still consistent.

Third, we check the robustness of our findings across different functional forms and model estimations. Our main analysis employs a FE Poisson model for estimation. Now, we estimate a linear double-log model with the dependent and independent variables log-transformed, a population-averaged (PA) model that allows for an exchangeable correlation structure of a generalized linear model, and a random effects model estimated via maximum likelihood (MLE). The corresponding results for the double-log, PA, and MLE models are shown in Table 4, Columns (4), (5), and (6), respectively. The model parameter estimates remain consistent with that in Column (1).

Fourth, some may be concerned about the unobserved heterogeneities of cities. To alleviate this concern, we employ a random coefficient (RC) modeling approach (Boudreau & Jeppesen, 2015). RC can account for the possibility that house sales vary due to any latent, unobserved city-specific heterogeneities. Our estimation results in Table 4, Column (7), show that the impact of AD remains positive and significant.

One final possible concern is that the advertising on houses may take time to affect buyers’ purchasing decisions and generate an impact on the house sales. In other words, we should examine the lagged effect of advertising. Hence, we use the lagged, instead of the current, term of AD at the various levels (lagged from 1 to 6 months) as the new independent variable. All the results in Table 5 remain consistent. More importantly, the significant lagged effects of AD also dismiss the potential concern of simultaneity bias.

In sum, all the various tests above establish the robustness and consistency of our findings regarding the positive impact of online advertising on new house sales.

5 Additional analyses

The above results document robust evidence that online advertising may increase new house sales. Next, we conduct additional analyses to offer more research findings and insights.

5.1 Sub-group analysis

To explore how the sales impact of online advertising varies across different conditions and sub-groups, we decompose our data sample into sub-samples based on new house price, resident income, and city tier. Table 6, Column (1), presents the baseline results from Table 2, Column (2).

Table 6 Sub-group analysis

First, we decompose our data sample into two sub-samples based on the mean value of new house price (NP): high-price group (new house price > mean) vs. low-price group (new house price < mean). We re-estimate our model based on these two sub-samples separately and summarize the results in Table 6, Columns (2) and (3). As indicated, the estimate in the high-price group, 0.060 (± 0.002), is lower than that in the low-price group, 0.127 (± 0.006). This suggests that the sales impact of online advertising is stronger when the new house price is lower. Our conjecture is that, when housing prices are lower, more people can afford to buy; meaning online advertising may impact a larger group of potential buyers, leading to more house sales. Thus, this can explain the stronger impact of advertising when housing prices are lower.

Second, we separate our data sample into two sub-groups based on the mean value of resident income (INC): high-income group (income > mean) vs. low-income group (income < mean). We re-estimate our model based on these two sub-samples separately and summarize the results in Table 6, Columns (4) and (5). As shown, the estimate in the high-income group, 0.152 (± 0.003), has a more positive impact than the low-income group, -0.013 (± 0.002). We posit that when buyers have higher income, they are more likely to include new houses in their purchase consideration set. Consequently, this allows online advertising to influence more potential buyers, which results in a stronger impact.

Lastly, we conduct a sub-group analysis based on the city tier. Specifically, the high-tier sub-group includes China’s first-tier cities (e.g., Beijing, Shanghai), whereas the low-tier sub-group covers the second- and lower-tier cities. We re-estimate based on these two sub-samples, and the results in Table 6, Columns (6) and (7) show that the online advertising impact would be weaker in high-tier cities, 0.058 (± 0.003), compared to that in low-tier cities, 0.151 (± 0.006). We argue that more-developed cities (i.e., cities with higher tiers) are typically more appealing due to their better job resources and quality of living, and, thus, have more immigrants. Finding a house in the local city to secure a stable job and life becomes more essential for those immigrants. In other words, this is likely to be the rigid demand in the first place. Thus, online advertising is less likely to affect this demand in such cities. Moreover, housing prices in more developed cities are much higher, and people tend to physically visit those houses before purchasing. However, for lower-tier cities with much lower housing prices, the Internet and online advertising become more useful to inform people (especially external investors) of new houses and attract their purchases. These could be plausible explanations for the weaker impact of online advertising in top-tier cities.

5.2 Analysis of other dependent variables

In addition to the dependent variable of new house sales, we next explore how online advertising affects other outcomes to provide a more complete understanding of the impact of online advertising.

First, we seek to understand the mechanism of the impact of online advertising on new house sales. We estimate the impact of online advertising on new house prices. The results in Table 7, Column (1), show that advertising will lower new house prices. We believe this is a result of advertising that conveys rich product information to buyers and reduces buyers’ search costs (Ekelund et al., 1995). The reduced cost results in the lower price, which further boosts sales. Moreover, we estimate the impact of online advertising on the number of immigrants into and emigrants from the city. As shown in Table 7, Columns (2) and (3), online advertising attracts more immigrants into the city and lowers the number of emigrants, which retains more existing residents from leaving the city. We suspect these results are due to the awareness effect of advertising (Clark et al., 2009). Specifically, online advertising, which has no space or time restrictions, informs potential buyers who live within or outside the city of housing information. This helps the buyers find their ideal houses and, thus, attracts more immigrants and retains more residents. As a result, this larger group of buyers may generate more sales.

Table 7 Analysis of other dependent variables

Second, it has been widely documented that advertising may have a spillover effect (Erdem & Sun, 2002; Rutz & Bucklin, 2011; Sahni, 2016), such that the advertising on a product may also affect other related products. Thus, we explore whether the advertising on new houses affects the price and sales of related second-hand houses. The results in Table 7, Columns (4) and (5), show that the estimates of advertising are insignificant. This suggests that the advertising spillover effect may not exist in the housing market.

6 Discussion and contribution

By collecting and analyzing a rich data set on online advertising and housing markets, we identify notable findings. First, we find that online advertising on new houses increases the sales of these houses. This sales impact is stronger with lower new house prices, higher resident income, and lower-tier cities. Additionally, we unveil that this effect could be due to the processes that online advertising lowers new house prices, attracts more immigrants, and reduces emigrations to form a potential larger group of buyers. Finally, we discover that online advertising in the new housing market is unlikely to have spillover effects in the related second-hand housing market in terms of price or sales.

Our research findings have the following contributions. First, our research is one of the few in information systems and marketing literature to explore the relationship between online advertising and the real estate market. Prior studies mainly document the impacts of online advertising in consumer goods markets (Goldfarb & Tucker, 2011a; Gopinath et al., 2013; Manchanda et al., 2006; Taylor et al., 2013). However, they do not focus on real estate assets or houses. Our research addresses this gap and extends the information systems and marketing literature.

Second, we contribute to the information systems and marketing literature by examining and reporting several contingent factors for the sales impact of online advertising. We document that the advertising impact is not uniform but depends on important factors, such as housing price, resident income, and city tier. These contribute to the literature by adding more important insights.

Third, our study is the first to explore the underlying mechanism of the impact of online advertising on house sales, via aspects such as housing price, immigrants, and emigrants. More interestingly, unlike other studies validating the spillover effects of advertising in other markets (Erdem & Sun, 2002; Rutz & Bucklin, 2011; Sahni, 2016), our research posits that the spillover effect may not exist in the housing market. Thus, this contributes to relevant literature by adding different research findings and conclusions.

Finally, our research findings offer important implications to practitioners and policy makers. For instance, our findings reveal that online advertising is still impactful in driving sales, even in the real estate market, which involves a long and complex purchasing process. Thus, real estate developers, who seek to promote house sales, or government agencies, who intend to regulate house sales, can perhaps rely on online advertising as a tool, especially further combining other factors like housing price, resident income, and city tier. Moreover, the reported underlying mechanism of the sales impact of online advertising could help practitioners better understand the nature of online advertising in the housing market. The opened “black box” of online advertising also allows practitioners to better track and leverage the impact of online advertising.

7 Conclusion and limitations

In this research, we build upon the stream of literature on the usage and mechanisms of online advertising to investigate how online advertising affects the sales of new houses and explore potential contingent factors and underlying mechanisms in the real estate market, which differs from the widely studied consumer goods market in the literature.

Despite our notable findings and important contributions, we still acknowledge some limitations. First, we only have data on advertising impressions but were not able to collect advertising expenditure data. Thus, we could not provide a more direct estimation of the relationship between advertising expenditure and sales to offer “dollar-value” insights. Second, we did not have access to more granular data (e.g., clickstreams and behavioral data), which capture whether buyers observe or click on some advertisements and their subsequent decision making, which allows us to conduct more in-depth investigations to offer more insights. Overall, these limitations may serve as important angles for future research.