1 Introduction

As Stockholm’s city evolves with new construction projects, so do its residents. Our study focuses on the impact of new housing investments on the city’s inhabitants and investigates whether these changes lead to shifts in demographics, socioeconomic status, and housing affordability. With an eye on the city’s ongoing transformation, this research sheds light on the relationship between urban development and the lives of those who call Stockholm home.

The research question has been analysed in many studies over time (see, for example, Rabiega et al., 1984; Simons et al., 1998; Ding et al., 2000; Ellen et al., 2001; Schwartz et al., 2006; Ooi & Le, 2013; Zahirovich-Herbert & Gibler, 2014; Kurvinen & Vihola, 2016; Brunes et al., 2020; Fernandez et al., 2021). Most of these studies have analysed the effect of new housing investments on housing prices and, in a few isolated cases, on rental prices in nearby areas. For example, Been et al. (2019) and Mast (2021) recently analysed affordability after new housing units were built in the neighbourhood. The question remains whether new housing investments lead to the gentrification of entire residential areas in the long term when new residential buildings are built. Lower affordability is a step toward gentrification that can eventually lead to the displacement of the people who originally inhabited the area. However, few studies have analysed the issue of gentrification by measuring the change in population in terms of educational background and age distribution. One exception is Dong (2017), who analysed various indicators of gentrification in connection with rapid train development. There is a research gap on new housing investments and their effects on gentrification.

The primary objective is to explore whether residential projects have affected the socioeconomic background and affordability of housing. The main research questions are whether new housing construction affects gentrification and affordability in Stockholm.

We address these questions by analysing several new residential construction projects in Stockholm, Sweden, from 2009 to 2014. We explore the influence of construction projects on the socioeconomic backgrounds and affordability of residents. We used an unbalanced panel data set from 2000 to 2020 for the following variables: education, demographics, income, and housing prices in 2,168 geographical units, where units are 250 × 250 m squares. We have used difference-in-difference with propensity score matching and inverse probability weighting.

This article contributes to the literature on gentrification and affordability in three ways. The main contribution of our paper is an empirical analysis to determine the spillover impacts of new residential constructions in the Stockholm municipality and to document the variability of these impacts. We implement a set of simulation estimates to explore the influence of new residential constructions on residents’ socioeconomic characteristics and housing affordability. Although difference-in-difference has been used in many studies (see, for example, Ellen et al., 2001; Schwartz et al., 2006; Ooi & Le, 2013; Lee et al., 2017; Diamond & McQuade, 2019; Brunes et al., 2020; Fernandez et al., 2021), we contribute to developing the methodology by combining the method with propensity score weighting, by including spatial covariates and by estimating models where new housing investments are excluded from the treatment effect. Most previous studies have focused on the price effects of new housing investments in nearby housing areas, although some also analyse the effect on rental housing market affordability. To our knowledge, no one has previously analysed gentrification in combination with affordability resulting from new housing investments.

The remainder of the paper is divided into six major sections. Section 2 reviews previous studies related to city change and residential construction, and we explain our research methodology and the different modelling approaches in Sect. 3. Section 4 describes the Stockholm case study, and Sect. 5 presents the data sets and descriptive statistics. Section 6 discusses the results of the difference-in-difference estimates. Finally, the paper discusses the study’s implications, limitations, and possible future studies in Sect. 7.

2 Literature review

There is a growing debate about whether new residential constructions affect the surrounding area in terms of gentrification and affordability. Some argue that new constructions contribute to a supply effect, which should relieve the demand for existing housing and reduce rents and prices. Others argue that new construction will lead to gentrification by attracting high-income households and new amenities, increasing housing prices and rents, and impacting the affordability of surrounding housing (Li, 2019). Been et al. (2019) and Mast (2021) conclude that adding new housing units lowers price increases and makes housing more affordable for low- and middle-income households. Li (2019) studied the effect of new high-rises on nearby residential rents and sales in New York City. The results found that for every 10% increase in housing stock within a 500-foot buffer, rents decrease by 1% and sale prices decrease where newly built housing units reduce demand for existing housing units. The inability of the supply of new homes to meet the demand for housing results in an increase in the price of existing homes (Glaeser & Gyourko, 2018).

However, there is growing opposition to new developments in the face of rising prices, and there is scepticism about whether increasing the supply of market-rate housing will enhance housing affordability. One of the arguments is that the increase in land prices in many cities and the failure to allocate land to affordable housing will contribute to the increase in prices. New housing at market price causes other housing to be designated for low-income families. Land use regulations, supply, affordability, and more stringent regulations for local land use result in fewer new constructions and higher prices (Been et al., 2019).

Many studies also focus on the consequences of residential construction and the negative impact of gentrification. Higher housing costs (in terms of increased rents or house prices) in low-income households worsen affordability and result in the displacement of existing residents (Atkinson, 2004; Chum, 2015; Walks et al., 2021). Hankinson (2018) indicates that homeowners are sensitive to housing proximity, unlike renters who typically do not express NIMBYism (not in my backyard). Instead, renters show high support for new residential constructions in the city. However, in cities where housing prices and rents are high, renters show the same degree of NIMBYism as homeowners, despite their support for significant increases in the city’s housing supply, where renters view new nearby developments as economically threatening in terms of increasing local prices. Therefore, renters support new residential developments, but not developments in their neighbourhoods. Brunes et al. (2020) analysed how nearby property prices are affected by new construction projects in Stockholm. Their study sample included more than 90,000 observations from 2005 to 2013, in which they used the difference-in-difference method in a hedonic model. The results indicated an increase in housing prices in nearby areas after completion of infill development, resulting in a positive spillover effect on nearby property prices only in lower-income areas. A recent study by (Wilhelmsson, 2023) investigates how housing construction affects the value of single-family homes in Stockholm, Sweden, and its implications for urban planning. Using the difference-in-difference methodology, the study analysed data from 480 housing projects and 17,000 home transactions between 2005 and 2018. Findings reveal that multifamily constructions do not impact the value of nearby single-family homes, while single-family constructions negatively affect them. The study suggests that urban planners should carefully consider the location of new single-family homes to maintain property values and create equitable and sustainable urban environments.

There is much debate about the consequences of gentrification (Atkinson, 2000; Smets & van Weesep, 1995; van Eijk, 2016). Wilhelmsson et al. (2021) measured gentrification and related gentrification to property values in a case study in Stockholm, Sweden. They used Getis-Ord statistics to identify and quantify gentrification in different residential areas. The results indicate that proximity to the gentrified area increases property values by 6 to 8%.

Zahirovich-Herbert and Gibler (2014) examined the effect of new residential construction on surrounding property values in Baton Rouge, Louisiana, over 18 years. The results show a positive correlation between newly constructed houses and surrounding residential properties. The results of Ooi and Le (2013) indicated a positive impact of new constructions on local housing prices in Singapore. Dube et al. (2023) examined whether the proximity to redevelopment projects influences the prices of single-family houses in Quebec City, Canada. The study showed that residential reconversions lead to a mean net price premium of about 2.48%. In the same vein, Kurvinen and Vihola (2016) studied the impact of multi-storey apartment constructions on surrounding apartment values in the Helsinki metropolitan area in Finland. The results showed significant evidence of a positive impact on the values within a radius of 300 m. The empirical results of the Liang et al. (2020) study indicated that urban renewal caused a continuous response in neighborhood housing prices even before the completion of reconstruction.

Many empirical studies, such as Zahirovich-Herbert and Gibler (2014), focused on the effect of subsidised housing on surrounding property values. Diamond and McQuade (2019) analysed the effect of affordable housing constructions on residents of the surrounding neighbourhood by estimating spillovers of residential construction financed by the Low-Income Housing Tax Credit (LIHTC) using the difference-in-difference method. LIHTC is a program founded in 1986, which financed 21% of all multifamily constructions from 1987 to 2008 to ensure low-income households had access to affordable housing. The spillover effect of affordable construction on residents of surrounding neighbourhoods varies between demographically different neighbourhoods. LIHTC development in low-income areas increases house prices by 6.5%, revitalises neighbourhoods, and attracts households with diverse incomes and ethnic backgrounds; while in high-income neighbourhoods, new developments decrease house prices by 2.5% and attract lower-income households. To support their results, Baum-Snow and Marion (2009) and Schwartz et al. (2006) examined the external effects of subsidised housing in New York City, using hedonic regression with a difference-in-difference model. The results indicated significant and positive spillover effects of subsidised housing investment. The spillover effects increase with the size of the project, whereas the external effects decrease with the distance from the project sites. These results correspond to the results of Rabiega et al. (1984), Santiago et al. (2001), and Ellen et al. (2001). DeSalvo (1974) studied a sample of 50 New York City neighbourhoods. Their results showed a positive effect of subsidised housing construction on property values. The assessed values increased by 9.89% annually, while the increase in the control areas was 4.64% annually. Olsen (2003) emphasised the role and importance of affordable housing in addressing failures in the housing market and providing access to housing for low-income households.

Regarding the size of residential construction and its effect on nearby property values, Ding et al. (2000) analysed the influence of new and rehabilitation residential investments on nearby property values in Cleveland, Ohio. The results indicated a positive effect of new construction on property values, where houses within 150 feet of new construction sold for $4,500 more. There is no effect on house values further than 300 feet away, and the effect is more significant in lower-income neighbourhoods. Research infers that small-scale construction investments do not impact nearby property values, and the influence varies with the number of units in the new residential construction (size). Therefore, investment policy must promote and encourage significant investments to improve neighbourhoods. Ding et al. (2000) study is an extension of Simons et al. (1998).

Finally, there is a growing literature that focusses on the role of amenities in attracting higher-income individuals and potentially increasing rents and sale prices (Diamond & McQuade, 2019; Kennedy & Leonard, 2001; Trojanek & Gluszak, 2018). Local amenities are essential to capitalise on real estate values (Banzhaf & Farooque, 2013; Bayer et al., 2007).

3 Methodology

We analysed the effect of more than 200 building projects from 2009 to 2014 in Stockholm, Sweden. Of course, the issue of causality and control for confounders is central. Using only observational data, we set up a quasi-experimental design similar to a treatment effect study. We used the difference-in-difference methodology (Angrist & Pischke, 2009; Heckman et al., 1997) with propensity score matching (Dong, 2017; Rosenbaum & Rubin, 1983) and inverse probability weighting (Abadie, 2005). Ellen et al. (2001), Schwartz et al. (2006), Ooi and Le (2013), Diamond and McQuade (2019), and Fernandez et al. (2021) and have all used the difference-in-difference approach in order to empirically analyse the impact of new residential construction on property values. Dong (2017) is an example of the difference-in-difference approach with propensity score weighting.

The construction of apartment multifamily buildings is the event/treatment that may affect the surrounding area (treated area) compared to a control area (we will use the term untreated area). We collect data on the outcome and control variables before and after the new construction. Additionally, not all residential areas in the study area are equally subject to housing construction. Furthermore, residential development areas are not randomly selected, which can create problems and result in selection bias. This is why we also estimate weighted least squares. Therefore, we use the propensity score methodology, where we weigh observations (weighted least squares) with respect to their probability of being a construction area. Therefore, we included areas that were equal in terms of income and price level, as well as distance to public transportation and the central business district (CBD) of the area before the implementation of the project. ‘Projects’ refers to housing projects with rental and owner-occupied apartments. Some projects are large and part of a larger residential development area, while others are more isolated development projects within the city limits.

The outcome variables that we analyse are as follows: household income, affordability, educational background, and age. Similar outcome variables have been used by Dong (2017). Differences in outcome variables are analysed before and after construction, as is the difference between the immediate area where the project can be expected to have an effect (treatment area) and a comparison area where it cannot be expected to have an effect (untreated area). Assumptions about the size of the treatment and untreated areas are essential, and we have handled this mainly through different assumptions made using the Euclidian distance to the projects.

The observation units consist of 250 × 250 m squares, and the total number of squares amounts to more than 2,168 geographical units. We have estimated models in which the quadrant where the building project is located has been included and excluded from the estimates. Our model is a so-called ‘two-way fixed effect model’ commonly used in the difference-in-difference methodology based on panel data. We include fixed effects for planning areas (so-called “DeSo” areas) and time to control for omitted variable bias. The areas under study are similar but smaller than the ZIP code used by Ellen et al. (2001), and the census tract fixed effect used by Czurylo et al. (2023). The difference-in-difference approach, combined with fixed effects, decreased the risk of time-invariant confounding factors. However, the interpretation of estimated treatment effects can be problematic when we include fixed-area effects as a constant over time, and fixed-time effects are assumed to occur constantly in space (Imai and Kim, 2021). Some caution is required when interpreting the results. The result of the difference-in-difference model will be presented with and without fixed effects.

We have also included accessibility measures among our variables, including the distance to the CBD and the closest metro station. We visually tested the parallel trend assumption by graphing the trends before and after the event. The difference-in-difference equation can be stated as follows:

$${Y}_{i,t}={\alpha }_{k,t}+{\lambda }_{1}{Treat}_{i,j}+{\lambda }_{2}{Post}_{i,t}+{\lambda }_{3}{(Treat*Post)}_{i,j,t}+\beta {X}_{i,t}+{\varepsilon }_{i,t}$$
(1)

where the subscript i equals the lowest geographical unit of the square, j equals the treatment area, k equals the planning area, and t equals the year. The outcome variable is Y, and “Treat” equals the treatment area. The variable “Post” equals the years after the construction, and (Treat*Post) is the interaction variable between the treatment area and the period after construction. X is a vector of other covariates, such as the distance to public transportation and CBD. The models have been estimated with ordinary least squares (OLS). We use W as the probability weight in the weighted least-squares model. These are based on propensity scores, and the weights will be equal to a [1/propensity score] for areas with residential construction and [1/(1–propensity score)] elsewhere, following Freedman and Berk (2008) and Cole and Hernán (2008).

All Greek letters are the parameters that will be estimated, and the parameter of concern is parameter λ3. If the parameter is positive, it is interpreted as a positive causal relationship between residential constructions and the outcome variable in the treatment area compared to the untreated area. Parameter λ1 indicates if the outcome variable has changed during the study period (before and after construction), and parameter λ2 indicates whether the outcome variable was statistically different in the treatment area before construction.

In model 1, we assume that ak,t is constant and that β is equal to zero, while in model 2, we relax the assumption that ak,t is constant. In model 3, we relax the assumption that β equals zero; in model 4, we exclude square i where actual construction occurs. These models have been estimated with OLS, while the final model, model 5, is estimated with weighted least squares (WLS).

As stated previously, we use propensity score weighting to reduce selection bias. However, there are indications that weighting using the propensity score will increase random error, and estimated standard error will have a downward bias (Freedman & Berk, 2008). Imai and Ratkovic (2014) showed that the risk of misspecification of the propensity score model could significantly affect treatment effects. Therefore, caution is necessary when interpreting the causal relationships estimated by the weighted least squares using propensity scores as weights, and we also present the OLS results for comparison.

4 The case study of stockholm

Our case study is the city of Stockholm, Sweden’s capital, with approximately 1 million inhabitants. From an international perspective, it is a relatively small city, including surrounding municipalities (Stockholm country), with approximately 2.3 million inhabitants. The population has grown by 0.7 million from approximately 1.6 million inhabitants since 1990.

The housing construction plans in Stockholm County for 2019–2030 include approximately 280,000 apartments. Most of these plans refer to apartment buildings (85%). The expansion plans mean that the total number of homes will increase by approximately 25% in just 10 years. Compared to the previous 10-year period, expansion plans are almost doubled. Thus, these plans are very ambitious and will place significant demands on where and how to build. Approximately a third of the planned housing construction will occur within the Stockholm municipality. The municipality is relatively densely populated and has buildable land for new projects in existing areas.

The construction aims to meet the needs of increased immigration to metropolitan regions. Housing supply is the biggest challenge for future economic growth in the region. At the same time, there is a need to reduce the threshold for entering the housing market for both young adults and newcomers. Residential construction aims to create more attractive, high-quality living environments that meet today’s environmental requirements. In addition to quantifiable objectives regarding the number of new homes in the municipality, Stockholm’s goal is for housing stock to counteract segregation and promote attractive living environments throughout the city.

Economic, environmental, and social sustainability are essential for housing construction. Therefore, the question of the effects of housing construction on segregation, gentrification, and affordability is central and more research is needed.

There may be an inherent contradiction between creating attractive living environments and gentrification. Gentrification implies a social process in which the composition of the neighbourhood population changes. Households with higher income move in and displace households with lower incomes and social status. New construction projects can contribute to gentrification by making the living environment within the district more attractive, resulting in higher housing prices. Over time, only households with higher incomes can move into the neighbourhood, and the neighbourhood becomes gentrified over the long term. The process is relatively slow, as there is a certain sluggishness to the housing market, making it challenging to analyse gentrification and its causes. As a case study, Stockholm is interesting to analyse, as it has a goal of substantial housing expansion, but at the same time, it has a clear social ambition regarding future housing expansion and does not want to create increased segregation and gentrification in existing neighbourhoods.

5 Data and descriptive statistics

We used an unbalanced panel data set for 2000–2020 in 2,168 geographical units. Units are 250 × 250 m squares and data on education, demographics, income, and housing prices are available for each square. In total, our panel data consist of 45,352 observations. Concerning income, we only have data for 2000, 2007, and 2013–2020, and housing prices are available for 2005–2018. Data on educational background and demographics of the population are available for all years. All observational units are geocoded. In addition to the outcome variables, we also have information on access to the nearest subway station and proximity to the city centre at all locations in the study area.

Our sources were Statistics Sweden for data on income and population. Information on housing prices came from Swedish Brokerage Statistics (Svensk Mäklarstatistik), and statistics regarding new multifamily residential houses come from the City of Stockholm. The number of constructions was 216 between 2009 and 2014, and most new apartments were built between 2012 and 2014 (148 new multifamily houses). The constructions are depicted in Map Fig. 1.

Fig. 1
figure 1

Residential constructions in Stockholm (2009–2014)

Three things can be noted regarding the geographical spread of new residential properties. There has been a greater concentration of construction in the southern parts of Stockholm, and the size of the projects has varied considerably. The project areas in the north are concentrated in only a few areas, while several in the south are spread over a larger geographical area. What may be important in estimating the difference-in-difference models is that there may be an inevitable interdependence between new buildings, and project size may have a particular significance for the outcome. Higher concentrations of projects and larger projects are expected to have more significant potential impacts on affordability and gentrification.

The descriptive statistics on the outcome variables and some covariates (distance to the subway station and distance to CBD) are presented in Table 1. Statistics refer to the mean and standard deviation in the treated group (observations within 1 km from new construction) and the untreated group (observations further than 1 km, but within 10 kms of new construction). The ring around the new housing investments is larger than what has been used; for example, Ooi and Le (2013) used 500 m, and Schwartz et al. (2006) and Ellen (2001) used 2000 feet (around 610 m). As an alternative to a 1000 m radius around the new housing, we tested the variables using a 500 m radius.

Table 1 Descriptive statistics

As we analyse more than 200 construction projects spread throughout the city, a treatment area of 1 km will cover a large part of the city and thus a large part of the analysis units. There are approximately 45,000 squares in the panel; approximately 75% of these are treated, while only 25% are untreated. This, of course, requires that the characteristics of the treated squares be equivalent to those of the untreated squares. As mentioned above, the panel is unbalanced, so we lack information on income and affordability for previous years. In essence, we can only analyse the effect of the two or three observations before the new residential constructions took place.

Regarding household income, we can say that we have 14,430 treated units and 4,632 untreated units. Average income is comparable between the two groups, but the variation around average income is significantly greater in the untreated group. ‘Affordability’ is defined as average income divided by average housing prices in the geographical unit (the inverse of income affordability used in, e.g., Gan & Hill, 2009). Affordability follows the same geographical pattern, i.e., the mean value is equivalent between the two groups, but the standard deviation is slightly higher in the untreated group.

Higher education is the share of people in the area with a college education, and ‘younger’ is the share of the age group 25 to 44 in the area, following Dong (2017). With respect to higher education and younger people, the share is higher in the treated group, while the spread is equal. In the treated group, 54% have a college education, as do 50% of the untreated group. The proportion of individuals aged 25–44 is 32% in the treated group but only 27% in the untreated group.

We can observe a more significant difference when analysing the covariates’ distance to the nearest metro station and the city. Areas with new construction are closer to metro stations than untreated areas, although the variation in the untreated group is substantial. In the treated group, the minimum distance to the nearest metro station is approximately 700 m, while the distance in the untreated group is 1,300 m. Surprisingly, we can observe that new construction projects are closer to the city than areas that have not yet been built. The average distance to the city is 6.5 km in the treatment group and 9.7 km in the untreated group. This justifies the use of covariates in the difference-in-difference estimates. In Map Fig. 2, we illustrate the geographical spread regarding affordability and income (left), higher education, and younger (right).

Fig. 2
figure 2

Income, affordability, education, and younger in Stockholm (2015)

The map on the left shows the geographical spread of affordability. The redder an area is, the more affordable it is, while the yellower an area, the less affordable it is. There is no clear tendency suggesting that the closer we get to the city, the less affordable housing is, but there are areas in the city centre where prices are lower relative to income, and in the suburbs, prices are relatively higher than income.

The map on the right illustrates the geographical distribution of the proportion with higher education. The bluer an area becomes, the more people with higher education live there. Here, the pattern is much more evident with more people with higher education in the central and northwest parts of the city. While the second map on the right shows that younger people are concentrated in the suburbs of Stockholm, older people are concentrated in the central parts of the city. Both affordability and higher education are outcome variables. In the difference-in-difference models, we will include fixed geographical effects that will effectively capture the geographical spread that exists in the city for all outcome variables.

6 Difference-in-difference estimates

In this section, we present our results from the difference-in-difference models. Four different outcome variables will be used: income, affordability, the proportion with higher education in the population, and the proportion in the age group 25–44 years. Income and affordability refer to the effects that can trigger a gentrification process, and higher education and age structure are more long-term effects due to a gentrification process. All outcome models have been estimated in five different model specifications presented previously. In addition to these models, we will present two models regarding the outcome variable income where we (1) Relax the assumption about the untreated area and (2) Analyse a larger impact area (treatment area).

6.1 Income and affordability effects

Table 2 shows the difference-in-difference estimates of the effect on income from the construction of apartment buildings. Table 3 shows the results in terms of affordability.

Table 2 Difference-in-difference estimates (outcome variable = income)
Table 3 Difference-in-difference estimates (outcome variable = affordability)

Regardless of the specification (except for the first model, which was without fixed effects and additional covariates), housing projects generally increase incomes in the immediate area. The degree of explanation is about 44% in models where fixed area and year effects are included, which can be considered relatively good. In the model that includes other covariates, the degree of explanation increases to approximately 56%. We can also note that we meet the parallel assumption of the trends before and after construction (see Fig. A1 in the appendix), even though the number of observations before construction is limited.

We can state that incomes have increased over time, before and after treatment, and that new construction projects are found in areas with slightly (but statistically significantly) higher income levels. In general, we can notice that the models without fixed effects produce a slightly higher estimate than the models where we have a lower omitted variable bias. The results also indicate that they are relatively robust across the different models.

Housing projects have had a positive and statistically significant effect on income, approximately 3 to 5%, a relatively strong impact. Interestingly, the new production occurred primarily in areas than the untreated areas. On average, income is 5% lower in areas where the municipality later built homes. It is worth noting that the difference between Models 3 and 4 is that squares, where there had been residential construction, are not included in Model 4; i.e., household income in the newly constructed dwellings will not be included as a treatment effect.

How can the result of the affordability model be interpreted? As with the income model, the degree of explanation is below 50%: between 2 and 49%. A higher degree of explanation would have been desirable, as there is a risk of omitted variable bias or an endogeneity problem. The estimate shows that housing affordability decreased before and after 2009–2014. This means that housing prices have increased more than domestic prices, which is unsustainable in the long run. It is also clear that the housing projects carried out during this period did not occur in more or less affordable areas than in the rest of Stockholm. We can also state that there has been a causal effect on affordability in the areas where housing construction occurs. This can be due to many things, but it is clear that the increase in income level that we observe has led to housing prices rising at the same rate, thus not affecting affordability. The interpretation is the same regardless of the model specification.

6.2 Educational background and demographic effects

As we have discussed before, changes in the composition of households in different residential areas are slow-moving. We analyse the effect of housing investments made between 2009 and 2014 and measure the effect from 2015 to 2020. Of course, this is a short period if it is the long-term effects we want to analyse. In this section, we intend to analyse the effects of housing investments on gentrification, i.e., has there been a social process where residents in the area have moved out after the investment? We analyse the population’s educational background and age structure in the area of influence. The results of the difference-in-difference models are presented in Tables 4 and 5.

Table 4 Difference-in-difference estimates (outcome variable = education)
Table 5 Difference-in-difference estimates (outcome variable = young)

The effect of higher education is not as straightforward as that of household income. Some models show a causal effect of housing construction on higher education, while others show no effect. All model estimates show that the proportion with higher education has generally increased between the pre-and post-period. Except for the fixed effect model, the results indicate that areas for new housing construction do not differ with respect to the proportion of higher education. The effect of treatment is not conclusive. In the default and fixed-effects model, we estimate that housing construction has increased the proportion of people with higher education in the treatment areas compared to the untreated areas. Therefore, there has been a form of gentrification in the immediate area surrounding the new homes. The effect is not statistically significant in the models where we control covariates such as proximity to the metro station and city centre and the distance to other housing projects. This also applies to other models that excluded the construction area and the WLS model.

The estimate of the causal effect of housing construction on population age is perhaps the clearest, at least in the short term. We cannot observe any effect at all. The outcome variable indicates the percentage of the population aged 25 to 44 years. The expectations were that newly produced homes in a residential area would primarily attract younger households, affecting other homes in the neighbourhood, and that was not the case. Housing construction has not affected the population structure, at least not based on age. On the other hand, the new production has mainly taken place in areas where the population is already younger, which can be explained by the fact that many projects have taken place in areas further away from the city’s central parts.

6.3 Robustness tests

Of course, our assumptions about treatment and untreated areas are central to estimating the difference-in-difference model. To ensure that these assumptions do not drive the result, we have performed two robustness tests in which we vary our assumptions about the size of the treatment and untreated areas. In the default model, we have assumed that the untreated area (untreated observations) is 1–10 km from residential construction projects. As an alternative to this assumption, we have expanded the untreated area to apply to the entire city of Stockholm. The second test we conducted was to change the treatment and untreated areas. Here, we have estimated a model where the treatment area is closer to the housing project (within 500 m), and the untreated area now consists of an interval of 500 m to 5 kms. The results are presented in Tables 6 and 7.

Table 6 Test 1: Difference-in-difference estimates (outcome variable = income)
Table 7 Test 2: Difference-in-difference estimates (outcome variable = income)

When we expand the untreated area to apply to the entire city of Stockholm, the result does not change the result’s statistical significance or economic interpretation. In the basic model (1), without fixed effects or covariates, we can see that incomes have generally increased significantly before and after examining housing construction projects. The results indicate that households’ incomes are approximately 30% higher after 2014 than before 2009. We can also note that the new construction project has mainly occurred in areas with lower income levels. The estimate shows a 7.5% lower income in areas built during 2009–2014. The effect on the development area (treatment area) is statistically significant and positive. Household income has increased by almost 4% due to new construction projects. If we control for fixed effects (2), the effect on income decreases slightly, further decreasing when we include covariates (3). However, excluding the immediate area where the construction has occurred (4) has no effect. This means that it is not the project that drives the change in household income but indirect spillover effects. Like previous models, the estimate drops further in the WLS model (5). Including probability, weights have proved problematic, which may have been caused by a specification error in the logistic regression model, but they can also mean that the effect on household income is negligible.

As a result of reducing the presumptive impact area of 500 m, we can conclude that the effect remains in most models. On the other hand, the effect in the model where we excluded the construction area will show statistically insignificant parameter estimates. This can occur when large parts of the treatment area disappear, and we have relatively few units that can actually be affected. In the model with the highest degree of explanation, we can note a causal treatment effect of housing construction on income. As before, it is clear that incomes have generally risen over time and that new residential buildings have been built in areas with lower incomes than in untreated areas.

7 Conclusion

City growth can involve new residential areas or increased density in existing ones, each with trade-offs between population growth and attractiveness. While new developments can enrich a city, they can also reduce green spaces and drive up housing prices, leading to gentrification.

Previous studies (Brunes et al., 2020; Diamond & McQuade, 2019; Ellen et al., 2001; Fernandez et al., 2021; Ooi & Le, 2013; Schwartz et al., 2006) have focused on how new housing affects nearby property prices, but few (Dong, 2017) have explored its impact on gentrification and affordability. Our article aims to fill this gap by examining how new housing investments influence income levels, education, youth population, and affordability.

Our findings suggest minimal effects of new housing on education levels and youth population, that is, no gentrification effect, but indicate a positive impact on income and affordability, potentially leading to increased gentrification over time. Our study highlights the importance of considering broader socioeconomic factors, to gauge the true impact of construction projects on residents, and the multifaceted impact of construction projects on a city’s social fabric. These outcomes carry significant implications for policy formulation, particularly in the context of urban development. Where the policymakers should recognize the nuanced nature of urban change, and work to achieve the balance of economic growth with social inclusivity and avoid exacerbating inequalities. Through designing policies to address the potential challenges associated with the development, and to foster sustainable growth, which ensures that the benefits of urban development are equitably distributed.

Future research should consider the type and location of new housing, as well as the urban context and centrality of the area. It would also be valuable to employ nonparametric difference-in-difference models to examine how these effects vary with distance.