Introduction

There is a direct relation between the use of gasoline for driving and emissions of \(\hbox {CO}_{{2}}\) and local air pollutants. Electric vehicles (EVs) can lower gasoline consumption, but will not reduce car-related traffic crashes and use of scarce public space for roads and parking. They will also not eliminate issues with resource dependence because they require minerals like lithium, which are produced and processed in few countries (International Energy Agency 2022).

Furthermore, EVs still require a lot of electricity despite being more efficient than gasoline vehicles. Their energy use mostly depends on their mass (Galvin 2022; Weiss et al. 2020). Reasons include heavier vehicles’ higher rolling resistance, higher air resistance, and brake-energy recovery inefficiencies. Full electrification of passenger vehicles would therefore increase total US electricity consumption by 36% (Galvin 2022). Decarbonizing current electricity use is already a daunting challenge. Furthermore, EV owners prefer to charge their vehicles in the evening, when the shortage of renewable energy is greatest (Bahamonde-Birke 2020). It is thus important to not solely rely on an electrification strategy, but to also reduce car dependence and minimize car weights and associated energy consumption.

A large body of earlier research has shown that an urban environment with high densities and limited distances to city centers can promote sustainable travel behavior (Ewing and Cervero 2010; Næss 2012; Newman and Kenworthy 1989; Silva et al. 2017; Banister 2011), though the size of the effect can be debated (Stevens 2017). Highly urban environments harbor more travelers to support advanced public transport infrastructure. In addition, these cities feature more destinations within a short walking or cycling distance.

Urban environments influence households’ energy-relevant car type choices as well, even if a direct relation between the built environment and car efficiency may be absent. EVs could for instance be less attractive for city residents without a private parking spot. On the other hand, they lose less energy when stopping for traffic lights and urban households may find their limited driving range less constraining. Moreover, lacking space for driving and parking make smaller (and thus lighter and more energy efficient) vehicles attractive.

Previous studies often focused on the influence of the built environment on vehicle kilometers traveled without taking vehicle energy efficiency into account. Other research focused on the ownership of standard cars versus trucks, vans, and SUVs. Yet, this classification has no clear relation with vehicle energy use. The limited number of studies that explicitly analyzed the influence of the built environment on vehicle energy often included few built environment variables, often used biased official vehicle energy data, and did not classify cars by weight.

The present paper aims to help fill this gap by combining energy science insights with transport modeling. It first looks into the real-world specific energy consumption (SEC) in megajoules per vehicle kilometer (MJ/vkm) of cars in the Netherlands. This SEC is converted to Well-To-Wheel (WTW) \(\hbox {CO}_{{2}}\) emissions per vkm. Next, the extent to which features of residential environments and their residents are related to energy-relevant car type choices is modeled.

As linear models are incapable of effectively capturing the effect of the built environment on car SEC,Footnote 1 this paper relies on a multilevel discrete choice modeling framework instead. At a first stage, the model considered the number of cars owned. Conditional on that decision, the model then considered choices between different car fuel types and weight categories. The use of separate fuel type and weight category coefficients improved the model performance and increased the validity of the results in future samples with more electric vehicles. Model results can be used to predict vehicle SEC and \(\hbox {CO}_{{2}}\) emissions based on households’ sociodemographic characteristics and residential environment. Predictions of total travel energy use can be made when combining this car type model with a model of travel distances and modal choice that will be described in a future publication.

This paper starts with a condensed overview of results of previous studies into car type choice. The data are then described in Sect. Data. The obtained real-world SEC and \(\hbox {CO}_{{2}}\) emissions are explored in Sect. Car energy and emissions exploration. Section The discrete choice model describes the integrated discrete choice model, presents the results, and explores these results using predictions. A discussion and conclusion are finally given in Sects. Discussion and Conclusion.

Literature on car type choice

Many studies have determined that the ownership of inefficient vehicle types - like vans, trucks, and SUVs - is negatively correlated with urban household locations (Prieto and Caemmerer 2013; Bhat et al. 2009; Li et al. 2015; Chen et al. 2021; McCarthy and Tay 1998), density (Liu and Shen 2011; Brownstone and Fang 2014; Eluru et al. 2010; Bhat et al. 2009; Song et al. 2016), proximity to the central business district and location of employment (Chen et al. 2021; Cao et al. 2006; Song et al. 2016), land-use diversity and job-household balance (Bhat et al. 2009; Eluru et al. 2010; Potoglou 2008; Chen et al. 2021; Song et al. 2016), access to non-motorized and public transport infrastructure (Eluru et al. 2010; Brownstone and Fang 2014; Chen et al. 2021; Song et al. 2016), and street connectivity (Bhat et al. 2009; Li et al. 2015; Song et al. 2016). The likely reason is a lack of room for maneuvering and parking in densely built urban areas (Bhat et al. 2009; McCarthy and Tay 1998; Chen et al. 2021; Cao et al. 2006).

Residents of urban areas may, however, be less inclined to buy electric vehicles: EV ownership is higher near the central business district in Philadelphia (Chen et al. 2015), but EV-owners tend to live outside the city in France and Sweden (Fernández-Antolín et al. 2018; Westin et al. 2018). One possible explanation is that people are more likely to buy EVs if they can charge them at private parking spots (Westin et al. 2018).

Sociodemographic variables may have a stronger effect on energy-relevant car type choice. Higher incomes are correlated with the ownership of large, and thus inefficient, vehicles (Timmons and Perumal 2016; Garikapati et al. 2014; McCarthy and Tay 1998; Fernández-Antolín et al. 2018; Eluru et al. 2010; Chen et al. 2021; Cao et al. 2006; Song et al. 2016). High-income households do, however, tend to own new (Bhat et al. 2009; Prieto and Caemmerer 2013; Chiou et al. 2009), luxury (Chen et al. 2021; Fernández-Antolín et al. 2018; Prieto and Caemmerer 2013), and electric vehicles (Chen et al. 2015). New vehicles use less energy, which explains why income is negatively correlated with vehicle SEC in some studies (ten Dam et al. 2022; Li et al. 2015).

Higher education is correlated with owning a standard-sized (Potoglou 2008; Cao et al. 2006), new (Chiou et al. 2009; Prieto and Caemmerer 2013), efficient (Timmons and Perumal 2016; Li et al. 2015; ten Dam et al. 2022), and/or electric vehicle (Westin et al. 2018; Fernández-Antolín et al. 2018). Large families with many children naturally tend to own large, inefficient vehicles (Liu and Shen 2011; Brownstone and Fang 2014; Eluru et al. 2010; Timmons and Perumal 2016; Garikapati et al. 2014; Bhat et al. 2009; Li et al. 2015; Fernández-Antolín et al. 2018; Prieto and Caemmerer 2013; Potoglou 2008; Chen et al. 2021; Cao et al. 2006; Song et al. 2016). Males are likely to have large and inefficient cars too (Bhat et al. 2009; Liu and Shen 2011; Eluru et al. 2010; Timmons and Perumal 2016; McCarthy and Tay 1998; Prieto and Caemmerer 2013; Chiou et al. 2009; Cao et al. 2006). Furthermore, people with multiple cars tend to own and use inefficient ones (Bhat et al. 2009; Garikapati et al. 2014; Timmons and Perumal 2016; Cao et al. 2006; Potoglou 2008; Chen et al. 2021). The employed acquire newer cars (Bhat et al. 2009; Prieto and Caemmerer 2013; ten Dam et al. 2022) of a more efficient fuel type (ten Dam et al. 2022), but the correlation with SEC may depend on the need to carry equipment (Timmons and Perumal 2016) versus the need to efficiently travel long distances (Cao et al. 2006; Chen et al. 2021). Older consumers finally seem to own less efficient cars (Timmons and Perumal 2016; McCarthy and Tay 1998), though Swedish research found them to own electric cars (Westin et al. 2018).

Research gaps

Many previous studies into the effect of the built environment on travel environmental impacts have analyzed vehicle kilometers driven without taking vehicle efficiency into account. Other researchers took energy efficiency into account while investigating total household energy consumption or emissions (Kim and Brownstone 2013; Zahabi et al. 2015; Lee and Lee 2020, 2014). Yet, it is necessary to model vehicle efficiency or SEC explicitly in order to establish energy transition pathways. This literature overview reviewed studies that did explicitly consider vehicle choice. Most of these studies ignored the ownership of compact cars. Furthermore, the often used classification based on vehicle function has no clear relation with energy use: an SUV does not necessarily consume more energy than a van or sedan (Li et al. 2015; Timmons and Perumal 2016).

We could find six previous studies that explicitly analyzed built environment effects on vehicle SEC and \(\hbox {CO}_{{2}}\) emissions. This limited body of research can still be improved. Two of the six previous studies included only one built environment variable - namely density (Timmons and Perumal 2016) versus a metropolitan area population dummy (McCarthy and Tay 1998) - while another focused on city-level travel infrastructure (Chiou et al. 2009). Furthermore, vehicle energy use could be operationalized better. Three of the six previous studies relied on official data - typically based on tests conducted by car manufacturers - rather than real-world fuel use figures (McCarthy and Tay 1998; Li et al. 2015; Song et al. 2016). These official data can be misleading, as shown in Sect. Car energy and emissions exploration of this article. Two others took the mean real-world energy use based on a combination of vehicle age and engine size or fuel type (Chiou et al. 2009; ten Dam et al. 2022). No articles could be found that classified car types by weight, which is the main determinant of (EV) efficiency. A clear link between the built environment and the SEC and \(\hbox {CO}_{{2}}\) emissions of cars has thus not been established.

Data

Travel, sociodemographic, and built environment data

This study aimed to establish such a clear link between the built environment and the SEC and \(\hbox {CO}_{{2}}\) emissions of cars by analyzing data from the Netherlands Mobility Panel (MPN) from KiM Netherlands Institute for Transport Policy Analysis (Hoogendoorn-Lanser 2013; Hoogendoorn-Lanser et al. 2015). The MPN consists of multiple surveys and a three-day travel diary. Household members complete these on a predetermined moment within the 8-week data collection period from September to November. The MPN is enriched with sociodemographic data using administrative registers. This study used the MPN-data of 2019. To prevent overfitting, the sample was supplemented with households who participated in 2018 or 2017. Only the latest year available was included such that no panel effect was introduced. The response rate in 2019 is likely similar to the 64% in 2013 (Hoogendoorn-Lanser et al. 2015).

Households were sampled from an existing access panel without professional panelists (Hoogendoorn-Lanser et al. 2015). KiM excluded households who lived abroad or could not read or write Dutch, a highly diverse group (Hoogendoorn-Lanser et al. 2015). We additionally excluded households that were incomplete or for which built environment data were unavailable. In the end, 4316 households were included. 2810 of these households together owned 3498 cars whose SEC could be accurately determined. The sample weight was used to avoid overrepresentation of households that own multiple cars (see subsection Tank‑To‑Wheel, Well‑To‑Wheel, and other CO2 emissions).

The variables used to analyze these data are summarized in Table 1. Table 2 provides the descriptive statistics and Fig. 2 shows how the data sources were coupled. The highest Variance Inflation Factor among the independent variables was 3.5 (Seabold and Perktold 2010). No variables were deleted due to multicollinearity issues. The rest of this subsection elaborates on the included variables.

The KiM provided us with the respondents’ postcode-6 addresses (see Fig. 1). The local address density could be obtained from Statistics Netherlands (Statistics Netherlands 2018, 2019a). The Statistics Netherlands data also included the average travel distance of the postcode areas’ inhabitants to stations. These distances were log-transformed to improve the model fit. The number of destinations within a 1 km travel distance of the inhabitants was coupled too. These destinations included places where you can buy food or drinks, entertainment services, nurseries, primary schools, out-of-school care facilities, and general practitioners (healthcare).

The most destination-rich postcode-6 areas were then used as city center proxies. If multiple postcode-6 areas within one municipality had the same number of local destinations, the area with the highest land-use mix entropy was chosen. The resulting accuracy was high: the computed center of Amsterdam was for instance located next to Dam square. Two city center sets were included: the four “huge centers” (\(\ge\)500 local destinations) and twenty-three “large centers” (\(\ge\)200 local destinations) including both the “huge centers” and provincial focal points. The straight-line distance from the MPN respondents’ postcode-6 areas to these city center proxies was computed with Geopy.

Fig. 1
figure 1

The Dutch city of Den Bosch with no borders superimposed, with postcode-4 area borders superimposed (1234), with postcode-5 area borders superimposed (1234A), and with postcode-6 area borders superimposed (1234AB)

The Normalized Difference Vegetation Index (NDVI) and land-use mix entropy around the postcode-6 areas were included from the Vitality Data Center project (Ren et al. 2019; Wang 2020; Wang et al. 2021). The NDVI represents green space and was computed using NASA satellite imagery. The index varies from 0 to 1, whereby 0 typically indicates urban areas and +1 dense vegetation cover. The land-use mix entropy index S represents diversity. It is based on the fraction of the total area \(A_{tot}\) used for 5 land-use categories i: residential, recreational (including all green and blue areas), commercial, industrial, and other. The equation is as follows:

$$\begin{aligned} S = -\sum _{i=1}^{5} \frac{A_i}{A_{tot}} \ln \left( \frac{A_i}{A_{tot}}\right) / \ln (5) \end{aligned}$$
(1)

The Vitality Data Center used 2015 land-use data from Statistics Netherlands. The index S takes on a number of 0 if only one land-use category is present and +1 if land-uses are divided completely equally.

Other built environment and sociodemographic variables were included from the MPN-data directly: the distance to the nearest bus stop, (stated) existence of a private parking spot, number of cars owned, household income, number of household members aged under 18 (the legal driving age), 18-39, 40-59, and 60 or older, and fraction of adults who are male, highly educated (bachelor level or higher), and employed for at least three days a week. A number of variables were excluded because they were never significant at the 10% level: the fraction of four-way intersections, travel distances to supermarkets, and distances to small (\(\ge\)50 local destinations) and medium centers (\(\ge\)100 local destinations).

The fraction of four-way intersections was likely insignificant because many Dutch cities are historical with a non-gridded street network and a high density. They are often surrounded by agricultural land with a high NDVI and low land-use mix index. The four main cities (“huge centers”) are located in the urbanized Western part of the country: Amsterdam, Rotterdam, The Hague, and Utrecht. Within the cities, short trips are typically undertaken by bike. Yet, the car remains the dominant mode in all built environment types: 67.4% of kilometers were traveled by personal vehicle in 2019 versus 11.0% by train and 8.1% by bike (Statistics Netherlands 2023). These car trips are supported by the quality of the road network, which is second only to that of Singapore (Schwab 2019).

Table 1 The explanatory variables included in the data analysis
Table 2 The descriptive statistics of the explanatory variables for all 4316 households
Fig. 2
figure 2

The solid lines show our coupling of data sources. The dashed lines indicate the use of Travelcard data by TNO researchers. Variables used to analyze car energy use have been made italic

Car energy data

Data on the cars were coupled from the Netherlands Vehicle Authority using the license plates provided by KiM (Team Open Data RDW 2021a; Team Open Data RDW 2021b). The fuel use data are based on the standardized New European Driving Cycle (NEDC), which has until recently been used by the European Union to enforce emission standards for manufacturers. However, the Netherlands Organization for Applied Scientific Research (TNO) has shown this official data to be biased: the SEC according to the NEDC is consistently lower than the SEC measured in real-world conditions, which has reduced the effectiveness of Dutch and European climate policy (van Gijlswijk et al. 2020). One reason is that manufacturers increasingly adapted their vehicles to the test procedure and utilized its flexibility (Ligterink et al. 2016; de Ruiter et al. 2021). Another reason is less-than-ideal driver behavior. The gap between official and real-world SEC depends on vehicle age and fuel type.

It was therefore decided to use real-world data from Travelcard Nederland BV available on praktijkverbruik.nl. In 2021, this dataset contained the real-world fuel use of 226.000 gasoline, 273.000 diesel, 26.000 gasoline hybrid (HEVs), 9.000 plug-in hybrid (PHEVs), and over 3.000 battery electric vehicles (BEVs) (de Ruiter et al. 2021). MPN vehicles were given the median fuel use of Travelcard entries of the same fuel type, building year, and model (“Toyota Aygo”). See also Fig. 2. Travelcard vehicles are often used by business drivers, but the TNO considers the bias small as many Travelcard vehicles are driven for limited distances, as there are no systematic differences in fuel use between business- and family-oriented car-models, and as 45% of all new Dutch cars are lease cars (van Gijlswijk et al. 2020).

The real-world SEC could be coupled from the Travelcard data for 60% of the included MPN vehicles. Sometimes, a vehicle model was not used by any Travelcard customer. Moreover, some Travelcard entries were manually removed: vehicles from 2003 and 2004 emitting exactly 68 g\(\hbox {CO}_{{2}}\)/vkm and/or consuming 0.57\(-\)1.27 liters of fuel/vkm.

The SEC of gasoline, diesel, and gasoline hybrid vehicles for which Travelcard data were unavailable has been quantified using TNO models calibrated with Travelcard data (de Ruiter et al. 2021). These models estimate \(\hbox {CO}_{{2}}\) emissions - based on the car weight, building year, and engine power - which could be reconverted to the original fuel use data with the emission factors provided. The TNO’s building year factors were extrapolated for the oldest gasoline and diesel vehicles, while a mean building year factor was used for old HEVs. As a test, the models were also applied to the 60% MPN vehicles for which Travelcard data was available.

The SEC of the 54 Liquified Petroleum Gas (LPG) vehicles was determined based on the average weight and fuel use of vehicle models in the Travelcard data. The (Travelcard) energy consumption of PHEV and BEV car models was listed in TNO publications (van Gijlswijk et al. 2020; de Ruiter et al. 2021). Imputation was used when data were unavailable. The distance driven electrically per PHEV model (4.7\(-\)39.2%) was taken into account (van Gijlswijk et al. 2020).

Cars with missing fuel type, building year, or weight data were excluded. Three cars with a registered weight of \(\le\)500 kg were excluded as well. All (hybrid) electric cars were kept due to their importance for the energy transition.

Tank-To-Wheel, Well-To-Wheel, and other \(\hbox {CO}_{{2}}\) emissions

This article focuses on direct energy consumption because it is relatively independent of the country and year under consideration. This SEC can be easily converted to Tank-To-Wheel (TTW, tailpipe) and Well-To-Wheel (WTW) \(\hbox {CO}_{{2}}\) emissions. Table 3 provides conversion factors for The Netherlands in 2020 from CE Delft (Leestemaker et al. 2023). The WTW emissions take into account the average powermix; biofuel use; energy losses for power production, fuel production, and fuel transport; and emissions for required oil refineries, wind mills, and powercables. The emissions can be corrected for different driving environments using CE Delft’s webtool (https://tools.ce.nl/stream). BEV WTW emissions are expected to decline with 70% by 2030 (Leestemaker and van den Berg 2023).

In addition to energy-related WTW emissions, the CE Delft report also estimates gasoline and electric vehicle emissions from other sources: vehicle production, vehicle maintenance, vehicle disposal, and road infrastructure (asphalt and lighting). A rough estimate for diesel vehicles can be made by correcting for their higher average use relative to gasoline vehicles (Statistics Netherlands 2019b; Personenauto’s steeds ouder (n.d.) 2023).

Table 3 The conversion factors in The Netherlands in 2020 (Leestemaker et al. 2023). One MJ/vkm is thus equivalent to 3.2 liters of gasoline per 100 vkm

Data processing, sample weights, and categorization

The above-explained data cleaning and merging was executed using Pandas (The pandas development team 2020; McKinney 2010). All built environment and sociodemographic data were standardized by subtracting the mean and dividing by the standard deviation of the full sample of 4316 households using Sklearn (Varoquaux et al. 2015). All data were used for training in accordance with standard discrete choice modeling methodology. To reduce the chance of overfitting, variables were added one-by-one whereby variables that were insignificant at the 20% level were excluded. The final model does not include any variable interactions or any variables that were never significant at the 10% level.

The model results were made representative for the Dutch population using the household weights - computed as the mean of the household members’ sample weights provided by KiM. The weights were scaled based on the number of car owning households to avoid in- or deflated P-values. Moreover, the sample weights were adjusted to avoid overrepresentation of the 770 households owning multiple cars. To be precise, the sample weights of their cars were multiplied with the probability that the car was used most during the survey-days. This probability was estimated with a simple multilinear model based on age and indicated use (in six categories of vkm/year).Footnote 2 If a household thus owned two cars built in 2010 and indicated to drive both 7,500-15,000 vkm/year, the probability of using each vehicle most was estimated at 50% and the sample weight of both data entries was halved.

The cars were categorized into 8 types \(fw\) based on fuel type f and weight category w: the main predictors of SEC. See also Table 4. Diesel vehicles were given their own fuel type class f since these efficient vehicles constitute a major fraction of the sample. (Hybrid) electric and hydrogen vehicles were given their own fuel type category too because of their future importance. In 2019, the number of EVs was growing rapidly in the Netherlands (Palmer et al. 2020). Ownership of BEVs in particular was stimulated by a registration tax exemption, a road tax exemption, fiscal benefits for electric company cars, tax deduction options, additional municipal measures, and a well-developed charging network (Palmer et al. 2020).

The standard (mostly gasoline) and diesel cars were divided into the following weight categories w: light (<1000 kg), midlight (1000-1250 kg), midheavy (1250-1500 kg), and heavy (\(\ge\)1500 kg). There were not enough (hybrid) electric vehicles to split this fuel type into weight categories, but those who bought heavy gasoline-fueled vehicles in the past can be logically expected to buy heavy electric vehicles in the future. Similarly, there were not enough heavy vehicles to allow further weight categories. The few lightest diesel vehicles were grouped into the midlight category.

Table 4 Typical car models and sample sizes for each fuel- and weight-based car type \(fw\)

Car energy and emissions exploration

The real-world SEC of the eight car type categories \(fw\) is shown in the boxplots of Fig. 3. The three quantiles for the entire sample are 1.94, 2.16 (the median), and 2.37 MJ/vkm. EVs are the most efficient, whereby BEVs consume less than 1 MJ/vkm. Diesel vehicles are also efficient. A notable exception are heavy diesel vehicles. As expected, heavy standard fueled cars (mostly gasoline) also use considerably more energy than their lighter counterparts. The least efficient standard vehicles are old, little-used second cars with a low sample weight.

The SEC quantiles according to the NEDC - when available - are shown in gray for reference purposes. As expected, the official data consistently underestimates real-world energy consumption. The official SEC of gasoline cars is for instance often lower than the real-world SEC of HEVs.

Fig. 3
figure 3

The real-world SEC in MJ/vkm of the fuel- and weight-based car types \(fw\)

Figure 4 shows the corresponding real-world Well-To-Wheel \(\hbox {CO}_{{2}}\) emissions. These are proportional to the SEC given the fuel type and context (the Netherlands in 2020). The 26 PHEVs are treated as average HEVs because their emissions are comparable. The emissions of the efficient BEVs remain low when including other - e.g. vehicle production - sources of \(\hbox {CO}_{{2}}\) (see Table 3), but may surpass those of gasoline vehicles when charged at peak hours (Bahamonde-Birke 2020).

Fig. 4
figure 4

The real-world WTW \(\hbox {CO}_{{2}}\) emissions of the car types as computed using Table 3

The discrete choice model

Model description

It was decided to analyze the real-world SEC and corresponding \(\hbox {CO}_{{2}}\) emissions using a multilevel discrete choice modeling framework.Footnote 3 Given that decisions on car ownership cannot be disentangled from household preferences for fuel- and weight-based car types \(fw\), both choices were modeled jointly, taking the multilevel characteristics of the decision-making process into account: the number of vehicles available to households influences the types of vehicles being purchased. Both decisions are fundamental to understanding households’ travel energy use. Moreover, both decisions depend on the households’ sociodemographic characteristics and residential environment. Therefore, to fully assess the impact of the built environment on vehicle SEC, it is imperative to consider both its direct impact on the vehicles being purchased as well as its indirect impact via the number of vehicles owned by different households.

At a first stage, car ownership classes were considered using a discrete choice model. Then, a car type model was specified, which considered the number of cars owned as discrete latent attributes.Footnote 4 This car type model estimated household choices for car fuel types f and weight categories w explicitly by defining the utility of each of the eight fuel- and weight-based car types \(fw\) (see Table 4) as the utility of the fuel type plus the utility of the weight category. The utility of the fuel types and weight categories was adjusted by a fixed amount in households that were estimated to own two or more vehicles. Both the car ownership and fuel- and weight-based car type model were considered simultaneously. Figure 5 represents the specification of the joint choice-model. Subsections Car ownership classes and Fuel- and weight-based car types explain the model structure step-by-step and provide the utility functions.

Fig. 5
figure 5

The structure of the integrated discrete choice model. The latent variable on the left is discrete and represents the car ownership class the household belongs to (0 cars, 1 car, or 2+ cars). This latent variable, in conjunction with the sociodemographic and built environment variables, explains the utility functions associated with the car type choices. The dashed lines show the modeling of energy-relevant travel behavior that will be described in a future publication

The built environment variables were included exogenously, assuming that they have an effect on households’ energy-relevant car type choices. It can be argued that residential self-selection plays a role as well since households with a preference for inefficient cars may also prefer to live in less urban settings. However, in the context of the relationship between the built environment and travel behavior, this residential self-selection effect appears smaller than the direct effect of the built environment (Cao et al. 2009; van de Coevering et al. 2021). In addition, residential location choice in the Netherlands is often driven by non-transport related considerations (Ettema and Nieuwenhuis 2017). Finally, the severely constrained Dutch housing market has been impeding locals from moving to their preferred location or from moving at all (Stuart-Fox and Blijie 2018; Ministry of the Interior and Kingdom Relations 2022). The United Nations recently investigated this “acute and massive housing crisis” (Rajagopal 2023).

Car ownership classes

First, a model was built that considers to which car ownership class c each household h belongs. The three classes were carless households, one-car households, and households owning two or more vehicles. Households’ preferences were considered to be a function of their sociodemographic characteristics and the characteristics of the built environment (\(x_{ih}\); see Table 1). These preferences were modeled as utilities \(U_{ch}\) in the assumption that each household would belong to the class that promises them the highest utility. These utilities in turn depended on the exogenous variables \(x_{ih}\) (whose impact is given by the estimated coefficients \(\beta _{ic}\)) and error terms \(\epsilon _{ch}\) representing all attributes of the decision ignored by the modeler. Class specific constants \(ASC_{c}\) were included to ensure that the model replicated the true market shares.

$$\begin{aligned} U_{ch} = ASC_{c} + \sum _{i=1}^{v} x_{ih}\beta _{ic} + \epsilon _{ch} \end{aligned}$$
(2)

It was assumed that the error terms followed an Extreme Value Type 1 (EV1) distribution and that they were independent and identically distributed (iid). This led to the well-known Multinomial Logit kernel (Domenchic and McFadden 1975). The associated probabilities \(P_{ch}\) of belonging to a car ownership class are given by the following equation:

$$\begin{aligned} P_{ch}= \dfrac{e^{U_{ch}}}{\sum _{c} e^{U_{ch}}} \end{aligned}$$
(3)

Fuel- and weight-based car types

Next, a second model estimated the probability that a vehicle of a household h was of a certain type conditional on the car ownership class probabilities. In this model, the utility \(U_{fwhc}\) associated with each of the 8 fuel- and weight-based car types \(fw\) (the choice set, see Table 4) was the sum of an Alternative Specific Constant \(ASC_{fw}\), the utility associated with the fuel type \(U_{fhc}\), the utility associated with the weight category \(U_{whc}\), and an error term \(\epsilon _{fwh}\):

$$\begin{aligned} U_{fwhc} = ASC_{fw} + U_{fhc} + U_{whc} + \epsilon _{fwh} \end{aligned}$$
(4)

There was thus one utility function for each fuel type and each weight category (rather than per fuel-weight combination). The weight-based utility \(U_{whc}\) associated with the (hybrid) electric car type was assumed zero as the sample was not big enough to subdivide this fuel type into weight categories.

The fuel type and weight category coefficients \(\beta _{if}\) and \(\beta _{iw}\) were the same for the onecar and twocar classes to reduce overfitting. Instead, an additional coefficient \(\beta _{2car}\) was estimated to represent that the household owns two or more vehicles. A household owning multiple cars was thus given a fixed amount of extra utility for each fuel type and each weight category regardless of its sociodemographic characteristics and the built environment.

$$\begin{aligned} U_{fhc}= & {} \beta _{2car,f} + \sum _{i=1}^{v} x_{ih}\beta _{if} \end{aligned}$$
(5)
$$\begin{aligned} U_{whc}= & {} \beta _{2car,w} + \sum _{i=1}^{v} x_{ih}\beta _{iw} \end{aligned}$$
(6)

Nested Logit specifications (Daly and Zachary 1978; Williams et al. 1977) were considered. However, these collapsed into the Multinomial Logit. Mixed Logit specifications were attempted too (Boyd and Mellman 1980) using random disturbances to correlate preferences for car fuel types and weights. However, none of the random disturbances were statistically significant at the 10% level, showing that the degree of correlation not captured by the sociodemographic and built environment variables was limited.

It was thus decided to analyze car types using a Multinomial Logit specification. This allowed us to use the independence of irrelevant alternatives property whereby only the difference in utility associated with two alternatives influences the probability of a household choosing one over the other. The separation of the utility functions in a fuel- and a weight-based component described above then ensured that a household’s probability of choosing a light gasoline over a heavy gasoline vehicle was only determined by the weight category coefficients \(\beta _{iw}\). These weight category coefficients should therefore remain valid in future samples with more electric vehicles.

The probability \(P_{fwh|c}\) that a vehicle of a household h belongs to one of the 8 types \(fw\) given the car ownership class c was finally given by following equation:

$$\begin{aligned} P_{fwh|c} = \dfrac{e^{U_{fwh|c}}}{\sum _{fw} e^{U_{fwh|c}}} \end{aligned}$$
(7)

The joint model was then estimated by maximizing the following loglikelihood (LL) function:

$$\begin{aligned} LL(\beta _{ic},\beta _{if},\beta _{iw}) = \ln \left( \prod _{h} \prod _{c} P_{ch} (c|x_i;\beta _{ic},\epsilon ) \prod _{h} \prod _{n} \prod _{fw} \left[ \sum _{c} P_{fwh|c} (fw|x_i,c;\beta _{if},\beta _{iw},\epsilon )*P_{ch} (c|x_i;\beta _{ic},\epsilon ) \right] ^q \right) \end{aligned}$$
(8)

The number of vehicles per household is hereby denoted n, since all household vehicles were included (see subsection Data processing, sample weights, and categorization). The dummy q was 1 if the household owned a car and provided sufficient information. Otherwise it was zero.Footnote 5 The models were constructed using Biogeme (Bierlaire 2020).

When using the model to make predictions, the probability of a household owning a car of a certain type if the household owns at least one vehicle is given by:

$$\begin{aligned} P_{fwh} =\frac{P_{c=1,h} P_{fwh|c = 1} + P_{c=2,h} P_{fwh|c = 2}}{P_{c=1,h} + P_{c=2,h}} \end{aligned}$$
(9)

Model results

The coefficients and P-values are presented in Table 5 and described below. The final loglikelihood was -8584 versus a market share model loglikelihood of -10717 and a Null Model loglikelihood of -12209. The joint model thus had a likelihood ratio test statistic of 7250, a McFadden \(\rho ^2\) statistic of 0.30, and a Tardiff \(\rho ^2\) statistic of 0.20. The results of the car ownership model are described first. The reference category is the no car class. This is followed by a description of the results of the integrated model of car types, with the reference of standard-fueled midheavy vehicles.

A first observation is that the probability of owning one car is, as expected, higher in large households with many employed members and a middle-high household income. The number of elderly household members matters especially. These car owning households tend to live in an environment with a low address density, a private parking spot, and a large distance to bus stops, train stations, and the “huge centers”.

The same factors are associated with the ownership of two or more cars, but some minor differences can be seen. Male households are more likely to own multiple vehicles, but they do not have a higher likelihood to own a vehicle at all. Moreover, households with many adult and working members seem especially likely to own two or more cars. Household income and the local address density are also strongly correlated with the twocar class.

These twocar households have a preference for heavy vehicles, as shown by the positive 2car coefficient. This is in accordance with the literature. In contrast, midlight, midheavy, and diesel vehicles are not often owned and used in these households.

The other fuel type coefficients \(\beta _{f}\) show that households with a low income and few male members tend to own standard-fueled (gasoline) rather than diesel or electric vehicles. These households tend to live close to the four main cities (huge centers) in the urbanized Western part of the country, but far from other large centers like provincial cities.

The preference for efficient diesel vehicles is highest in households with many middle-aged, employed members. These households tend to live far away from any large center. They also seem likely to have a private parking spot and to be located in non-diverse areas. A possible explanation is that some working and remote households aim to save fuel costs without having the option of going car-free altogether.

Ownership of efficient (hybrid) electric vehicles is higher for households with private parking as well. A likely reason is that owners prefer to charge their vehicles at home (Westin et al. 2018). In addition, EVs tend to be owned by households with many university-educated members. Elderly households seem less likely to own these innovative vehicles.

The weight category coefficients \(\beta _w\) show that households with many members generally do not own (mid)light cars, which is logical. Especially households with many elderly members do not possess (mid)light cars whereas households with many young adults do not own heavy cars. As expected, households with a higher income do not possess (mid)light vehicles either. Households with many male members prefer heavy over light cars as well. Finally, households with many working members tend to own vehicles with an average weight.

The NDVI built environment variable shows that households in non-green (urban) areas tend to own light rather than midheavy vehicles. Private parking spots are associated with heavy vehicle ownership. These results are logical as compact cars are easier to park and maneuver in crowded urban areas. The built environment effect is strengthened by the preference for both non-urban environments and heavy vehicles of households owning two or more cars. Interestingly, light vehicle ownership is higher at a larger distance from the main cities (huge centers). As with the diesel vehicle preferences, this can be due to economic reasons since these households may be forced to drive many kilometers.

Finally, a number of variables are logical or thought-provoking despite being insignificant at the 10% level. A high household income is for instance insignificantly correlated with EV ownership. Moreover, higher education is insignificantly correlated with heavy vehicle ownership, which is interesting from a socioeconomic perspective. The question is whether these effects would be significant in a sample that contains more electric or heavy vehicles.

Table 5 The estimated coefficients for each of the utility functions. Variables with a P-value of 5% or less have been made bold. The data have been standardized such that coefficients give the change in utility when increasing the variable by 1 standard deviation (std) as shown in Table 2. The 2car coefficients represent the preference of households that own two or more cars for certain fuel types and weight categories

Model predictions

The obtained results can be used to make car SEC predictions with a sampling approach. The sampled values are the specific energy consumption medians of the 8 car types \(fw\). The sampling weights are the probabilities \(P_{fwh}\) that households own these car types (if they own at least one car). For example, if a household is modeled to have a 10% probability of owning a (hybrid) electric vehicle, there is a 10% probability that the household vehicle is predicted to have an SEC of 1.7 MJ/vkm: the median SEC of the (hybrid) electric vehicles in the dataset. Many draws are made, after which the average is taken as the final prediction \(\mathrm {\overline{SEC}_{car}}\). The advantage of this method is that it retains the deterministic variabilityFootnote 6 in the data: the standard deviation of the predictions is 0.24 MJ/vkm for the 4316 households versus 0.35 MJ/vkm in reality.Footnote 7

In addition to showing the model performance, the predictions can also be used to illustrate the model results. Specifically, the following three sociodemographic profiles were set in the residential environments described in Table 6:

  • A low-income (<20k) female graduate student (age 23) with a university degree and no employment.

  • A middle-income (40to60k) family of a father, mother (age 35), and two children with no university degrees and full-time jobs.

  • A high-income (\(\ge\)120k) family of a father, mother (age 50), and two children with no university degrees and full-time jobs.

Table 6 The residential environments used for the predictions. Amsterdam city center (1012NH) is the most urban residential area in the dataset. Wassenaar (2244BD) is the most “suburban”. In contrast, Heesch village (5384AK) is the most average (i.e. its built environment characteristics were closest to the mean) while the Friesland farm area (8741KB) is the least urban. The parkingspot availability was determined manually

The car ownership probabilities \(P_{ch}\) and car SEC predictions for each profile-environment combination (using 100 thousand draws) are given in Table 7. For extra context, we also predicted the approximate WTW \(\hbox {CO}_{{2}}\) emissions on city streets, country roads, and highways using the CE Delft webtool (https://tools.ce.nl/stream) (Leestemaker et al. 2023). Do note that those living in Amsterdam are likely to use their cars for trips outside the city.

To further clarify our results, four predictions \(\mathrm {\overline{SEC}_{car}}\) were added to the boxplots of the car types’ official and real-world SEC (Fig. 6): the student and high-income family in Amsterdam and Wassenaar. The student is predicted to own an efficient car regardless of her living location. Depending on this living location, she does have a major probability of not owning a car at all. This would result in almost zero travel energy use if she relies on walking, cycling, and trains instead. For the middle- and high-income family, the built environment has a stronger influence on the predicted vehicle SEC. Yet, these differences remain smaller than the standard deviations, indicating that the uncertainty in the vehicle type choice prediction is greater than the built environment effect. Moreover, the built environment effect is also much smaller than the gap between the official and real-world data (the gap between the gray and colored boxplots).

Table 7 The probabilities of the car ownership classes, the predicted \(\mathrm {\overline{SEC}_{car}}\) in MJ/vkm, and the approximate WTW predictions per driving environment in g\(\hbox {CO}_{{2}}\)eq/vkm. The predictions are uncertain due to sampling uncertainty (as shown by the standard deviations \(\mathrm {\overline{Std}_{SEC}}\) in MJ/vkm) and variation in car energy use due to the building year, car model, and driving style. The \(\mathrm {\overline{SEC}_{car}}\) of the student and high-income family in Amsterdam and Wassenaar have been made bold (see also Fig. 6)
Fig. 6
figure 6

The SEC of the different car type categories. The vertical lines show the predicted vehicle specific energy consumption \(\mathrm {\overline{SEC}_{car}}\) of the student (solid green) and the high-income family (dashed purple) in Amsterdam and Wassenaar

Discussion

Electrifying the car fleet can reduce \(\hbox {CO}_{{2}}\) emissions, but requires rare minerals and may cause a prohibitive increase in power demand. Especially heavy EVs use a lot of electricity. It is thus important to also limit car dependence as well as car weights and associated specific energy consumption (SEC). This study looked into the real-world SEC and Well-To-Wheel \(\hbox {CO}_{{2}}\) emissions of cars, which were much higher than recorded in the official data. Next, it investigated the extent to which features of residential environments and their residents are related to energy-relevant car type choices. For this purpose, a multilevel discrete choice modeling framework was constructed. This framework was fed with built environment data from multiple sources on the fine-grained postcode-6 level and considered choices between different car fuel types and weight categories conditional on the number of cars owned.

Earlier research found that those living in centrally located, dense areas with good access to public transit have a lower probability of owning vans, trucks, and SUVs. Our multilevel framework revealed the effect of these variables on vehicle weight to be indirect: they lowered the probability of owning a second car, with those households owning multiple cars preferring heavy vehicles. Direct predictors of vehicle weight according to our results were the degree of open (green) space in the residential area and the availability of a private parking spot.

One possible way to increase vehicle efficiency is therefore to lower parking norms. Reducing these norms would also reduce construction costs and free up additional space for housing, public parks, and other functions: up to 50% of the public space in Dutch cities is used for car infrastructure (Zijlstra et al. 2022). Households with private parking were, however, more likely to own electric vehicles according to our model results. It may be possible to avoid negative effects on EV ownership by providing sufficient public charging points.

Our vehicle predictions showed a noticeable combined influence of built environment variables like parking availability on car SEC. This should reinforce the effect of these variables on travel energy use through city residents’ lower car preferences and shorter travel distances. The influence of the built environment on car efficiency does seem limited relative to its influence on daily travel behavior, as indicated by the much stronger effect of the built environment variables on car ownership than on car SEC. The bias in the official vehicle energy data was also much larger than the total influence of the built environment on car SEC and associated \(\hbox {CO}_{{2}}\) emissions.

Car efficiency may therefore be more effectively increased by continuing to improve the official car fuel use data to better inform consumers and to support government regulations. The European Union has for instance been setting caps on car specific \(\hbox {CO}_{{2}}\) emissions (The International Council on Clean Transportation 2023). Ideally, these emissions would be tested in real-world conditions. Furthermore, our results indicate that energy-relevant car type choices are influenced by economic motives: households that have to drive long distances seem to prefer efficient cars. Remote households were for instance more likely to own light and diesel vehicles. Households with many employed members owned diesel vehicles too. Increased fiscal benefits for energy efficient cars may thus be considered.

Future research could use the rising share of electric cars to clarify the precise effect of the built environment on EV ownership. This research could then also determine if ownership of heavy EVs is indeed influenced by the same factors as ownership of heavy gasoline vehicles. These heavy vehicles could be subdivided into more weight categories since both car ownership and weights are expected to keep rising in the Netherlands (Zijlstra et al. 2022). Furthermore, it would be interesting to gather data on vehicle preferences to analyze if residential self-selection confounds the influence of the built environment on energy-relevant car type choices.

A last important future research direction is the inclusion of kilometers traveled (mileage) and modal choice, which seem more important than car type choice in determining the impact of the built environment on energy consumption. A second model is therefore being constructed to include these daily travel behavior characteristics. The complete model structure should reveal the pathways through which the built environment influences travel energy and \(\hbox {CO}_{{2}}\) emissions. It should also reveal the combined effect of the residential environment on travel energy, which is likely more than the sum of its parts (Brownstone and Fang 2014).

Conclusion

Small, lower-income households with few male or older members in non-green environments were more likely to own light, efficient vehicles. Remote households had a preference for light and diesel vehicles. In contrast, households with private parking tended to own both heavy and electric vehicles. Small households with few working members, a lower income, and an urban living location were finally less likely to own one or multiple cars, whereby the ownership of multiple cars was correlated with a preference for heavier vehicles.

The combined effect was a mild preference for efficient vehicles in urban environments. Earlier studies that omitted vehicle efficiency thus slightly underestimated the environmental impact of urban planning interventions. The bias in the official vehicle energy data was much larger than this built environment effect. The easiest way to reduce vehicles’ specific energy consumption and \(\hbox {CO}_{{2}}\) emissions seems therefore to keep improving the testing procedures in order to better inform consumers and tighten car \(\hbox {CO}_{{2}}\) emission standards. In addition, increased financial stimulation of light, efficient cars may be considered.