Introduction

Curb spaces have been an essential part of modern urban infrastructure, as they carry many types of end-stream traffic, such as parking for residences, bus transit, temporary pickups and drop-offs, and loading and unloading of goods for delivery. In recent years, they have also been used as pickup/drop-off zones for ridesharing services, bike share or scooter racks, and delivery zones for commercial deliveries (Jaller and Pahwa 2020; Jaller et al. 2021; Yu and Bayram 2021; Liu, Qian and Ma 2022). The growing demand for passenger pickups/drop-offs and commercial deliveries for curb spaces may crowd out existing resources. As on-street parking prices are usually much lower than off-street parking (e.g., parking garages and lots), drivers tend to cruise to find an available parking curb space (Shoup 2006). When many drivers search for parking in a particular area (e.g., several blocks, a neighborhood), this becomes a collective behavior—an aggregation of individual behaviors that may accumulate unexpected negative externalities. The collective parking search behavior reflects the failure of curb management and the imbalance between the supply and demand of curb space, whether for residence parking, ride-hailing, or commercial deliveries. Parking issues cause drivers’ mental frustration and longer cruising times and contribute to congestion and additional emissions as vehicles travel at lower speeds when drivers look for parking spaces and signs (Ng 2016). On average, drivers spend about 17 h a year searching for parking, and the estimated cost of wasted time, fuel, and emissions is $345 per driver (McCoy 2017). Other estimates indicate drivers typically spend 3.5 to 14 min searching for parking (Shoup 2006). A study on freight parking demand in Portugal shows that having optimal freight curb bay locations and enforcement of usage are good practices leading to improved traffic flow and better mobility (Alho et al. 2018). Another related analysis using cellular automata simulation supports the idea that freight bays increased traffic fluidity by 6% on average (Iwan et al. 2018).

Quantitatively, studying parking searches requires aggregate proximity data that indicate this collective behavior. Due to the lack of city-wide vehicular data, it is generally unlikely to capture the real-time dynamics of each vehicle (location, speed, cruise time, etc.). Therefore, surveys and other data collection methods are needed. For example, parking meter transaction data, curbside parking data, and arterial traffic data can help estimate the proportion of traffic searching for parking along high occupancy arterials (Dowling et al. 2017). For behavioral research, surveys and videos are often ways to study the extent to which people perceive parking searches, and researchers use these methods to determine how many vehicles in a specific area are searching for parking (Hampshire et al. 2016; Qin et al. 2020). GPS data can help investigate parking searches’ temporal and spatial aspects (van der Waerden, Timmermans and Van Hove 2015). These data enable either aggregate or disaggregate analyses considering the spatio-temporal variability across periods and the location of search instances. Still, data challenges remain because it is difficult to understand different search patterns in different areas using GPS data. After all, the coverage and penetration of tracking devices are limited. At the aggregate level, the average cruising time (ACT) for a given area can serve as a good indicator (Shoup 2006; van Ommeren, Wentink and Rietveld 2012; Inci, van Ommeren and Kobus 2017; Lee, Agdas and Baker 2017; Millard-Ball, Hampshire and Weinberger 2020).

This paper has two primary objectives to contribute to studying parking search patterns and their impacts. First, we explore the relationship between aggregate census and socioeconomic information at the ZIP level and cruising time of parking searches to determine the factors influencing ACT. Second, we propose a novel prediction framework on ACT and related emissions to identify the parking hotspots with high cruising time of parking search and emission blackspots in dense urban areas. The results can provide implications and help planners and policymakers to analyze the problematic areas in the city and thus address them by adjusting zone-specific policies regarding parking price, enforcement, and curb allocation. Also, we expect the ACT prediction framework to work with different time-period data inputs, both short-term (e.g., daily and weekly) and long-term (e.g., six-month and annual data). Practically, with the input of short-term data, the framework could provide short-term (e.g., next hour) prediction of potential parking hotspots in the city, which can help drivers re-decide their parking destinations and boost mode shift from driving.

This paper focuses on a specific problem: the search for parking, often termed “cruising”. Parking is a unique curb space demand because on-street parking often costs less than off-street options. This disparity incentivizes drivers to search for available on-street spots, contributing to congestion and emissions. Given this context and the significant societal cost of cruising regarding time, fuel, and emissions, the paper adopts a two-pronged approach: descriptive and predictive analyses. Initially, spatial lag models help understand the Average Cruising Time (ACT) and its influential factors across different geographic scales. The output of these spatial lag models is essential as it serves as the foundation for our subsequent analyses. Following this, k-means clustering helps segment the dataset based on empirical vehicle mixes derived from the spatial lag models. This clustering allows us to conduct comparative analyses that further enrich our understanding of ACT variations across different vehicle scenarios. Finally, various machine learning techniques—including Artificial Neural Networks (ANN), Random Forest (RF), and others—are utilized to predict ACT and Average Emission Metrics (AEM) based on city-wide cellular grids. In addition to the primary analyses, this study also conducts a sensitivity analysis on the grid size used for the spatial prediction models. We explore grid sizes ranging from 200 to 400 m and assess the impacts on prediction accuracy and spatial autocorrelation. This predictive framework operates independently of the findings from the spatial lag models and comparative analyses.

The remainder of the paper is organized as follows: Section “literature review” provides a literature review on various issues related to parking searches. Sections “methods and Data” present the methods and the datasets used in this paper and the empirical results of the descriptive and comparative analyses. Section 5 discusses the results of the grid-based ACT and AEM prediction. Finally, the paper discusses the key findings and policy recommendations in Sect. “key findings and policy recommendations” and conclusions in Sect. “conclusions”.

Literature review

Parking search, also denoted as parking cruise, is one of the most studied topics in the economics of parking (Inci 2015). Researchers studied that topic from various perspectives and approaches before the rise of new mobility services and e-commerce. The first and most challenging aspect is parking search behavior is inherently difficult to identify. Shoup described it in his work as “invisible” (Shoup 2006), meaning it is difficult to identify whether a vehicle is searching for parking. Therefore, previous studies have taken different approaches to obtain data metrics, including (1) cruising time/distance, (2) traffic share, and (3) factors related to parking search. Shoup summarized and found that the ACT ranged from 3.5 to 14 min in congested urban areas, with 8 to 74% of vehicles (Shoup 2006). Subsequent empirical studies have used survey methods. Van Ommeren et al. used a Dutch nationwide sample of car trips collected by survey and found ACT to be 36 s per car trip (van Ommeren et al. 2012). A project survey in New York City shows that over 50% of trucks parked in legal parking spots had spent over 15 min searching for parking (Holguin-Veras et al. 2016). In 2017, another survey was conducted in Brisbane, Australia, to collect the ACT and understand potential factors influencing drivers’ cruising behavior. The study shows that the ACT for on-street parkers is 13.38 min and that errors in drivers’ perception of parking costs are among the leading factors encouraging drivers to cruise for on-street parking (Lee, Agdas and Baker 2017).

In addition to surveys, Inci et al. (2017) also used a traffic observation method to collect data. They found that, on average, 3.6 vehicles per hour search for parking with a marginal parking spot (Inci, Ommeren and Kobus 2017). More recently, GPS data is preferable for its accuracy and coverage (only with enough samples). For instance, Millard-Ball et al. (2020) use a GPS dataset to compute the difference between the actual distances driven and the most direct routes of parking-search vehicles in San Francisco and found that the average cruising distance (ACD) is 32.1 m with the standard deviation of 177.5 m (Millard-Ball, Hampshire and Weinberger 2020). Gu et al. (2020) developed a macroscopic parking dynamics model that considers both on-street and off-street parking. The model employs real-time pricing strategies to minimize cruising delays, providing a comprehensive framework for managing parking in dense urban areas with limited parking supply. With the advancement of technology and the increase in tracking device penetration, more and more vehicles of various types will provide GPS data in the future, whether aggregate or disaggregate.

Although many studies have successfully applied different models to predict outcomes in various fields and processes, very few studies have applied them to predict ACT. A similar prediction objective was to predict short-term wait times at a U.S.-Mexico border crossing using Gradient Boosting Regression and RF methods (Sharma et al. 2021). The model encourages combining more sophisticated predictive algorithms and prediction methods on datasets, while the high data variability is challenging, leading to unreliable predictions. Given the high variability in data, a single method may not provide satisfactory model prediction accuracy. This led us to compare different classification and regression methods on ACT and its factors, aiming for a more robust and reliable predictive model.

Regarding factors affecting ACT, studies have focused on parking demand, which is strongly associated with population and commercial establishment density. For the most part, studies have analyzed parking demand for passenger and freight vehicles. Evidence shows a mismatched parking supply in denser urban areas will exacerbate congestion (Barter 2011). For example, the concentration of freight activities in urban areas can evidence locations where the demand for commercial vehicles parking far exceeds local availability (Jaller, Holguín-Veras and Hodge 2013). There are also views that the demand for curbside parking is related to the type of land use (Simons 2020; Jaller et al. 2021). Land use types with many off-street parking spaces (e.g., retail, storage, etc.) will have lower curbside parking demand. A study found that the influence of household properties, such as income, on automobile travel demand is remarkable (Schimek 1996). Another empirical study by Assemi et al. (2020) in Brisbane, Australia, focused on the factors influencing cruising time for on-street parking. The study found that arrival time and trip purpose significantly impact cruising behavior. Interestingly, the study also revealed a negative association between relative traffic volume and cruising time, emphasizing the need for real-time and reliable parking information systems.

In addition to the above factors, vehicle mix, population, POI density, the number of establishments, and the number of employees of all industries could also impact ACT. Due to differences in vehicle types and vehicle mixes, different emissions may occur in areas with the same ACT, and due to the size limit of parking spaces, it may take longer for long vehicles to find parking. However, studies that consider all these factors to quantify ACT and the associated impacts (e.g., emissions) for different land uses and traffic mix are lacking.

The gaps in the studies of parking search include: (1) Using GPS data is a more convenient way than traditional surveys and traffic observations; (2) Few studies exist that predict ACT, and the proposed novel prediction framework will be a worthwhile addition to a broader view. With the framework, public and private sectors can generate long-term and short-term predictions of ACT and AEM for different uses, depending on the periods (length of time window) and spatial resolutions (level of aggregation) required; and (3) Most previous studies are focused either on passenger vehicles or commercial vehicles exclusively. It is essential to consider the impacts of different vehicle types more broadly.

Data

This paper uses datasets from different sources and merges them through spatial data processing. The datasets used include (1) the searching-for-parking GPS aggregate dataset (GeoTab); (2) ZIP-level census data and socioeconomic indicator data; (3) POI location dataset; and (4) Primary Land Use Tax Lot Output (PLUTO) land use data from New York City urban planning departments. Table 1 describes the sources of those datasets.

Table 1 Description of datasets

Searching-for-parking aggregated GPS data

GeoTab’s searching-for-parking dataset is the primary dataset for this paper. GeoTab collects various types of data, including but not limited to vehicle location via GPS, speed, fuel consumption, and additional driving behavior metrics. GeoTab provides aggregated dynamic data of vehicles with GPS tracking devices installed in specific small geographic areas. The dataset contains the empirical ACT variable. Specifically, the GPS data is spatially aggregated at the geohash geographic unit, a geocoding system represented by a short string (Niemeyer 2008). The study used level 7 geohash coding, with a single geohash size of 153 m × 153 m (see GeoTab n.d.). The aggregated dataset represents a range of 6 months for publicly available data on the GeoTab data platform and is updated monthly. The study used data from April 1, 2020, to October 1, 2020.

A key concern is how it identifies from the vehicle’s cruise dynamics that a vehicle is searching for parking. According to the description (GeoTab Dataset), GeoTab considers the geohash, speed, time, and parking location of the vehicle traveling through. The platform uses specific machine learning methods to determine whether this vehicle is searching for parking. All searching vehicles’ ACT and other variables are aggregated to each geohash. We assume that the accuracy of parking search recognition is validated. Although it is relatively challenging to derive the penetration rate of GeoTab devices, i.e., it is not clear what percentage of all traffic has tracking devices traveling through a given geohash, ACT measures the cruising time of vehicles, and it is not affected by the size of samples or the penetration of tracking devices. Therefore, GeoTab’s data can provide a proximity of measurement of ACT, and the study assumes the data as floating vehicle data to some extent.

Census and socioeconomic data

In addition, we use census and socioeconomic information at the ZIP code level. The “ZIP Business” and “ZIP Code” datasets include data with the (1) establishments and employees by NAICS industry categories and (2) population and median household income, respectively. We gathered aggregate POI data for multiple types of establishments and locations from Caliper Corporation, representing a POI density. To analyze the land-use impacts on ACT in detail, we also use the PLUTO data for the descriptive and comparative analyses.

Descriptive statistics

The study estimates descriptive statistics in Table 2 to offer a comprehensive understanding of the dataset variables, excluding the vehicle mix, which results from a clustering analysis.

Table 2 Descriptive statistics of key variables used in the study

After carefully considering the preliminary statistics and addressing the multicollinearity concerns identified in the Variable Inflation Factor (VIF) analysis, we removed specific variables (e.g., NAICS 2 and NAICS 5 employment) to improve the model’s interpretability and reliability.

Methods

This section is structured into two primary segments to address the complex nature of predicting Average Cruising Time (ACT) and Average Emission Metrics (AEM). The first segment focuses on a descriptive and comparative analysis to understand the factors influencing ACT at a geohash level. This analysis is essential for identifying key variables and is the foundation for the subsequent predictive modeling analysis. In this part, spatial lag models help understand ACT and its influential factors across different geographic scales in New York City. The second segment revolves around the prediction framework for estimating ACT and AEM based on city-wide grids. Figure 1 provides a schematic representation of this multi-faceted methodology, detailing the input–output relationships among various datasets and models.

Fig. 1
figure 1

The technical scheme of the paper

Part I. Descriptive and comparative analysis of influencing factors on ACT

The descriptive and comparative analysis focuses on New York City due to the imperative of examining the influence of land use on ACT and the availability of detailed PLUTO data. Notably, spatial lag models are employed to capture better the spatial dependencies inherent in parking search behavior (Plümper and Neumayer 2010). This methodological choice stems from the observation that searching for parking is characterized by spatial spillover effects and spatial autocorrelation, which traditional models such as Ordinary Least Square (OLS) fail to capture adequately.

Initially, an overall spatial lag model is specified, with the results compared to those derived from an OLS model highlighting the improved explanatory power of including spatial effects. Subsequently, a K-Means method helps cluster the percentage of vehicle types in each geohash, deriving empirical vehicle mixes. Spatial regime lag models are then fit based on these empirical vehicle mixes to investigate the influence of specific vehicle types on ACT within particular geohashes. Additionally, Gaussian Mixture Models (GMM) are employed to link ACT with Average Total Geohashes Traveled (ATG), setting the stage for the subsequent estimation of AEM in the predictive framework.

Spatial lag model

Spatial lag models consider the spatial lag autocorrelation, considering the spatial spillover effect (Su and Song 2010). With a neighbor structure defined by the non-zero elements of the spatial weights matrix \(W\), a spatially lagged variable \(y\) is a weighted sum or a weighted average of the neighboring values for that variable (Fotheringham, Brunsdon and Charlton 2003). In the most used notation, the spatial lag of the dependent variable \(y\) is expressed as \(Wy\). Based on the traditional OLS model, the spatial lag model in this paper introduces a spatially lagged dependent variable vector, which assumes that spatial dependencies exist directly among the observations of a dependent variable:

$$y = X\beta + \rho Wy + \varepsilon$$
(1)

where \(X\) is the observation matrix of independent variables, \(\beta\) and \(\rho\) are estimated coefficients, and \(\varepsilon\) is the error term. With the assumption of spatial dependency, a spatial lag model is suitable for modeling under the condition that the dependent variable at one location is affected by the variable at the nearby locations. Spatial lag models are helpful when dealing with spatial spillover effects, as spatial dependencies exist in many cases. If the ACT of each grid is independent, the traditional OLS model is a good choice, though, in reality, the ACT between adjacent grids affects each other.

K-Means clustering and GMM model

Using the K-Means method, we cluster the percentage of the five vehicle types inside the aggregate dataset for each geohash. The five vehicle types are: (1) cars and multi-passenger vehicles (MPV), (2) light-duty trucks (LDT), (3) medium-duty trucks (MDT), (4) heavy-duty trucks (HDT), and (5) other vehicles. Also, GMM models are used to cluster the grid-based ACT and ATG and analyze the relation for different clusters. In the GeoTab dataset, empirical ACT and ATG have two or more asymptotic Gaussian distributions.

Overall spatial lag model results

We use a combined dataset incorporating Geotab’s searching-for-parking, census, and socioeconomic data solely from New York City. The combined dataset consists of 4,596 observations, covering the five boroughs. After preliminary OLS model analyses, we selected the model specification based on variable significance (reflected by p values) and the goodness-of-fit of the overall model. One issue before fitting the spatial lag model is constructing the spatial weight matrix. We use the KNN method to estimate the spatial weight matrix for variables indicating centroids of geohashes of the combined dataset. Thus, we tuned the parameter of the number of neighbors (denoted by \(k\)) of the KNN weight matrix. The tuning criterion expects the spatial lag model to have the lowest possible Akaike information criterion (AIC) value, keeping the variables of the model constant. Figure 2 shows the AIC values of different spatial lag models under the same set of independent variables. The \(k\) that optimizes the AIC value of the spatial lag model is 10 to 50, stabilizing between 14,440 and 14,460. Figure 3 shows the p value of the variables of the spatial lag model for different \(k\). To balance the optimization of the AIC value of the model and the number of significant variables, we decided to use 11 as the number of neighbors.

Fig. 2
figure 2

AIC values of spatial lag vs. number of neighbors of the spatial weight matrix

Fig. 3
figure 3

P values of all variables of the spatial lag model

Table 3 compares the estimates of the overall spatial lag model with estimates of the corresponding OLS model. The spatial autocorrelation test uses Moran’s I statistic. The result of Moran’s I test (approximately 0.25) indicates a modest positive spatial autocorrelation, suggesting that areas with similar ACT are slightly clustered. This could be an essential factor to include in the subsequent analyses and could justify the use of a spatial lag model. The independent variables include total establishments, number of employees across different industry categories, population, median household (HH) income, POI density, land use area across different types (residential, office, retail, etc.), and the lagged variable of ACT. The variables are selected when specifying the overall spatial lag model, meaning some insignificant variables will be dropped.

Table 3 Estimates of the overall spatial lag model

Among the variables, the analysis uses the first level of NAICS code representing different categories of industries (NAICS 2017). The general description of NAICS categories is listed as follows:

  • 1: agriculture, forestry, and fishing and hunting;

  • 2: mining, utilities, and construction;

  • 3: manufacturing;

  • 4: wholesale trade, retail trade, and transportation and warehousing;

  • 5: professional services;

  • 6: educational services, healthcare, and social assistance;

  • 7: entertainment, accommodation, and food services; and

  • 8–9: other services, including public administration.

From Table 3, the log-likelihood and AIC of −7207.44 and 14,442.88 for the spatial lag model are better than the OLS model. Also, the coefficient of lagged ACT in the model is 0.4882. It is significant, which means the ACT of neighboring geohashes have positive and significant effects on the ACT of the current geohash of interest, indicating a positive spatial correlation. The ability of other variables to explain ACT weakens, and the significance of their estimated coefficients reduces.

Empirical vehicle mix clustering results

Figure 4 shows the boxplots for the five vehicle mixes and their corresponding percentages of vehicle types. Each empirical vehicle mix (cluster) has a dominant vehicle type, i.e., the highest average percentage of a certain vehicle type. Thus, we name the empirical vehicle mixes obtained by clustering as (1) cars- and MPV-dominated, (2) LDT-dominated, (3) MDT-dominated, (4) HDT-dominated, and (5) other-dominated. Comparing the clustering results of the two cities, we found that Los Angeles has more geohash in which trucks are dominant, accounting for 58.5% in total. Higher truck dominance may lead to higher AEM.

Fig. 4
figure 4

Boxplots of percentages of vehicle types across different empirical vehicle mixes

Subgroup comparative models across empirical vehicle mixes

For comparative analysis, we fit subgroup spatial lag models for different vehicle mixes in New York City. Table 4 compares the selected variables and estimated coefficients for various spatial lag models. The results show that the coefficients are similar to those of the overall spatial lag model in Table 3, but each subgroup model of its corresponding vehicle mix has unique coefficients. The coefficient of the lagged ACT variable is significantly positive (95%) only in the sub-regime models of empirical vehicle mix 1 and 5.

Table 4 Estimates of spatial lag subgroup models across vehicle mixes

The significant factors also differ in different subgroup spatial lag models. The significant influences for the vehicle mix one subgroup model are NAICS 4 employees, NAICS 7 employees, residential area, retail area, median household income, and total units. For the truck-dominated geohashes (vehicle mixes 2, 3, and 4), the significant influences contain total establishments, NAICS 2, 3, 4, and 7 employment, residential area, office area, retail area, garage area, storage area, healthcare area, and POI density. The residential area is the most significant positive factor among all subgroup models across the five vehicle mixes. The coefficients of the variable on ACT are more significant than that of vehicle mix 1 in the subgroup models regarding trucks. We also performed the Anselin-Kelejian spatial dependence test for each subgroup model, and the results showed significant spatial dependence for the subgroup models of vehicle mixes 1, 2, and 4.

Part II: grid-based vehicle mix and ACT prediction framework

In this part, we divide the case study cities (New York City and Los Angeles) into grids with a baseline size of 400 m × 400 m and weights the input (both the independent variables and empirical ACT) to each grid. This grid-based ACT prediction framework considers two steps for two grid-based variables (i.e., the prediction framework yields estimated grid-based data from the current empirical data): vehicle mix and ACT. The two steps are (1) model training (with training dataset) and (2) prediction and validation (with testing dataset).

The model training process involves machine learning models when predicting grid-based vehicle mix and ACT. For the model training of grid-based vehicle mix, voting classifier, an ensemble method with specific weights is used for ANN, KNN, RF, and DT models to balance out their weaknesses since they may be equally well-performed. For instance, while the ANN model is adept at capturing complex, non-linear relationships, they are often computationally expensive and prone to overfitting. On the other hand, simpler models (e.g., DT and KNN) offer interpretability and efficiency but may lack the robustness needed for complex datasets. Although powerful and versatile, RF can sometimes be a computational black box. Given the complexity of the factors involved and the need for a model that can generalize well across diverse data, this ensemble approach is beneficial for predicting grid-based vehicle mix.

We performed an exhaustive sensitivity analysis to evaluate the potential impact of grid size on the robustness of our prediction results. This entailed an in-depth comparison of performance metrics across a comprehensive range of grid sizes, specifically at 200 m, 250 m, 300 m, 350 m, and 400 m. The analysis was designed to evaluate whether varying the granularity of the grid would affect key performance indicators, including Root Mean Square Error (RMSE),’Moran’s I statistic for predicted Average Cruising Time (ACT), and the distribution of prediction residuals as assessed by the Kolmogorov–Smirnov (KS) and Anderson–Darling (AD) tests. These metrics collectively inform us about our model predictions' accuracy, spatial autocorrelation, and statistical distributional alignment with empirical observations.

For the model training of grid-based ACT, We select one of the RF and ARD models based on prediction accuracy and the ability to handle sparse data. In terms of the prediction and validation process, we use tenfold cross-validation (CV) to validate the prediction accuracy, using average root mean squared error (RMSE), mean absolute error (MAE), and R-squared as evaluation metrics.

The grid-based AEM calculation process incorporates average cruising distance (ACD) and weighted emission factors for different vehicle mixes within a particular area. The AEM for a single grid is calculated as follows:

$$AEM_{i,p} = ACD_{i} \cdot F_{i,p}$$
(2)

where \(AE{M}_{i,p}\) is the average emissions metrics of pollutant type \(p\) for the specific grid \(i\), \(AC{D}_{i}\) is the average cruising distance for grid \(i\), and \({F}_{i,p}\) is the weighted emission factor of pollutant type \(p\) for grid \(i\) with its vehicle mix type. All variables are grid-based.

Cruising distance estimation requires imputation due to a lack of direct measurement. The cruising distance uses the Average Total Geohashes Traveled (ATG) times the simulated average Manhattan distance between two adjacent geohashes. We estimated weighted emission factors using the EMFAC 2021 models (EMFAC2021 Volume III 2021). AEM was calculated separately for different estimated vehicle mixes. We mainly focus on the five types of pollutants in vehicle emissions generated by searching for parking, namely nitrogen oxides (NOx), PM2.5, PM10, carbon dioxide (CO2), and sulfur oxides (SOx). The weighted emission factors for each pollutant type and vehicle mix are calculated using the following formula:

$$F_{i,p} = f_{p,k} \cdot VP_{i,k}$$
(3)

where \({f}_{p,k}\) is the EMFAC emission factor of pollutant type \(p\) for single vehicle type \(k\), and \(V{P}_{i,k}\) is the percentage of the vehicle type \(k\) in grid \(i\). The process estimates each weighted emission factor for each vehicle mix before calculating the grid-based AEM.

Results

ACT prediction results

Before the prediction and validation of grid-based ACT, we use a weighted voting classifier with the four machine learning methods (ANN, RF, DT, KNN) to predict and validate the vehicle mix for each of the grids. The vehicle mix prediction results are an intermediate process for ACT prediction. Then, the grid-based ACT is predicted and validated. Table 5 shows the description of the training dataset and the prediction accuracies.

Table 5 Descriptive of training datasets and prediction accuracies on ACT

The New York City training dataset uses 4,569 geohashes as empirical data (ground truths) and around 9,600 grids for prediction. Meanwhile, Los Angeles has 3,190 as empirical data and around 15,000 grids for prediction. We compared three different regression models to predict ACT, which were hyperparametrized via tenfold CV and grid search to ensure that the model has the smallest RMSE, MAE, and largest R-squared. We chose the RF model for ACT prediction based on this optimization criterion. This was the case for New York City since the empirical data can cover 90% of the city with 400 m buffers. However, the data of Los Angeles are sparser, and we found that the RF model could not handle the multicollinearity problem in the sparse data resulting from too few training samples. Hence, we had to fall back on the ARD model.

The Fig. 5 shows the spatial distribution of the ACT from training datasets (ground truths) and the prediction results of grid-based ACT. For New York City, there are a total of 9,602 grids with predicted ACT. The average is 6.0 min, and the standard deviation is 0.6 min. For Los Angeles, there are 15,523 grids with predicted ACT, an average of 5.1 min, and a standard deviation of 0.3 min. Spatially, the parking search hotspots (high-ACT grids) in New York City are mainly in the Upper East Side (east to Central Park), Midtown, and Downtown Manhattan areas, with smaller hotspot areas in the Bronx, Brooklyn, and Queens. The hotspots in Los Angeles are in the downtown area, City of Industry, LAX airport, and Long Beach Ports. The grid-based ACT prediction results and ground data are relatively modest and match real perceptions.

Fig. 5
figure 5

Ground data and prediction results of grid-based ACT. (a) New York city, (b) Los Angeles

As detailed in Table 6, our sensitivity analysis systematically explores the influence of grid size variation on the precision of ACT predictions and their spatial autocorrelation. The analysis traversed grid sizes from 200 to 400 m for both New York City and Los Angeles. We observed a consistent trend where the RMSE of predicted ACT remained remarkably stable across all grid sizes, suggesting that our’model’s accuracy is not significantly affected by the granularity of the grid size used.

Table 6 Sensitivity analysis of grid size on prediction accuracy and spatial autocorrelation ACT

Intriguingly, we noted that larger grid sizes, specifically 350 m and 400 m, exhibited higher values of’Moran’s I, indicating a stronger spatial autocorrelation in the predicted ACT. This suggests that larger grids may better capture the spatial dependencies inherent in urban cruising patterns, potentially reflecting more accurately the collective behavior of drivers when searching for on-street parking.

As gauged by the Kolmogorov–Smirnov (KS) and Anderson–Darling (AD) tests, the statistical distribution of the residuals demonstrated a decrease in values with increasing grid sizes. This trend signifies that the distribution of prediction errors aligns more closely with the observed data as the grid size enlarges, implying that the larger grid sizes may provide a more reliable representation of the actual spatial variability in cruising for parking.

These findings underscore the importance of considering the spatial scale in predictive modeling for urban transportation phenomena. The selection of an appropriate grid size is a balance between capturing detailed spatial behaviors and maintaining model stability and generalizability across different urban forms. Our study indicates that while finer grid resolutions offer detailed insights, coarser grids can provide a holistic view with comparable predictive performance, suggesting a nuanced approach to grid size selection depending on the specific goals of the urban analysis. Therefore, based on the balanced consideration of both the predictive accuracy and the spatial autocorrelation captured at this scale, we have chosen to present the predictive maps using a 400 m grid size, which we believe provides a reasonable compromise between detail and model robustness for the scope of our urban study. Policymakers can adjust the grid size to meet specific requirements in practical applications. Still, they should be mindful of the potential impact that overly fine-grained grids can have on the accuracy of predictions, as this could introduce noise and variability that may not be relevant for strategic decision-making.

Cruising distance versus cruising time

Figure 6 shows the average Manhattan distance between two adjacent geohashes. The simulation trial shows that the expected Manhattan distance between two random points in two adjacent geohashes stabilizes at approximately 204 m, by which we can connect the geohashes traveled to cruising distance. Also, Fig. 7 shows the Kernel Density plot and GMM clustering results for average total geohashes (ATG) and ACT. The density plot confirms that ACT approximately follows a specific Gaussian distribution and that ATG follows a couple of Gaussian distributions. Through the GMM, we found two separate relations between ATG and ACT. Over 95% of geohashes in both cities belong to label 0; thus, the ANN model used the label 0 data. The ACD is then estimated using the relations between ATG and ACT, plus the finding of the average Manhattan distance.

Fig. 6
figure 6

The average Manhattan distance between two adjacent geohashes

Fig. 7
figure 7

The Kernel Density plot and GMM clustering results for ATG and ACT. (a) Kernel Density plot, (b) GMM clustering results

AEM estimation results

We then estimate AEM using the emission factors in Table 7, and the results show that under the same cruising distance, the HDT-dominated vehicle composition produces the highest emissions of the five types of pollutants. In contrast, the LDT-dominated ones are the lowest, even lower than the cars- and MPV-dominated types. The results also show that for NOx and PM2.5 pollutants, the emissions generated by HDT-dominated vehicles are approximately four times those of the LDT-dominated mix. Figure 8 shows the estimates of AEM (PM2.5).

Table 7 Weighted emission factors (unit: g/vehicle/m)
Fig. 8
figure 8

AEM (PM 2.5 only) estimates. (a) New York City, PM 2.5 estimation results, (b) Los Angeles, PM 2.5 estimation results

For simplicity, we only display the results of PM2.5 since the results of other pollutants have similar spatial distribution patterns. Spatially, New York City’s emissions are concentrated near JFK Airport, Queens, and Lower Manhattan. In contrast, emissions in Los Angeles are concentrated near LAX airport, Long Beach Port, and the City of Industry. In addition, we concluded lack of data limited the validation of high emissions estimates of AEM in rural areas.

Key findings and policy recommendations

The spatial spillover effect of parking search

The spatial lag model reveals the significant spatial autocorrelation of empirical ACT in New York City, which means that the ACT of a grid of interest is significantly and positively affected by the ones of neighboring grids. From the driver’s perspective, when curbside parking spaces are limited near a destination, they are more likely to travel to adjacent neighborhoods in search of free or low-cost spaces.

As in Millard-Ball et al.’s behavioral study of parking search, when parking supply is perceived to be scarce, drivers are more willing to take a conveniently available space, even if it is some distance from their destinations (Millard-Ball, Hampshire and Weinberger 2020). It shows that drivers prefer on-street parking spaces instead of choosing closer off-street parking and paying higher fees. We speculate that a significant portion of this parking search is driven by parking price, especially for car trips rather than temporary pickups or commercial deliveries. This means pricing can be an essential factor in parking search due to car trips, necessitating further analysis to evaluate parking pricing policy.

Recommendation 1: Dynamic adjustment of on-street parking prices by zones and periods with outputs of the prediction framework

First, while the model indicates that spatial autocorrelation of ACT is a significant factor, it does not directly incorporate pricing elements. This is mainly because on-street parking prices, where applicable, are usually low-cost or free, making drivers more likely to search for on-street parking options, potentially affecting the ACT. Nevertheless, we hypothesize that pricing may serve as a latent factor influencing parking search behavior, and therefore, its implications should not be entirely disregarded.

Near parking search hotspots, there may be a need to adjust the price of on-street parking to align with that of off-street parking. Doing so could effectively reduce the number of drivers searching for parking. Policymakers can use the proposed ACT prediction framework to obtain corresponding ACT prediction maps by inputting short-term GPS data from different zones, neighborhoods, and periods (such as peak and off-peak data aggregated over weekdays). This facilitates the dynamic implementation of parking price adjustment policies by zones and periods. Additionally, the outputs of this prediction framework could help drivers make more informed decisions about their parking destinations and potentially encourage a mode shift away from driving.

Different significant factors of parking search reflect different demand sources

As land use can indirectly affect parking search by varying induced demand and supply, we found that residential area, retail area, accommodation, and food services (e.g., hotels, restaurants, bars, etc., NAICS 7) employees are the most significant influencing factors for geohashes of all vehicle mixes. Also, most of NYC’s centripetal ride-hailing traffic growth trips are related to restaurants (Gorback 2020). Therefore, the aggregation of accommodation and food services may exacerbate parking searches due to the growing demand for ride-hailing services (temporary pickups/drop-offs). Spatially, the effects of these three factors are particularly significant in NYC Midtown, an area with primarily residential and retail land uses. Besides, we found that total establishments and employees in manufacturing, transportation, and logistics industries (NAICS 3 and 4) positively impact ACT for truck-dominated (vehicle mixes 2–4) geohashes. This result shows that the parking search also results from commercial deliveries and truck activities, which attract and generate intensive end-stream traffic in their last-mile deliveries in areas that densely concentrate wholesale, storage, and logistics establishments.

Recommendation 2: more policy focus on the flexible allocation of off-street space to suffice growing residential freight demand while strengthening enforcement on commercial vehicles

Moreover, as commercial delivery demand grows for off-street or on-street loading spaces in residential areas, dealing with residential freight demand manifests more importance in mixed-use and residential areas (Chen, Conway and Cheng 2017). While the first key finding suggests that higher costs deter drivers from off-street parking, this is particularly true for personal vehicles. On the other hand, commercial and freight vehicles face distinct challenges, such as the need for larger parking spaces and longer parking durations for loading/unloading, making off-street options more viable. For instance, Jaller, Holguín-Veras and Hodge (2013) delved deep into the intricacies of parking demand and availability for commercial vehicles. Their findings underscored that commercial vehicles, such as delivery trucks, often spend extended periods searching for parking near customer locations. The prevalent inefficiencies in on-street parking, marked by recurrent parking violations, stress the need for dedicated off-street solutions tailored to these vehicles. Furthermore, beyond just off-street options, we believe in the significance of dedicated loading bays or specific pickup/drop-off locations (Jaller et al. 2021; Yu and Bayram 2021). Such provisions can be strategically located to facilitate quick and efficient commercial operations, reducing unnecessary congestion and further optimizing the urban transport ecosystem.

In areas where delivery demand is large, commercial vehicles may occupy excessive curb spaces and even have a higher chance of parking violations, such as double parking or blocking entrances, compared to other land uses that usually have reserved off-street parking spaces or loading/unloading zones. Double parking violations for commercial vehicles are likely to occur more frequently and repetitively than those for passenger cars (Kim and Wang 2021). It is, therefore, worthwhile to consider whether the allocation of curb space in the neighborhoods around parking hotspots is reasonable and meets the demand of commercial deliveries and to adjust the curb space allocation accordingly by setting up more loading/unloading zones. Strategies could include establishing dedicated off-street loading/unloading zones, potentially offering reduced rates during off-peak hours, or fostering collaborations with existing off-street parking facilities to reserve specific spaces for commercial deliveries. This approach would mitigate the strain on on-street parking, curbing overall parking search durations. Also, in the future, a further agent-based parking simulation can be used to testify and assess the curb allocation and enforcement policies.

The different distribution patterns of parking hotspots and emission blackspots in the two case cities

Regarding the grid-based ACT and AEM prediction results, we found different distribution patterns of parking hotspots (grids with high ACT) and emission blackspots (grids with high AEM) for the two cities. In New York City, the projections show that parking hotspots are located mainly in the Upper East Side, lower and midtown Manhattan, and several downtowns in the surrounding off-island areas, while the emission blackspots are located in the vicinity of JFK Airport and along the railroad lines within Queens, where they do not overlap (i.e., the areas with high ACT are not necessarily areas with high AEM). In Los Angeles, parking hotspots are mainly located near the downtown area, the City of Industry, and the Port of Long Beach, and emission black spots are also concentrated in these areas, with a relatively high degree of overlap between parking hotspots and emission blackspots, which verifies high truck-activity relations.

Recommendation 3: More policy focus on reducing parking search time in emission blackspot areas as a strategy to mitigate redundant emissions

In New York City and Los Angeles, the study identified areas with more significant parking search times (indicated by ACT) and emissions (measured as AEM). While the spatial relationship between these parking hotspots and emission blackspots varies between the two cities, a common thread is that these areas represent a significant opportunity for policy intervention to mitigate redundant emissions.

While the current dataset does not allow us to break down these metrics by vehicle type, the high levels of ACT and AEM in these areas suggest that any parking search time reduction would likely positively impact emissions. Therefore, we recommend that policy efforts focus more intensively on these emission blackspot areas. Strategies could include deploying intelligent parking solutions, improving public transit connectivity, or even implementing low-emission zones where only vehicles meeting specific emission standards are allowed.

Limitations and future work

This study has several limitations arising from both the dataset and the modeling assumptions. Firstly, we rely on aggregated GPS data lacking individual attributes (e.g., route, speed, travel purpose), limiting our analysis’s granularity and the study’s ability to consider real-time or short-term fluctuations in parking demand, supply, and emissions. The assumptions include a presumed 100% accuracy in identifying parking search behavior, which may not be accurate in practice. Additionally, the size of certain groups, particularly heavy-truck-dominated geohashes, potentially affects the model specification and may cause relevant variables to be deemed insignificant due to limited observations.

Another limitation is the assumption of global spatial spillover in the Spatial Lag Model. While it does account for spatial dependencies, it implicitly assumes that these spillover effects are global, affecting all areas uniformly. This may not capture localized spatial relationships adequately, and we recognize this as an area for future research, possibly exploring models that allow for localized spillover effects.

Despite these limitations, our framework offers valuable insights into average trends and could be significantly enhanced if more disaggregated data were available. This would allow for a more nuanced understanding of individual attributes affecting ACT and facilitate the model’s application to diverse predictive scenarios, including short-term peak hour forecasts, the impact of large-scale events, and road construction projects. Future research could address these issues through agent-based parking search simulations and incorporating dynamic variables such as pricing and road conditions.

However, it is worth noting that more disaggregated data could enhance the current framework. Such refinement could make it feasible to apply this model to various predictive scenarios, such as short-term peak hour forecasts, the impact of sudden large-scale events, and even road construction projects.

Conclusion

This paper contributes to the study of parking search patterns and their impacts by (1) exploring the relationship between aggregate census and socioeconomic information at the ZIP level and cruising time of parking search and finding out the influencing factors on ACT, and (2) proposing a novel prediction framework to identify the parking hotspots with more significant cruising time of parking search as well as emission blackspots in dense urban areas. The results can provide implications and help planners and policymakers to analyze the problematic areas in the city and thus address them by adjusting zone-specific policies regarding parking price, enforcement, and curb allocation; and (3) expect that the ACT prediction framework can work with different time-period input of data, both short-term (such as daily and weekly) and long-term (such as 6-month and annual data). Practically, with the input of short-term data, the framework could provide short-term (e.g., next hour) prediction of potential parking hotspots in the city, which can help drivers re-decide their parking destinations and boost mode shift from driving.

To achieve those objectives, this paper (1) quantified the effects of different factors such as population, income level, employment, number of establishments, points-of-interest (POI) density, and dominant vehicle type, among others, when explaining the variability in parking search; (2) proposed a method to implement and compare different machine learning techniques to predictively estimate grid-based ACT by using a sample of aggregated GPS data (specifically aggregated in a 6-month period); and (3) quantified the emissions associated with parking search and estimates AEM. We use the five boroughs of New York City and Los Angeles County as case studies. In terms of methods, we identify the factors affecting ACT (sample GPS data) by spatial lag models and perform descriptive and comparative analyses under different empirical vehicle mixes clustered using a K-Means method. The sensitivity analysis conducted further enhances our understanding by exploring the effect of grid size variations, ranging from 200 to 400 m, on ACT's prediction accuracy and spatial autocorrelation, thereby guiding the selection of an appropriate grid size for the predictive models. This approach highlights the implications of spatial resolution on model performance and aids policymakers in choosing an optimal grid size for practical applications. Several machine learning models are employed to predict grid-based (i) vehicle mixes and (ii) ACT separately and provide model validations. The machine learning methods include ANN, KNN, RF, DT, and ARD regression models. We use the estimated grid-based vehicle mixes, census and socioeconomic characteristics, and road-network-related attributes in the ACT prediction process. Finally, AEM is estimated using emissions factors from the 2021 version of the California EMission FActor (EMFAC) emission model.

The significant findings include: (1) Introducing a spatially lagged variable reveals the significant spatial autocorrelation of ACT, which represents the spatial spillover effects of parking search. This finding indicates that pricing policy can be an essential factor in parking search, which necessitates a further evaluation of parking pricing; (2) Residential area, retail area, and accommodation and food services employees are the most significant factors influencing ACT. We suggest three primary sources of parking search in dense urban areas and then provide policy implications to address the growing needs for pickups/drop-offs and commercial deliveries (residential freight demand). (3) The grid-based ACT and AEM prediction results show an overlapping spatial distribution of ACT and AEM in New York City. The emission blackspots in Los Angeles are more related to truck activity. Thus, we believe reducing chances for parking search for heavy trucks with high emissions is essential, as most emission blackspots in Los Angeles are in land uses with manufacturing, transportation, and logistics industries.