A Probabilistic Simulation Framework to Assess the Impacts of Ridesharing and Congestion Charging in New York City

Understanding the holistic city-wide impact of planned transportation solutions and interventions is critical for decision making, but challenged by the complexity of the urban systems, as well as the quality of the available urban data. The cornerstone for such impact assessments is estimating the transportation mode-shift resulting from the intervention. Although transportation planning has well-established models for the mode-choice assessment such as a nested multinomial logit model, an individual choice simulation could be better suited for addressing the mode-shift allowing us to consistently account for individual preferences. Moreover, the available ground-truth data on the actual transportation choices is often incomplete or inconsistent. The present paper addresses those challenges by offering an individual mode-choice and mode-shift simulation model and the Bayesian inference framework, and demonstrates how impact assessments can be performed in the events of incomplete mobility data. It accounts for uncertainties in the data as well as the model estimate and translates them into uncertainties of the resulting mode-shift and the impacts. The framework is evaluated on the two intervention cases: introducing ride-sharing for-hire-vehicles in NYC as well the recent introduction of the Manhattan Congestion surcharge. It can be used to assess mode-shift and quantify the resulting economic, social and environmental implications for any urban transportation solutions and policies considered by decision-makers or transportation companies.


Introduction
The vast scale of NYC can magnify even a slight improvement in the efficiency of the transportation solutions translating it into significant cumulative economic, environmental and societal impacts. The rapidly growing for-hire vehicles (FHV) service is one area which can realize such optimization of drastically improving the efficiency of car and taxi transportation, as intended to cut traffic, congestion and energy consumption (Santi et al. 2014). Companies like Uber, Lyft, Via and many others provide their services in most of the U.S. cities as well as around the world-with the customer being able to book an FHV or shared FHV (ride-sharing with other customer) with their mobile applications. The surge in ride-sharing trips in recent years have demonstrated that the FHV service is playing an increasingly important role in the city's overall transportation (over 2.5 times from mid-2017 till the end of 2018 with over 25 million miles traveled monthly by the end of 2018 on the shared rides according to New York City Taxi & Limousine Commission (NYC TLC) open data) (Atkinson-Palombo et al. 2019). Such potential has been further unleashed with an in-depth understanding of the basic urban quantities/ parameters (such as city size and driving speed) that affect the fraction of individual trips that can be shared (Tachet et al. 2017). Unfortunately, modal shifts resulting from increased affordability of the FHV service can easily offset those positive impacts, contributing to a substantial proportion of the overwhelmed road traffic and energy emission. Meanwhile, an evaluation of the impacts against different transport alternatives and for different population groups with distinct demographics is essential (Kodransky and Lewenstein 2014). Another issue resulting from the growing number of vehicles is the increased traffic congestion in the city. Both citywide bus speeds and the average travel speed within Manhattan's central business district (the area south of 60th Street) are the lowest they have been in decades. Buses average 7.58 miles per hour-it was 8 miles per hour in 1990-while the travel speed in Manhattan is now just over 7 miles per hour, down from 9 miles per hour in 1990 (NYC 2019). Meanwhile, close to 45% of New Yorkers get delivery at home once per week, which not only affects the number of trucks are on city streets, but how vehicles can get around. The city is putting congestion pricing as one of the measures into place that may combat these problems. Urban stakeholders and municipal managers need to make informed decisions while considering policies and adopting solutions based on the travel behavior simulations driven by such knowledge, ideally, in a social petri dish minimizing the impact of irrelevant external factors.
The behavioral framework for the set of complete and inter-related choices undertaken by travelers and potential travelers in the travel market is required. Both aggregate and disaggregate approaches have been developed to estimate travel demand and modal split (Koppleman and Bhat 2006). Those popular and widely-used include the gravitational models (Anas 1983), the Probit models (Alemi et al. 2019), the Logit models (Wen and Koppelman 2001) and many others. The explanatory variables included in the models often involve demographic, socioeconomic character, trip characters and mode attributes (Wen and Koppelman 2001;Scheiner and Holz-Rau 2007). In addition, traveler and triprelated data including the actual mode choice of the traveler are often required for the estimation and evaluation of a practical mode choice model, which should be obtained by surveying a sample of travelers from the population of interest. For decades, transportation researchers have largely used survey data from active solicitation (Chen et al. 2016), which are detailed but limited by relatively small sample size (small data). The rapid rise and prevalence of mobile technologies have enabled the collection of a massive amount of passive data (big data) very different from data of active solicitation (small data) that are familiar to most transportation researchers and require different methods and techniques for processing and modeling (González et al. 2008;Liu et al. 2015;Yue et al. 2014;Hasan and Ukkusuri 2014). In recent years, data on human mobility and interactions in the city space saw an increasing number of applications. Data sources being leveraged as proxies for human mobility include anonymized cell phone connections (Girardin et al. 2008;Gonzalez et al. 2008;Amini et al. 2014;Kung et al. 2014;Grauwin et al. 2017), credit card transactions , GPS readings (Santi et al. 2014;Nyhan et al. 2016;Qian et al. 2019), geo-tagged social media Paldino et al. 2015;Belyi et al. 2017) as well as various sensor data (Kontokosta and Johnson 2016).
A critical drawback lies in having the available data either not including any user demographic information for individual trips, or providing travel statistics with demographic information at the aggregate level only, as a response to alleviate privacy and surveillance concerns (Douriez et al. 2016). A synergy of disclosed (small and big) travel data from different data providers and departments is often required (Huang et al. 2018;Li et al. 2019;Beiró et al. 2016): to represent the resultant complete travel information (such as the number of trips, travel time, and monetary cost) at a certain aggregate level, and it, subsequently, is not as accurate and detailed as the incomplete data. Such compromise imposes uncertainties onto both the data reliability and the modeling process (Manzo et al. 2015;Trajcevski 2011;Rasouli and Timmermans 2012), suggesting that the point estimates of modeled modal choices only represent one of the possible outputs generated by the models and, instead, anticipated modal choices are better expressed as a central estimate and an overall range of uncertainty margins articulated in terms of output values and the likelihood of occurrence (Boyce 1999).
The key focus of our work lies in developing a data-driven approach applicable to scenarios where ground-truth mobility information is insufficient and fragmentary. The lack of individual point level data creates roadblocks in modeling city-wide assessment of any transportation-related policy interventions. Furthermore, evaluating intervention-related changes become even more complex for small urban geographies such as zip-codes and census tracts, which can be crucial for making localized decisions by policymakers in a big city. Thus, one of the primary focus of our work is demonstrating how the transportation impact assessment could be performed in the somewhat typical situation of having incomplete data on urban mobility. We put forward a probabilistic framework to explore the modal choice behaviors in NYC using a data-driven method based on partial ground-truth data and with consideration of both data and model uncertainties. The proposed individual choicebased simulation model utilizes the synthesized data (from NYU C2SMART center) along with NYC's TLC ridership, allowing to simulate the mode-choices probabilities across all the transportation modes in question. We demonstrate that the model can learn from multiple data sources which could have different scales and different information on transport modes. The applicability of the model can thus be extended to any urban area in question where mobility information is incomplete or fragmented. This is typical for many cities where mobility across all transport modes is difficult to measure and could be collected by independent agencies.
By evaluating the synthesized transportation choices under scoping scenarios as well as the actual up-to-date taxi and FHV ridership, we train the mode-choice simulation model capable of simulating further mode-shift on the individual level under intervention scenarios of interest-the introduction of ridesharing FHV in NYC and the Manhattan Congestion surcharge. Once quantified, the mode-shift impacts can be translated into the economic, environmental, societal impacts of the considered scenarios, aiming to quantitatively inform stakeholders and policymakers of the implications of shared mobility and congestion pricing on the entire city as well as specific populations and neighborhoods.

Data Overview
The Origin-Destination flows are retrieved from two sources: C2SMART simulation test bed and NYC TLC open data. The C2SMART test bed represents synthesized travel flows across multiple transport modes, of which flows for Taxi, Transit, Walking, and Driving are aggregated at Taxi Zone levels for our work. The NYC TLC data gives ground truth data on flows for Taxis, FHVs, and shared FHVs., which are originally aggregated on the Taxi Zone level. The trip distribution and spatial coverage across NYC for the four travel modes from C2SMART simulation test bed is shown in Fig. 1. We supplement these data with travel cost, travel time (retrieved from API services) for each O-D pair in question. Furthermore, the income wage brackets for commuters are accessed from the American Community Survey Data (ACS) and Longitudinal Employer-Household Dynamics (LEHD) (both U.S. Census Bureau programs). The LEHD also provides population breakdown across the income brackets for each Taxi Zone.
The detailed discussion on data and comparison metrics is present in Appendix A: "The Data."

Methods
The objective of this study is to prototype a simulation modeling framework suitable for understanding the mode-choice behavior and assessment of city-scale impacts of transportation innovations and policies on urban transportation systems along with the associated environmental, economic and social implications. The assessment will be evaluated on two pilot use cases of introducing ride-sharing in New York City (offered through UberPOOL, Lyft Shared and other FHV companies) and Manhattan Congestion surcharge. The impacts in question include travel time and cost for passengers, traffic and congestion, gas consumption/vehicular emissions. Particular focus will be made on the equitable impacts across populations, comparing how overall changes in travel time and mileage translate among different income groups.
Traditional counterfactual impact assessment is challenged by (1) the fact that spatial counterfactual does not seem feasible (interventions are implemented city-wide and there is no comparable territory without deployment to be considered as a control area), while (2) utility of the temporal counterfactual (comparing the same urban system before and after the deployment) is limited by multiple major trends and transformations happening within a complex urban system simultaneously with the deployment in question, (3) many target quantities of interest, such as overall urban traffic, gas consumption, emissions are hardly measurable with the available data and are again affected by multiple urban transformations happening simultaneously. It is also important to mention that a body of studies applied the Geographically Weighted Regression model (GWR) for transport mode choice analysis, but they generally compared the GWR model with the ordinary least squares (OLS) model to highlight the spatial variations in the relationship between transport accessibility, land uses, etc (Andersson 2017;Paez and Currie 2010;Torun et al. 2020;Chow et al. 2006;Chiou et al. 2015). The factors being considered in those related work are based on each single spatial point, whereas in our research context we consider the utility of transport between a pair of origin-destination (O-D) points. While data from C2SMART test bed and TLC present an opportunity to model travel flows with respect to variables in interest, it is noted that the scale of these two data is quite different, with TLC representing much larger of the trips (Appendix A: "Data Metrics"). So we do not quite have a ground truth of the mobility from all the travel modes we consider. Hence a GWR model would not be appropriate in this regard.
As an alternative, the present paper proposes a methodology based on a data-driven integrated transportation simulation modeling framework, assessing the mode choice between the six major transportation modes in question: walking, private and public transportation, taxi and forhire-vehicles, including ride-share modes. For an estimated transportation demand, an agent-based choice model will be simulated, estimating unknown parameters of the individual utility of considered transportation modes as well as the agent characteristics (distribution parameters for individual preferences) through a multi-step Bayesian inference framework sequentially gaining information from the available partial observations of actual mobility choices. The Bayesian inference framework for the mode-choice model inference was earlier applied in our work on assessing the impact of bike-sharing (Sobolevsky et al. 2018), although the model used there was more traditional multinomial logit discussed below.
While the simulated individual choices within the model enable a direct assessment of the mode-shift consistent with individual preferences, which can be further translated into the impact of interest. Uncertainty about the data and parameter estimates will be incorporated into the simulations and resulting impact assessment.

Multinomial Logit
We first consider the broadly used Multinomial Logit Model as the baseline approach for estimating the mode-choice for the regular commute. The model as well as its nested version (which we can use in case of related modes such as taxi and FHV) offers an advantage of estimating the mode-choice probabilities using closed-form formulas representing the aggregate-level choices of a simulation model. However, the parameters of the nested model lack a direct connection with the underlying simulation parameters (which are based on individual choices of commuters with respect to travel time/cost) and this way limits the utility of the model for individual-level mode-shift assessment. Nevertheless, it can still serve as a baseline to assess the efficiency of the proposed simulation model, so we include it in that capacity.
A Multinomial Logit (MNL) discrete choice model (Fig. 2) and its nested version with a nest for taxi+FHV and sub-nest for shared and non-shared FHV (discussion on different nesting structures is in "Appendix B") is trained based on the two available datasets: (1) Number of trips between each O-D pair by wage group and 4 transport modes (Taxi, Transit, Walk, Driving) from C2SMART; and (2) Number of trips between each O-D pair by 3 transport modes (Taxi, FHV, shared FHV) from TLC. The models depend on a set of parameters-, which controls the impact of the mode utility differences on the mode choice probability, adjusting the objective value of time (time multiplied by individual wage rate) to anticipated monetary cost incorporating possible irrationality of individual decisions while combining it with the direct monetary cost to assess the overall utility. The nested model would further include taxi+FHV , FHV controlling the choices between nests and within each nest (Koppleman and Bhat 2006).
Mathematically, the utility score U j for alternative j depends on the time taken T j between the O-D pair in consideration, the monetary cost P j for choosing the alternative, the hourly income W of the commuter, and a random component of error j , yielding a base utility function and the individual utility of U j + j , where j follows a Gumbel distribution. The probabilities for each of the four major transportation modes to be chosen as having the highest utility is defined as

Fig. 2 Multinomial Logit model framework
We further consider another version of the MNL with logutilities (logMNL), corresponding to having a multiplicative random factor applied to original utilities. Specifically, adjust (1) as considering log-utilities and assuming individual log-utility to be U j + j with a random term again following Gumbel distribution. This will correspond to choosing a mode with a minimal inverse negative utility e −U j = ( WT j + P j )e − j rather than a minimal negative utility −U j = WT j + P j − j in the classical setup, i.e. having a multiplicative exp-Gumbel individual random factor instead of an additive Gumbel random term.
When considering FHV and shared FHV modes one needs to acknowledge the relation with the taxi mode and corresponding correlations between individual preferences. This can be accounted for by introducing a nest of taxi and these modes along with a subnest of FHV and shared FHV modes to the model. For the nested model, the marginal probability of the outcome j is calculated based on the deterministic part V j of the utility (i.e., V j = − ( WT j + P j ) ), and the inclusive value IVk which signifies how inclusive each nest is based on its dissimilarity parameters (i.e., IVk = ln ∑ l∈N k e 1 k V l ), yielding a chosen mode The parameter k cancels itself out for the nests containing a single transport mode. Eventually, the dissimilarity parameters 1 , 2 for the taxi, non-shared FHV, shared FHV nests/sub-nests together with the utility parameters , determine the shift between each alternative, while 1 , 2 largely control the balance within the taxi + FHV nest and FHV sub-nest. The baseline model parameters were estimated through estimating , of the utility function based on C2SMART data by minimizing the Weighted Root Mean Squared Error (WRMSE) between the number of trips from model prediction and real data for Taxi, Public Transit, Walk and Driving. The tested models measure the goodness of fit between model prediction and C2SMART simulation test bed data based on several metrics and search a wide range of parameters for the optimal fit in a reasonable time. The final nested model splits people's regular mobility between origins and destinations across the city (from C2SMART or LEHD data) and predicts aggregated transportation mode choices. The model also provides wage distribution for each transport mode to be used while assessing the preferred transport mode choice for the commuters from the given wage group. In the next evolution of the model, it will enable further direct simulation of their future choices under changing conditions according to the scenarios of interest.

Individual Choice-Based Simulation Model
This approach is based on agent-based simulations of individual choices. In fact, so does the multinomial logit model representing one particular scenario when an additive random term following Gumbel distribution represents individual preferences. This enables a closed-form representation of the resulting probabilities, however not relying on that allows further flexibility in choosing the modeling framework. Besides direct control of the original simulation parameters will enable direct individual-level assessment of the mode-shift consistent with individual preferences.
To avoid the reliance on close form representation of the mode probabilities, we simulate mode choices for each individual origin-destination pair and the specific passenger of the given income category and use a Neural Network architecture for parameter estimation. This allows further flexibility in choosing the modeling framework, without having to use explicit analytic formulas.

Model parameters
We define the utility for each given pair of O-D, passenger wage w and transportation mode based on travel time and cost estimates as well as a random factor, representing individual preferences towards each mode. The utility can be interpreted as a perceived "cost of travel" to an individual weighting travel time and monetary cost by suitable parameters which can be learned using the model. The utility in this model setting is thus defined as U = * t * w + c , where is the rationality adjustment for the cost of time estimate as before, t is the travel time estimate, c is the travel fare/ cost estimate and w is the wage of the commuter. We also introduce a random multiplicative factor ∼ N(0, 2 ) . representing individual preference to the given mode. Thus the log-utility is defined as ln U = ln( * t * w + c) + .
We assume terms generally independent across transport modes except of taxi, FHV and shared FHV, which are of course related-if one has an increased preference towards taxi, its likely that FHV will be also preferred and even more so between FHV and shared FHV which one can see as even more closely related, as while offering a slightly different type of service they are facilitated by the same provider/app. Thus two new parameters are introduced: corTFS-correlation coefficient between random factors of taxi and FHV or shared FHV (SFHV) modes, and corFS: correlation coefficient between random factors of FHV and SFHV. (Detailed discussion on different model nesting variations and performance comparison is given in "Appendix B"). This way, the model parameters to be estimated are , , corTFS, and corFS. The Neural Network (NN) model thus outputs modechoice probabilities corresponding to each O-D pair, which is then used for likelihood estimation (Fig. 3).

Model Training and Likelihood Estimation
The NN model is used in a two-phase Bayesian inference framework (Fig. 4) based on the data of individual simulated trips generated by C2SMART simulation test bed as well as taxi+FHV data available from TLC. The parameters , are estimated in the first step with C2SMART test bed, while correlation parameters corTFS, and corFS are estimated in the next step with training on TLC data, with keeping , fixed. It fits the mode-choice probabilities P m between the six transportation modes m as the function of their log-utilities and the model parameters. Notice that can be treated as the scaling factor for the log-utilities to simplify the model. To fit the model we simulate P m for various values of U m ∕ sampled from the random (normal) distribution and corTFS, corFS (provided corFS > cor-TFS) sampled uniformly (50,000 random samplings) and use it to learn the neural network. The model architecture consists of three hidden layers with 8,12,8 neurons respectively, with a rectified linear unit ("relu") activation for hidden and sigmoid for the output layer trained on 'binary cross-entropy' objective function.
For mode choice probabilities P m (o, d, w, , ) for each set of origin(o), destination(d) and wages(w), the loglikelihood for four modes given the observed C2SMART (o, d, w, , , corTFS, corFS) calculate the log-likelihood of the data given in the model as where P TFHV = ∑ m∈taxi,FHV,SFHV P m . Based on the above framework, we obtained the best parameter sets of = 0.71, = 0.38 and corTFS = 0.31, corFS = 0.58 based on likelihood values. The , parameter values are sampled from log-normal prior distributions with ln ∼ N(ln beta , 2 beta ) , ln ∼ N(ln sigma , 2 sigma ) . The prior assumes having majority of the time underestimated up to 3 times with P(0.33 < < 1) = 68% confidence, i.e. P(− ln 3 < ln < 0) = 68% which can be achieved when ln beta = −(ln 3)∕2, beta = (ln 3)∕2 . Similarly, for , the prior distribution assumes having the individual correction factor within [1/2, 2] (correction up to twice) with 68% confidence. This can be achieved if we take sigma = ln(ln 2) and sigma = |ln(ln 2)| ; if one simulates multiple ln ∼ N(ln(ln 2), (ln(ln 2)) 2 ) then for the resulting the probability of P(0.5 < < 2) is again going to be 68%. The correlation parameters corTFS and corFS are sampled from uniform distribution [0,1] provided that corFS > corTFS. Then the sampling simply takes the evenly distributed percentiles of each distribution with equal weights.
Once the parameters are sampled and the model fit likelihoods are assessed, it allows simulating of the mode-choices for a variety of sampled parameters with the results weighted by the joint likelihood e L( , )+L FHV (corTFS,corFS) (as the prior sampling ensures even probability intervals). For expressassessment, one can simulate the results just for the maxlikelihood parameters, however comprehensive parameter sampling provides assessment with respect to the model uncertainty.
Based on the estimated parameter likelihoods, we simulate the final mode choices between origins and destinations for each individual commuter or group of commuters of a given wage group under two different scenarios of interest: (A) intervention scenario (having shared FHV unavailable or after imposing Manhattan Congestion surcharge) and (B) the baseline scenario with all the transportation modes available with their original utilities. Individual correction factors are maintained the same between scenarios (A) and (B). For each individual simulation and the set of model parameters, the mode-shift can be directly assessed and aggregated into (5) percentage mode-shift over the entire city or origin, destination and/or wage group of interest. Being assessed for multiple sampled parameters, it also provides probability distributions with respect to parameter likelihood weighting. The percentage mode-shift can be further translated into the impacts of interest with respect to the differences in travel time, cost and mileage driven between the transport modes.

Model Comparisons
We first evaluate the above simulation model against the classic MNL and logMNL (a version with multiplicative random factors for further consistency with the simulation framework above) according to their capability of fitting the reported choices of four major modes (walking, driving, public transit and taxi) during the pre-FHV era.
All of the discussed approaches estimate mode-choice probabilities P t for each origin-destination-wage pair based on the defined utility involving the income of a commuter, travel time and costs. The MNL framework gives probabilities based on Eq. (2). Whereas for the individual choice simulation model, we developed an approach to estimate choice probabilities and resulting likelihoods for each parameter sets through a NN model. The simulations corresponding to each parameter set are weighted by the likelihoods having their logarithms estimated by Eqs. (4) and (5). Table 1 reports the likelihood-weighted averages for the modechoices provided by each model as well as the R-squared values based on the net 4-mode prediction values for the models discussed.
We observe that both multiplicative model specifications provide estimates much closer overall to the C2SMART test bed according to the R2 score compared to the additive MNL specification, while individual choice simulation model performs slightly better compared to logMNL. But it also apparently gets a much closer prediction on the taxi ridership, which is particularly important for our use cases. Specifically, for the taxi ridership estimates (which is the most important for the considered use cases concerning taxi and FHV trips primarily), MNL underestimates the ground truth by over 1.5 times, while logMNL overestimates by approximately 1.6 times. While the individual choice simulation model shows just a 9% deviation. It also gives much closer estimates for walking and driving, while underperforming on public transit. Furthermore, the choice simulation model provides a more adequate estimate for the travel time rationality parameter (the max-likelihood parameter of = 0.71 corresponds to a quite realistic 29% undervaluing time, while optimal for MNL and logMNL is above 1 corresponding to time overestimation, which contradicts common intuition of people generally valuing direct money benefits more than indirect benefits of the same estimated value. This further asserts that the main advantage of the simulation model lies not in being vastly better than MNL evaluated on the whole data, but being more interpretable and providing a better understanding of the underlying parameters and apparently a better fit as far as the taxi ridership is concerned. Finally, as discussed the simulation model framework provides a better intuition and flexibility when simulating individual trips and evaluating alternative choices for the mode-shift part of the analysis. Based on this initial evaluation we are going to stick to the simulation model going forward.

Uncertainty Analysis
Accounting for uncertainties is critically important for the impact assessment to assess statistical significance of the reported city-wide quantities as well as their difference per wage group or areas across the city. We address uncertainties from two sources: uncertainty in the data and uncertainty in the model.
Uncertainty in the data is accounted for by incorporating the travel time and fares random distributions into the model and running the simulations multiple times. The variation in the trips from the data-based uncertainty simulations was observed to be pretty low to have any significant impact on the mode shift and resulting impacts of interest ("Appendix B: Uncertainty Analysis").
Model-based uncertainty is analyzed using the approach described above weighting results from different model simulations by the model fit likelihood. This way uncertainty in the mode-choice assessment turns out to be much more significant (Appendix B: "Uncertainty Analysis"), hence going forward we'll primarily focus on this type of uncertainty in the mode-shift and related impact assessment.

Impact Assessment
To evaluate the applicability of the proposed framework to assess the impacts of transportation interventions and policies, the paper considers two use cases -introducing shared FHV after 2014 in NYC and imposing Manhattan Congestion surcharge in early 2019.
The data for the six major transportation modes in question (transit, walking, driving as well as taxi, FHV, shared FHV) is leveraged from three major sources: (1) C2SMART simulation test bed, (2) NYC Taxi and Limousine Commission (TLC) and (3) web-scraped data from public API interfaces (Appendix A: "The Data"). The C2SMART simulation testbed (He et al. 2020) includes approximately 27.3 million trips for travel modes-taxi, transit, walking and driving and across 16 income groups, following the travel agendas from the historic Regional Household Travel Survey with a synthetic population. The data provides a representative estimation of the city-wide travel choices during the pre-FHV era across people from different income groups across NYC. Whereas for the estimation of taxi and For-hire vehicle choices, we use the up-to-date open data from TLC. It is further used for estimating time and costs estimates for taxis and driving. The travel costs and times for other travel modes are retrieved from the publicly available API services (Google Maps and HereMaps). Accounting for uncertainty is one of the key goals of our analysis. To account for this, we retrieved this information multiple times for each origin-destination pair to capture the variations in the costs and times. More details regarding each data set are given in Appendix A: "The Data".

Impact of Ridesharing in NYC
As shared FHV became an integral part of NYC transportation, understanding their actual impact is challenged by the lack of an appropriate control area where shared FHV were not available. Historic pre-2014 mobility cannot serve as an adequate baseline as a rapidly evolving transportation system likely got affected by multiple trends, not only the spread of shared FHV. E.g. increased adoption of an FHV service as such (not necessarily shared) could have had a larger impact.
However, the proposed mode-choice model allows simulating a hypothetical scenario with the same transportation demand if shared FHV were not available. As described before we first train the model on the historic mobility represented by C2SMART simulation testbed and then further estimate FHV-related parameters based on the actual taxi, FHV and shared FHV ridership reported by TLC. Important to mention that the model is used to simulate the relative distribution of the ridership per mode for each origin-destination and passenger wage group, while to estimate the actual scale of the impact we are going to rely on the actual amount of shared FHV reported by TLC (as those are the trips that would not have happened without ridesharing, while the alternative modes that would have been used are to be determined for those). This way dependence of the model on historic simulation testbed data is limited to estimating the likelihood of the parameters.
We analyzed the mode-shift (if shared FHV trips were to be facilitated by the second-choice mode in each scenario) simulated by the model with different parameters weighted by the model fit likelihood to determine the anticipated effect of shared FHV on the NYC transportation system. The mode-shift (i.e. percentage of the observed shared FHV trips that would have been facilitated by public transportation, walking, taxi, FHV, and private vehicles) is reported in Fig. 5. The model-based uncertainties seem relatively small, highlighting the robustness of the pattern.
As one would expect a majority of the shared FHV trips would have been facilitated by FHV and taxi as the closest alternative. Together with driving, this adds up to nearly 70%. However, around 30% of the trips have actually replaced transit and walking. So while the majority of the shared FHV rides potentially (in case ridesharing actually occurred) cut the traffic by combining the trips that would otherwise involve individual driving, around 30% of those trips replace non-driving mobility, this way increasing the traffic.
On the aggregate citywide scale, we observe a net travel time decrease of 1.77% (95% confidence interval-1.71%-1.83%) and the net mileage increase of 1.14% (95% confidence interval-1.06%-1.22%) even if we assume that each shared FHV trip have actually combined two trips (unfortunately we do not have ground-truth data on that, so this likely represents an optimistic scenario in terms of the traffic impact as some shared FHV might still serve individual passengers while sharing more than two trips at once seems to be a rather rare case). Assessing the scale of yearly citywide ridership from 2019, this corresponds to more than 495,000 h saved for the NYC commuters at the price of 940,000 extra miles driven citywide over the year. So on average, every hour saved comes at a price of a traffic increase by 1.9 miles. The net decrease in travel times comes mainly from the reduction of about 14 M transit trips. The extra miles driven translate to close to 47,000 extra gallons of fuel emitting around 420 tons of carbon-dioxide emissions, assuming 9 kg CO2 emitted per gallon of gas (data from U.S. Environmental Protection Agency (United 2022)). In terms of economic impacts, the mode shift accounts for the citywide time-cost reduction of $4.72 M.
While shared FHV cause an overall travel time decrease and traffic increase across the city, those impacts are greatly uneven across the city. On the level of individual taxi zones, the largest travel time decrease of up to 8% occurred in inner areas of Brooklyn, Queens and Staten Island which seem to benefit the most (Fig. 6) as the new relatively affordable commute option has likely bridged the local gaps in transportation accessibility. While some areas such as the airports saw an opposite effect of up to 8% increase in travel time, which can be related to using the shared FHV as a replacement for more expensive taxi and FHV service heavily used in such locations (having generally lengthy and expensive Providing individual simulations with respect to commuter wealth, the model allows to analyze the equitability of the impacts across urban populations. We observe the most significant changes for the low-income groups in % difference in mileage (Fig. 7), while the highest changes in travel times are observed for higher-income groups (>$100k annual income). For the high-income groups, the majority of shared FHV trips come from transit and driving. So there is an increase in mileage from transit to shared FHV trips and at the same time decrease from the switch from driving to shared FHV. In the case of low-income groups (<$60k annual income), the mileage increase comes from shared FHV trips are being accommodated from walking and transit modes. In short, it looks like the shared FHV service is the most efficient for the wealthier in terms of the tradeoff between improved travel time and the traffic footprint, while when used by low-income passengers it causes a much heavier traffic footprint with smaller travel time improvement. Additionally, we observe that the mode-shift differences across income groups are significant with respect to the model-based uncertainties.

Manhattan Congestion Pricing Impact
Another use case is the impact of a new pricing policy-Manhattan Congestion surcharge, adding a fixed cost to taxis ($2.50), FHV ($2.75) and shared FHV ($0.75/passenger) for all trips originating in Manhattan. For shared FHV, we took an average of 2 passengers per ride at a time, so the total cost added was $1.50. According to the model simulation, on a city-wide scale, we observe an increase of 1.09% in travel times and a 0.87% decrease in mileage, which can be attributed to lower usage of taxis and FHV and the mode-shift to alternative nondriving modes. On seeing the number of reduced trips across modes, we observe an almost equal drop across taxis and FHVs, although the highest reduction is seen for shared FHV. Almost 60% of the reduced trips are accommodated by transit mode, which translates into $16 M projected increase in revenue for the MTA. Driving and walking accommodate 28% and 12% of the reduced trips respectively (Fig. 8).
Assessing on a scale of total 2019 taxi+FHV ridership, the decrease in the number of trips for taxis and FHVs account for around 681,000 fewer miles driven which comes at a net increased travel time of 329,000 h. The decrease in driving mileage causes revenue loss of $19 M for the taxi and $11 M for FHV (shared+non-shared) which comes due to the drop in taxi+FHV trip numbers for trips originating in Manhattan, although the net revenue increase for taxi + FHV services is $119 M from the increased prices per trip. This further translates into the citywide economic impact of $2.7 M time-cost value increase after the Manhattan congestion charge is added.
Seeing from an equitability perspective, we observe the most dramatic changes for the high-income groups in percent difference in travel times and mileage (Fig. 9), meaning that commute choices of the richest are affected the most. Compared to the low-income population, we see an increase of about 1 percentage point in travel times for high-income groups. The same is observed for total mileage driven, where the decrease is about 0.8% lower for low-income populations than high-income groups. This makes sense as taxi and FHV ridership are seen across the high-income population. The highest mileage cut comes from the top mode switch from FHVs and taxi to transit. This change is seen the most for the $100k-$150k income group whereas for >$150k income groups, the mileage cut decreases as top mode choice is private car instead of transit after the congestion surcharge. The mileage cut is significantly less for lower-income groups as the number of trips of taxis and FHVs are low, to begin with.
With congestion surcharge, the top mode choice becomes transit/walking but the net number of trip changes are low compared to the higher-income groups. The highest change among low-income groups is observed for the $60k-$100k group where top mode choice switches to transit from FHV/ shared FHV after the congestion charge is introduced. The same trend is seen for travel times where the rich observe the highest time increase owing to their switch from taxis/ FHVs to transit mode. In terms of spatial impact, the biggest impact is seen in the high-income neighborhoods of Manhattan, specifically Lower East side, Upper East side and Upper West side parts of the borough. As compared to upper Manhattan neighborhoods like East Harlem and Washington Heights, the impacts in both travel times and mileage are relatively higher in Midtown and Lower Manhattan areas.
The relatively low changes in total travel times and mileage for the whole city can be explained by the low total proportion of taxi trips present in the data. Together, taxis and FHVs make up around 7% of the total mobility in the C2SMART simulations. Thus any monetary changes in fares in taxi+FHVs for one borough (Manhattan) translate into a low change in net times and mileages.
So in general the policy seems to be efficient in causing a statistically significant decrease in the overall traffic, while the vulnerable populations seem to be the least affected overall.

Consensus Across C2SMART Test Bed and LEHD Mobility
With the intervention scenario of the introduction of shared FHV, we can expect an increase in citywide net travel mileage and a decrease in travel times. But as the C2SMART simulation testbed data might not be a perfect representation of true mobility within the city, it is important to test our model across different mobility data sets to see how much the results might differ and if the model gives a reasonable estimate across different representations of mobility. Thus we decided to also test it on one other data for NYC-the LEHD mobility. The LEHD data has mobility information from across 47,000 O-D pairs compared to 21,000 pairs from C2SMART. On running the simulation model with the best likelihood parameters on the LEHD pairs, we observed a net citywide travel time decrease of 1.91% and a net mileage increase of 1.29% upon introducing the scenario where shared FHV was available as a transport mode.
So the resulting impacts from the simulation model mildly depend on the data source of mobility demand (and compared to other quantifiable sources of the assessment uncertainty, the data source remains the most significant one), but generally remain consistent and close to the range of percent changes originally obtained from the C2SMART simulation testbed data.

Conclusions
This research work constructed the simulation modeling and probabilistic inference framework suitable for the assessment of city-scale impacts of transportation innovations and policies on the transportation system along with the associated environmental and economic implications with respect to the uncertainty of such impacts. The key aspect of this work lies in the framework's ability to learn from diverse and possibly inconsistent datasets (such as historic transportation surveys and actual taxi and FHV ridership) providing partial information on urban mobility, stepwise gaining information from either source. This provides an important way to model and measure impacts in the events of incomplete mobility data, which is the case in many urban locations. The framework's applicability is illustrated in two use cases: the introduction of shared FHV in NYC and Manhattan Congestion surcharge.
Broadly, our results indicate that shared mobility helped to decrease travel times between 1 and 2 % for all categories of passengers. However, it does so by increasing the traffic up to 0.5-1.5%-decreases from trip sharing seem to be offset by a growing number of riders due to increased affordability of the service. It works more efficiently for highincome categories of passengers providing higher travel time decrease with lower mileage increase. On the other hand, the Manhattan congestion surcharge noticeably decreases the FHV traffic of up to 1 % , however, it does so at the price of increased travel time and in particular for high-income travelers, who are perhaps the most frequent users of taxis and FHVs, to which the surcharge is targeted. The uncertainty analysis confirms the statistical significance of the impacts as well as their heterogeneity across populations. The impacts above are further translated into the total traffic, gas consumption, emissions, monetary savings and public transit earnings implications.
While we hope that this study can be a proof of concept for other cities considering shared mobility, congestion pricing, or other similar interventions, it should be noted that New York City's transportation system is unique in many ways, and makes a switch to public transportation more practical than in many other cities. In addition, while the impact assessments in the paper provide proof-of-concept use cases for the proposed framework, further work may be needed to develop a comprehensive and accurate picture of the mode choices and mode shift. The outdated survey-based ground truth refined by C2SMART simulation testbed by itself might not be fully representative of actual urban mobility. The current landscape of urban mobility might differ significantly from the RHTS and similar available transportation surveys conducted in the pre-FHV era. And although the historic data is only used as part of the parameter estimation for the model, while the scale of the impacts is based on the up-to-date TLC data, the mode-choice proportions and reliability of the impact assessment might still get affected.
Another limitation of the study is the simplicity of the utility function as presently considered. Accounting only for travel time and cost it may not reflect all the critical factors of how a person makes a transportation choice. The present utility function would work very well in an ideal world where everyone worked out the economics of their commute daily, but, the reality is that transportation choices are influenced by habits, comfort preferences, and other human factors as well as environmental conditions. We focused our study on commuters because it allowed us to infer demographic and transportation demand information, but the morning commute takes up a small part of New York City's complex transportation system. The collection of more comprehensive ground truth data and accounting for more aspects of individual choices could further improve the reliability of the impact assessment. Finally, the validity of the mode-choice and impact assessments is conditional on the validity of the model, although an uncertainty assessment related to inaccuracies in the data as well as the model fit allows us to assess the degree of confidence in such an assessment, the specific model behind mode choices need to be assumed.
With that, the main contribution of the paper is a proofof-concept demonstration that a robust data-driven probabilistic modeling framework incorporating incomplete and inconsistent available mobility data, is capable of assessing the holistic picture of the urban commute and impact of transportation interventions with a reasonable degree of certainty.

The Data
Data from following sources-C2SMART simulation test bed, the Regional Household Travel Survey (RHTS) and the NYC Taxi and Limousine Commission trip records (TLC)was used to determine the transportation demand between O-D pairs and the wage distribution of commuters. Initial exploration of the RHTS/TLC data supported our choice to focus on the commute hours (i.e., 7 a.m.-10 a.m. and 5 p.m.-8 p.m.).

C2SMART Simulation Test Bed
This data (He et al. 2020) provides travel demand for 4 modes-taxi, transit, walking and driving. The 27.3 million trips are aggregated on a Traffic Analysis Zones (TAZ) which are further aggregated into Taxi zones levels for our models. In total, the data covers trips from 20,834 unique origin-destination pairs, covering 250 of 263 taxi zones in New York City. The data also contains trips from modes like bike, carpool, shared bikes etc., which we do not include in our analyses. Nearly half of the trips constitute transit, followed by driving, walking and taxi.

Regional Household Travel Survey
The data was used to reflect reported choices of transportation modes by commuters serving as partial ground truth for fitting the model. Census tract level estimate data was pulled in order to generate probabilities of a commuter within each taxi zone of having wages within each Census income bracket. Commute information from the RHTS was reported by respondents which form of transportation they used "most days" for commuting to work, as to estimate the 8 Page 14 of 17 percentage of people in each taxi zone that regularly choose each form of transportation for trips to work. Collectively, the RHTS data allows us to estimate the probability of any given resident of each origin zone choosing each distinct mode of transportation, given the commuter's income.

Taxi Trips
New York City's Taxi and Limousine Commission provides free access to their database of taxi trip data with ride-level granularity. These data give a sense of the high volume traffic areas in the city, as well as the distribution of trips by the time of day. They are crucial to estimate current modal distribution and, when used in conjunction with demographic RHTS data, to predict mode shift under various scenarios and within varying demographics. We found that the actual TLC data was most correlated with the LEHD Origin-Destination Employment Statistics (LODES) demand for the commute hours, and this correlation allows us to assume that the extracted trips are largely originating from the rider's home taxi zone, enabling us to infer the commute trips by taxi from TLC data between taxi zone pairs (O-D pairs).

For-Hire-Vehicles Trips
The data was also provided by TLC consists of individual trip data for different FHV services (ex. Uber, Lyft, etc.). For analysis, we have separated the FHV trips to FHV and shared FHV, and aggregated both data at the level of pick-up and drop-off zone, date and hours between commute hours. The aggregated data contains attributes of date, pick-up location id, drop-off location id, average trip duration (sec), trip counts, and surcharge flag (FHV and shared FHV). A typical month of data includes 15-20 million rides, with around 20% of shared FHV, and 80% non-shared FHV. We found the zones with high trip amounts are mostly concentrated at lower/middle Manhattan and downtown Brooklyn areas for both pick-up and drop-off locations. Such finding suggests that a large portion of our model simulations will be reflecting the trips in these areas, thus we need to be more carefully consider the regional demographic information of these areas as well as their functionalities (e.g. shopping, parks, etc), to avoid any false assumptions when profiling people choices.

Travel Times and Costs
The key elements of our model required us to estimate the time and cost associated with trips between each taxi zone pair for each of the six transportation modes (taxi, FHV, shared FHV, public transportation, walking, private vehicle).
Additionally, in order to evaluate the utility of each transportation mode, we obtained the rider's wages from RHTS. We have aggregated all data sources to the TLC taxi zone level and the final version of the data which is used in the model includes pickup and drop-off locations, commute duration, price, and the wage distribution for that origin-destination pair. We considered mean fare amount, trip duration, and their standard deviations to inform data uncertainty.

API Services
HERE/Google technologies are the company that provides mapping and location data. In order to assess travel time, cost and overall utility of each transportation mode considered in the model given the O-D pair, we use HERE/Google REST APIs to gather information such as maps, routing, geocoding, places, positioning, traffic, transit, and weather information. The public transit data can be acquired via specific Public Transit API (HERE Maps). Use HTTP GET methods, route information such as the trip time duration, the number of transfer, and the mode for each transfer will be get given departure and arrive location, departure time, and specific mode. To try and limit the impact of any special circumstances that would impact these estimates, we retrieved the data on several occasions and took an average. For those pairs with no route information available for either of the modes, we consider their corresponding time and distance values to be infinity. This situation can happen where there does not exist a way of commuting from one zone to another (e.g. islands).

LEHD/ACS Data
Another data source is from LEHD (Longitudinal Employer-Household Dynamics). The LEHD program is part of the Center for Economic Studies at the U.S. Census Bureau and produces cost effective, public-use information combining federal, state and Census Bureau data. It has commuter information for 11 wage groups on a taxi zone level. Also present is the population choices of transportation for taxi, walking, transit, driving, biking, carpool etc. from the American Community Survey (ACS) data. Information regarding for-hire vehicles is missing, so only the four transport modes of our interest are present. Finally, since only the originbased commuter information is present, we cannot explicitly have any true choices for an origin-destination pair.

Data Metrics
The travel flow information from both the C2SMART test bed and TLC data is able to cover almost all of NYC's taxi zones. (250 of 263, 95%). Overall, we are able to access mobility information for 20,575 unique O-D pairs across the city, which are common for both data sources. While C2SMART provides estimates for 4 travel modes: Transit, Taxi, Walking and Driving, the information for FHV/shared FHV and taxis is sourced from TLC. Merging the flows with the travel time/cost information and the population wage metrics from LEHD, the typical dataset has the variables needed to estimate the utilities needed as the model inputs (Fig. 10).
Where "pulocation", "dolocation" are the O-D pairs, "tmode" corresponds to each travel mode, "duration" is travel time (in minutes), "price" is the monetary cost (in USD), and "w10000", "w15000", …are the wage brackets representing the population distribution for each wage group.

Model Performance Comparison With Different Nesting Versions
In order to estimate the best nesting structure among the 6 travel modes in question, we compared the simulation model's performance with different nesting structures in its configuration. The model specification is the same as described in the paper and it depends on the parameters and , of the utility score U and the nesting correlation parameters corr nest , the number of which depends on the nesting configuration of the model. "Appendix A": Fig. 2 shows the model's performance in terms of the best R2 score (computed for the best nesting of the given complexity over the total vs estimated number of trips for each mode) vs the number of nesting parameters used in the model (which is synonymous with the nesting complexity or the number of nests among the 6 modes). We observe that the best nesting configurations give significantly better performance as we increase the number of nests compared to the model without nesting. However, as we increase the number of nests in the model configuration, the model hyper-parameters corr nest also increase, so it is important to limit the number of parameters to a reasonable value in conjunction with the overall performance improvement we get. We also observe that the performance gets significantly better as we move from 1 nest to 2 nests in the model. However, going from 2 to 3 nests already does not provide a significant improvement, indicating that further improvement in the model performance might not justify the increase in the model complexity. We note that the best two nest configuration gives an improvement of 0.05 in R2 score compared to the best one nest model version. The performance of different nesting approaches is given in the Supplementary Table 1. Among the model configurations with the two nests, we note that two nesting approaches: (a) FHV+shared FHV and taxi+driving nests and (b) FHV+shared FHV and taxi + (FHV/SFHV) nests give optimal performances. The former nest version performs slightly better than the latter in overall R2 score (0.877 vs. 0.872) but latter performs better in estimation of the taxi trips in our data. The taxi trips are relatively low compared to other modes in our data but crucial for judging the model's performance in the context of our problem. We observe an error of 9% in taxi trips prediction for nesting structure '(b)' as compared to 14% in case of '(a)'. Overall, it is fair to conclude that configuration with the two nests provides the optimal balance between the model performance and complexity. Furthermore, we find that the FHV + SFHV and taxi + (FHV/SFHV) nesting version gives optimal results among the two-nest configurations and is also more intuitive over other versions.
Additionally, as the nesting structures (a) and (b) demonstrate very close modeling performance, we assessed the citywide impacts in travel time and mileage resulting from the alternative FHV+shared FHV and taxi + driving nest version of the model. We observed a net travel time decrease of 1.81% and mileage increase of 1.16% after introducing shared FHV as a travel option. These are very close to the impacts assessed from our version of the model which calculated 1.77% travel time decrease and 1.14% increase in mileage. So one may conclude that the final impact assessment does not depend much on choosing either of the two optimal nesting configurations (Fig. 11).

Uncertainty Analysis
Uncertainty in the data is accounted for by incorporating the travel time and fares random distributions into the model and running the simulations multiple times. Then we calculate the mean and variance of trips for each of the four  Table 2.
Model based uncertainty is analysed using the approach mentioned in the probabilistic approach of getting simulation likelihoods.