1 Introduction

The emergence and widespread use of Mobility-on-Demand systems in recent years has had a profound impact on urban transportation in a variety of ways. Amongst other advantages, these systems have the potential to mitigate congestion costs (such as commute times, fuel usage, accident propensity, etc.), enable marketplace optimization for both passengers and drivers, and provide great environmental benefits. A prominent such example is ridesharingFootnote 1. Ridesharing however results in some passenger disruption as well, due to compromise in flexibility, increased travel time, and loss of privacy and convenience. Thus, in the core of any ridesharing platform lies the need for an efficient balance between the incentives of the passengers, and those of the platformFootnote 2.

Optimizing the usage of transportation resources is not an easy task, especially for cities like New York, with more than 13000 taxis and 270 ride requests per minute. For example, (Buchholz 2018) estimates that 45000 customer requests remain unmet each day in New York, despite the fact that approximately 5000 taxis are vacant at any time. In fact, on aggregate, drivers spend about \(47\%\) of their time not serving any passengers (Buchholz 2018). Moreover, up to \(80\%\) of the taxi rides in Manhattan could be shared by two riders, with only a few minutes increase in travel time (Alonso-Mora et al. 2017a). A more sophisticated matching policy could mitigate these costs by better allocating available supply to demand. As a second example, coordinated vehicle relocation could also be employed to bridge the gap on the spatial supply/demand imbalance and improve passenger satisfaction and Quality of Service (QoS) metrics. Drivers often relocate to find passengers: \(61.3\%\) of trips begin in a different neighborhood than the drop-off location of the last passenger (Buchholz 2018), yet currently drivers move without any coordinated search behavior, resulting in spatial search frictions.

Given the importance of the problem for transportation and the economy, it is not surprising that the related literature is populated with a plethora of papers, proposing different solutions along different axes, such as efficiency (Santi et al. 2014; Alonso-Mora et al. 2017a; Agatz et al. 2011; Ashlagi et al. 2017; Huang et al. 2019; Bienkowski et al. 2018; Dickerson et al. 2018; Fagnant and Kockelman 2018; Lokhandwala and Cai 2018), platform revenue (Banerjee et al. 2017; Chen et al. 2019), driver incentives (Ma et al. 2019; Yuen et al. 2019; Garg and Nazerzadeh 2020), fairness (Lesmana et al. 2019; Sühr et al. 2019; Xu and Xu 2020; Nanda et al. 2020), reliability (Fielbaum and Alonso-Mora 2020; Alonso-González et al. 2020), or analyzing the effects on sharing economies (Kooti et al. 2017; Jiang et al. 2018; Ghili and Kumar 2020; Asadpour et al. 2020).

It is well-documented (e.g., (Lesmana et al. 2019)) that all these different desiderata are often contrasting (e.g., fairness vs. revenue), and therefore we should not expect a single algorithm for ridesharing to be superior for all of them; rather, the design of such algorithms should be contingent on the goals of the designer, and which of those properties they consider to be more important for the application at hand. Thus, we want a flexible and adaptable design, able to work best with respect to any set of such objectives with ‘a few tweaks’.

To enable this, we propose a modular approach to algorithm design in ridesharing, in which an algorithm consists of three different components, namely (a) matching passengers with other passengers, (b) assigning rides to vehicles and (c) vehicle relocation, in which the taxis move, when they do not serve passengers, close to positions where requests are expected to appear in the near future. Each component can then be seen as a different (sub)-algorithm, and those algorithms can be appropriately chosen to be geared towards the specific objectives of the designer. As a matter of fact, our approach draws inspiration from several successful algorithms in the ridesharing literature, such as the well-known High Capacity algorithm of (Alonso-Mora et al. 2017a), or the recent algorithm of (Riley et al. 2020), who can both be cast as examples of algorithms in this modular design setting.

Fig. 1
figure 1

The three components of a CAR

1.1 Our contributions

1.1.1 CARs

We initiate the systematic study of Component Algorithms for Ridesharing (CARs). A CAR is an algorithm consisting of three sub-algorithms, each solving one of the following components of the ridesharing problem (Fig. 1).

  • Matching passengers to other passengers. For this component, the underlying algorithmic problem is that of Online Maximum Weight Matching, where the “online” part stems from the fact that passenger requests appear at different points in time, and we have to account for the future when deciding which passengers to match. As such, we have a lot of classic as well as modern matching algorithms at our disposal.

  • Assigning rides to vehicles. For this component, the underlying algorithmic problem can either be seen as an Online Maximum Weight Bipartite Matching, or as an instance of the k-Taxi Problem and by extension as the famous k-Server problem from the literature of online algorithms. Similarly to above, there is a large set of classic and modern solutions that one can plug-in as components for this part.

  • Vehicle Relocation. For this component, the objective is to use historical data to predict the location of future requests and move idle taxis closer to those locations. From an algorithmic standpoint, this problem can be cast as either as k-Facility Location problem, concerned with the optimal placement of facilities (taxis) to minimize transportation costs, or as an Online Maximum Weight Matching problem on the history of requests.

1.1.2 Evaluation platform

While several papers in the literature provide evaluations on realistic datasets, (e.g., see (Riley et al. 2020; Santi et al. 2014; Alonso-Mora et al. 2017a; Agatz et al. 2011; Santos and Xavier 2013; Danassis et al. 2019), they either (a) only consider parts of the ridesharing problem and therefore do not propose end-to-end solutions, (b) only evaluate a few newly-proposed algorithms against some basic baselines, (c) only consider a limited number of performance metrics, predominantly with regard to the overall efficiency and often without regard to QoS metrics, or (d) perform evaluations on a much smaller scale, thus not capturing the real-life complexity of the problem. On the contrary, our work provides a comprehensive evaluation of a large number of proposed algorithms, over multiple different metrics, and for real-world scale, end-to-end problems. Specifically:

  • We meticulously design an experimental setting to resemble reality as close as possible in every aspect of the problem. To the best of our knowledge, this is the first end-to-end experimental evaluation of this magnitude, and could serve as a common-ground for evaluating future work in a setting designed to capture real-world challenges.

  • We evaluate our CARs for a host of different objectives (10 metrics) related to global efficiency, complexity, passenger, and platform incentives (see Table 2).

We focus on (shared) rides of at most two requests (i.e., vehicles of capacity two) for two reasons: complexity, and passenger satisfaction; as we explain in detail in Sect. 3.2.4.

1.1.3 Results

Applying the modular approach we advocated above, we design a large set of CARs, based on different classic and modern algorithms for the different components (14 in total, see Table 1). The main take-away is the following:

  • CARs based on off-line, in-batches maximum-weight matching approaches perform well on global efficiency and passenger related metrics.

  • CARs based on k-server algorithms perform well on platform related metrics (e.g., the Balance algorithm (Manasse et al. 1990)).

  • Lightweight CARs perform better in real-world, large-scale settings since real-time constraints dictate short planning windows which can diminish the benefit of cumbersome optimization techniques compared to myopic approaches.

  • Simple, lightweight relocation schemes can significantly improve Quality of Service metrics by up to \(50\%\).

  • We identify a scalable, on-device CAR based on ALMA (Danassis et al. 2019) that performs well across the board.

Our findings provide convincing evidence to a ridesharing platform as to which combination of components would be most suitable for a given set of objectives.

2 Discussion and related work

The literature on ridesharing is rather extensive; here we only highlight the key algorithmic principles in our design of CARs.

The dynamic ridesharing – and the closely related dynamic dial-a-ride (see (Agatz et al. 2012)) – problem has drawn the attention of diverse disciplines over the past few years, from operations research to transportation engineering, and computer science. Solution approaches include constrained optimization (Qian et al. 2017; Simonetto et al. 2019; Agatz et al. 2011; Alonso-Mora et al. 2017a; Riley et al. 2020), weighted matching (Ashlagi et al. 2017; Bei and Zhang 2018; Dickerson et al. 2018; Zhao et al. 2019; Danassis et al. 2019), other heuristics (Qian et al. 2017; Santos and Xavier 2015; Bathla et al. 2018; Lowalekar et al. 2019; Santos and Xavier 2013; Pelzer et al. 2015; Gao et al. 2017; Shah et al. 2020), reinforcement learning (Guériau and Dusparic 2018; Li et al. 2019; He and Shin 2019), or model predictive control (Chen and Cassandras 2019; Riley et al. 2020; Tsao et al. 2019), among others. We refer the interested reader to the following surveys (Agatz et al. 2012; Silwal et al. 2019; Furuhata et al. 2013; Ho et al. 2018; Mourad et al. 2019; Cordeau and Laporte 2007) for a review on the optimization challenges, various algorithmic designs adopted over the years, a classification of existing ridesharing systems, models and algorithms for shared mobility, and finally models and solution methodologies for the dial-a-ride problem, respectively.

As we mentioned in the introduction, the key algorithmic components of ridesharing are the following. First, it is an online problem, as the decisions made at some point in time clearly affect the possible decisions in the future, and therefore the the literature of online algorithms and competitive analysis (Borodin and El-Yaniv 2005; Manasse et al. 1988) offers clear-cut candidates for CARs. Second, all of the components can be seen as some type of matching both for bipartite graphs (for matching passengers with taxis, or idle taxis with ‘future’ requests) and for general graphs (for matching passengers to shared rides). In fact, several of the algorithms that have been proposed in the literature for the problem are for different variants of online matching.

Finally, ridesharing displays an inherent connection to the k-taxi problem (Coester and Koutsoupias 2019; Buchbinder et al. 2020; Fiat et al. 1994; Kosoresow 1996), which, in turn, is a generalization of the well-known k-server problem (Koutsoupias and Papadimitriou 1995; Koutsoupias 2009)Footnote 3. In the k-taxi problem, once a request appears (with a source and a destination), one of the k taxis at the platform’s disposal must serve the request. Viewing shared rides (multiple passengers that have already been matched in a previous step) as requests, one can clearly apply the k-taxi (and k-server algorithms) to the ridesharing setting. Granted, the k-server algorithms have been designed to operate in a more challenging setting in which (a) the requests have to be served immediately, whereas normally there is some leeway in that regard, often at the expense of customer satisfaction, and (b) the positions of requests are typically adversarially chosen, rather than following some distribution, as is the case in reality. Despite those facts, the fundamental idea behind these algorithms is a pivotal part of ridesharing, as it aims to serve existing requests efficiently, but at the same time place the vehicles as well as possible to serve future requests. This is also the main principle of the relocation strategies for idle taxis.

The algorithms that we consider are appropriate modifications of the most significant ones that have been proposed for the aforementioned key algorithmic primitives of the ridesharing problem, as well as heuristic approaches which are based on the same principles, but were specifically designed with the ridesharing application in mind. We emphasize that such modifications are needed, primarily because many of these algorithms were tailored for sub-problems of the ridesharing setting, and end-to-end solutions in the literature are rather scarce.

Much of the related work in the literature focuses on approaches that are inherently centralized and require knowledge of the full ridesharing network, which makes them rather computationally intensive. As an additional goal of our investigation, we would like to identify solutions that are lightweight, decentralized, and which ideally run on-device. Of course, some hybrid and decentralized approaches for the ridesharing problem have been proposed (e.g., (Simonetto et al. 2019; Guériau and Dusparic 2018)), and several of the algorithms that we include in our experimental evaluation can be implemented in a decentralized manner (e.g., (Giordani et al. 2010; Ismail and Sun 2017; Zavlanos et al. 2008; Bürger et al. 2012)), but that would typically require a larger amount of communication between the agents; in this case, the vehicles. As it turns out though, the ALMA algorithm of (Danassis et al. 2019), which has been designed with precisely these objectives in mind (low computational complexity, scalability, and low communication cost), performs very well across the board with respect to our objectives.

The third component of our CARs is the relocation of idle taxis. Relocation is an important component of a successful ridesharing application. Many studies in shared mobility systems have shown that the adoption of a relocation strategy can help improve the system performance for their specific context (Guériau and Dusparic 2018; Vosooghi et al. 2019; Martínez et al. 2017; Bélanger et al. 2016; Ruch et al. 2018; Alonso-Mora et al. 2017a; Buchholz 2018; Lioris et al. 2016; Spieser et al. 2014; Tsao et al. 2019; van Engelen et al. 2018; Wen et al. 2017; Wallar et al. 2018). Strategies include using a short window of known active requests (Alonso-Mora et al. 2017a), historical demand (Guériau and Dusparic 2018; Alonso-Mora et al. 2017b; Fielbaum et al. 2021b; Zhou et al. 2013; Xue et al. 2015; van Engelen et al. 2018), or techniques to predict future demand (Spieser et al. 2016). Yet, relocation by nature increases vehicle travel distance, leading to undesirable consequences (economical, environmental, maintenance, management of human resources, etc.), thus a balance needs to be struck. Most of the employed relocation approaches are course-grained; the network is generally divided into several zones, blocks, etc. (Guériau and Dusparic 2018; Vosooghi et al. 2019; Martínez et al. 2017) and the entities (e.g., the vehicles) move between the zones. However, compared to other shared mobility systems, dynamic ridesharing posses unique challenges, meaning that such coarse-grained approaches are not appropriate: most of them are centralized – thus computationally intensive and not scalable –, they might not take into account the actions of other vehicles, potentially leading to over-saturation of high demand areas, and, most importantly, they are slow to adapt to the highly dynamic nature of the problem (e.g., responding to high demand generated by a concert, or the fact that vehicles remain free for only a few minutes at a time). The problem clearly calls for fine-grained solutions, yet such approaches in the literature are still rather scarce. In this paper, we employ such a fine-grained relocation scheme (similarly to (Alonso-Mora et al. 2017a)), based on matching between the idle taxis and the potential requests, which is better suited for the problem at hand.

Relocation can be either viewed as the k-center or k-Facility Location Problem (Guha and Khuller 1999), or as an Online Maximum Weight Matching problem on the history of requests. Given the high complexity of the former problems (they are both NP-hard, in fact, APX-hard (Hsu and Nemhauser 1979; Feder and Greene 1988)), we have opted for the latter interpretation.

3 Problem statement & modeling

In this section we formally present the Ridesharing problem. To avoid introducing unnecessary notation, we only present the description of the model here; precise notation and details are provided in the respective sections where they are used.

In the Ridesharing problem there is a (potentially infinite) metric space \({\mathcal {X}}\) representing the topology of the environment, equipped with a distance function \(\delta : {\mathcal {X}} \times {\mathcal {X}} \rightarrow \mathbb {R}_{\ge 0}\). Both are known in advance. At any moment, there is a (dynamic) set of available taxi vehicles \({\mathcal {V}}_t\), ready to serve customer requests (i.e., drive to the pick-up, and subsequently to the destination location). Between serving requests, vehicles can relocate to locations of potentially higher demand, to mitigate spatial search frictions between drivers. Customer requests appear in an online manner at their respective pick-up locations, wait to potentially be matched to a shared ride, and finally are served by a taxi to their respective destination. In order for two requests to be able to share a ride, they must satisfy spatial, and temporal constraints. The former dictates that requests should be matched only if there is good spatial overlap among their routes. Yet, due to the latter constraint, requests cannot be matched even if they have perfect spatial overlap, if they are not both ‘active’ at the same time. Finally, ridesharing is an inherently online problem, as we are unaware of the requests that will appear in the future, and need to make decisions before the requests expire, while taking into account the dynamics of the fleet of taxis.

3.1 Performance metrics

The goal is to minimize the cumulative distance driven by the fleet of taxis, while maintaining high Quality of Service (QoS), given that we serve all requests (service guarantee). Serving all requests improves passenger satisfaction, and, most importantly, allows us to ground our evaluation to a common scenario, ensuring a fair comparison.

3.1.1 Global metrics


Distance Driven: Minimize the cumulative distance driven by all vehicles for serving all the requests. We chose this objective as it directly correlates to passenger, company, and environmental objectives (minimize operational cost, delay, CO\(_2\) emissions, maximize the number of shared rides, improve QoS, etc.). All of the evaluated algorithms have to serve all the requests, either as shared, or single rides.


Complexity: Real-world time constraints dictate that the employed solution produces results in a reasonable time-frameFootnote 4.

3.1.2 Passenger specific metrics—Quality of Service (QoS)


Time to Pair: Expected time to be paired in a shared ride, i.e., \(\mathbb {E}[t_{\text {paired}} - t_{\text {open}}]\), where \(t_{\text {open}}, t_{\text {paired}}\) denote the time the request appeared, and was paired as a shared ride, respectively. If the request is served as a single ride, then \(t_{\text {paired}}\) refers to the time the algorithm chose to serve it as such.


Time to Pair with Taxi: Expected time to be paired with a taxi, i.e., \(\mathbb {E}[t_{\text {taxi}} - t_{\text {paired}}]\), where \(t_{\text {taxi}}\) denotes the time the (shared) ride was paired with a taxi.


Time to Pick-up: Expected time to passenger pickup, i.e., \(\mathbb {E}[t_{\text {pickup}} - t_{\text {taxi}}]\), where \(t_{\text {pickup}}\) denotes the time the request was picked-up.


Delay: Additional travel time over the expected direct travel time (when served as a single ride, instead of a shared ride), i.e., \(\mathbb {E}[(t_{\text {dest}} - t_{\text {pickup}}) - (t'_{\text {dest}} - t_{\text {pickup}})]\). \(t_{\text {dest}}\), and \(t'_{\text {dest}}\) denote the time the request reaches, and would have reached as a single ride, its destination.

Research conducted by ridesharing companies shows that passengers’ satisfaction level remains sufficiently high as long as the pick-up time is less than a certain threshold. The latter is corroborated by data on booking cancellation rate against pick-up time (Tang et al. 2017). In other words, passengers would rather have a short pick-up time and long detour, than vice-versa (Brown 2016b). This also suggests that an effective relocation scheme can considerably improve passenger satisfaction by reducing the average pick-up time (see Sect. 7.2.7).

Given the importance of short pick-up times in passengers’ satisfaction, we opted to distinguish and study each segment of the waiting process separately (‘Time to Pair’, ‘Time to Pair with Taxi’, and ‘Time to Pick-up’). To the best of our knowledge, we are the first to do so. Such analysis can provide a clear picture of sources of inefficiency to a ridesharing platform, and improve the overall satisfaction which in turn correlates to the growth of the company.

3.1.3 Platform specific metrics


Quality of Service (QoS): Refer to the aforementioned, passenger specific metricsFootnote 5. Improving the QoS to their costumers correlates to the growth of the company.


Number of Shared Rides: Related to the profit. By carrying more than one passenger at a time, vehicles can serve more requests in a day, which consequently, increases the income (Widdows et al. 2017). The matching rate is important especially in the nascent stage of a ridesharing platform (Dutta and Sholley 2018).


Frictions: Waiting time experienced by drivers between serving requests (i.e., time between dropping-off a ride, and getting matched with another). Search frictions occur when drivers are unable to locate rides due to spatial supply and demand imbalance. Even though in our scenario matchings are performed automatically, without any searching involved by the drivers, lower frictions indicate a better distribution of the platform’s supply.

3.2 Modeling

Our evaluation setting is meticulously designed to resemble reality as closely as possible, in every aspect of the problem. We achieve this by using actual data from the NYC’s yellow taxi trip recordsFootnote 6 – both for modeling customer requests and taxis – and running our simulations to the scale of the actual problem faced by the ridesharing platforms (we run simulations with more than 390, 000 requests and 12, 000 taxis). Moreover, we have exhaustively designed every detail of the problem, such as speed of the vehicles, initial positions, distance function, etc. In what follows, we describe each design aspect in detail.

3.2.1 Dataset

We have used the yellow taxi trip records of 2016, provided by the NYC Taxi and Limousine Commission\(^6\). The dataset was cleaned to remove requests with travel time shorter than 1 minute, or invalid geo-locations (e.g., outside Manhattan, Bronx, Staten Island, Brooklyn, or Queens). For every request, the dataset provides amongst others the pick-up and drop-off times, and geo-location coordinates. Time is discrete, with granularity of 1 minute (same as the dataset). On average, there are 272 new requests per minute, totaling to 391479 requests in the broader NYC area (352455 in Manhattan) on the evaluated day (Jan, 15). Figure 2 depicts the number of request per minute on the aforementioned day.

Fig. 2
figure 2

Request per minute on Jan. 15, 2016 (blue line). Mean = 272 requests (yellow line)

3.2.2 Taxi vehicles

A unique feature of the NYC Yellow taxis is that they may only be hailed from the street and are not authorized to conduct pre-arranged pick-ups. This provides an ideal setting for a counter-factual analysis for several reasons: (1) We can assume a realistic position of each taxi at the beginning of the simulation (last drop-off location). (2) Door-to-door service can be inefficient (Fielbaum et al. 2021a; Stiglic et al. 2015), thus users may be requested to walk to/from a nearby fast street. Given that users have presumably hailed the taxis from larger streets, this results to a more accurate modeling of the origins of supply and demand. Finally, (3) all observed rides are obtained through search, thus – assuming reasonable prices, and delays – customers do not have nor are willing to take an alternative means of transportation. The latter validates our choice that all of the algorithms considered will have to eventually serve all the requests.

By law, there are 13, 587 taxis in NYC\(^12\). The majority of the results presented in this paper use a much lower number of vehicles (what we call base number) for three reasons: (1) to reduce the complexity of the problem, given that most of the employed algorithms can not handle such a large number of vehicles, (2) to evaluate under resource scarcity – making the problem harder – to better differentiate between the results, and (3) to investigate the possibility of a more efficient utilization of resources, with minimal cost to the consumers. However, we still present simulations for a wide range of vehicles, up to close to the total number.

The number, initial location, and speed of the taxi vehicles were calculated as follows:

  • We calculated the base number of taxis, as the minimum number of taxis required to serve all requests as single rides (no ridesharing). If a request appears, and all taxis are occupied serving other requests, we increase the required number of taxis by one. This resulted to around \(4000 - 5000\) vehicles (depending on the size of the simulation, see Sect. 7.2). Simulations were conducted for \(\{\times 0.5, \times 0.75, \times 1.0, \times 2.0, \times 3.0\}\) the base number.

  • Given a number of taxis, V, the initial position of each taxi is the drop-off location of the last V requests, prior to the starting time of the simulation. To avoid cold start, we compute the drop-off time of each request, and assume the vehicle occupied until then.

  • The vehicles’ average speed is estimated to 6.2 m/s (22.3 km/h), based on the trip distance and time per trip as reported in the dataset, and corroborated by the related literature (in (Santi et al. 2014) the speed was estimated at \(5.5 - 8.5\) m/s depending on the time of day).

3.2.3 Customer requests

A request, r, is a tuple \(\langle t_r, s_r, d_r, k_r \rangle\). Request r appears (becomes open) at its respective pick-up time (\(t_r\)), and geo-location (\(s_r\)). Let \(d_r\) denote the destination. Each request admits a willingness to wait (\(k_r\)) to find a match (rideshare), i.e., we assume dynamic waiting periods per request. The rationale behind \(k_r\) is that requests with longer trips are more willing to wait to find a match than requests with destinations near-by. After \(k_r\) time-steps we call request r, critical. If a critical request is not matched, it has to be served as a single ride. Recall that in our setting all of the requests must be served. Let \({\mathcal {R}}_t^{\text {open}}, {\mathcal {R}}_t^{\text {critical}}\) denote the sets of open, and critical requests respectively, and let \({\mathcal {R}}_t = {\mathcal {R}}_t^{\text {open}} \cup {\mathcal {R}}_t^{\text {critical}}\).

We calculate \(k_r\) as in related literature (Danassis et al. 2019). Let \(w_{\text {min}}\), and \(w_{\text {max}}\) be the minimum and maximum possible waiting time, i.e., \(w_{\text {min}} \le k_r \le w_{\text {max}}, \forall r\). Knowing \(s_r, d_r\), we can compute the expected trip time (\(\mathbb {E}[t_{\text {trip}}]\)). Assuming people are willing to wait proportional to their trip time, let \(k_r = q \times \mathbb {E}[t_{\text {trip}}]\), where \(q \in [0, 1]\). \(w_{\text {min}}, w_{\text {max}}\), and q can be set by the ridesharing company, based on customer satisfaction (following (Danassis et al. 2019), let \(w_{\text {min}} = 1, w_{\text {max}} = 3\), and \(q = 0.1\)).

3.2.4 Rides

A (shared)ride, \(\rho\), is a pair \(\langle r_1, r_2 \rangle\), composed of two requests. If a request r is served as a single ride, then \(r_1 = r_2 = r\). Let \({\mathcal {P}}_t\) denote the set of rides waiting to be matched to a taxi at time t. Contrary to some recent literature on high capacity ridesharing (e.g., (Alonso-Mora et al. 2017a; Lowalekar et al. 2019)), we purposefully restricted ourselves to rides of at most two requests for two reasons: complexity, and passenger satisfaction. The complexity of the problem grows rapidly as the number of potential matches increases, while most of the proposed/evaluated approaches already struggle to tackle matchings of size two on the scale of a real-world application. Moreover, even though a fully utilized vehicle would ultimately be a more efficient use of resources, it diminishes passenger satisfaction (a frequent worry being that the ride will become interminable, according to internal research by ridesharing companies) (Widdows et al. 2017; Brown 2016a). Given that a hard constraint is the serving of all requests, we do not assume a time limit on matching rides with taxis; instead we treat it as a QoS metric.

3.2.5 Distance function

The optimal choice for a distance function would be the actual driving distance. Yet, our simulations require trillions of distance calculations, which is not attainable. Given that the locations are given in latitude and longitude coordinates, it is tempting to use the Haversine formulaFootnote 7 to estimate the Euclidean distance, as in related literature (Santos and Xavier 2013; Brown 2016a). We have opted to use the Manhattan distance, given that the simulation takes place mostly in Manhattan. To evaluate our choice, we collected more than 12 million actual driving distances using the Open Source Routing Machine (project-osrm.org), which computes the shortest path in road networks. Manhattan distance’s standard and mean squared error, compared to the actual driving distance, was \(-0.5 \pm 2.9\) km, and \(1.7 \pm 2.4\) km respectively, while Euclidean distance’s was \(-3.2 \pm 3.8\) km, and \(3.2 \pm 3.8\) km respectively.

3.2.6 Embedding into HSTs

A starting point of many of the employed k-server algorithms is embedding the input metric space \({\mathcal {X}}\) into a distribution \(\mu\) over \(\sigma\)-hierarchically well-separated trees (HSTs), with separation \(\sigma = \Theta (\log |{\mathcal {X}}| \log (k \log |{\mathcal {X}}|))\), where \(|{\mathcal {X}}|\) denotes the number of points. It has been shown that solving the problem on HSTs suffices, as any finite metric space can be embedded into a probability distribution over HSTs with low distortion (Fakcharoenphol et al. 2003). The distortion is of order \({\mathcal {O}}(\sigma \log _\sigma |{\mathcal {X}}|)\), and the resulting HSTs have depth \({\mathcal {O}}(\log _\sigma \Delta )\), where \(\Delta\) is the diameter of \({\mathcal {X}}\) (Bansal et al. 2015).

Given the popularity of the aforementioned method, it is worth examining the size of the resulting trees. Given that the geo-coordinate system is a discrete metric space, we could directly embed it into HSTs. Yet, the size of the space is huge, thus for better discretization we have opted to generate the graph of the street network of NYC. To do so, we used data from openstreetmap.org. Similarly to (Santi et al. 2014), we filtered the streets selecting only primary, secondary, tertiary, residential, unclassified, road, and living street classes, using those as undirected edges and street intersections as nodes. The resulting graph for NYC contains 66543 nodes, and 95675 edges (5018, and 8086 for Manhattan). Given that graph, we generate the HSTs (Santi et al. 2014).

Table 1 Evaluated CARs

4 Component algorithms for ridesharing

In this section, we describe our design choices for developing Component Algorithms for Ridesharing (CARs). Each CAR is composed of three parts (Fig. 1): (a) request – request matching to create a (shared) ride, (b) ride to taxi matching, and (c) relocation of the idle fleet. Each of these components is a significant problem in its own right. Complexity issues make the simultaneous consideration of all three problems impractical. Instead, a more realistic approach is to tackle each component individually, under minimum consideration of the remaining twoFootnote 8. The algorithms that we consider are appropriate modifications of the most significant ones that have been proposed for the key algorithmic primitives of the ridesharing problem (see Sects. 1.1 and 2), i.e., online and offline matching algorithms, with or without delays for steps (a), (b), and (c), k-taxi/server algorithms for step (b), as well as heuristic approaches that were specifically designed with the ridesharing application in mind.

A list of all the CARs that we designed and evaluated (14 in total) can be found in Table 1, while in the following sections we provide a detailed description of each CAR component.

4.1 CAR components

We have evaluated a variety of approaches ranging from offline maximum weight matching (MWM), and greedy solutions, to online MWM, k-Taxi/Server algorithms, and linear programming. Offline algorithms (e.g., MWM, ALMA, Greedy) can be run either in a just-in-time (JiT) manner – i.e., when a request becomes critical – or in batches, i.e., every x minutes (given that our dataset has granularity of 1 minute, we run in batches of 1, and 2 minutes).

Matching Graphs: At time t, let \({\mathcal {G}}_a = ({\mathcal {R}}_t, {\mathcal {E}}^a_t)\), where \({\mathcal {E}}^a_t\) denotes the weighted edges between requests. With a slight abuse of notation, let \(\delta (s_{r_1}, s_{r_2}, d_{r_1}, d_{r_2})\) denote the minimum distance required to serve both \(r_1\), and \(r_2\) (as a shared ride, i.e., excluding the case of first serving one of them and then the other) with a single taxi located either in \(s_1\), or \(s_2\). The weight \(w_{r_1, r_2}\) of an edge \((r_1, r_2) \in {\mathcal {E}}^a_t\) is defined as \(w_{r_1, r_2} = \delta (s_1, d_1) + \delta (s_2, d_2) - \delta (s_{r_1}, s_{r_2}, d_{r_1}, d_{r_2})\) (similarly to (Danassis et al. 2019; Alonso-Mora et al. 2017a)). If \(r_1 = r_2\), let \(w_{r_1, r_2} = 0\) (single passenger ride). Intuitively, this number represents an approximation (given that it is impossible to know in advance the location of the taxi that will serve the ride) on the travel distance saved by matching requests \(r_1\), and \(r_2\)Footnote 9.

Similarly, at time t, let \({\mathcal {G}}_b = ({\mathcal {V}}_t \cup {\mathcal {P}}_t, {\mathcal {E}}^b_t)\), where \({\mathcal {E}}^b_t\) denotes the weighted edges between rides and taxis. With a slight abuse of notation, let \(\delta (s_v, s_{r_1}, s_{r_2}, d_{r_1}, d_{r_2})\) denote the minimum distance required (out of all the possible pick-up and drop-off combinations) to serve both requests \(r_1\), and \(r_2\) (that compose the (shared) ride \(\rho\)) with a single taxi located at \(s_v\). The weight \(w_{v, \rho }\) of an edge \((v, \rho ) \in {\mathcal {E}}^b_t\) is defined as \(w_{v, \rho } = 1 / \delta (s_v, s_{r_1}, s_{r_2}, d_{r_1}, d_{r_2})\). If \(r_1 = r_2\) (single passenger ride), let \(\delta (s_v, s_{r_1}, s_{r_2}, d_{r_1}, d_{r_2}) = \delta (s_v, s_{r_1}, d_{r_1})\). For the step (b) of the Ridesharing problem, we run the offline algorithms every time the set of rides (\({\mathcal {P}}_t\)) is not empty.

4.1.1 Maximum weight matching (MWM)

The maximum weight matching algorithm finds a matching with maximum total edge weight in a graph. We use a maximum wieght matching algorithm to

  • match requests into shared rides (step (a) of the Ridesharing problem), i.e., find a matching on \({\mathcal {G}}^a\) that maximizes the quantity \(\sum _{(r_1,r_2) \in {\mathcal {E}}^a_t} w_{r_1,r_2}\).

  • match rides with taxis (step (b) of the Ridesharing problem), i.e., find a matching on \({\mathcal {G}}_b\) that maximizes the quantity \(\sum _{(v, \rho ) \in {\mathcal {E}}^b_t} w_{v, \rho }\).

In both cases we use the well-known blossom algorithm of Edmonds (1965). Not surprisingly, MWM results in high quality allocations, but that comes with an overhead in running time, compared to simpler, ‘local’ solutions (see Sect. 7.2). This is because blossom’s worst-case time complexity – on a graph (VE) – is \({\mathcal {O}}(|E| |V|^2)\), and we have to run it three times, one for each step of the Ridesharing problem. Additionally, the MWM algorithm inherently requires a global view of the whole request set in a time window, and is therefore not a good candidate for the fast, decentralized solutions that are more appealing for real-life applications.

4.1.2 ALtruistic MAtching Heuristic (ALMA), (Danassis et al. 2019, 2022, 2021; Danassis 2022; Danassis and Faltings 2020)

ALMA is a recently proposed lightweight heuristic for weighted matching. A distinctive characteristic of ALMA is that agents (in our context: requests / rides) make decisions locally, based solely on their own utilities. In particular, while contesting for a resource (in our context: request / taxi), each agent will back-off with probability that depends on their own utility loss of switching to their next most preferred resource. E.g., for step (b) of the Ridesharing problem, suppose that for the agent representing ride \(\rho\), the next most preferred taxi to v is \(v'\), then \(loss = w_{v, \rho } - w_{v', \rho }\). The back-off probability (\(P(\cdot )\)) is computed individually and locally, based on EquationFootnote 101.

$$\begin{aligned} P(loss) = {\left\{ \begin{array}{ll} 1 - \epsilon , &{} \text { if } loss \le \epsilon \\ \epsilon , &{} \text { if } 1 - loss \le \epsilon \\ 1 - loss, &{} \text { otherwise} \end{array}\right. } \end{aligned}$$
(1)

Intuitively, agents that do not have good alternatives will be less likely to back-off and vice versa. The algorithm is inherently decentralized, requires only a 1-bit partial feedback from the resource (indicating whether the resource is free or not), and has constant in the total problem size running time, under reasonable assumptions on the preference domain of the agents. Thus, it is an ideal candidate for an on-device solution. Moreover, in (Danassis et al. 2019) it was shown to achieve high quality results on a simpler version of step (a) of the Ridesharing problem, and in (Danassis et al. 2022) it was shown that it can be adapted to protect the privacy of the agents.

4.1.3 Greedy

Greedy is a very simple algorithm, which selects a node \(i \in V\) of a graph \(G=(V,E)\) uniformly at random, considers all the edges (ij) with endpoint i, and matches i with a node \(j^{*}\) that is the endpoint of the edge with the largest weight among those, i.e., \((i,j^{*}) \in \arg \max (w_{i, j})\). Greedy approaches are appealingFootnote 11, not only due to their low complexity, but also because real-time constraints dictate short planning windows which diminish the benefit of batch optimization solutions compared to myopic approaches (Widdows et al. 2017).

4.1.4 Approximation (Appr), (Bei and Zhang 2018)

Approximation (Appr) is a recently-proposed offline algorithm due to Bei and Zhang (2018) which can be used to solve steps (a), and (b) of the Ridesharing problem. The algorithm takes a two-phase approach which is also based on maximum weight matchings (or more accurately, the equivalent notion of minimum cost matchings), but on a set of different weights (to the ones we defined for the MWM algorithm). In particular:

  • First, it matches requests to shared rides using minimum cost matching based on the shortest distance to serve any request pair but on the worst pickup choice. Formally, the algorithm defines the quantities:

    $$\begin{aligned} w_{ij}= & {} \min \left\{ \delta (s_1,s_2)+\delta (s_2,d_1)+\delta (d_1,d_2), \delta (s_1,s_2)+\delta (s_2,d_2)+\delta (d_2,d_1)\right\} \\ w_{ji}= & {} \min \left\{ \delta (s_2,s_1)+\delta (s_1,d_1)+\delta (d_1,d_2), \delta (s_2,s_1)+\delta (s_1,d_2)+\delta (d_2,d_1)\right\} \end{aligned}$$

    and then chooses \(w^1(i,j) = \max \{w_{ij},w_{ji}\}\). Intuitively, \(w_{ij}\) is the distance of the shortest path that picks up request \(r_1\) first (at its source location \(s_1\)), and similarly, \(w_{ji}\) is the distance of the shortest path that picks up request \(r_2\) first.

  • Then it matches rides to taxis using again minimum cost matching, and assuming the weight to be the distance of the closest pick-up location of the two. Formally, let \(w^2(v,\langle r_i, r_j \rangle ) = \min \{\delta (s_v, s_i), \delta (s_v,s_j)\}\), where \(s_v\) is the position of taxi v, and compute a minimum cost matching in the bipartite graph defined by pairs \(\langle r_i, r_j \rangle\) matched in the previous step and taxis, with weights defined by \(u^2\).

    Bei and Zhang (2018) prove a worst-case approximation guarantee of 2.5 for the algorithm.

4.1.5 Postponed greedy (PG), (Ashlagi et al. 2019)

Postponed Greedy (PG) is another very recently proposed, algorithm for the maximum weight online matching problem with deadlines (step (a) of the Ridesharing problem). The algorithm is online, meaning that it considers the potential requests that might appear in the future when making decisions about the present; its competitive ratio was proven to be 1/4 by Ashlagi et al. (2019). Contrary to our setting, the algorithm was designed for fixed deadlines, i.e., \(k_r = c, \forall r \in {\mathcal {R}}\).

The algorithm is best described in terms of an auction environment (Ashlagi et al. 2019) as follows. Let \(S_t\) and \(B_t\) be the sets of virtual sellers and virtual buyers at time t respectively. When a request r appears at time t, the algorithm creates a virtual seller \(s_r\) and a virtual buyer \(b_r\) for that request, and adds them to the aforementioned sets, i.e., \(S_t \leftarrow S_{t-1} \cup \{s_r\}\) and \(B_t \leftarrow B_{t-1} \cup \{b_r\}\). In other words, every request has two copies: a buyer and a seller. These are then placed in a virtual weighted bipartite graph \(G=(S_t,B_t,E_t)\), where the edge weights are defined in the same manner as the weights of \({\mathcal {G}}_a\) (see ‘Matching Graphs’ in Sect. 4.1). The algorithm proceeds to match the newly added buyer \(b_r\) with a seller \(s_{r^*}\) in a greedy manner, i.e., \((b_r,s_{r^*}) \in \underset{r' \in S_{t-1}}{\arg \max }(w_{r, r'})\). This choice remains fixed for subsequent time steps. When the request r becomes critical (i.e., the deadline is about to be met), the ‘role’ of the request as either a seller or a buyer is conclusively chosen (uniformly at random). If r is a seller, and a subsequent buyer was matched with r, the match is finalized and is included in the output matching.

The major difference between the setting consider by Ashlagi et al. (2019) and our setting is that for us, requests become critical out-of-order, and a critical request cannot be matched later. Thus, we apply the following modification: when a request becomes critical, if determined to be a seller, the match is finalized (if one has been found), otherwise the request is treated as a single ride.

4.1.6 Greedy dual (GD), (Bienkowski et al. 2018)

Greedy Dual is an online algorithm for solving the minimum cost (bipartite) perfect matching with delays, i.e., both steps (a), and (b) of the Ridesharing problem, which is based on the popular primal-dual technique (Goemans and Williamson 1997). The weight (cost) of an edge in this setting includes arrival times as well, specifically:

$$\begin{aligned} w_{r_1, r_2} = \frac{\left( \delta \left( s_1, s_2\right) + \delta \left( d_1, d_2\right) \right) }{u_{\text {average}}} + |t_1 - t_2|, \end{aligned}$$

where \(u_{\text {average}}\) is the average speed (see Sect. 3.2.2). The algorithm partitions all the requests into active sets, starting with the singleton \(\{r\}\) for a newly arrived request r. As is typical in the primal-dual approach, at every time-step t these actives sets ‘grow’, until the weight of the edges of different active sets make the dual constraints of the problem tight (i.e., satisfied with equality). At this point the active sets merge, and the algorithm matches as many pairs of free requests in these sets as possible.

The algorithm has a competitive ratio of \({\mathcal {O}}(|{\mathcal {R}}|)\) and works with infinite metric spaces, potentially making the algorithm better suited for applications like the Ridesharing problem. Yet, in terms of our setup, it does not take into account the willingness to wait (\(k_r\)), thus missing matches of requests that became critical. Despite being designed for bipartite matchings as well, we opted out from using it for step (b) since it would require to create a new node every time a taxi vehicle drops-off a ride and becomes available.

4.1.7 Balance (Bal), (Manasse et al. 1990)

Balance is a simple and classic algorithm for the k-server problem from the literature of competitive analysis. The rationale behind the algorithm is that it tries to balance out the distance traveled by taxis over the course of their operation, trying to maintain the workload as equal as possible. In particular, a ride is served by the taxi that has the minimum sum of the distance traveled so far plus its distance to the source of the ride (chosen uniformly at random between the sources of the two requests composing the ride). Specifically, ride \(\rho\) will be matched to taxi v:

$$\begin{aligned} (v, \rho ) = \underset{v \in {\mathcal {V}}_t}{\arg \min }\left( \text {driven}(v) + \delta \left( s_v, s_{\rho }\right) \right) \end{aligned}$$
(2)

where \(\text {driven}(v)\) denotes the distance driven by taxi v so far, and \(s_{\rho }\) is selected equiprobably among \(s_1\) and \(s_2\). The algorithm is min-max fair, i.e., it greedily minimizes the maximum accumulated distance among the taxis. The competitive ratio of the algorithm is \(|{\mathcal {X}}|-1\) in arbitrary metric spaces with \(|{\mathcal {X}}|\) points (Manasse et al. 1990).

4.1.8 Harmonic (Har), (Raghavan and Snir 1989)

The Harmonic algorithm (Har) is another classic randomized algorithm from the k-server problem literature, which is simple and memoryless (i.e., it does not need to ‘remember’ the decisions that it took in previous steps). The algorithm matches a taxi with a ride with probability inversely proportional to the distance from its source (chosen uniformly at random between the sources of the two requests composing the ride). Specifically, ride \(\rho\) will be matched to taxi v with probability:

$$\begin{aligned} P(v, \rho ) = \frac{\frac{1}{\delta \left( s_v, s_{\rho }\right) }}{\underset{\rho ' \in {\mathcal {P}}_t}{\sum }{\frac{1}{\delta \left( s_v, s_{\rho '}\right) }}} \end{aligned}$$
(3)

where \(s_{\rho }\) and \(s_{\rho '}\) are both selected equiprobably among \(s_1\), \(s_2\) and \(s_{1'}\), \(s_{2'}\), respectively. The trade-off for its simplicity is the high competitive ratio, which is \({\mathcal {O}}(2^{|{\mathcal {V}}|} \log |{\mathcal {V}}|)\) (Bartal and Grove 2000).

4.1.9 Double coverage (DC), (Chrobak et al. 1990)

Double Coverage (DC) is one of the two most famous k-server algorithms in the literature. The algorithm is designed to run on a specific type of metric space called an HST (Hierarchical Separated Tree, see Sect. 3.2.6). For a general metric spaces \({\mathcal {X}}\), the algorithm can be applied by first embedding \({\mathcal {X}}\) to an HST (a process which is referred to as an ‘HST embedding’). This process ‘simulates’ the general space \({\mathcal {X}}\) by an HST, in the sense that the HST approximately captures the properties of the original space \({\mathcal {X}}\). The points of \({\mathcal {X}}\) are the leaves of the HST.

Given an HST, the algorithm works as follows. To determine which taxi will serve a ride, all unobstructed taxis move towards its source, i.e., a leaf of the HST (chosen randomly between the sources of the two requests sharing the ride) with equal speed. Initially, all taxis are unobstructed. During this movement process, a taxi becomes obstructed when its path from its current location to the leaf corresponding to the ride is ‘blocked’ by another taxi, meaning that it would have to move through the same position in the tree that another taxi has already been at, to reach the leaf. In this case, the taxi stops (as the ‘blocking’ taxi is closer to serving the ride), while the remaining taxis keep moving as before. When some taxi reaches the leaf corresponding to the ride, the process stops, and each taxi maintains its current position on the HST.

To implement the algorithm, we first appropriately discretize our metric space and then perform the HST embedding as described in (Bartal 1996; Fakcharoenphol et al. 2004) (see Sect. 3.2.6 for more details). Given that only leaves correspond to locations on \({\mathcal {X}}\), we chose to implement the lazy version of the algorithm (which is worst-case equivalent to the original definition e.g., see (Koutsoupias 2009)), i.e., only the taxi serving the ride will move on \({\mathcal {X}}\); one can envision a process in which the taxis ‘virtually’ move as described above, but once the ride has been served, all taxis are restored to their original positions. This is also on par with the main goal of minimizing the distance driven. The algorithm is k-competitive on all tree metrics (Chrobak and Larmore 1991a).

4.1.10 Work function (WFA), (Chrobak and Larmore 1991b; Koutsoupias and Papadimitriou 1995)

The Work Function algorithm (WFA) is perhaps the most important k-server algorithm, as it provides the best competitive ratio to date, due to the celebrated result of (Koutsoupias and Papadimitriou 1995). Intuitively, to decide which taxi (or server) will be the one to serve a ride that just appeared at time t, and, more generally, the movement of the other taxis, the algorithm:

  • computes the (offline) optimal solution until time \(t - 1\), meaning the best possible allocation of rides to taxis using the information from the beginning of the algorithm until the appearance of the ride at time t,

  • computes a greedy cost for switching between configurations,

  • chooses the new taxi positions that minimize the sum of the two aforementioned costs.

More formally, let \(L^t = (l^t_1, l^t_2, \dots , l^t_{|{\mathcal {V}}|})\) denote the configuration of the fleet of taxis \({\mathcal {V}}\) at time-step t, i.e., a vector of taxi locations, where \(l^t_v\) specifies the location of taxi v. Let \(\text {OPT}_t(L)\) be the optimal (total distance-minimizing) way of serving rides that appear at times 1 through t, such that the taxis end up at configuration L. To choose configuration \(L^{t}\), it uses the following rule:

$$\begin{aligned} L^{t} = \arg \min _{L}\left\{ \text {OPT}_t(L)+ d\left( L^{t-1},L\right) \right\} \end{aligned}$$

The WFA serves ride \(\rho _t\) at time-step t by switching from the current taxi configuration \(L^{t-1}\), to a new configuration \(L^{t}\). Specifically, it selects \(L^{t}\) which minimizes (a) the minimum total cost of starting from \(L^{0}\), serving in turn \(\rho _0, \rho _1, \dots , \rho _{t-1}\), and ending up in \(L^{t}\), plus (b) the distance traveled by a taxi to move from its position in \(L^{t-1}\) to that in \(L^{t}\).

An obvious obstacle that makes the algorithm intractable in practice is that the complexity increases from step to step, resulting in computation and/or memory issues. To circumvent this obstacle, we implemented an efficient variant using network flows, as described in (Rudec et al. 2013). Yet, as the authors of (Rudec et al. 2013) state as well, the only practical way of using the WFA is switching to its window version w-WFA, where we only optimize for the last w rides. Even though the complexity of w-WFA does not change between time-steps, it does change with the number of taxis. The resulting network has \(2|{\mathcal {P}}| + 2|{\mathcal {V}}| + 2\) nodes, and we have to run the Bellman-Ford algorithm (Bellman 1958) at least once to compute the potential of nodes and make the costs positive (Bellman-Ford runs in \({\mathcal {O}}(|{\mathcal {P}}||{\mathcal {V}}|)\). We refer the reader to (Bertsekas 1998) for more details on network optimization. As before, the source of the ride is chosen randomly between the sources of the two requests composing the ride.

4.1.11 k-Taxi, (Coester and Koutsoupias 2019)

This is a very recent algorithm for the k-taxi problem, which provides the best possible competitive ratio. The algorithm operates on HSTs, where the rides and taxis at any time are placed at its leaves. First, it generates a Steiner tree that spans the leaves that have taxis or rides, and then uses this tree to schedule rides, by simulating an electrical circuit. In particular, whenever a ride appears at a leaf, the algorithm interprets the edges of the tree with length R as resistors with resistance R, which determine the fraction of the current flow that will be routed from the node corresponding to the taxi towards the ride. These fractions are then interpreted as probabilities which determine which taxi will be chosen to pick up the ride.

4.1.12 High capacity (HC), (Alonso-Mora et al. 2017a)

This algorithm comes from a highly-cited paper, and is the only one in our evaluated approaches that addresses vehicle relocation (step (c)). Contrary to our approach, it tackles steps (a), and (b) simultaneously, leaving step (c) as a separate sub-problem. The algorithm consists of five steps:

  1. (i)

    Computing a pairwise request-vehicle shareability graph (RV-graph) (Santi et al. 2014). The RV-graph represents which requests and vehicles might be pairwise-shared, with edges connecting all possible requests to pair and all possible vehicles to serve a request.

  2. (ii)

    Computing a graph consisting of feasible (candidate) trips and the set of vehicles that can execute them (RTV-graph). This is a tripartite graph with edges connecting requests to trips (a request is connected to a trip if it is part of it), and edges connecting trips to vehicles (an edge between vehicle and a trip exists if the vehicle is able to serve it).

  3. (iii)

    Computing a greedy solution for the RTV-graph. In this step, rides are assigned to vehicles iteratively in decreasing size of the trip (in our case, we first assign shared rides (two requests), and then single rides) and increasing cost (e.g., delay).

  4. (iv)

    Solving an ILP to compute the best assignment of vehicles to trips, using the previously computed greedy solution as an initial solution.

  5. (v)

    (optional) Rebalancing of free vehicles. If there remain any unassigned requests, it solves an ILP to optimally assign them to idle vehicles based on travel times.

We use CPLEX (Bliek et al. 2014) to solve the ILPs.

4.1.13 Baseline: single ride

Uses MWM to schedule the serving of single rides to taxis (there is no ridesharing, i.e., we omit step (a) of the Ridesharing problem).

4.1.14 Baseline: random

Makes random matches, provided that the edge weight is non-negative.

While our evaluation contains many recently proposed algorithms for matching, the observant reader might notice that, with the exception of k-taxi, our k-server algorithms are from the classical literature. We did consider more recent k-server algorithms (e.g., (Dehghani et al. 2017; Lee 2018; Bansal et al. 2015)), but their complexity turns out to be prohibitive. This is mainly because they proceed via an ‘online rounding’ of an LP-relaxation of the problem, which maintains a variable for every (time-step, point in the metric space) pair. Even for one hour (3600 time-steps) and our discretization of Manhattan (5018 nodes), we need more than 18 million variables (230 million for NYC).

5 Scalability challenges

To highlight the challenges in the design of CARs, we will be referring to our evaluation setting (see Sect. 3.2), which accurately models a real-world application, in terms of both scale and detail. Let \({\mathcal {V}}\), \({\mathcal {R}}\) denote the set of vehicles / requests, respectively. Recall that in our setting, which involves real data from NYC taxi records, there are 272 new requests per minute on average, totaling to 391479 requests in the broader NYC area (352455 in Manhattan) on the evaluated day (Jan, 15, 2006). By law, there are 13, 587 taxis in NYCFootnote 12.

5.1 ILP approaches

A natural approach would be to try to use Integer Linear Programs (ILPs) for matching passengers to other passengers or rides, under spatial and temporal constraints, similarly to the High Capacity algorithm of (Alonso-Mora et al. 2017a) (which can be seen as a CAR with steps (a) and (b) intertwined). As is commonly the case with ILPs, the problem is scalability; the number of variables can be as large as \({\mathcal {O}}(|{\mathcal {V}}| |{\mathcal {R}}|^2)\) – which results in 27 - 216 million variables, given that every time-step we have approximately 300 - 600 requests, and as many taxis – and the number of constraints is \(|{\mathcal {V}}| + |{\mathcal {R}}|\). This makes ILP approaches prohibitive as components in CARs. The latter make hard to even compute the initial greedy solution in real-time. Alonso-Mora et al. circumvent this issue by enforcing delay constraints, specifically they ignore requests that are not matched to any vehicle within a maximum waiting time. This is not possible in our model since we have to serve all requests (service guarantee).Footnote 13

5.2 MWM approaches

Given that all three parts of the ridesharing problem can be viewed as matching problems, a natural approach would be to run maximum-weight matching (MWM) in batches (e.g., (Bei and Zhang 2018)), meaning that we serve the requests that have accumulated over a pre-specified time window. The MWM problem can be solved via the classic blossom algorithm (Edmonds 1965) with run time – on a graph (VE) – of \({\mathcal {O}}(|E| |V|^2)\).

5.3 k-server/taxi algorithms

Many of these algorithms operate by embedding the input metric space \({\mathcal {X}}\) into a distribution \(\mu\) over Hierarchical Separated Trees (HSTs) (e.g., the classic double-coverage (Chrobak et al. 1990)), and thus to apply them in practice, it is necessary to examine the size of these trees. Given that the geo-coordinate system is a discrete metric space, we could directly embed it into HSTs. Yet, the size of the space is huge, and hence for better discretization we have opted to generate the graph of the street network of NYC (see Sect. 3.2.6). The resulting graph for NYC contains 66543 nodes, and 95675 edges (5018, and 8086 for Manhattan). Here, there is an obvious interplay between the accuracy of the embedding and the algorithm’s complexity.

More recent k-server algorithms (e.g., (Dehghani et al. 2017; Lee 2018; Bansal et al. 2015)) use sophisticated ‘online rounding’ techniques; these however require maintaining variables for every (time-step, point in the metric space) pair, which makes them prohibitive for any large-scale real-world application; even for one hour (3600 time-steps) and our discretization of Manhattan (5018 nodes), we would need more than 18 million variables (230 million for NYC).

5.4 Observability

Most approaches are centralized, and require a global view of the entire window, which is hard to scale. As autonomous agents proliferate, a practical and applicable CAR must be distributed and ideally run on-device.

6 Vehicle relocation challenges

There are two ways to enforce relocation: passive, and active. Ridesharing platforms, like Uber and Lyft, have implemented market-driven pricing as a passive form of relocation. Counterfactual analysis performed in (Buchholz 2018) shows that implementing pricing rules can result in daily net surplus gains of up to 232000 and 93000 additional daily taxi-passenger matches. While the gains are substantial, the market might be slow to adapt, and drivers and passengers do not always follow equilibrium policies. Contrary to that, our approach is active, in the sense that we directly enforce relocation. Moreover, we adopt a more anthropocentric approach: in our setting, the demand is fixed, thus the goal is not to increase revenue as a result of serving more rides, but rather to improve the QoSFootnote 14.

There are many ways to approach dynamic relocation. Most of the employed relocation approaches are course-grained; the network is generally divided into several zones, blocks, etc. (Guériau and Dusparic 2018; Vosooghi et al. 2019; Martínez et al. 2017) and the entities (e.g., the vehicles) move between the zones. However, compared to other shared mobility systems, dynamic ridesharing posses unique challenges, meaning that such coarse-grained approaches are not appropriateFootnote 15: most of them are centralized – thus computationally intensive and not scalable –, they might not take into account the actions of other vehicles, potentially leading to over-saturation of high demand areas, and, most importantly, they are slow to adapt to the highly dynamic nature of the problem (e.g., responding to high demand generated by a concert, or the fact that vehicles remain free for only a few minutes at a time). The problem clearly calls for fine-grained solutions, yet such approaches in the literature are still rather scarce. High Capacity (HC) employs fine-grained relocation. HC solves an ILP, which could reach high quality results, but it is not scalable nor practical. Ideally, we would like a solution that can run on-device. The k-server algorithms perform an implicit relocation, yet they are primarily developed for adversarial scenarios, and do not utilize the plethora of historic dataFootnote 16. In reality, requests follow patterns that emerge due to human habituality (e.g., during the first half of the day in Manhattan, there are many more drop-offs in Midtown compared to pickups (Buchholz 2018)).

6.1 Patterns in customer requests

To confirm the existence of transportation patterns, we performed the following analysis: For each request r on January 15Footnote 17, we searched the past three days for requests \(r'\) such that \(|t_{r} - t_{r'}| \le 10\), \(\delta (s_{r}, s_{r'}) \le 250\), and \(\delta (d_{r}, d_{r'}) \le 250\). The results are depicted in Fig. 3. On average, \(13.3\%\) of the trips are repeated across all three previous days, peaking at \(43.7\%\) on rush hours (e.g., 6-8 in the morning). Note that predicting transport demand based on historic data is not an easy task; \(13.3\%\) is about 47000 trips, which is rather significant in raw numbers.

Fig. 3
figure 3

Percentage of similar trips per hour in Manhattan, January 15, 2016 (blue line). Mean value = \(13.3\%\) (yellow line)

6.2 Relocation matching graph

Given the high density of the requests, and the low frictions of the taxis (i.e., taxis remain free for relocation only for a short time window), we opted for a simple, fine-grained, matching approach. We use the history to predict a set of expected future requests. Specifically, let D, and T be the sampling windows, in days and minutes respectively (we used \(D = 3\), and \(T = 2\)). Let t denote the current time-step. The set of past requests on our sampling window is \({\mathcal {R}}_{\text {past}} = \{r: t_{r} - t \le T\}\), as long as r appeared at most D number of days prior to t. The set of expected future requests \({\mathcal {R}}_{\text {future}}\) is generated by sampling from \({\mathcal {R}}_{\text {past}}\). Relocation is performed in a just-in-time manner, every time the set of idle vehicles is not empty. We generate similar matching graphs as in Sect. 4.1, and then we proceed to match requests to shared rides, and rides to idle taxis. The difference being that now the set of nodes of \({\mathcal {G}}_a\) is \({\mathcal {R}}_{\text {future}} \cup {\mathcal {R}}_{t}\). Finally, each idle taxi starts moving towards the source of its match (given that these are expected rides, the source is picked at random between the sources of the two requests composing the ride).

7 Evaluation

7.1 Employed CARs

Evaluating all of the possible combinations of CAR components is infeasible. To make the evaluation tractable, we first consider only the first two steps of the ridesharing problem (i.e., no relocation). When possible, we use the same component for both steps (a) and (b). k-Taxi/Server algorithms, though, can not solve step (a), thus we opted to use the best performing component for step (a) (namely the offline maximum-weight matching (MWM) run in batches). Then, we move to evaluate step (c), testing only the most promising components (namely the MWM and ALMA, plus the Greedy as a baseline). We begin by isolating step (c); we fix the component for (a) and (b) to MWM, to have a common-ground for evaluating relocation. Finally, we present results on end-to-end solutions. A list of all the evaluated CARs can be found in Table 1, while Table 2 contains a summary of all the evaluated metrics.

Table 2 Evaluated performance metrics (global, passenger (Quality of Service), and platform specific)

7.2 Simulation results

In this section we present the results of our evaluation. For every metric we report the average value out of 8 runs. In what follows we shortly detail only the most relevant results. Please refer to Appendix A for the complete results including larger test-cases on the broader NYC area and omitted metrics, standard deviation values, algorithms (e.g., WFA, and HC had to be evaluated in smaller test-cases), etc.

Figures 4, 7, 5, and 6 present the results without relocation. We first present results on one hour (Figs. 4 and 7) and base number of taxis (see Sect. 3.2.2). Then, we show that the results are robust at a larger time-scaleFootnote 18 (Figure 5), and varying number of vehiclesFootnote 19 (2138 - 12828) (Fig. 6). Finally, we present results on the step (c) of the Ridesharing problem: dynamic relocation (Table 3, Fig. 8).

Fig. 4
figure 4

08:00 - 09:00, #Taxis = 4276 (base number). Manhattan, January 15, 2016

Fig. 5
figure 5

00:00 - 23:59 (full day), #Taxis = 5081 (base number). Manhattan, January 15, 2016

Fig. 6
figure 6

08:00 - 09:00, #Taxis = \(\{2138, 3207, 4276, 8552, 12828\}\). Manhattan, January 15, 2016

Fig. 7
figure 7

08:00 - 09:00, #Taxis = 4276 (base number). Manhattan, January 15, 2016

7.2.1 Distance driven

In the small test-case (Fig. 4a) MWM performs the best, followed by Bal (\(+7\%\)). ALMA comes third (\(+19\%\)), and then Greedy (\(+21\%\)). The high performance of Bal in this metric is because it uses MWM for step (a), which has a more significant impact on the distance driven. Similar results are observed for the whole day (Fig. 5a), with Bal, ALMA, and Greedy achieving \(+4\%\), \(+18\%\), and \(+22\%\) compared to MWM, respectively. Figure 6a shows that as we decrease the number of taxis, Bal loses its advantage, Greedy is pulling away from ALMA (\(9\%\) worse than ALMA), while ALMA closes the gap to MWM (\(+17\%\)).

7.2.2 Complexity

To estimate the complexity, we measured the elapsed time of each algorithm. Greedy is the fastest one (Fig. 4b), closely followed by Har, Bal, and ALMA. ALMA is inherently decentralized. The red overlay denotes the parallel time for ALMA, which is 2.5 orders of magnitude faster than Greedy.

7.2.3 Time to pick-up

MWM exhibits exceptionally low time to pick-up (Fig. 4c), lower than the single ride baseline. ALMA, Greedy, and Bal have \(+69\%\), \(+76\%\), and \(+33\%\) compared to MWM, respectively. As before, Fig. 6b shows that as we decrease the number of taxis, Bal loses its advantage, and Greedy is pulling further away from ALMA. Note that to improve visualization, we removed DC’s pick-up time as it was one order of magnitude larger than Appr.

7.2.4 Delay

PG exhibits the lowest delay (Fig. 4d), but this is because it makes \(26\%\) fewer shared rides than the rest of the high performing algorithms. ALMA has the smallest delay (\(-13\%\) compared to MWM), with Greedy following at \(-1\%\), while Bal has \(+63\%\) (both compared to MWM). As the number of taxis decrease (Fig. 6c), ALMA’s gains increase further (\(-22\%\) compared to MWM).

Figures 5d, and 6d depict the cumulative delay, which is the sum of all delays described in Sect. 3.1.2, namely the time to pair, time to pair with taxi, time to pick-up, and delay. An interesting observation is that reducing the fleet size from 12828 (\(\times 3.0\) of the base number) to just 3207 (\(\times 0.75\) of the base number) vehicles (\(75\%\) reduction) results in only approximately 2 minutes of additional delay (Fig. 6d). This goes to show the great potential for efficiency gains such technologies have to offer.

Finally, we wanted to investigate the distribution of the achieved QoS metrics and, consequently, the reliability/fairness of each CAR. As such, we plotted in Fig. 7a the sequence of percentilesFootnote 20 for the cumulative delay. As shown, the vast majority of the users (\(75\%\)) experience cumulative delay close to the average value (only 46, 85, 92, 69 additional seconds of cumulative delay than the average value for MWM, ALMA, Greedy, and BAL, respectively). Of course, some of the users experiences high cumulative delay, but this is a small percentage of them. Specifically, less than \(5\%\) of requests experience a delay of more than 8.5, 13, 13, and 9.5 minutes for MWM, ALMA, Greedy, and BAL, respectively. Given the size and the average speed of taxi vehicles in Manhattan, such delays could be expected and, thus, acceptable; ultimately, it is up to the ridesharing platform to impose hard constraints and reject requests with potentially high delay.

7.2.5 Frictions

Figure 5b shows the driver frictions. In this metric, k-server algorithms seem to outperform matching algorithms by far. Compared to MWM, Bal and Har achieve a \(63\%\) and \(73\%\) decrease, respectively, while ALMA and Greedy achieve a \(26\%\), and \(21\%\) decrease, respectively. Given that we have a fixed supply, lower frictions indicate a more even distribution of rides amongst taxis.

It is important to note that while the results for all the other metrics are consistent when moving from the one hour test-case to the full day test-case, this is not true for the frictions (see Figs. 9i and 12i and Tables 6 and 10 in the Appendix). This is because taxis that serve zero or one rides are assumed to have zero friction by definition. Algorithms like Bal – which attempts to balance the distance driven by each taxi – will utilize each vehicle multiple times, even for the short time window of one hour. This results to a deceivingly high number in the frictions in the one hour test-case. As a matter of fact, the number of taxis that served less than two rides (and, thus, had zero friction) in the one hour test-case for Bal were 483. For MWM this number is 1368 (almost 3 times larger), for ALMA it is 1181, and for Greedy 1120. This is why we opted to present the frictions for the full day test-case in Fig. 5b.

7.2.6 Time to pair with taxi & number of shared rides

Excluding the test-case with the smallest taxi fleet (\(\times 0.5\) the base number), the time to pair with taxi was zero, or close to zero, for all the evaluated algorithms. The latter comes to show the potential for efficiency gains and better utilization of resources using smart technologies. The reason for the low time to pair with a taxi is that, for the step (b) of the ridesharing problem (matching (shared) rides to taxis), we run the offline algorithms in a just-in-time (JiT) manner, i.e., every time the set of rides (\({\mathcal {P}}_t\)) is not empty (see Sect. 4.1). We opted to do so for simplicity – the alternative would require to run all combinations of batch sizes for both steps (a) and (b). Results from step (a), though, suggest that running in batches is more beneficial (running in batch size of two minutes consistently outperformed the JiT version, see Appendix A). There is a clear trade-off: match with a taxi as soon as possible (JiT), and have a vehicle moving to pick-up the ride earlier, or wait (match in batches every x minutes), potentially allowing for better matches? Answering this question remains open for future work.

The number of shared rides is approximately the same for all the employed algorithms, with notable exception the PG which makes \(26\%\) fewer shared rides.

7.2.7 Relocation

The aim of any relocation strategy is to improve the spatial allocation of supply. Serving requests redistributes the taxis, resulting in an inefficient allocation. One can assume a ‘lazy’ approach, relocating vehicles only to serve requests. While this minimizes the cost of serving a request (e.g., distance driven, fuel, etc.), it results in sub-optimal QoS. Improving the QoS (especially the time to pick-up, since it highly correlates to passenger satisfaction, see Sect. 3.1.2) plays a vital role in the growth of a company. Thus, a crucial trade-off of any relocation scheme is improving the QoS metrics, while minimizing the excess distance driven.

Table 3 Relocation gains
Fig. 8
figure 8

Time to Pick-up (s) – End-To-End Solution January 15, 2016 – 00:00 - 23:59 – Manhattan – #Taxis = 5081

CARs with relocation successfully balance this trade-off (Table 3). In particular, ALMA – the best performing overall – radically improves the QoS metrics by more than \(50\%\) (e.g., it decreases the pick-up time by \(55\%\), and its standard deviation (SD) by \(58\%\)), while increasing the driving distance by only \(6\%\). The cumulative delay is decreased by \(43\%\).

As a final step, we evaluate end-to-end solutions, using MWM, ALMA, and Greedy to solve all three steps of the ridesharing problem. Figure 8 depicts the time to pick-up (error bars denote one SD of uncertainty), a metric highly correlated to passenger satisfaction level (Tang et al. 2017; Brown 2016b). We compare against the single ride baseline (no delay due to sharing a ride, see Sect. 4.1.13). Once more, the proposed relocation scheme results in radical improvements, as the time to pick-up drops (compared to the single ride) from \(+14.09\%\) to \(-41.76\%\) for MWM, from \(+74.14\%\) to \(-9.33\%\) for ALMA, and from \(+86.10\%\) to \(-7.97\%\) for Greedy. This comes to show that simple relocation schemes can eliminate the negative effects of ridesharing on the QoS.

7.2.8 ALMA as an end-to-end CAR

While MWM seems to perform the best in the total distance driven, and most QoS metrics – which is reasonable since it makes optimal matches amongst passengers – it hard to scale and requires a centralized solution. In contrast, greedy approaches are appealing\(^11\) not only due to their low complexity, but also because real-time constraints dictate short planning windows which can diminish the benefit of batch optimization solutions compared to myopic approaches (Widdows et al. 2017).

In fact, ALMA is of a greedy nature as well, albeit it utilizes a more intelligent backing-off scheme, thus there are scenarios where ALMA significantly outperforms the greedy, as proven by the simulation results. For example, in more challenging scenarios (smaller taxi fleet, or potentially different types of taxis) the smarter back off mechanism results in a more profound difference.

Most importantly, ALMA was inherently developed for multi-agent applications. Agents make decisions locally, using completely uncoupled learning rules, and require only a 1-bit partial feedback (Danassis et al. 2019), making it an ideal candidate for an on-device implementation. This is fundamentally different than a decentralized implementation of the Greedy algorithm for example. Even in decentralized algorithms, the number of communication rounds required grows with the size of the problem. However, in practice the real-time constraints impose a limit on the number of rounds, and thus on the size of the problem that can be solved within them.

Table 4 High level (qualitative) ranking of the evaluated CARs

7.3 High-level analysis

Applying the modular approach we advocate, allowed us to thoroughly test a wide variety of state-of-the-art algorithms for ridesharing. When dealing with a multi-objective optimization problem, it is unreasonable to expect to identify an approach that outperforms the competition across the board. Nevertheless, our findings provide convincing evidence to a ridesharing platform as to which CARs would be most suitable for a given set of objectives. Specifically: (i) CARs that rely on off-line (in-batches) maximum-weight matching solutions perform well on global efficiency and passenger related metrics, (ii) CARs based on k-server algorithms perform well on platform related metrics (e.g., Bal), (iii) lightweight CARs perform better in real-world, large-scale settings due to short planning windows imposed by the requirement to run in real-time, (iv) a simple, fine-grained relocation scheme based on the history of requests can significantly improve Quality of Service metrics by up to \(50\%\), and finally, (v) we identify a scalable, on-device CAR based on ALMA that performs well across the board. A summary of the results can be found in Table 4.

8 Conclusion

Managing transportation resources on a large scale remains a critical open problem. We initiate the systematic study of Component Algorithms for Ridesharing (CARs), a modular design methodology for ridesharing. To gain insight into the intricate dynamics of the problem, it is highly important to evaluate a diverse set of candidate solutions in settings designed to closely resemble reality. We evaluate a diverse set of candidate CARs (14 in total) – focused on the key algorithmic components of ridesharing – over 10 metrics, in settings designed to closely resemble reality in every aspect of the problem. To the best of our knowledge, this is the first end-to-end evaluation of this magnitude. We show the capacity of simple relocation schemes to improve QoS metrics radically, eliminating the negative effects of ridesharing, and identify an ALMA-based CAR that offers an efficient (across all metrics), scalable, on-device, end-to-end solution.