1 Introduction

Since Uber launched in 2009, ride-hailing companies have evolved into a global service that has become an indelible aspect of our lives [1, 2]. While such services have given consumers (the riders) more choices and drivers the ability to earn money on a flexible schedule, some aspects of these services might have adverse social effects  [3,4,5,6]. For example, in Chicago, the only U.S. city to require ride-hailing companies to release pricing information, research has shown that their pricing algorithms can lead to disparate impact: neighborhoods with a higher proportion of people of color, higher levels of poverty, and younger residents are significantly associated with higher fare prices [3]. This disparate impact is especially troubling because it further harms communities that have historically been systemically marginalized in a myriad of ways and tend to be more dependent on these services in the first place due to poor public transportation infrastructure  [7,8,9,10,11,12].

The dearth of viable public transportation options in disadvantaged parts of town results in strikingly different mobility patterns for disadvantaged and non-disadvantaged residents (Fig. 1). The difference in destinations for weekday rush-hour trips beginning in disadvantaged neighborhoods (Fig. 1a) vs. those beginning in non-disadvantaged neighborhoods (Fig. 1b) is remarkable. Trips beginning in disadvantaged neighborhoods are overwhelmingly to neighborhoods not well served by public transportation, while trips beginning in non-disadvantaged neighborhoods are to areas well connected by public transportation, such as downtown Chicago (Fig. 1c).

Systemic disparity in the transportation options available to disadvantaged residents is an issue multiple city governments already recognize and wish to fix, often by offering discount pricing for public transportation to those in need [13,14,15,16]. Reducing mobility inequities is one of Chicago’s five essential elements for its development plan focused on inclusive growth, On To 2050 [17]. Unfortunately, these policies do not help alleviate the root cause of the problem: public transportation in affected areas is underfunded with inefficient coverage [7, 9, 18]. Ride-hailing apps worsen this inequity by charging affected residents higher prices [3]. Indeed, researchers analyzing Uber data for six large cities in the U.S.A. and Europe found that 20 to 40% of ride-hailing trips had no viable public transportation alternatives [19]. This work provides a technique to help address this worsened inequity, with discounts proposed by one of our pricing mechanisms covered by the government as subsidies to reduce the disparate impact on disadvantaged neighborhoods, in the same vein as other subsidy initiatives for low-income residents for necessary services [13,14,15,16].

Fig. 1
figure 1

Transportation patterns in Chicago

While we advocate for systematically extending public transportation services to provide equitable and affordable coverage for all residents to be the overarching goal for government policy, improving public transportation is a complex and extremely resource-intensive endeavour. It takes considerable time to implement, often decades, and requires funding on the order of billions of dollars. Although city governments all over the U.S. have been aware of the inequitable reach of public transportation for quite some time [7, 9, 17], achieving noticeable improvements in coverage has proven challenging due to significant barriers. The extremely high costs are a significant challenge, with estimates from the Eno Center for Transportation indicating that building public transportation in the U.S. that is at or above-ground level costs approximately an average of $73 million per kilometer [21]. Transit that is predominantly below ground costs even more, at $354 million per kilometer on average [21]. Meanwhile, disadvantaged residents need relief urgently. Our proposed solution can bridge the gap until public transportation can be adequately improved. Mayor Lightfoot of Chicago announced $12.5 million in transportation cost relief, out of which $5 million has been earmarked for prepaid cards eligible for use on public transit by disadvantaged residents [22]. While these funding amounts may not be sufficient for constructing new public transportation infrastructure, they can offer immediate relief to concerned residents through approaches like ours.

Although important and necessary, unveiling and mitigating the disparate impact of ride-hailing apps’ pricing mechanisms to provide a foundation for the government to draft fair ride-hailing policies presents unique challenges. First, there is a difference in affordability. Different neighborhoods may have different average income levels, and thus, what may be affordable for a resident of one neighborhood may not be affordable for a resident of another. A fair and equitable solution should price rides according to affordability so that nobody gets priced out. Secondly, metrics to measure bias in the context of disparate impact due to pricing are unavailable. Although multiple metrics for measuring bias have been proposed in the literature [23, 24], they cannot be applied here directly since they are not designed to capture the spatial and economic aspects of the problem. Finally, there are data-related challenges as all the data ideally needed to model the problem holistically is unavailable [25]. For example, metrics such as price elasticity [26], which in our case is the measure of change in the number of trips with a change in the price of trips, is unknown, as is the surge level of trips, and thus these quantities must be estimated, requiring making assumptions at times.

To address the aforementioned challenges, this paper introduces novel definitions of fair pricing and corresponding fair-pricing strategies to help reduce the disparate impact towards marginalized groups heavily reliant on ride-hailing services due to systemic inequities in public transportation. To the best of our knowledge, this is the first work that not only investigates bias within ride-hailing but also presents pragmatic solutions to mitigate the identified disparate impacts. More specifically, the main contributions of this paper are:

  • We develop the first metric, Relative Rideability (RR), to detect and quantify the existing bias in ride-hailing pricing affecting marginalized communities. The second metric explores the difference in affordability of trips for different groups and the relationship between trip pricing and the number of rides that take place to formally define bias in ride-hailing with the concepts of price elasticity and consumer surplus from the economics literature.

  • We then propose two versatile fair-pricing strategies that effectively reduce disparate impact on marginalized groups and can flexibly accommodate different requirements for varied scenarios.

  • Extensive experimental evaluation on real-world ride-hailing and census data from Chicago demonstrates the capability and effectiveness of our pricing mechanisms in the real world.

The remainder of the paper is organized as follows: We survey related work in Sect. 2. Section 3 explores bias within ride-hailing; Sect. 4 presents our metrics to measure this bias. Our pricing mechanisms are introduced in Sect. 5, and experimental results on real-world data in Sect. 6. Finally, we conclude with Sect. 7.

2 Related work

Although research into pricing mechanisms for ride-hailing services is a burgeoning area, most work looks at improving the efficiency of matching drivers and riders [27], pricing rides at times of low supply [28], or optimizing pricing to regulate demand on the platform while ensuring a reasonable wait time for all riders [29]. Other work explores using deep learning to price trajectories [30], and studying the effect of surge pricing [31]. There is also extensive work on matching drivers and riders efficiently [32,33,34]. There is little work examining fair pricing to reduce disparate impact among riders. Relevantly, researchers have found that fares for Uber rides in the U.S.A. from major airports to hotels were significantly correlated with the prices of hotel rooms [5], although they do not attempt to address this price discrimination.

Fair benefits for drivers on the platform have also been investigated, with the idea that spread out over time, all drivers active on the platform should receive benefits in proportion to the amount of time they have spent active on the platform [35], but fair prices for riders are not considered. The focal point of this work is to approach ride-hailing from the opposite end: fair and equitable pricing for riders.

Lu et al. [36] study the hypothetical scenario where ride-hailing services use personal data about customers (e.g., purchase habits) to determine customized prices for each customer. The work introduces the concept of a rider buying “fair price insurance,” which is a price auditing contract. The blockchain-based smart price auditing system (called ‘Spas’) determines when a customer was charged a price based on their profile (instead of the ‘correct/expected’ price calculated according to a pricing policy they have access to), and in such instances, issues a refund [36]. While this can help address a hypothetical source of discrimination, it does not explicitly address the disparate impact caused by current pricing techniques toward marginalized groups. This work, in contrast, focuses on mitigating the observed disparate impact on marginalized groups.

A growing body of work examines discrimination on ride-hailing platforms unrelated to pricing. Male passengers using such services with typically African-American-sounding names are almost 3 times as likely to have drivers cancel on them compared to passengers with typically white-sounding names [4, 37]. In Seattle, African-American passengers faced wait times as much as 35% longer and on average about 10% longer than white passengers [4, 37]. Such trends, combined with higher fares associated with disadvantaged neighborhoods, create significant disparate impact.

Such phenomena are not restricted to ride-hailing platforms alone. Much of the peer-to-peer economy has similar problems. Researchers have found that on AirBnB, an online platform for temporary lodging, non-black hosts in New York City are able to charge approximately 12% more than black hosts for an equivalent rental [38]. Guests looking for accommodation on the platform with typically African-American names are 16% less likely to be accepted by hosts than identical guest profiles with white-sounding names [39]. Another work looked at peer-to-peer lending and found that requests for loans by African-Americans were significantly less likely—25- to 35% less likely—to be accepted than by requests from a white person with a similar credit profile [40].

3 Bias in ride-hailing

Biased pricing in ride-hailing results in higher prices for people likely to be struggling financially. The lack of accessible public transportation options in disadvantaged neighborhoods leads to markedly different ride-hailing patterns for people from disadvantaged and non-disadvantaged communities. Figure 1 depicts destinations for ride-hailing trips in Chicago for these communities. Economically Disconnected Areas (EDAs), identified by the Chicago Metropolitan Agency for Planning (CMAP), are neighborhoods with high concentrations of low-income households, minorities, or households with limited English proficiency speakers. Approximately one-third of the city’s inhabitants live in an EDA [41]. Figure 1a shows destinations for ride-hailing trips beginning in EDAs during morning rush hour on weekdays, and Fig. 1b denotes the same for ride-hailing trips beginning in non-EDAs. Since we look at weekday morning rush hours (6-9AM Monday through Friday), it is reasonable to assume these trips are principally by EDA and non-EDA residents, respectively, commuting to work. The most common destination for trips by the non-disadvantaged is downtown Chicago (Fig. 1b), the central business district of the city and a hub of offices. Even though the city’s downtown is extensively connected by public transportation, as shown by the red box in Fig. 1c, people living in non-disadvantaged non-EDA regions still choose to use Uber instead of public transit (i.e., convenience vs. necessity). On the other hand, residents living in disadvantaged EDA areas take ride-hailing trips to other parts of the city not as well served by public transportation (i.e., necessity). They do not take ride-hailing trips to regions well-connected by public transportation (e.g., downtown Chicago). Travel patterns for evening weekday rush hours are similar, i.e., non-disadvantaged residents of Chicago start ride-hailing trips overwhelmingly from downtown Chicago, a region well serviced by public transit. In contrast, disadvantaged residents start ride-hailing trips far from downtown. These behavior patterns strengthen the argument that ride-hailing services are helping fill a gap left by public transportation in disadvantaged communities.

To address this disparate impact due to the ride-hailing services’ black-box pricing algorithms, we consider bias in ride-hailing to be a significant difference in the average number of rides taken by the disadvantaged versus those of non-disadvantaged groups and corresponding affordability. We adopt the Chicago Metropolitan Agency for Planning’s designation of certain census tracts as Economically Disconnected Areas (EDAs) to denote disadvantaged neighborhoods and all other neighborhoods (i.e., non-EDAs) as non-disadvantaged neighborhoods. EDAs are regions that are disconnected from economic growth and prosperity and may be experiencing disinvestment [17, 41]. We further define EDA-trips to denote trips (or rides) that either begin or end in an EDA region. Non-EDA-trips denotes trips (or rides) that do not. In the remainder of this paper, we interchangeably use the terms EDA regions and disadvantaged neighborhoods as well as non-EDA regions and non-disadvantaged neighborhoods for readability.

While addressing mobility inequity in all forms is important, addressing certain forms is more urgent than others. For example, mobility inequity that hinders someone from their work commute is more critical due to the potential economic impacts that might have cascading effects on an individual’s life rather than mobility inequity when going to watch a movie or sporting event. Therefore, along with all EDA trips, we also look at trips originating in EDA regions between 6-9 AM and ending in EDA regions between 4–6 PM, Monday through Friday (i.e., the weekday rush hours). The intuition is that trips originating in EDAs during morning rush hour and terminating in EDAs during afternoon rush hours are highly likely to be EDA residents commuting to or from work. While there may be other critical trips one might take, such as going to the hospital or getting vital medication from the pharmacy, such trips are hard to pinpoint from transportation data alone, especially when locations are represented at the level of the census tract and not points of interest (e.g., hospitals, pharmacies, schools, et cetera). On the other hand, work commutes tend to be easier to isolate in data due to timing patterns, and therefore, we focus on them.

4 Unveiling ride-hailing bias

Despite notions of bias and fairness having been debated in philosophy for millennia [42,43,44] and for over a decade in computer science, with over twenty-five statistical metrics already proposed [23], there is still no consensus about which definition is appropriate for which context [24]. In this work, we study bias in ride-hailing from two angles. The first, Relative Rideability, is inspired by statistical parity, which has garnered support as an intuitive metric for bias in past research [23, 24]. The second approach considers affordability, which is commonly used in public policy as a metric to study how prices of essential goods and services such as housing, transportation, electricity, et cetera affect different groups in society and to propose policies to mitigate adverse impacts on disadvantaged communities [45,46,47].

4.1 Relative rideability

Inspired by demographic parity and aligned with the \(p\%\)-rule [48, 49] used by the U.S. Equal Employment Opportunity Commission (EEOC) to evaluate disparate impact, we introduce the Relative Rideability (RR) score to quantify the previously discussed bias in ride-hailing shown as the significant difference in the average rides across different groups. Specifically, the \(p\%\)-rule states that if the selection rate for a particular group is less than \(p\%\) of the selection rate for the group with the highest selection rate, then there is a substantially different rate of selection and may be considered disparate impact. Analogously, RR can be mathematically defined as:

$$\begin{aligned} RR = \frac{\min \{d_1,\dots , d_i\}}{\max \{\lnot d_1,\dots , \lnot d_j\}} \end{aligned}$$
(1)

where \(d_{i}\) is the average number of trips by residents of disadvantaged group i, and \(\lnot d_{j}\) the average number of trips by residents of non-disadvantaged group j. In our setting, RR is the average number of trips by EDA residents divided by average number of trips by non-EDA residents. While we have one disadvantaged and one non-disadvantaged group, RR is not restricted to binary settings. With \(\lnot d_{j}\) commonly greater than \(d_{i}\) in inequitable ride-hailing services, the higher the RR the fairer the model.

4.2 Affordability

Next, we look at another way of quantifying bias in ride-hailing: difference in ride affordability. A measure of affordability can be captured by consumer surplus [50,51,52], an economics concept that captures the difference between a consumer’s willingness to pay for a particular product or service and the price of that product or service. If the price of the product or service is less than the amount the customer is willing to pay for it, then the consumer surplus is positive. Otherwise, it is negative, which can happen if the good or service is necessary (e.g., food or life-saving medication). In other words, the higher the consumer surplus, the more easily the consumer can afford that product or service.

However, the information needed to quantify consumer surplus in ride-hailing, such as the number of people who looked at the quoted price but did not make a request and the price itself, is unavailable in publicly released ride-hailing data. We circumvent this challenge by computing price elasticity (\(E_p\)) instead, which measures the change in demand of a product or service with respect to a change in its price, to obtain an estimation of consumer surplus:

$$\begin{aligned} E_{p} = \frac{\delta _Q}{\delta _P} \end{aligned}$$
(2)

where \(\delta _Q\) is the percentage change in the quantity of the product or service demanded, and \(\delta _P\) is the percentage change in its price. As a note, price elasticity will also be used in our following fair pricing mechanisms to estimate how the number of trips might change with a change in price.

Price elasticity is considered when there is a change in the price of a good or service. However, ride-hailing trips are not assigned a flat per-mile rate; thus, simply studying trips at different prices is not ideal since the difference in the price of two trips might be due to multiple factors. Therefore, we study similar trips that were shown different prices due to how ride-hailing platforms compute surge level, a multiplier used to multiply and increase the estimated ride price at times of high demand or low supply. In addition, ride-hailing platforms typically compute a continuous surge level for rides but show a discrete value to consumers for simplicity and ease of understanding. For example, a trip for which a surge level was computed as 1.449 will result in a discretized surge of 1.4x. In contrast, a surge level of 1.451x will result in a discretized surge of 1.5x, even though the difference between the two trips’ continuous surge levels is only 0.002. We use regression discontinuity design around these discretized surge points to estimate price elasticities when surge levels go from 1.2x to 1.3x, 1.3x to 1.4x, 1.4x to 1.5x, and so forth. We run a linear probability model, a special case of ordinary least squares regression, that is commonly used in economics in which the outcome variable is binary, and the dependent variables may be binary or continuous. We fit the linear probability model regression (where the dependent variable is binary) below for each surge discontinuity, with Ride indicating whether a particular request leads to a trip, and include all trips on either side of the surge discontinuity:

$$\begin{aligned} \begin{aligned} Ride = \beta _0 + (\alpha * i_1 * i_2) + (\beta _1 * i_1) + \\ (\beta _2*(1 - i_1) * i_2) + (\beta _3 (1 - i_2) * x_1) \\ + (\beta _4 * i_2 * x_1) + \epsilon \end{aligned} \end{aligned}$$
(3)

where \(\alpha \) indicates the drop in rides around a discontinuity, \(i_1\) is a decision variable indicating whether the surge of that particular trip lies within 0.01 of a surge price discontinuity, \(i_2\) is a decision variable that denotes whether the surge for this particular trip is to the right of the price discontinuity (i.e., its discretized surge level is higher than the discretized surge point, thus a trip with a continuous surge level of 1.451 discretized to 1.5x surge will have a value of 1, whereas 1.449 discretized to 1.4x surge will have a value of 0), and \(x_1\) is the actual (non-discretized) surge value, and the \(\beta \)s are the coefficients, and \(\epsilon \) the error. To compute this at each jump level, we make use of trips that are on either side of the discontinuity. Since we compute this for each surge discontinuity, \(\alpha \) helps capture the change in number of trips due to change in price because of the surge level. Our calculation of consumer surplus and price elasticity is taken from the approach detailed in an economics paper by [53] to compute consumer surplus for the service UberX across four major markets across the United States. To sum up, price elasticities are first estimated at different surge levels using the regression equation we applied in Eq. 3, and then we utilize those price elasticities at different surge levels for computing consumer surplus.

This regression discontinuity design we compute at each price discontinuity (Eq. 3) helps estimate \(\alpha \) (the drop in rides around a surge level discontinuity), which indicates the change in the number of trips that occur at that surge level discontinuity. We can then make use of this \(\alpha \) to compute price elasticities for price discontinuities as below:

$$\begin{aligned} E_{p} = \frac{\delta _Q}{\delta _P} = \frac{\frac{\alpha }{N_p}}{\delta _P} \end{aligned}$$
(4)

where \(N_p\) is the proportion of trips at a specific price p.

At this point, we run into another data-related challenge: surge levels are unavailable in ride-hailing data. To get around this, we use RANdom SAmple Consensus (RANSAC) regression to determine when surge pricing was in effect and what the surge levels were. When data contains outliers, RANSAC can be used to robustly estimate model parameters from a subset of ‘inliers’ (i.e., the data points that are not outliers) from the dataset. The intuition behind using RANSAC is as follows. According to Uber, surge pricing goes into effect when an unusually large number of people request trips simultaneously [54]. This occurs when a non-standard or much higher-than-normal number of customers try to book trips simultaneously. If all trips that occur at 1.0x (no surge) while considering trip duration, time of day, and locations are standard trips or inliers, then trips that occur at surge pricing would be considered outliers. We get a continuous surge multiplier, which we can then discretize similarly to Uber. For example, 1.449 gets discretized to 1.4, and 1.451 gets discretized to 1.5). Once we have the surge level for each ride (1.0x or higher), we can use rides on either side of a surge level in the regression continuity design to estimate price elasticities.

Algorithm 1
figure a

Affordability

To reiterate, consumer surplus is the difference between the price a consumer is willing to pay for a product or service and the price they actually pay. Algorithm 1 details its sketch. Specifically, we look at ride-hailing data for a group (EDA or non-EDA residents) and first estimate trips’ surge levels (line 1). Next, we compute a regression discontinuity design around each successive surge level and then estimate price elasticity at that surge level (lines 2-5). When considering trips at 1.0x surge (i.e., no surge), we take the price elasticity at the next surge level, 1.1x, and calculate the number of trips that would have happened if the customers shown 1.0x surge had been shown a 1.1x surge instead. We then multiply this number of potential trips with the difference in fare (which is 10% in this case) and the average price paid at 1.0x surge. This is the surplus for up to surge 1.1x for those riders charged 1.0x surge. We then replicate this for each pair of successive surge levels and sum it all up to arrive at the total consumer surplus estimate (lines 6-8). Finally, Algorithm 1 outputs consumer surplus for trips for the group.

5 Fair Pricing Mechanisms

With the tailored fairness metrics for ride-hailing, we now detail fair pricing mechanisms to help reduce disparate impact on disadvantaged communities. We focus our efforts on pricing EDA trips to help address the disparate impact on EDA residents, which can help the city of Chicago in addressing concerns about mobility inequity for its residents [17]. Further, to allow for flexibility when accounting for different situations, we define two variants of our pricing mechanism: variable discounting and fixed discounting. With variable discounting, the discount given to EDA-trips may vary depending on many factors, such as time of day, demand, waiting times, et cetera, while with the alternative fixed discounting mechanism, all EDA-trips under consideration would receive the same discount (e.g., 15%). Such a dual approach reflects the needs of the platform and the government’s policy-making. Specifically, the platform might require fixed discounting because that may be more straightforward to implement; on the other hand, the government may think it is justified to give different trips different discounts depending on the conditions at the time and the increased complexity in implementing such a system is not a significant concern. Thus, we propose both methods to ensure versatility.

In Sect. 6 we study the impact of our discounting mechanisms by comparing the bias metrics on all EDA-trips as well as only on work-commute EDA-trips to tackle mobility inequity in work commutes, which can have significant economic impacts on an individual’s life.

5.1 Variable discounting: FairRide

We first propose a new pricing mechanism called FairRide. The intuition behind FairRide is that a pricing mechanism should consider different riders’ differing abilities to afford rides to reduce the disparate impact on the most disadvantaged in society. To that effect, we propose a mechanism that prices rides for EDA-regions separately. We focus on EDA-trips, upsample affordable trips, and run a multiple regression model to determine pricing only for rides that begin or end in an EDA region (EDA-trips). Considering only trips by EDA residents and ensuring ample representation of affordable trips leads to rides being priced in accordance with the riders’ ability to afford them. In Sect. 6, we compare FairRide with machine learning models commonly employed for pricing in the literature [55,56,57,58,59,60], and find that looking only at EDA-trips with more affordable trips leads to most models resulting in more trips for EDA residents than the current pricing mechanism and a higher relative rideability (RR), and FairRide outperforms them all. Two baselines are also implemented: a ‘naive’ baseline of simply applying a $5 discount on all rides, and a second baseline we call the ‘control’ baseline which computes RR and affordability for one-third randomly selected census tracts in Chicago.

5.2 Fixed discounting: FixedFairRide

The second fair pricing mechanism, called FixedFairRide, solves an optimization problem to maximize the number of EDA-trips while maintaining or increasing revenue in a manner that offers a consistent and fixed discount (\(\delta \)) to all EDA-trips. In other words, we determine a fixed amount to discount EDA-trips by to ensure a fixed discounting policy. Our overarching goal is to determine the optimal value, \(\delta \), for discounting EDA-trips such that relative rideability increases and the number of EDA trips is maximized. This can be mathematically formulated as:

$$\begin{aligned} \eta = \max _{\delta } \sum _t \left[ \sum _{l} N (\delta * p) \right] \end{aligned}$$
(5)

where \(\eta \) represents the total number of rides, t denotes a particular time period, l represents EDA location pairs, \(\delta \) denotes the discount for EDA-trips, p is the price of the EDA trip, and N is the number of EDA trips at that price.

We introduce two constraints to this optimization according to two possible scenarios. In the first setting, the government subsidizes the discount for EDA residents. Thus, the platform and the driver receive the same revenue they currently receive from a ride, but the EDA resident would get a discount from the original price that the government subsidy will top up. Here, we do not need to consider revenue since the price charged by the platform stays the same. Nor do we need to consider the feasibility of the trip for the driver since the driver still receives the same fare as the pre-discount price. We note that it is not in the interest of the platform to increase prices by the amount of the discount since that would not lead to an increase in trips by EDA-riders as the riders would still be charged the original prices that are unaffordable for them. However, the government may wish to set a ceiling for the discount they offer (say, e.g., 30%), which can be a constraint for this setting:

$$\begin{aligned} 0< \delta < n \end{aligned}$$
(6)

where n is the maximum discount a government can subsidize.

In the second setting, we assume the platform offers to cover the discount to help address the disparate impact. However, the platform would likely prefer the total revenue to remain in place. Thus, the alternative constraint is:

$$\begin{aligned} Revenue \ge r, \end{aligned}$$
(7)
$$\begin{aligned} 0< \delta < 1 \end{aligned}$$
(8)

where r denotes the total revenue by current pricing, while \(\delta \) must be greater than 0, indicating the discount for EDA-trips is always positive. In addition, we observe that we can increase revenue beyond r simply by adding a tremendous amount of new, heavily discounted trips. However, a trip may not be worth it for the driver below a certain price since they may not earn enough from it to cover their costs and make a profit. To protect drivers, we consider the feasibility of the trip from their perspective and enforce another floor for revenue so that the driver is not in danger of earning little from the trip, and the floored constraint thus becomes:

$$\begin{aligned} Revenue \ge max(r,\ \eta *p_{min}) \end{aligned}$$
(9)

where r is total revenue with current pricing, \(\eta \) the total number of rides. \(p_{min}\) is the minimum price for a trip which can be a function of factors like travel time, distance, demand, et cetera, deemed important by the platform for a trip.

The discontinuities in price due to surge levels make this a non-convex problem, and not straightforward to solve. We therefore employ grid search [61] while varying values for \(\delta \).

While there may be other factors that can be considered while designing fairer pricing mechanisms, such as people’s social behaviors or preferences, exhaustively exploring every factor is beyond the scope of this work. By grounding ourselves in economics, we focus on mitigating the disparate impact of current pricing mechanisms, resulting in worse affordability for marginalized groups.

6 Experimental evaluation

In this section, we use real-world data from the city of Chicago to run experiments. We use this dataset since this is, to our knowledge, the only publicly available dataset that contains price (or fare) information for trips, which is necessary to address bias in pricing in ride-hailing [62]. Data from all ride-hailing platforms operating within Chicago are aggregated, standardized, and anonymized before release. Which ride-hailing platform serviced a trip is not identified. For each trip, the dataset provides the trip start (or ‘pick-up’) and trip end (or ‘drop-off’) time, each rounded off to the nearest 15 min; trip pick-up and drop-off locations at the level of Chicago census tract or community area; duration of the trip in seconds; and the fare of the trip, rounded to the nearest $2.50. Locations for pick-ups or drop-offs outside the city limits of Chicago are unavailable. Drivers and riders are not identified. We look at trips from January to October 2021 and focus on not authorized trips as ‘shared.’ Since the Covid-19 pandemic began in early 2020, most ride-hailing services disabled the option for shared rides. Rush hour EDA-trips are trips that originate in EDA regions between 6–9 AM and end in EDA regions during 4-6 PM, Mondays through Fridays, resting on the assumption that these are likely EDA residents commuting to or from work. We also repeat our analyses with data for the same 10 month period in 2023 (January through October) for consistency.

Finally, we perform a spatial-join between the pick-up and drop-off coordinates of this dataset with a spatial dataset released by the city of Chicago that identifies EDA regions in Chicago to help identify trips within our data that begin or end in EDA- or non-EDA-regions.

6.1 The profound Bias

We use the metrics we proposed in Section 4 to unveil the intrinsic discrimination of the ride-hailing companies’ current pricing mechanisms. We find that \(RR = 0.33\) when looking at all EDA-trips. Such a low RR indicates the pervasiveness of real bias, and the extent of disparate impact it could cause on disadvantaged communities. While RR is slightly better for EDA-trips during weekday morning rush hour, \(RR = 0.56\), it is still fairly low. We also observe below a stark difference in consumer surplus and affordability (Fig. 2b).

Fig. 2
figure 2

Surge levels and affordability (\(\Delta _c\))

Next, we explore affordability via consumer surplus. We first use RANSAC regression to estimate when surge pricing was in effect and the surge levels for rides. The general trend can be observed in Fig. 2a: generally, the number of trips decreases as the surge multiplier increases. While we do not have ground truth for surge levels to measure how accurate the predictions from our model are, the trend we observe (Fig. 2b) is in line with the surge level trends in [53], who had access to ground truth as the work was in conjunction with Uber. We can also make the following observations from Fig. 2a. First, a greater share of EDA-trips occurs at lower surge levels than non-EDA-trips. Second, the number of EDA-trips falls below 1,000 at surge level 4.4x, when the average trip price is $62.06. However, for non-EDA-trips, it takes until surge level 5.9x for the number of trips to fall below 1000 trips, at an average price of $97.02.

Table 1 Number of trips, RR, and affordability from different models for EDA trips during weekday morning rush hour (6–9AM)
Table 2 Number of trips, RR, and affordability from different models for all EDA trips
Table 3 Discount \(\delta \) different minimum trip price \(p_{min}\)
Table 4 Number of trips, RR, and affordability from different models for all EDA trips (2023)

Now that we know the trip surge levels, we use the equations from Sect. 4.2 to calculate price elasticities and consumer surplus. Price elasticity is negative for both EDA-trips and non-EDA-trips, except in cases when surge levels are very high (> 8.0x) and with very few rides (typically <100). This indicates that as prices increase, demand typically decreases, and vice-versa. We find that the consumer surplus of non-EDA-trips is much higher than that of EDA-trips. Examining only surge levels with a reasonable number of rides (at the highest surge levels, we observe less than 100 rides at each level, so we do not include these in the following figures), the total consumer surplus for non-EDA-trips is $17,972,629.829, while the surplus for EDA-trips is $2,326,555.876 over the ten-month period that was examined (Fig. 2b). When dividing by the total number of rides serviced in each category, we get an average consumer surplus of $67.11 for non-EDA-trips, and an average consumer surplus of $36.76 for EDA-trips.

In 2016, [53] estimated that the consumer surplus for UberX across the U.S. in 2015 was $6.8 billion. Thus, one day’s consumer surplus across all cities in the U.S., by their estimates, would be $18 million. Further, these calculations are for seven years ago and would likely be higher now. Compared to these figures, and considering that Chicago is a major city in the U.S.A., our estimates are likely lower than the true values. Thus, riders beginning or ending trips in non-EDA areas can afford ride-hailing services far more easily than riders who begin or end trips in EDA neighborhoods.

6.2 Pricing mechanisms

6.2.1 FairRide

We compare FairRide to machine learning models used for pricing within the finance literature [55,56,57,58,59,60]. We also compare against two baselines. The first is a naive baseline, which is a discount of $5 applied to all EDA rides under consideration. That is, when considering all EDA-trips, we will apply the $5 discount to all EDA-trips, and when focusing on morning rush hour EDA-trips, the discount would apply only to them. The second baseline, termed control baseline, randomly selects one-third of the census tracts in Chicago and computes relative rideability and affordability for them. Since one-third of Chicago’s residents live in an EDA [41], we randomly pick one-third of Chicago’s census tracts as a control group. Since the census tracts are chosen randomly, this control baseline can have EDA-trips and non-EDA trips. RR for this baseline would be the average number of trips by EDA-residents within the control group divided by the average number of trips by non-EDA-residents within the control group.

Effectively, we can consider the bias scores of this control group to be the upper-limit we would like to reach so that EDA trips become as fair as the random mix of EDA and non-EDA trips. Table 1 presents results for all EDA-trips during the morning rush-hour on weekdays (6-9AM Monday to Friday), and Table 2 depicts experimental results for all EDA-trips.

First, we observe that in both Tables 1 and 2, the original number of EDA-trips, their RR and \(\Delta _c\) is significantly lower than the control. This shows a bias for EDA-trips, and indeed, the number of EDA-trips is less than that of a random mix of EDA and non-EDA trips. One can consider reaching the numbers for the control group as the fairness goal. As shown in the last four rows of the tables, our two approaches reach these numbers.

Next, we find that training on data organized by region leads to lower prices for approximately 35.7% of all EDA-trips, and a 35.59% increase in rides for the period we examine in 2021 for rides beginning or ending in an EDA. In other words, as rides become more affordable, an additional 1,803,514 rides begin or end in an EDA. As a result, with FairRide Relative Rideability (RR) increases by 35.6% to 0.457, and affordability (\(\Delta _c\)) increases by 22.5% to $2,850,552.77 for all EDA-trips. When focusing on morning rush hour trips (6-9 AM Monday through Friday) originating in EDAs, RR increases by 30.4% to 0.73, and affordability increases by 26.6% to $639,832.42. While FairRide and FixedFairRide perform better than all competitors in both scenarios, FixedFairRide with \(p_{min} = 12\) outperforms the control group. One may consider this too much of a discount to EDA-trips and perhaps not ideal. On the other hand, FixedFairRide with \(p_{min} = 14\) is lower than the control group while still outperforming the others. We leave the policy-making to the policymakers but note that the discount is a knob in our mechanism that can be changed to match a given policy, as demonstrated by varying \(p_{min}\). We emphasize again that the reason for the high RR and \(\Delta _c\) for the control group is because it contains both EDA- and non-EDA-trips, and the non-EDA-trips’ higher consumer surplus on average helps. The trends for evening rush hour trips ending in EDAs (4-6 PM Monday through Friday) are similar to those for the morning rush hour trips; therefore, we omit them for brevity.

6.2.2 FixedFairRide

We then test FixedFairRide with different values of \(p_{min}\) (the minimum price for a trip) and observe that as \(p_{min}\) increases, the value of \(\delta \) (the trip discount) decreases (Table 3).

Therefore, as the minimum price per trip increases, the discount decreases. Each of these values of \(p_{min}\) results in more EDA-trips than the current pricing mechanism, all also leading to more trips than FairRide up till \(p_{min} = 15\). There is a significant increase in Relative Rideability (RR) and affordability (\(\Delta _c\) or consumer surplus) for EDA residents when considering all EDA-trips and morning rush-hour trips originating in EDAs. The trends for trips ending in EDAs during the evening rush hour are similar to morning rush hour trips; therefore, we omit them for brevity. In the dataset we use for our experiments, the lowest fare value is $2.50, and the highest proportion of trips (approximately 14% of all trips) take place around $10. Thus, it is likely that setting the minimum trip price much higher than $10 will not be practical.

On repeating our analyses with more recent data for the same duration (January through October) from the year 2023, we find that although more trips take place overall, the overarching trend of bias remains similar. As Table  4 shows, despite many more EDA trips taking place, the bias as denoted by RR score has not changed significantly, as non-EDA trips have also increased proportionally. The persistence of such bias across the years increases the urgency for interventions to help assist residents in EDA regions.

7 Conclusion

Although much work in the recent past has explored pricing for ride-hailing services, none have looked at fairer pricing to reduce disparate impact on disadvantaged residents. Using real-world data, we first observed that disadvantaged residents use ride-hailing services as a necessity (e.g., for work commutes) due to a lack of public transportation alternatives. Thus, we introduced two new bias metrics, Relative Rideability and Affordability, to unveil this intrinsic bias in ride-hailing. Consequently, we proposed flexible pricing mechanisms to price rides more fairly and make trips more affordable for disadvantaged residents. Empirical experiments reveal the profound bias caused by existing ride-hailing pricing for disadvantaged residents as compared to a control group (consisting of a random mix of rides from disadvantaged and non-disadvantaged residents) and show pricing trips by our mechanisms lead to more affordable and equitable ride-hailing services which could, in turn, assist the government in implementing fair ride-hailing policies.