The return on investment for taxi companies transitioning to electric vehicles

A case study in San Francisco


We study whether taxi companies can simultaneously save petroleum and money by transitioning to electric vehicles. We propose a process to compute the return on investment of transitioning a taxi corporation’s fleet to electric vehicles. We use Bayesian data analysis to infer the revenue changes associated with the transition. We do not make any assumptions about the vehicles’ mobility patterns; instead, we use a time-series of GPS coordinates of the company’s existing petroleum-based vehicles to derive our conclusions. As a case study, we apply our process to a major taxi corporation, Yellow Cab San Francisco (YCSF). Using current prices, we find that transitioning their fleet to battery electric vehicles and plug-in hybrid electric vehicles is profitable for the company. Furthermore, given that gasoline prices in San Francisco are only 5.4 % higher than the rest of the United States, but electricity prices are 75 % higher; taxi companies with similar practices and mobility patterns in other cities are likely to profit more than YCSF by transitioning to electric vehicles.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11


  1. 1.

    *This equation only applies when studying BEVs with switching stations. The different scenarios we study are given in Case study EV scenarios


  1. Aecom: Economic Viability of Electric Vehicles. Aecom Report (2009)

  2. Bay City News: SFMTA Approves Controversial Taxi Medallion Plan. Accessed 20 Oct 2012

  3. Becker, T.A., Sidhu, I., Tenderich, B.: Electric Vehicles in the United States: A New Model with Forecasts to 2030. Center for Entrepreneurship & Technology (2009)

  4. Berman, O., Larson, R., Fouska, N.: Optimal Location of Discretionary Service Facilities. Operations Research pp 1–36 (1990)

  5. Berman, O., Larson, R., Fouska, N.: Optimal location of discretionary service facilities. Transp. Sci. 26:1–11 (1992)

    Article  Google Scholar 

  6. Better Place: Better Place to Bring Electric Taxi Program to the San Francisco Bay Area. (2010). Accessed 27 April 2011

  7. Bloomfield, N.G.: Better Place Unveils Europe’s First Battery Switch Station in Denmark. (2012). Accessed 20 Oct 2012

  8. Boulanger, A.G., Chu, A.C., Maxx, S., Waltz, D.L.: Vehicle electrification: status and issues. Proc. IEEE 99(6):1–23 (2011)

    Article  Google Scholar 

  9. Canizares, et al.: Towards an Ontario Action Plan For Plug-In Electric Vehicles (2010)

  10. CBS San Francisco: San Francisco Taxi Fares Go Up. (2011). Accessed 20 Oct 2012

  11. Chevrolet: Chevrolet Volt Specifications. (2011). Accessed 5 July 2011

  12. Cornell University Law School.: US Code, Title 26,30. New Qualified Plug In Electric Drive Motor Vehicle Credit. (2010). Accessed 5 July 2011.

  13. Crawdad: GPS Mobility Data Set. (2009). Accessed Oct 2010

  14. Darovsky, et al.: Electric avenue: two case studies on the economic feasibility of the electrification of transportation. Master’s thesis, Duke University (2010)

  15. Delucchi, M.A., Lipman, T.E.: An analysis of the retail and lifecycle cost of battery-powered electric vehicles. Transportation Research Part D (2001)

  16. Drezner, Z., Hamacher, H.: Facility Location. Applications and Theory. Springer, Berlin (2002)

    Google Scholar 

  17. Elgowainy, A., Burnham, A., Wang, M., Molburg, J., Rousseau, A.: Well-to-wheels energy use and greenhouse gas emissions analysis of plug-in hybrid electric vehicles. Center for Transportation Research, Energy Systems Division. ANL/ESD/09-2 (2009)

  18. EPRI.: Comparing the Benefits and Impacts of Hybrid Electric Vehicle Options. EPRI Report (2001)

  19. Farrington, R., Rugh, J.: Impact of Vehicle Air Conditioning on Fuel Economy, Tailpipe Emissions, and Electric Vehicle Range. Earth Technologies Forum, National Renewable Energy Laboratory (2000)

  20. Galbraith, K.: Better Place Unveils Battery Swap Station. (2009). Accessed 25 April 2011

  21. Gao, H.O., Kitirattragarn, V.: Taxi owners’ buying preferences of hybrid-electric vehicles and their implications for emissions in New York city. Transp. Res. Part A: Policy Pract. 42(8):1064–1073 (2008)

    Google Scholar 

  22. García, I., Miguel, L.J.: Is the electric vehicle an attractive option for customers?. Energies 5(1):71–91 (2012)

    Article  Google Scholar 

  23. GasBuddy: San Francisco Gas Prices. (2011a). Accessed 12 May 2011

  24. GasBuddy: Toronto Gas Prices. (2011b). Accessed 12 May 2011

  25. Glover, F., Laguna, M.: Tabu Search. Kluwer Academic Publishers, Norwell (1997)

    Google Scholar 

  26. Hensley, R., Knupfer, S.,, Pinner, D.: McKinsey Quarterly: Electrifying Cars: How Three Industries Will Evolve. (2009). Accessed 17 July 2011

  27. Hensley, R., Knupfer, S.,, Pinner, D.: Green Tech Media: EV Batteries Plummet in Price: Down to $400 a kWh. (2010). Accessed 17 July 2011.

  28. Kamat, H.: California Air Resources Board: Lithium Ion Batteries for Electric Transportation: Costs and Markets. (2009). Accessed 17 July 2011

  29. Kempton, W., Tomic, J.: Vehicle-to-grid power implication: from stabilizing the grid to supporting large-scale renewable energy. J. Power Sour. 144(1):280–294 (2005)

    Article  Google Scholar 

  30. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598):671–680 (1983)

    Article  Google Scholar 

  31. Kleinrock, L.: Theory, Volume 1, Queueing Systems. Wiley, New York (1975)

    Google Scholar 

  32. Kliesch, J., Langer, T.: Plug-In Hybrids an Enviornmental and Economic Performance Outlook. American Council for an Energy-Efficient Economy Report (2006)

  33. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)

    Google Scholar 

  34. Korte, B., Vygen, J.: Combinatorial Optimization: Theory and Algorithms 3rd edn. Springer, Berlin (2006)

    Google Scholar 

  35. Kuby, M., Lim, S.: The flow-refueling location problem for alternative-fuel vehicles. Socio-Economic Plan. Sci. 39(2):125–145 (2005)

    Article  Google Scholar 

  36. Kuby, M., Lim, S.: Location of alternative-fuel stations using the flow-refueling location model and dispersion of candidate sites on arcs. Netw. Spatial Econ. 7(2):129–152 (2006)

    Article  Google Scholar 

  37. Kuby, M., Lines, L., Schultz, R., Xie, Z., Kim, J.G., Lim, S.: Optimization of hydrogen stations in Florida using the flow-refueling location model. Int. J. Hydrogen Energy 34(15):6045–6064 (2009)

    Article  Google Scholar 

  38. Kanellos, M.: Green Tech Media: EV Batteries Plummet in Price: Down to $400 a kwH. (2010). Accessed 4 July 2011

  39. National Research Council: Transitions to Alternative Transportation Technologies: Plug-in Hybrid Electric Vehicles. The National Academies Press, Washington (2010)

  40. Nissan: Nissan Leaf Pricing Information for California. (2011a). Accessed 5 July 2011

  41. Nissan: The Nissan Leaf. (2011b). Accessed 28 June 2011

  42. Owen, S.H., Daskin, M.S.: Strategic facility location: a review. Eur. J. Oper. Res. 111(3):423–447 (1998)

    Article  Google Scholar 

  43. Prud’homme, R.: Electric Vehicles: A Tentative Economic and Environmental Evaluation. International Transport Forum (2010)

  44. Reuters: GM Sets $41,000 Price for Electric Chevy Volt. (2010). Accessed 4 Mar 2011

  45. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs (2003)

    Google Scholar 

  46. Samaras, C., Meisterling, K.: Life cycle assessment of greenhouse gas emissions from plug-in hybrid vehicles: Implications for policy. Environ. Sci. Technol. 42(9):3170–3176 (2008)

    Article  Google Scholar 

  47. Schaller Consulting: The New York City Taxi Fact Book. (2006). Accessed 4 July 2011

  48. Scientific American: Electric Cars: How Much Does it Cost per Charge?. (2009). Accessed 27 April 2011.

  49. Shidore, N., Bohn, T.: Evaluation of Cold Temperature Performance of the JCS-VL41M PHEV Battery Using Battery HIL. Argonne National Laboratory, USA. (2008)

  50. Shukla, A., Pekny, J., Venkatasubramanian, V.: An optimization framework for cost effective design of refueling station infrastructure for alternative fuel vehicles. Comput. Chem. Eng. 35(8):1–8 (2011)

    Article  Google Scholar 

  51. Simpson, A.: Cost-benefit analysis of plug-in hybrid electric vehicle technology. 22nd International Battery, Hybrid and Fuel Cell Electric Vehicle Symposium (2006)

  52. Snyder, L.V.: Facility location under uncertainty: a review. IIE Trans. 38:547–564 (2004)

    Article  Google Scholar 

  53. The Automobile Association: UK and Overseas Fuel Prices. (2011). Accessed 1 Aug 2011.

  54. The Exploratorium: Cabspotting. (2008). Accessed 12 Feb 2011.

  55. Tom, G., Kurt, P.: California’s Residential Electricity Consumption, Prices, and Bills 1980–2005, California Energy Commission Staff. (2007). Accessed 5 July 2011

  56. Tzeng, G.H., Lin, C.W., Opricovic, S.: Multi-criteria analysis of alternative-fuel buses for public transportation. Energy Policy 33(11):1373–1383 (2005)

    Article  Google Scholar 

  57. US Department of Labor: Average energy prices in the San Francisco Area. (2010). Accessed 1 July 2011

  58. Upchurch, C., Kuby, M., Lim, S.: A model for location of capacitated alternative-fuel stations. Geogr Anal 41:1–22 (2009)

    Article  Google Scholar 

  59. US Bureau of Labor Statistics: Average Energy Prices In The San Francisco Area: May 2011. (2011a). Accessed 14 July 2011

  60. US Bureau of Labor Statistics: San Francisco Electricity Prices. (2011b). Accessed 8 June 2011

  61. Wirasingha, S., Schofield, N., Emadi, A.: Feasibility analysis of converting a chicago transit authority (cta) transit bus to a plug-in hybrid electric vehicle. In: Vehicle Power and Propulsion Conference, 2008. VPPC ’08. IEEE, pp. 1 –7 (2008)

  62. Yarow, J.: The Cost of a Better Place Battery Swapping Station: $500,000. (2009). Accessed 11 Jan 2011

  63. Yellow Cab San Francisco: Yellow Cab San Francisco Rates. (2011). Accessed 4 July 2011

  64. Zfacts: Current Gas Prices and Price History. Figures from Department of Energy. (2010). Accessed 12 July 2011

Download references


The authors would like to acknowledge Jason Baek, Augustin Chaintreau, Earl Oliver, Lisa Patel, and Dr. Catherine Rosenberg for their assistance with this research.

Author information



Corresponding author

Correspondence to Tommy Carpenter.


Appendix A: Brief CLGN background

In this appendix we provide a brief background on CLGNs. We assume knowledge of standard Bayesian networks; an excellent reference text is (Koller and Friedman 2009).

We first present some necessary definitions.

  • The graphical model of a problem is a directed acyclic graph G(VE), where each vertex is a variable and each edge represents a causal effect. The variables may be known (we can directly observe or compute their values) or hidden (we estimate their value because we cannot observe their values directly).

  • The set of parents Pa(X) of a node X in a graphical model is defined as all nodes Y such that \((Y,X) \in E\) and YX.

  • Variables can be either discrete or continuous; discrete variables can only take values from a countable set of values, such as the integers, whereas continuous variables can be any real number.

  • A Bayesian network is a directed acyclic graph that defines the relationship P(X|Pa(X)) between every variable and its parents. The probability of any variable X is independent of all other variables in the network given its parents.

Hybrid models

Standard Bayesian networks contain only discrete variables. A hybrid model contains a mix of both continuous and discrete variables. Several different hybrid models exist; we chose to use conditional linear Gaussian networks (CLGNs). In linear Gaussian models, each variable X is modeled as a linear combination of its parents. CLGNs are extensions of linear Gaussian models that allow for both discrete and continuous variables.

In CLGNs, three types of relationships are defined:

  • A discrete child with only discrete parents

  • A continuous child with only continuous parents

  • A continuous child with a mixture of continuous and discrete parents.

Note that CLGNs do not allow for discrete variables with continuous parents. Other models address this issue, but we do not need these extensions for our application.

Querying conditional linear gaussian networks

To query a variable is to return its Gaussian distribution. To query each of the three types of variables, we use the following formulas.

We consider the simplest case first; a discrete variable with only discrete parents. To express this conditional relationship, we use a discrete conditional probability table (CPT) as in standard Bayesian networks.

Next we consider a continuous variable with only continuous parents. A continuous variable X can take on any real number in the domain of X. Therefore, we cannot have a finite CPT because it would be infinitely large. Instead, we maintain a function of its parents’ values that is used to generate a Gaussian over X. Let X have k parents with means \(pa_1,pa_2,\ldots pa_k\). Under the CLGN model, we specify k + 2 parameters \(\alpha_0, \alpha_1, \ldots, \alpha_k\), and a variance σ2 and compute \(P(X|pa_1,pa_2, \ldots pa_k)\) as

$$ P(X|pa_1,pa_2,\ldots pa_k) = {{\mathcal{N}}}\left( \alpha_0 + \sum_{i=1}^k\alpha_i(t)\cdot pa_i(t), \sigma^2 \right) $$

That is, the set of αs are linear combination constants; we are calculating a new Gaussian that is a linear combination of other Gaussians (its parents).

Before we examine the third case, we note how σ2 is obtained. There are two widely used versions of CLGN’s: those where the variance of each variable depends on the variances of its parents, and those where the variance is assumed to not depend on its parents (Koller and Friedman 2009). We are using the latter simpler model because we do not have data for the variables in Table 1. This model is not as accurate because it ignores covariance between variables and their parents, but is commonly used when the variances of variables in the network are not known, and still captures most of the meaningful relationships (Koller and Friedman 2009). Because we use this model, we do not present background on CLGNs where the variance of each variable X depends on Pa(X); but note these models rely on the theory of multivariate Gaussian distributions.

Finally, we consider the most complex case, a continuous variable with both continuous and discrete parents. Let X be a continuous random variable with j discrete parents and k continuous parents. Let \({\bf D} = \{ D_1, \ldots , D_j \}\) represent the discrete parents of X. Let \({\bf C} = \{C_1, \ldots , C_k \}\) represent the continuous parents of X; we denote the mean of the ith continuous parent c i . Together, \({\bf D} \bigcup {\bf C} = Pa(X)\). For every combination d chosen from D, we have a (possibly different) vector of k + 2 constants \(\alpha_{d_0}, \alpha_{d_1}, \ldots , \alpha_{d_k}, \sigma^2_d\), and a variance σ d 2 such that

$$ P(X|{\bf D = d}, {\bf C =c}) = {{\mathcal{N}}}\left( \alpha_{d_0} + \sum^k_{i=1} \alpha_{d_i}\cdot c_i ; \sigma^2_d \right) $$

Again, the set of αs are the linear combination constants.

The problem with this approach is that the set of all combinations of d may be massive; even if each discrete parent was binary, we would still have 2d combinations and would need to store (k + 2)2d constants for every variable. A better idea is to store a function for each variable that calculates these k + 2 constants based on its parents at any time. This also allows us to set the α values based on Xs discrete and continuous parents, if needed. Therefore, we introduce a function \({\phi_X({\bf d}, {\bf c}): {\mathbb{R}}^{j+k} \rightarrow {\mathbb{R}}^{k+1}}\). This function ϕ X takes in all of Pa(X) and generates the α values used in the linear combination. Creating the ϕ functions requires knowledge of the problem; we need encode our knowledge of how variables are dependent upon each other into the network via the ϕ functions. If we were instead storing the constants, then we would need to derive the constants for each variable based on our knowledge of the problem.

Having introduced the ϕ X function, we rewrite Eq. 23 as:

$$ P(X|{\bf D = d}, {\bf C =c}) = {{\mathcal{N}}}\left( \alpha_{d_0} + \sum^k_{i=i} \alpha_i\cdot c_i ; \sigma^2_{{\bf d}} \right) $$
$$ \{\alpha_0,\ldots, \alpha_k\} = \phi_X({\bf d}, {\bf c}) $$

Whenever we query a variable, we use Eqs. 24 and 25 to calculate the distribution.

Appendix B: General switching station optimization algorithm

As discussed in Related Literature, placing facilities to maximize the number of miles travelled or maximize the number of intercepted flows does not necessarily maximize revenue. We assume the taxi company’s objective is to maximize their overall revenue, and therefore introduce a new optimization framework based on the discretized locations of the taxis and their charge levels. It is also a variation of the flow based facility location model.

We now provide the details of our approach to computing locations for switching stations. First, we show that the problem is NP-hard, which implies that it is unlikely to be able to be solved by an algorithm that runs in polynomial-time. Then, we formulate it as an integer program, and propose an algorithm for the problem that works on small instances.

We outline a proof that the switching station location problem is NP-hard by a reduction to the facility location problem. The facility location problem is stated as: given a set of clients has some demand from a facility and a cost to build each facility, find the optimal placement of facilities to minimize the cost of the facilities and the cost of serving the clients. The switching station location problem can be reduced to the facility location problem by treating the taxis as clients whose demand varies over time and the switching stations as the facilities that can meet that demand. Therefore, finding the set of optimal switching station locations is also NP-hard.

We now formally describe the switching station location problem. First, we introduce the necessary notation. Let L be the set of locations where a switching station can be placed. We denote a taxi by x and the set of all taxis by X. We assume knowledge of a cost function cost(l) for each location \(l \in L\) that is the price of placing a station at l. We use Loc t (x) to be the location of x at time t and fare t (x) is True when x has a passenger and False when it does not. We use a binary variable y(l) to indicate if a location has been selected for a switching station. The charge level of \(x \in X\) at timestep t k is denoted by CL(xt k ). Let o t (xr) be the opportunity cost of an EV with charge level r at time t. For a taxi x, o t (xr) should be zero when x’s battery is sufficiently charged; however, as its charge level drops, there is some opportunity cost because the driver will not be able to complete trips over some length, and thus may lose revenue because some passengers cannot be transported to their destination. In our analysis, we define o t (xr) to be the sum of taxi x’s fares for the remainder of its shift, once it cannot complete a trip because its charge level r is too low. That is, if x cannot complete a trip at time t, then its opportunity cost is the fares for the trips it would have completed from time t until the end of its shift. Finally, we use tau to be the battery level at which a taxi will always swap its battery if is at the same location as a switching station.

The objective of our optimization problem is stated as given the set of taxis and their temporal mobility patterns, find the optimal location(s) for switching stations such that the taxi company’s profits are maximized. Our mathematical formulation of the switching station location problem is as follows:

$$ \hbox{min} \sum_{l \in L}{\rm cost}(l) \cdot y(l) + \sum_{x \in T} \sum_t {\rm o}_t(x, {\rm c}_t(x)) $$

subject to:

$$ \begin{aligned} y(l) &=\{ 0,1\} \forall l \in L \\CL(x,t_{k} ) &=\left\{ {\begin{array}{ll}{Full} & {{\text{if}}\,{{\text y}}({\text{Loc}}_{t}(x)) = 1} \\ {} & {{\text{and }}CL(x,t_{{k - 1}} ) < \tau } \\ {} & {{\text{and fare}}_{t} (x) = {\text{false}}} \\ {CL(x,t_{{k - 1}} ) - u(t_{{k - 1}} ,t_{k} ),} & {\text{otherwise}} \\\end{array} } \right.\\ u(t_{{k- 1}} ,t_{k} ) &= {\text{energy used from }}t_{{k - 1}} ,t_{k}{\text{ }} \\ \end{aligned} $$

when L does not contain too many locations (for example, as in our case study below), we can solve the switching station location problem optimally using brute force. That is, we find the value of Eq. 26 for all possible locations of \(1, 2, \ldots, k\) switching stations. The value of k is found by determining the number of switching stations sufficient so that no revenue is lost due to opportunity costs (i.e., we have \(\sum_{x \in T} \sum_t {\rm o}_t(x, {\rm charge}_t(x)) = 0\)). At this point, Eq. 26 is monotonically increasing when more switching stations are added, so we can safely conclude that Eq. 26 is minimized with k or fewer switching stations.

This brute force approach may not be feasible over larger areas with more locations. In this case, it is possible to use heuristic algorithms to find a solution, though these heuristics cannot guarantee the optimality of their solution. Algorithms such as simulated annealing, tabu search, and hill climbing are general optimization methods, and could be used to find approximate solutions to the switching station location problem (Glover and Laguna 1997; Kirkpatrick et al. 1983; Russell and Norvig 2003).

Our formulation of the switching station location problem relies on time series locations of the vehicles that will use the switching stations. Ideally, this location data is collected from multiple vehicles over multiple weeks; however, this data may not be obtainable in some situations. In this case, it is still possible to optimize the placement of switching stations using stochastic facility location algorithms [e.g., (Owen and Daskin 1998; Snyder 2004)]. Such algorithms are designed to optimize facility locations when there is a high amount of uncertainty in the input. These algorithms take a probability distribution of the amount of time vehicles spend at given locations as input. This distribution could be estimated from, e.g., road congestion statistics or logs of passenger pickups and drop-offs.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Carpenter, T., Curtis, A.R. & Keshav, S. The return on investment for taxi companies transitioning to electric vehicles. Transportation 41, 785–818 (2014).

Download citation


  • Electric vehicles
  • Bayesian networks
  • Public transportation
  • Taxis