Utilization rate of the fleet: a novel performance metric for a novel shared mobility

Car-sharing systems have irrupted in our cities following the shared mobility paradigm. They have evolved the personal mobility market from product-based into service-oriented, which ultimately provides a positive impact on the city’s sustainability. Car sharing systems are a complex interactive service, whose dynamics can dramatically affect its operational viability. In order to better asses this viability, we must rely on data to produce novel metrics that characterize both the user behavior and the service performance. Up to date, research has focused on modeling the demand on the basis of the number of rentals that start within a specific time slot. However, this approach seems unable to provide a representative metric of the performance of a car-sharing system. In this paper, we propose a novel metric, the utilization rate of the fleet, which considers the precise number of vehicles within a fleet that are in service every minute of the day. From this basic metric, we derive a key performance indicator (KPI) to reflect the viability of any car-sharing system in economic and sustainability terms. We have applied this new metric and KPI to a dataset with 449 days of car2go data, collected in 10 European cities.


Introduction
Car-sharing services have come to stay. This is good news because shared mobility represents a sustainable strategy to alleviate climate change in cities. It introduces cleaner technologies and reduces vehicle ownership (Liao et al. 2020), which ultimately contributes to mitigating negative externalities of transport such as pollution and congestion.
Car-sharing services are organized upon similar rules in different cities. However, the exact way in which this service fits into each city's everyday life depends on distances, urban layouts and infrastructures, together with habits and preferences of citizens, which considerably vary across cultures (Salvia and Morello 2020).
The basic idea behind car-sharing is pretty simple: instead of paying to buy your own car, you pay to use a shared car. This means that car-sharing evolves the traditional fulfillment of the mobility needs from a product-based approach to a service provision. There are two business models (Remane et al. 2016;Perboli et al. 2018): (a) the traditional business-to-consumer (B2C) model, in which a company makes a fleet of vehicles available to its customers; (b) and the newest peer-to-peer (P2P) model, in which private owners make their own vehicles available to other users. We can find three categories among B2C systems: (i) two-way services, also called round-trip systems: with stations and trips starting and ending at the same station; (ii) one-way station-based services: which include stations, but the origin and destination of the trip may differ; (iii) one-way free-floating services: with no stations as cars are picked up and dropped off anywhere within a predefined zone, called the operational area. So far, P2P platforms have a testimonial presence in the sector, although they do represent a certain market share in some countries such as France (Münzel et al. 2020).
By far, the most successful modality is free-floating car-sharing (FF-CS). In Europe, car2go was the first company to implement a FF-CS service (Ulm, Germany, in October 2008). The demand for this type of service has grown gradually over the last few years for several reasons, including flexibility, 24/7 availability, and competitiveness. First, the possibility of picking up the car right where it is needed and returning it to the most suitable place, given the user's destination, is undoubtedly a clear advantage for the service. Second, companies provide a spatially distributed supply of vehicles, even assuming rebalancing operations or oversizing the fleet of vehicles (as we will see later in this paper); this is key to providing a high quality of service. Last, a competitive pricing policy makes the service more attractive to users, and so the service becomes a substitute for other transport modes, especially at certain times of the day.
For the time being, the pricing structure implemented by the car-sharing operators is quite simple: the price is directly proportional to the rental time, so the price-per-minute is the relevant figure of the service. This policy may have some indirect drawbacks as encouraging users to speed.
In order to evaluate and optimize FF-CS services, we need to improve our current knowledge about the user's behavior, the patterns of operation, and the corresponding economic impact. We propose to do that by mining some concepts from the service and planning field of the telecommunications industry. Initially, the proposed service was mere speculation, but now after several deployments there is the potential analyze real user behavior. In this respect, the new web-based methodologies best fit this purpose given the low and rapidly increasing number of users, which discards other traditional approaches based on surveys or interviews. In addition, focusing on data we are able to get closer to individuals and avoid the fundamental limitations of traditional transport planning techniques, mainly based on aggregations and estimates (Vecchio and Tricarico 2019).
This data-driven analysis allows us to unravel the complex relationship between supply and demand inherent to FF-CS. From the user's point of view, the service is attractive if it guarantees availability; from the operator's, the key is efficiency. So, we have two conflicting requirements: users need a high density of empty available cars, whilst operators require a high proportion of occupied cars.
Nevertheless, data does not imply an overall solution for the analysis of FF-CS. We also need to construct new metrics capable of reflecting the service-oriented nature of the novel shared mobility. The tools we applied to classical production-based economies, do no longer apply to this new scenario of shared economy. We cannot model the economic viability of a FF-CS system observing how many cars are demanded but which is the service rate they provide.
Consequently, in this paper we propose the utilization rate of the feet to characterize a FF-CS service. The rate is defined as the percentage of cars in use with respect to the total, in a certain time interval, for example, 1 minute. Despite the simplicity of this metric, its temporal dimension allows a fine-grained description of usage patterns. Globally, through aggregates and averages, it represents a key performance indicator (KPI) capable of determining whether the service is economically viable for the operator and socially desirable for the citizens.
The remainder of the paper is structured as follows. Section Related work describes other scientific approaches to the characterization of FF-CS services in order to define the framework of our paper. Section Materials and methods is devoted to justify the need of a new metric, the formal description of the proposed metric, and the empirical dataset we used for our experiments. Section The need for a new metric presents the results obtained from these experiments in 10 European cities and performs an economic analysis based on those. Finally, Sect. Notation highlights the benefits of the proposed metric and its applicability in future mobility services.

Related work
Methodologically, scientific literature has analyzed FF-CS services from three different perspectives: relying on surveys or interviews, developing simulation models, or conducting data-driven analysis. From a different perspective, regarding the investigated subject, the emphasis has been on proposing reallocation strategies or modeling supply and demand.
Through a before-and-after survey in 5 cities in North America ( N ∼ 6 000 ), the results in (Martin and Shaheen 2016) show that car-sharing members show a statistically significant reduction on the number of vehicles they own, despite the fact that 60 % of all households joining car-sharing are car-less. Furthermore, on average shared vehicles are more modern and hence more efficient (less polluting) than the average owned, a trend that will remain as the usage of shared vehicles is higher, requiring a more frequent replacement rate. In addition, FF-CS both substituted and complemented public transportation. However, we must be cautious about the conclusions extracted from surveys that could be potentially biased, given that they tend to be conducted among service users and only a small percentage of the population uses the service. This fact also suggests that the use of the traditional household mobility surveys may not be a reliable solution.
Simulation-based models show implicit limitations as they rely on assumptions that do not always conform to reality (Cohen and Shaheen 2016). This is particularly critical when trying to simulate a non-existent service, like FF-CS in some cities. In (Ciari et al. 2014) an agent-based simulation software is used to model the FF-CS demand in Berlin. The projection of the model for the year 2015, predicted a rental profile with a peak around 13:00. However, an analysis carried out with real data in Berlin 2015 (Schmöller et al. 2015) showed a totally different profile of car-sharing bookings with a peak in demand around 20:00. Thus, an inaccurate model generated wrong conclusions and led the authors in (Ciari et al. 2014) to state that FF-CS users are mainly commuters and that work, and not leisure, is the main objective of their journeys. There are numerous studies that show that the behavior of users is contrary to these statements (see Boldrini et al. 2019 for an interesting discussion on this matter). Nevertheless, simulators like the one presented in ) are a must when it comes to designing a system for its future implementation. These and other scientific and industrial works often show biased representation of the car-sharing mobility. Thus data-driven models and generalized metrics are yet to explore (Shaheen et al. 2019).
The missing ingredient in building models and construction planning processes is the understanding of the user motivation. This is obviously difficult in the context of a proposed service where no data is available and everything is based on speculation. However, we are now in the position to have some example services in place with available data to be analyzed.
The telecommunications industry has a long history of service design and planning and has spent decades analyzing the relationship between the service and the environment in which it is used, for example (Norton 1992). This bridges the gap between the engineering of the service and the capacity planning required to make it economically viable. In practical terms, it requires making an economic connection, which in turn is defined by a strong link to the motivations and the objective of the target user base. For example access to and support for a service can be controlled by modifying the applied tariff (DaSilva 2000). Such concepts have already been embraced, for example, (Febbraro et al. 2019) proposes a user-based relocation methodology in which the users enjoy fare discounts if they accept leaving the car in a specific location. However, this immediately makes it evident that systematically characterizing user behavior is going to be essential in fully understanding these services.
As noted in the previous paragraphs, failure to correctly characterize the user behavior leads to incorrectly predicted results, despite the fact that the underlying model is viable. Although the authors believe that there is a great potential for extracting comparisons and methodologies from this economic and user psychological perspective explored by the telecommunications industry, it is only sensible to start with the simplest of comparisons. What we propose here is to look at some of the more basic planning concepts and start to define the relationship between these basic parameters and user characterization. Following a similar approach to the design of telecommunication networks, we can either use Erlang's function to measure the probability of not having an available circuit given a specific data traffic, or use its inverse in order to calculate the number of circuits needed to provide the required service for that data traffic (Berezner et al. 1998). These principles can be transposed to the car-sharing field using the analogy of circuits with cars and traffic with users' mobility demand. Therefore, our objective is constructing a simple metric that can link service provision and user demand in order to optimize the design of FF-CS systems.
All of this is also related to real world constraints, for example in (Kypriadis et al. 2020) considers the implication of how the battery level of the electric cars along with the time the employee walks from one car to another are taken into account when deciding if a car should be relocated and to where. This would again be impacted by the ability to influence user behavior illustrating the potential that such understanding could bring.
Obviously, to make any headway we need access to real data and data-driven approaches are the most relevant for our work; there are still few studies, but we review the most relevant below.
The authors in (Ciociola et al. 2017) present an integrated system to harvest data in real time (at 1 minute intervals) from car-sharing platforms and, simultaneously, from Google Directions API. Thus, in addition to the information on rentals, they also get trip times by private vehicle and public transport on the same origin destination pair. Authors take the city of Turin as a use case, collecting 167 000 rentals in 52 days (from 10/12/2016 to 31/01/2017), from 2 operators. The study includes, temporal, spatial, and usage analyses. The temporal analysis shows the total number of rentals recorded on each day, along with temporal profiles for weekdays and weekends (hourly averaged over the entire period). The spatial analysis identifies the main departure and arrival zones. Finally, the analysis on usage habits studies the duration and distance of trips, the user's driving patterns, and the correlation with the public transport system. Results show that rentals are 36 % shorter on average than public transport time, but only start to be selected when public transport time is higher than 10 minutes. (Habibi et al. 2017) analyzed vehicle availability data, sampled every 60 seconds, between 2014 and 2016 from 3 different operators in 22 cities in Europe and North America. In total, they identified 27 million trips. They calculated usage variables such as the average distance (geometric linear distance rather than traveled distance) and duration of the trips, and efficiency variables such as number of trips per day and vehicle and the utilization rate (ratio between the time cars are used in a day relative to the time they can be potentially used, in percentage). The authors also studied when and where the vehicles are mostly used. The analysis is mainly descriptive and shows very different usage patterns among cities.
Authors in (Hardt and Bogenberger 2018) analyze the spatio-temporal behavior of a FF-CS service in Munich's business area, from April 2016 to March 2017, including over 100 000 customer rentals and discriminating 4 types of zones: residential areas, commercial areas, satellite areas, and downtown areas. The studied variables were rentals, drop-offs, and availability of vehicles. The definition of availability is qualitative (and normalized) as the percentage of time in which at least one vehicle is supplied in a specified time interval. Patterns were created by averaging these data by hour-of-day.
In (Boldrini et al. 2019) the authors used pickup and drop-off times of vehicles in 10 European cities. Data were collected on a 1-minute basis during a 45 days. The authors studied the utilization rate for vehicles (defined as the number of daily trips per vehicle), investigated how demand is related to sociodemographic indicators (using a multivariate linear regression model), how that demand is distributed in space and time, and finally, compared different demand forecasting algorithms, concluding that random forest provides the most accurate predictions, in most cases. The same dataset is used in our work, although we analyze it in a different way.
An analysis of two FF-CS services in Madrid is presented in (Ampudia-Renuncio et al. 2020). This city is an interesting case study because, as already highlighted in (Habibi et al. 2017), Madrid has one of the highest utilization rates for FF-CS vehicles ( 21.6 % ); in addition, the fleet is entirely electric or hybrid. Data in (Ampudia-Renuncio et al. 2020) were acquired every 30 seconds between 28/11/2017 and 12/04/2018. The authors studied the spatio-temporal distribution of trips for each day of the week. Each trip was also simulated in public transport (using Google Maps API) and the results show that there is prevalence of the short-distance trips that are faster than the corresponding trip in public transport.
The study in (Alencar et al. 2019) gathered data for more than one year, between 2017 and 2018, in Vancouver (Canada). The authors compared three CS services: two-way, oneway, and free-floating. They present spatio-temporal characterization of the service, separating weekdays and weekends, and extract users' habits patterns. Results show that oneway and FF-CS services have similar characteristics and are mostly used for short/medium period travels.
On the supply side, the study in (Münzel et al. 2020) analyzes data from 177 European cities to explain the car-sharing supply using 14 explanatory variables. The conclusions show that car-sharing is popular in cities with a high educational level and many green party votes, being less popular in cities with many car commuters.

The need for a new metric
In car-sharing services, demand characterization is usually performed observing the number of rentals, which is calculated as the number of trips that begin in a 1-hour time interval. All data-driven studies described in Sect. Related work followed this approach. Nevertheless, this magnitude has two limitations related to the temporal definition and the representativity of the metric.
First, for a fine-grained knowledge (for example, one measurement every minute), the number of rentals is not a good choice due to its high variability. As an illustrative example, Fig. 1 shows the number of rentals per minute in Milan on Thursday 06/18/2015. These continuous changes in the signal obstruct its interpretation, which forces researchers to rely on aggregated values (1-hour periods or even entire days) that are then averaged over a set of days.
Second, the number of rentals on its own, hides an extremely useful piece of information to car-sharing companies: the duration of the rental. All operators within the sector originally applied a very simple rate policy: a fixed minute rate. Therefore, the key is not only the number of cars rented per unit time, but more significantly, how long those cars keep rented. As an illustrative example consider a fleet of 100 cars that are rented at the same time. The number of rentals would be equal to 100, suggesting a 100 % utilization. However, this would only be true if the duration of those trips was the same. On the contrary, as each of those trips ends, the actual utilization of the fleet will decrease 1 % , reaching 0 % if none of them is rented back. This fact would be ignored by the number of rentals, but precisely highlighted by the utilization rate of the fleet. By relating back to a basic service model, (Berezner et al. 1998) and just examining the concept of the queued service structure, it is clear the actual rate of service uptake is only one aspect that is important. The holding time, or more specifically the time the vehicle is rented is a second equally important factor. In these types of queue models, the queue occupancy is the observable that best characterizes these combined parameters. The rate encodes information relating to the need of the user and the holding time relates to the intent of the user.
In this work we propose a new magnitude to characterize the demand for car-sharing services, namely the utilization rate of the fleet, which we define as follows: the percentage of vehicles that are in use at a given moment. Figure 2 shows the resulting metric resulting from the same data used in Fig. 1. This is in analogy with the queue occupancy of the simple service models found in queuing theory. Analysis of this simple quantity can give us a great deal of information pertaining to the requirements of the user base.

Notation
Let's consider a car-sharing service with a fleet of N vehicles, and let (n, t, t) be a tuple denoting the rental of the car n, at time t, with duration t . We divide a 24-hours day into intervals of duration = 1 min. , so we have a sequence of discrete times, t k , for 1 ≤ k ≤ 1 440.
We define the utilization of the fleet as where the subscript d refers to a specific day. Note that the value of U d [t k ] represents the number of cars which are in service that day during the interval t k , t k+1 .
Using percentages, we have the utilization rate of the fleet: Finally, we analyze mean values of the rate by averaging measurements in the same time slot of different days. In these cases, we will use the following notation: where D * is a wildcard symbol that represents a set of days; for example, D mon will refer Mondays, D work will indicate workdays, etc.; in the most general case, we will use D all when we consider all the days included in the database.

The dataset
The dataset used in this research was provided by the National Research Council of Italy and it contains information about car2go pickup and drop-off times in 10 European cities. car2go is one of the main FF-CS operators.
The data collection process was as follows. Custom code was developed to interact with the car2go public API: this software sends a query every minute, receiving a JSON file with information about the available vehicles, that is, a snapshot of the cars that are parked and ready for rental, at that specific moment. Note that the frequency of the data capture directly impacts on the error in the duration of the trips: the higher the frequency, the lower the error. On the other hand, increasing the frequency adds complexity to the data capture software. Considering these two opposite limitations, we chose capturing data in 1-minute windows as a compromise.
The process ran for several days. The collected data provide the following information for each car: timestamp, vehicle identifier, geographical coordinates (longitude and latitude) and some additional fields such as the type of engine (electric or not), the amount of fuel remaining (in percentage) and the state of cleanliness of the vehicle (both internal and external). For electric vehicles, data also includes information about whether the car is charging or not, in that instant.
Comparing successive snapshots offered by the JSON files in these data (namely "raw-DB-1"), we can generate a second dataset with information about the movements of the vehicles. Consider the following sequence of events: a car is seen at location A at time t; for times t + 1, t + 2, ... the car disappears; at time t + n the car reappears at location B. So we can infer that the vehicle was picked up in (A, t) and returned in (B, t + n) , and, therefore, the car was rented for n time steps. This trip produces an entry in a new database (named "raw-DB-2"). In this case, the information is organized as follows. For each car rental, city, vehicle identifier and trip time are recorded; in addition, both for the pickup and for the drop-off of the car, timestamp, coordinates, type of engine, remaining fuel, cleanliness status and whether it is charging or not (when applicable) are also stored. Finally, we add two additional fields: the trip time and the trip distance returned by Google API for the given origin-destination pair, corresponding to the trip.
Data was collected from 17/05/2015 to 01/07/2015 in nine of the cities, thus collecting 44 entire days. However, we removed the data registered on 24/06/2015 from the analysis, due to a long period of disconnection in the API, which altered the actual operation of the FF-CS service. On the other hand, data from Munich was captured from 11/03/2016 to 13/05/2016, obtaining 62 entire days. See Table 1 for details.
One of the main challenges in transforming data into information for smart cities is managing the data quality (Lim et al. 2018). Consequently, data were properly preprocessed to eliminate invalid entries (mistakes in reported GPS coordinates). In addition, a 4-hour filter was applied on the maximum trip duration, removing records in the dataset with abnormal characteristics corresponding to failures in the data recording system of the FF-CS system (some records have origin and destinations separated hundreds of kilometers, which suggests potential errors in the GPS operation) or vehicle maintenance tasks. Consequently, the resulting pre-processed database included trips with durations between 1 minute (time window of the data capture) and 4 hours. All records falling outside this time interval were considered outliers. The final number of trips and cars in each city is shown in Table 1. Figures 3 and 4 show the average utilization rate of the fleet (a measure every 1 minute), for each type of day and for the ten cities. Gray shadings show the standard deviation around the mean (in blue). Boxes indicate the minimum and maximum values, as well as the mean and standard deviation, for all values obtained in each type of day. Holidays have been analyzed separately.
Although there does seem to be a certain base time-pattern, there are obvious differences among individual type of days. For example, in the case of Rome (see Fig. 4b) the behavior on Wednesdays is different from that on Thursdays; furthermore, these two days show a significantly dissimilar variability. This same observation applies to Mondays compared to Tuesdays in Milan (see Fig. 3e).
The profiles for each type of day are very different from each other: • Workdays: In general, from Monday to Friday there are two peaks, one in the morning and one in the afternoon (although Milan sometimes shows three peaks). It is noticeable that the afternoon peak is remarkably higher than the morning peak. This is a distinct feature of car-sharing services (Boldrini et al. 2019), which greatly differ from the usual behavior in other transport modes like public transport (Yap et al. 2018). For their part, Copenhagen, Stockholm and Turin present a fairly flat demand profile throughout workdays. This behavior may be related to the performance of the service because in the first two cities, the car-sharing operator has currently stopped providing this service. • Weekends/Holidays: Underlying pattern is a wave-shaped curve (a deep valley followed by a prominent mountain); on this base-line, some additional variability is added in the segment with the highest utilization, mainly on Saturdays; see, for example, the cases of Munich (Fig. 4a) and Vienna (Fig. 4e). Figure 5 shows ⟨u d [t k ]⟩ d∈D all for each city. The statistical characterization is shown in Table 2. This measurement, the average utilization rate of the fleet, is postulated as a KPI in FF-CS services. It is a simple but powerful analysis tool. Let us show its validity. On the one hand, it is a clear and precise measure of the service demand, which gives an idea of both its magnitude and its temporal distribution. By simple visual inspection, we can easily identify the times of day when the number of rental cars escalates and when it declines precipitously. Let us observe the case of Milan: almost as many cars are rented at two in the morning as at eleven at noon. The highest peak in demand occurs soon after 22:00. The case of Copenhagen is very different: the demand at night is comparatively much lower. In this city, the metro runs 24/7, which may have a direct impact on the distinct demand profile we observed relative to the rest of the cities.

Global profiles
On the other hand, this KPI enables the operator to quickly calculate the income, and therefore, the viability of the service. Using only one figure (the mean of the KPI), it is possible to immediately calculate the turnover. Let's see how.
The formula for calculating the total income in a year is: where is the mean of the KPI, is the price per minute, and N is the size of the fleet, as usual.
Taking the city of Berlin as an example, Price per minute varies slightly from city to city, but we can take 0.3€/min as a representative value of the fare in Europe. Table 3 shows the results of the calculation for every city. In addition to the annual figures, numbers are provided per vehicle and per day. We can draw some interesting   conclusions. The two cities in which the service has disappeared (Copenhagen and Stockholm) show an average daily income per car of approximately 13€.This means that each vehicle is operating less than 45 minutes a day. In this respect, the metric we proposed could have been used as a support decision tool to forecast the low levels of demand of this service, which eventually made this FF-CS unfeasible.
In regards to vehicle usage, let us remark that, on average, a private car is used only 5 % of the time (Shoup 2017), that is, 72 minutes a day ( 0.05 × 1440 ). Translated into money, considered a 0.3 € fare, this time is equivalent to an income of 21.6 € per car and day. Thus, this figure could be considered as the threshold that differentiates sustainable systems from those that are not, from the point of view of social efficiency. A service may be economically viable but it would be failing to contribute to the sustainability of cities if it results in an increase on the number of stopped cars that occupy parking spaces on public roads, thus wasting a scarce public resource. It should be noted that only six cities out of the ten successfully meet this requirement.
This KPI can also support the definition of new fare policies. Traditionally, car-sharing companies have set a simple pricing structure: the rental price is directly proportional to the rental time. Nevertheless, they are lately beginning to apply variable fares, based on certain parameters: surge pricing and proximity-based or demand-based pricing, to name a few. Since utilization precisely characterizes demand, it would undoubtedly contribute to developing pricing algorithms that take user behavior into account.

Conclusion
FF-CS services are an increasingly used transport alternative in large urban centers, and it will be an efficient option to the extent that it achieves to populate the streets with fewer and cleaner cars.
User characterization is an intrinsic part of modern information systems and service provisioning is fundamental component of the communications industry. By combining some of the more basic features of these approaches with the now available information about real services, we have demonstrated that is possible to better characterize the user base and hence provide tools to evolve the service to be more profitable and more usable.
In combination with previous ideas relating to service modeling we have demonstrated that there is the opportunity to mine the data for easily understandable user characteristics. Thus demonstrating the possibility of further improving the modeling and planning process, as well as allowing the redesign of the service components.
The improvement of the design process will be key for the convergence of three flourishing technologies and markets in our cities: automation, electrification and mobility on demand (MOD) or mobility-as-a-service (MaaS) strategies (Shaheen et al. 2019). The utilization rate of the car-sharing fleet will serve as the basis for the analysis of the viability of present shared-mobility services and the optimization of their operation. Furthermore, given that this metric represents the actual utilization of the fleet and its evolution in time, it will play an important role in the definition of future multi-modal services for the transport of people and goods. In addition, let us remark that the utilization rate of the fleet can be applied to any type of vehicle: from traditional human-driven cars to novel unmanned air drones.
Although only applied to the data at a city level, geo-spatial analyses would provide to opportunity to analyze demographic differences and because this is an era of big data.
Merging the data captured from the service use with that of the user data (although we do not have access to such data) would greatly enhance the user profile of such services enabling the tailoring of such services almost to specific individuals. The joint analysis of these data sources will eventually provide meaningful information to design user-centric mobility services, which is a fundamental principle of the MOD approach (Shaheen et al. 2020).
Consequently, the utilization rate of the fleet provides useful insights for the optimization of the design and operation of current and future shared-mobility services.