Introduction

The use of ride-hailing services (RHS) since its launch in early 2010 has increased globally. Ride-hailing is an emerging mobility service that connects commuters and drivers via a mobile app interface and an internet connection. In terms of flexible payment options, easy access, and the cosiness of travelling in a car, it offers a higher level of comfort and convenience. In Indian cities, for example, UBER was the first RHS company to launch its service in 2013 [24]. Since then, the RHS market has grown tremendously, especially in metropolitan cities, as new operators such as Ola entered the picture. The two largest RHS companies in India, Uber and Ola, offered an average of seventy million monthly trips between 2015 and 2016 [8]. It has, however, been observed that recent studies present contradictory consequences of the implementation of RHS on the transportation system. For example, Agarwal et al. [1] indicated that the peak hour road congestion in major cities might increase significantly with the introduction of RHS. For discretionary trips as well, ride-hailing services can substitute for the traditional modes of transportation and be a sustainable alternative. Multiple studies, mostly from the global north, show that the traditional alternatives have suffered significant losses in market share, revenue, labour force, and facilities due to the rapid growth of RHS [41, 42]. They, however, play a very different function in the developing world, where personal vehicle ownership is far lower than in developed nations. RHS has experienced enormous success in developing countries, which can be attributed to several factors, such as inefficient alternative transportation services coupled with the absence of strong regulatory frameworks, ill-developed public transportation, an increase in vehicle ownership, and other socio-demographic factors [5, 25, 33].

Several studies conducted over the past few decades have suggested that built environment (BE) plays a key role in determining travel mode choice, sometimes even more than the conventional mode choice attributes (e.g., travel characteristics, demographics, etc.) [7, 10, 21]. Although the built environment studies for other modes are on the rise, very few studies focus explicitly on the choice of ride-hailing services due to the lack of publicly available data [22]. In particular, previous studies largely relied on two types of datasets- (i) household-level travel surveys, where TNC-related information is rarely reported separately, and (ii) trip-level data obtained from TNC operators. While the former type of data usually does not have enough spatial granularity to investigate built environments and ride-hailing usage, the latter misses out on the non-adopters and even the demographics of the trip-makers [14, 26]. At the same time, there have been a few recent studies which have attained the required level of spatial granularity, and the present study builds on their existing knowledge [3, 23, 45, 46]. To the best of our knowledge, most of these do not distinguish between first-time use, i.e., adoption and subsequent usage, i.e., frequency which potentially leads to biased understanding. Moreover, all these studies are based on developed economies (mostly from the USA). Since the organic urban form and prevalent mixed land uses in the Global South may have varying effects on RHS penetration, their findings are not directly transferrable. Besides, the present work also fills in a few critical research gaps by modelling mandatory (work) with non-mandatory (discretionary) trip purposes while accounting for joint decisions of adoption and usage. In terms of methodology, the contemporary studies by Malik et al. [29, 30] adopted a very similar approach though they did not investigate how (and if) the BE affects different trips (work and discretionary) in varied degrees. Based on the motivation elucidated above, the objective of the current study is to estimate the factors that affect both ride-hailing adoption and the frequency of use by residents of Kolkata in India. To achieve that, we estimated a semi-ordered bivariate probit model with the adoption of ride-hailing (binary outcome) and the frequency of ride-hailing (ordered outcome) as dependent variables for both work and discretionary trips.

The remainder of this paper is organised as follows. Section Literature Review provides an extensive review of previous relevant literature while briefly discussing the critical research gaps filled by current research. Section Survey Design and Dataset discusses the survey design, and sample characteristics, while Sect. Descriptive Analysis presents the descriptive analysis and heat maps of the ride-hailing adoption and frequency. The descriptive analyses lead to developing hypotheses to be tested through the modelling methodology outlined in Sect. Methodology. In Sect. Estimation Results, the model findings for the semi-ordered bi-variate probit model and goodness-of-fit measures are presented. Finally, Sect. Conclusions and Discussion explores the practical consequences of our findings before summarising the findings and identifying future study possibilities.

Literature Review

Ride-hailing services’ adoption is influenced by various factors, as reported by studies over the past few decades, including socio-economic and demographic characteristics, trip-related characteristics, and attitudes [13, 33]. Studies on the relationship between the built environment and ride-hailing services, though are scarce. The evidence for how the built environment affects ridesharing is less clear, despite the fact that such factors have been shown to have a significant impact on the mode choice of conventional means of travel. Although studies have found some evidence of a link, the built environment is only partially measured. Based on previous research and the literature, this study investigates how the built environment influences an individual’s adoption and use of ride-hailing services.

Built Environment and Its Relation with Ride-Hailing Services

The built environment can be defined in various ways; Ewing and Cervero’s, famously referred to as the 5 D’s, are the most widely used, viz.,- density, diversity, design, destination accessibility, and distance from public transit stops [17]. Several empirical studies [7, 13, 29] have examined mode choice as a function of trip maker characteristics (e.g., age, gender, household income, household size and composition), mode features (cost of travel, accessibility, safety, and security), and built environment factors. Studies [20, 32] have extensively investigated the relationship between the built environment and travel behaviour. It has been witnessed that the built environment significantly influences mode choice. However, not all previous research has found a significant and consistent influence of the built environment on mode choice compared to other socio-demographic and ride-hailing attributes. The research till-date typically concentrates on the characteristics of RHS users but misses out on the non-users, who comprise a significantly bigger share as compared to users [7, 41]. Furthermore, the studies use socio-demographic data and behavioural attitudes to identify traveller preferences that a model can precisely predict. In the second category of literature, the demand for RHS trips is examined using trip-based data, which can explicitly capture the total demand for RHS trips acquired from RHS providers. However, this type of data typically does not include socio-demographic data, primarily due to privacy concerns. Based on the data and variables available, this study followed a different approach to arrive at the final research outcome.

Table 1 Summary of previous studies

Table 1 summarises the various research approaches and key variables employed in numerous studies conducted in various regions to gain a clear understanding of the association between the built environment and ride-hailing services’ adoption. A study conducted in New York City by Malik et al. [29] uses the walk score method to suggest that people who live in lively, walkable neighbourhoods use ride-hailing more frequently, and the lack of public transportation encourages more people to use ride-hailing services [30]. In contrast, Loa et al. [28] indicated that ride-hailing services share more of a complimentary relationship with public transit; however, in specific contexts, RHS attract people away from the latter. Gehrke et al. [18] explored further into the effect of demographics on the substitution effect of ride-hailing and found that high-income individuals are less likely to replace ride-hailing with public transit, which points to the ridership competition amongst these modes. According to a study by Correa et al. [13] in New York City, higher taxi/Uber demand is associated with shorter transit access times, longer roadway length, lower vehicle ownership, higher income, and more job opportunities. In another similar study conducted in China in 2019 [27], recreation and entertainment points of interest (POI) and residential district POI are the most influential factors in night-time online car-hailing travel (for DiDi), while land-use mix has a positive effect on online car-hailing travel during rush hour. Alemi et al. [3] found that an increase in activity density (number of jobs and housing units per acre) causes a decrease in the frequency of use of the ride-hailing service, while an increase in land use mix increases the frequency. Yu and Peng [43] investigated the spatial variation of ride-hailing demand to highlight the positive influence of sidewalk and intersection densities. While the former attribute meant the possible complementary relationship between RHS demand and active modes, the latter probably stems from lower waiting time due to a better network. Conversely, Sabouri et al. [34] argued higher intersection density corresponds to better facilitation of non-motorised modes leading to lower usage of ride-hailing services. Furthermore, Sabouri et al. (negative) and Yu & Peng (positive) made contrasting observations related to the effect of transit accessibility on ride-hailing demand. A later study by Hasnine et al. [23] found an increase in ride-hailing trips where the commuting duration is more than an hour. At the same time, multiple studies underscored the positive relation of population density with higher usage of ride-hailing services [15, 43]. Importantly, Dias et al. [15] attempted to find how the effects of the built environment attributes differ for varied trip purposes. For example, they observe transit accessibility to have a positive relationship with ride-hailing demand for work trips while an inverse relationship could be detected for other trip purposes indicating possible synergies. Numerous contemporary studies have found clear links between socio-demographic factors and ride-hailing and/ or ride-sharing services [3, 12, 13, 30, 38]. Individuals from the younger generation, who are more educated and come from higher-income families, are more likely to use ride-hailing services than those from the older generation and lower-income families [3, 29].

To summarise, the existing body of literature on ride-hailing services and built environments has yielded varied results regarding the latter’s impact on utilising the former. The disparity in findings might stem from not explicitly distinguishing the effects of adoption and continued usage while accounting for different trip purposes. The other contributing factor could be measuring the built environment variables at an aggregated level (census tracts or blocks) instead of assessing the accessibility (e.g., distance to destination) at an individual level. Furthermore, the trip-specific attributes were mostly not included for modelling the ride-hailing usage which might lead to overestimation of the built environment effect. In the present study, we attempt to address these concerns and our approach has been explained in the following subsection.

Survey Design and Dataset

This section discusses the survey design approach, collection of primary, and development of secondary datasets for this research, which are necessary to achieve the main objectives in a structured manner. To this end, we collect three types of explanatory variables,i.e., (1) built environment attributes, (2) ride-hailing trip-specific variables, and (3) individual and household level socio-demographic variables.

Study Area

The study area for this research is Kolkata, the capital city of West Bengal, and India’s third most populous metropolitan city. With a population of 4.5 million [31] within municipal boundaries, making it one of the most populous regions globally as well. The city is located on the Hooghly River’s bank, 120 kms from the Bay of Bengal, India’s main port entry point to northeastern India. The city’s public transportation includes the Kolkata Suburban Railway, Kolkata Metro, trams, and buses. Currently, Kolkata city is divided into 144 wards. In Kolkata, thirteen per cent of respondents said they utilise ride-sharing, which is the highest among the Indian metropolitan cities [9], whereas ninety-one per cent of surveyed city commuters intend to buy a personal vehicle in the next five years. Interestingly though, the same respondents showed the largest propensity to forego buying a car if ride-sharing could match their transportation expectations.

Survey Administration

The data collection was done through a questionnaire survey conducted over three weeks in March 2021 when the COVID infections had significantly decreased.Footnote 1 The collection of data was intentionally avoided during the months of January and February 2021, allowing travel patterns to return closer to their typical levels to incorporate the potential long-term post-COVID behaviours and preferences. It is worth noting that all major public transportation services, including buses, suburban railways, and subways (metro rail), resumed their operations starting in November-December 2020. For this study,Footnote 2 the authors collected both primary and secondary datasets. The primary dataset (intercept data collected at households) includes information related to the existing travel habits, adoption and usage of ride-hailing services as well asFootnote 3 socio-demographics (household and individual). Alongside, the secondary dataset involves extracting the built environment characteristics pertinent to the origins of ride-hailing trips (mostly home locations). In fact, the latter type of dataset is actually based on the geo-locations provided by the respondent. Furthermore, the survey was administered through computer-aided face-to-face interviews utilising mobile tablets. Respondents’ answers were recorded using an internet-based Google form, aiming to mitigate human errors and alleviate the workload associated with data entry.

Dataset

Primary data: Travel and Socio-demographics

As part of the primary data collection, we relied on random sampling as our research objective included both RHS users and non-users (non-adopters) who are adults (>18 years). Based on the sample size criterion for an infinite population,Footnote 4 the study required responses from at least 385 individuals to maintain the representativeness [39]. Therefore, about a thousand (976 exactly) individuals were interviewed with a rejection rate of 14% (137), resulting in 839 responses. Out of 839 data points, 35 were discarded as they were incomplete, leaving 804 complete responses that were further used in the analysis. Following data cleaning, survey results are divided into two groups according to the types of trips they took: work trips and discretionary trips. Of particular relevance is the fact that if one uses RHS for both work and discretionary trips, then the respondent has been included separately in both datasets. Although the responses have a few attributes in common (for example, monthly spending, number of monthly trips etc.), they differ for trip-specific details (for example, trip distance, time of trip in a day etc.). For work trips, a total of 526 (out of 804) respondents were travelling to educational institutions or workplaces. Out of them, 267 people used the RHS at least once during the previous month, while 259 people did not adopt RHS (see Table 4). Any trip other than home-based work and home-based school is considered a discretionary trip. A total of 582 respondents participated in the discretionary activity, out of whom 307 respondents used the RHS at least once during the survey period, and 275 people did not adopt RHS (see Table 4). Table 2 provides a brief idea about the sample distribution, which shows that it largely reflects the population in terms of individual demographics, i.e., age and gender, except the fact that millennials (25–40 years) are over-represented. Nonetheless, this does not deter the objective of the present paper, which focuses on exploring the correlation between the endogenous variables (RHS adoption and frequency) and BE attributes, as well as demographic, rather than delving into the impact of altering exogenous variables on the endogenous variable. Additionally, since the sampling procedure was random in nature, i.e., independent of the modelled outcome variable (adoption and frequency of ride-hailing use), no weighting was applied to the sampleFootnote 5.

Table 2 Key sample characteristics of survey data

Secondary data: Built Environment Variables

The term built environment refers to various structures and urban areas that are distinct from the natural environment, particularly those that can be altered by policies and people’s behaviour. The most widely used descriptive dimension is drawn from Ewing and Cervero [17] famously termed “five Ds". We initially selected the “five Ds" based on prior research and data on the characteristics of the built environment (See Table 3). We then used a workable ’four D’s’ built environment factor for this study based on the available data since land use diversity data for Kolkata city was not available.

Table 3 Built environment indicators

As discussed earlier, the built environment indicators of the trip origins and destinations were estimated using secondary data sources. BE indicators were classified into four variables, namely, density, design, destination accessibility, and distance from public transit stops). We obtained population data from the Census of India website, which was used to determine the population density of the wards of Kolkata city. Political boundaries, such as city and ward boundaries datasets, were obtained from Kolkata Municipal Corporation (KMC). As the questionnaire survey was conducted using mobile tablets, surveyors captured the geographical coordinates through Google Maps. Doing this enabled exactly locating the trip origins (see Fig. 1) of work and discretionary trip destinations as gathered from the respondents. Road infrastructure data, which includes street network geospatial datasets from the Open Street Map (OSM) and the Bhukosh Geological Survey of India website, was used to calculate road and intersection density. Bus stop location data are obtained through Google Earth Pro. Using QGIS software, the density variable was calculated by estimating population density for each origin with a geotagged address, design variable by measuring intersection density, and road density by creating a buffer distance of one square kilometre from the origin. For the destination accessibility variable, the distance between the origin and the destination is used, and a programme is used to calculate the distance and time taken by each mode using Google Maps API. In QGIS software, the distance from public transportation is calculated using Euclidean distance, and the number of bus stops within one square kilometre buffer distance.

Fig. 1
figure 1

Distribution of trip origins in the study area

Road infrastructure data, which includes street network geospatial dataset from Open Street Map (OSM)Footnote 6and Bhukosh Geological Survey of IndiaFootnote 7 website, was used to calculate road and intersection density. Bus stop location data are obtained through Google Earth Pro. Using QGIS software, the density variable was calculated by estimating population density for each origin with a geotagged address, design variable by measuring intersection density, and road density by creating a buffer distance of one square kilometre from the origin. For the destination accessibility variable, the distance between the origin and the destination is used, and a programme is used to calculate the distance and time taken by each mode using Google Maps API. In QGIS software, the distance from public transportation is calculated using Euclidean distance, and the number of bus stops within one square kilometre buffer distance.

Descriptive Analysis

User Characteristics Related to Ride-Hailing Adoption and Usage

The heat map charts (Fig. 2a, c) depict the heat maps for ride-hailing adoption for both work and discretionary trips, with dark red indicating a higher concentration of ride-hailing adoption in that area. On the flip side, Fig. 2b, d depict the heat maps for usage frequency of ride-hailing services for both types of trips, with darker red indicating a higher number of trips made by individuals in that area. The usage is defined here as the monthly frequency (of last month) of availing RHS by the respondents, while the frequency has been categorized into eight groups: (1) 0 (never), (2) 1 (once), (3) 2 (twice), (4) 3 (thrice), (5) 4–6 times, (6) 7–9 times, (7) 10–14 times, and (8) more than 14 times. This subsection also introduces the sample’s socio-demographic characteristics, which reasonably reflect the census population. Categories such as gender, age groups, occupation, and household size have been used to compare the adoption and the frequency of RHS for both work and discretionary trips.

Fig. 2
figure 2

Heat maps with RHS adoption and use frequency for (a, b) work trips; and (c, d) discretionary trips

Table 4 Distribution of sample for work and discretionary trips

Table 4 shows that the male and female RHS users who took work trips make up 69.57% and 30.43% of each group, respectively. Similarly, male and female RHS users who made discretionary trips make up 53.42% and 46.58%, respectively. In terms of age group, we observe more than half of the respondents who use RHS are from the millennial generation, while nearly half of the respondents who do not use RHS are older than 45 years, primarily middle-aged and older people. Employed people typically use ride-hailing services more frequently for work and discretionary travel. Interestingly, people who are un-employed mostly use ride-hailing services for discretionary purposes, while students use them less frequently because they lack the appropriate income. More than 75% of respondents with education levels above high school use ride-hailing services for both work and discretionary travel, compared to less educated respondents who use the services less frequently. More than 60% of respondents are with incomes between INR 20,000 and INR 50,000 who are the most frequent users ride-hailing services, while those with incomes over INR 50,000 use them less frequently. Besides, respondents who belong from households without a car use RHS more frequently than their car-owning counterparts for both work and discretionary trips.

Ride Hailing Services Trip Characteristics

Table 5 shows that almost 30% of RHS journeys are short distance (less than 5 km), nearly 45% are medium distance (between 5 and 10 km), and the rest are long distance (more than 10 km), which includes journeys for work and discretionary trips. Nearly 60% of respondents spend less than INR 500 per month on app-based services, while nearly 35% spend between INR 500 and INR 1000 per month and less than 7% spend more than INR 1000 per month on app-based services for work and discretionary trips, respectively.

Table 5 Ride-hailing service trip characteristics for work and discretionary trip

Most respondents made ride-hailing trips during the daytime. 64.43% of respondents make their trips for work on weekdays, while nearly 60% of respondents use RHS during weekends for discretionary trips. 31.63% of respondents who said they make work trips use RHS services for getting to or from work, while 43.32% of respondents who said they make discretionary trips use RHS services for recreational or social events. 17.39% and 24.10% of respondents who said they made work and discretionary trips use RHS for visiting doctors or medical emergency purposes.

Built Environment Characteristics

Built environment factors included in this study were density, urban design, destination accessibility, and distance to transit stops (See Table 6), which were based on the literature review, as presented in Sect. Dataset. In this section, the built environment variables are studied and organised by their classification into one of these four dimensions.

Table 6 Built environment characteristics—work trip and discretionary trip

The population density of each ward is taken directly from the Census of India 2011 and then linked with Kolkata’s ward map using QGIS software for each type of trip. In this study, the average population density was significantly higher compared to previous studies [14, 43]. Google Earth Pro was used to calculate the distance of a user’s origin to the nearest bus stop. It is observed that the average bus stop density (no of bus stops per sq. km) & distance to nearest bus stop (meters) for work trip origins is 12.58 and 161 ms respectively, whereas for discretionary trips, it is 12.51 and 156 ms respectively. In contrast, the distance to the nearest transit stop was greater in a previous study [14], and the average bus stop density was lower [27, 45] than in the current study context. ‘Intersection Density’ and ‘Road Density’ are commonly used as indicators of an area’s street connectivity, which is an important aspect of its urban design. Data was collected from open street maps and processed using QGIS software to calculate design variables. The average road density for a user’s work trip within a 1 Sq.km buffer zone is 23.84 km per sq. km, whereas it is 24.49 km per sq. km. In contrast to the current study, road density was lower in earlier studies [43]. Data reveals that the average intersection density for the origins of a work trip is 593.55 intersection per sq.km, whereas for discretionary trips, the average is 622.82 intersections per sq.km. Distance to destination variable used to define destination accessibility of a trip origin, was calculated as the distance between trip origin and destination using Google Maps API for different modes. According to the data, the average travel distance for a work trip is 7.98 kms, while a discretionary trip is 5.91 kms.

However, the average values solely do not give a clear idea about the relationship between ride-hailing usage and BE attributes. Hence, we decided to visualize the said interaction with the help of heat maps which plot the density of BE variables against average RHS use frequency. Based on these maps (See Figs. 3a–d; 4a–d), the use of ride-hailing services is more likely in areas with higher population density, whereas variables like bus stop density, intersection density, and road density follow reverse or unclear trend. Therefore, to empirically investigate these relationships regarding how the built environment affects ridesharing usage for both work and discretionary travel, a discrete choice model was estimated, as described in the following section.

Fig. 3
figure 3

Heat maps with RHS use frequency (work trips) for built environment variables

Fig. 4
figure 4

Heat maps with RHS use frequency (discretionary trips) for built environment variables

Methodology

Hypotheses Development

The study tests the following hypotheses:

  1. 1.

    Hypothesis 1 (H1): The built environment variables have similar but varied effects for adoption and usage frequency for a particular type of trip purpose.

  2. 2.

    Hypothesis 2 (H2): The built environment variables have significantly different effects for different trip purposes, i.e., work and discretionary.

  3. 3.

    Hypothesis 3 (H3): The predictive power of the model improves when effects of the built environment variables are estimated in conjunction with other explanatory variables (trip characteristics and socio-demographics).

It is worth mentioning that when the dependent variables are categorical, and the error terms of the dependent variables are expected to have a high correlation, the bivariate ordered probit model allows to make more accurate predictions than other models [35]. Hence, we employ semi-ordered bivariate ordered probit model using the CMP module of Stata in this study.

Semi-Ordered Bivariate Probit Model

The semi-ordered bivariate probit model [19] was used as there are two dependent variables (Y1, Y2) that have to be modelled together; in this case one is a binary variable and the other an ordinal variable, as a function of some explanatory variables. The binary variable represents whether the respondent is a user/ non-user of RHS, i.e., used RHS at least once in the last month or not, whereas the ordinal variable depicts the number of ride-hailing trips performed by the respondent in the last month.

The two dependent variables for each individual i are the adoption (binary) of ride-hailing services (\(y_{i,1}\)) and the frequency of use (ordinal) of ride-hailing (\(y_{i,2}\)). Two latent variables are defined to estimate a bivariate model: \(y_{i,1}^*\) and \(y_{i,2}^*\). The results of the model estimation are discussed in the following section. The explanatory variables \(X_{i,1}\) and \(X_{i,2}\) are used to model these latent variables while \(\beta _1\) and \(\beta _2\) is the set of coefficients to be estimated. As shown in Equation 3, the error terms \(\epsilon _{i,1}\) and \(\epsilon _{i,2}\) are assumed to be correlated and to have a bivariate normal distribution, and \(\rho\) (rho) is the correlation between the error terms. A value of \(\rho\)=0 indicates that no correlation exists between the error terms.

$$\begin{aligned} y_{i,1}^*= & {} \beta _1 * X_{i,1} + \epsilon _{i,1} \end{aligned}$$
(1)
$$\begin{aligned} y_{i,2}^*= & {} \beta _2 * X_{i,2} + \epsilon _{i,2} \end{aligned}$$
(2)
$$\begin{aligned}{} & {} \begin{pmatrix} \epsilon _{i,1} \\ \epsilon _{i,2} \end{pmatrix} \sim N \left[ \begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 1 &{} \rho \\ \rho &{} 1 \end{pmatrix} \right] \end{aligned}$$
(3)

In equations 4 and 5, the binary variable \(y_{i,1}\) in and ordinal variable \(y_{i,2}\) can be seen. The unknown cut-offs satisfy the conditions mentioned below.

$$\begin{aligned}{} & {} \mu _1< \mu _2< \mu _3< \mu _4< \mu _5< \mu _6 < \mu _7 \end{aligned}$$
(4)
$$\begin{aligned}{} & {} y_{i,1} = \left( {\begin{array}{c}0 (non-users), y_{i,1}^* \le \delta _1\\ 1 (users), \delta _1 < y_{i,1}^*\end{array}}\right) \end{aligned}$$
(5)
$$\begin{aligned}{} & {} y_{i,2} = \begin{Bmatrix} 0 (never), y_{i,2}^* \le \mu _1\\ 1 (once), \mu _1< y_{i,2}^* \le \mu _2\\ 2 (twice), \mu _2< y_{i,2}^* \le \mu _3\\ 3 (thrice), \mu _3< y_{i,2}^* \le \mu _4\\ 4 (4-6\; times), \mu _4< y_{i,2}^* \le \mu _5\\ 5 (7-9\; times), \mu _5< y_{i,2}^* \le \mu _6\\ 6 (10-14\; times), \mu _6< y_{i,2}^* \le \mu _7\\ 7 (more\, than\, 14\, times), \mu _7 < y_{i,2}^*\\ \end{Bmatrix} \end{aligned}$$
(6)

The probability that \(y_{i,1}\) = j and \(y_{i,2}\) = k is given by Equation 6. The bivariate normal cumulative distribution function is represented by \(\phi _2\). Please note that Equation 6 applies to a general ordered bivariate model. In the current case, \(y_{i,2}\) is ordinal, which means \(\delta _k\) =\(\delta _1\) and \(\delta _k\)-1 = 0. These probabilities are the ones that are entered into the log-likelihood function. The model parameters \(\beta _1\), \(\beta _2\), \(\mu _1\), \(\mu _2\), \(\mu _3\), \(\mu _4\), \(\mu _5\), \(\mu _6\), \(\mu _7\), \(\delta _1\) and \(\phi\) are estimated using the full-information maximum likelihood estimation.

$$\begin{aligned} \begin{aligned} Pr (y_{i,1} = j, y_{i,2} = k | X_{i,1},X_{i,2})&= \left( {\begin{array}{c}\phi _2\left[ (\mu _j-\beta _1*X_{i,1}), (\delta _k-\beta _2*X_{i,2}), \rho \right] \\ -\phi _2\left[ (\mu _{j-1}-\beta _1*X_{i,1}), (\delta _k-\beta _2*X_{i,2}), \rho \right] \end{array}}\right) \\&- \left( {\begin{array}{c}\phi _2\left[ (\mu _j-\beta _1*X_{i,1}), (\delta _{k-1}-\beta _2*X_{i,2}), \rho \right] \\ -\phi _2\left[ (\mu _{j-1}-\beta _1*X_{i,1}), (\delta _{k-1}-\beta _2*X_{i,2}), \rho \right] \end{array}}\right) \end{aligned} \end{aligned}$$
(7)

Estimation Results

The findings and results from the semi-ordered bivariate probit model are presented in this section (See Table 7). The effects of the socio-demographic characteristics, ride-hailing characteristics, and built environment variables, respectively, on the adoption and use frequency of ride-hailing are listed and discussed subsequently.

Table 7 Estimation results

Influence of Built Environment Attributes

The estimation results suggest both Hypothesis 1 and Hypothesis 2 hold true for our study. All the variables which significantly influence RHS adoption for a particular trip purpose also affect the usage frequency, albeit with varied intensity (H1), with the bus stop density parameter being the only exception. However, the set of BE variables that influence work and discretionary trips are quite different (H2) with road density being the only common attribute. In the case of work trips, the adoption and frequency of ride-hailing trips are influenced by four BE variables: (1) distance to destination; (2) bus stop density; (3) road density; and (4) population density. On the other hand, the ride-hailing trip adoption and frequency for discretionary trips are influenced by three BE variables: (1) bus stop density; (2) bus stop distance; and (3) road density. It is also worth noting that road density and intersection density variables are found to be highly correlated (0.95 and 0.97 for work and discretionary trips, respectively). Hence, in the final model, only one of those, i.e., road density variable, has been considered to avoid errors due to overestimation.

Population density of the area where an individual resides significantly impacts RHS adoption on work trips, but it has no effect on discretionary travel. People are more likely to adopt RHS as population density increases which contradicts previous findings [13, 41] where it was reported to be negatively correlated. However, it has the least influence on RHS frequency. This finding is likely pointing towards the lack of proportionate growth of traditional travel choices with population growth. Regarding neighbourhood design variables, road density is negatively associated with RHS adoption and use frequency for both trip purposes. Interestingly, this contradicts with the findings from global north [43, 44], who discovered that RHS trips increase as road network density increases. The possible reason for the difference could be the better availability of various travel alternatives in a well-connected area. Also, such an effect could be linked to the fact that, unlike the global north in developing countries, a large section of RHS users belongs to non-car-owning households [6].

Individuals living in areas with limited public transport connectivity (i.e., lower bus stop density) are more likely to adopt RHS for work and discretionary trips. In contrast, there is no significant impact on RHS use frequency in areas with lower bus stop density. This finding contradicts earlier studies, largely done in the USA context [4, 13, 43, 44], that found areas with better transit accessibility have a higher propensity for ride-hailing services. Furthermore, this disparity in findings indicates that in developing countries, RHS is gathering momentum, particularly where public transit services perform poorly. Interestingly though, a higher frequency of ride-hailing is associated with a shorter distance between the origin and the nearest bus stop. This possibly suggests a supplementary effect between the said modes when the trip is not time-bounded, i.e., discretionary trips, and such trip-specific diversity is in line with earlier findings [15]. Distance to destination, the other BE factor, significantly influences the choice and usage of RHS for work trips but not for discretionary trips. With increased distances to destination,i.e., longer trip length, individuals are likely to make more RHS trips. Such an effect could be explained as longer trip length often corresponds to several trip chains leading to travel inconveniences (e.g. extra transfer time, longer waiting time, lower reliability), which could be avoided in a single ride-hailing trip. Also, let’s not overlook the fact that this relation proved to be significant for work trips with higher time-boundedness. Besides, this is consistent with the findings of a recent Hong Kong study [40], which found that the accessibility of residential neighbourhoods to destinations had a significant effect on travel time and trip frequency.

Finally, we also compute average treatment effects (ATEs) for the BE attributes to understand their relative impact on the variables of interest, i.e., adoption and frequency (See Table 8). For general understanding, ATEs provide us with the change in posterior probability due to a treatment change in one particular attribute while keeping rest constant. Since the BE variables are continuous, we compute the treatment effect by changing from the base value of the 25th percentile to the 75th percentile of the respective variable. At the same time, for ease of presentation, the changes in the shares for the ordered output, i.e., frequency, the categories have been combined into three larger categories. In this paper, ‘Freq: Never’ denotes zero usage while ‘Freq: Low’ denotes three levels (1,2 and 3 times in a month) and ‘Freq: High’ represents the rest of the categories (4 times to more than 14 times in a month).

Overall, for the adoption dimension related to both trip purposes, road density is observed to have the highest magnitude of effect. For work trips, it is closely followed by population density, bus stop density and distance to destination, respectively. Since population density and distance to destination attributes were not significant for discretionary trips, bus stop density proved to have the second largest effect on adoption. In terms of usage frequency for work trips, distance to destination emerges as the most influential factor, with population density having the least impact. This could be explained as there are trip-specific factors (for example, cost and time of day) contingent on trip distance, which significantly influence continued usage decisions but are less or not relevant for adoption. Interestingly for discretionary trips, road density is observed to have the highest influence on both adoption and frequency, which probably points to the spatial skewness of the existing ride-hailing supply.

Table 8 Average treatment effects of built environment attributes

Influence of Socio-Demographic Attributes

Results suggest that RHS are more likely to be adopted by younger users (Gen Z and Millennials) for both work and discretionary trips. Younger adults (Gen Z) and millennials show a significant propensity to use ride-hailing more frequently for both work and discretionary travel. This finding is consistent with earlier research [3, 11, 26], which may be attributed to the higher tech-savviness of the younger generation while the older generation is unfamiliar with new technology and specialised apps of RHS services. Also, education turns out to be a significant predictor of RHS adoption for work and discretionary trips, but not for use frequency of work trips. This indicates that education influences the early inertia (adoption decision) of using app-based RHS, which is consistent with previous findings [3, 36] but loses its significance for subsequent usage (frequency decision). In a similar vein, we must recognize that employment (including student) also has a positive effect (represented through a negative coefficient for non-employee) on ride-hailing adoption for work trips but not for the frequency. This is in contrast to the observation made by [29] which could be related to higher RHS usage for discretionary trips in Indian context [5]. On the other hand, the model coefficients of household income show a positive relationship with the use frequency of ride-hailing services for work trips. Individuals belonging to higher income households (more than INR 50,000) are more likely to frequently use RHS as compared to the rest, which is consistent with previous research [2, 14, 36], but it is not a significant predictor of RHS adoption. This highlights that household income (a proxy for affordability), unlike education, is related to frequency decisions rather than adoption. At the same time, we could observe ride-hailing adoption for discretionary trips is influeunced by car ownership which is understandably correlated with household income. Therefore, the statistical insignificance of household income for discretionary trips is probably a case of correlation effect rather than true income effect. Individuals who do not own a car appear to be more likely to adopt and frequently use ride-hailing services. In fact, this can also be attributable to worries about parking availability and cost, particularly in and around city centres.

Influence of Ride-Hailing Trip Attributes

Trip distance is an important factor to take into account when analysing the use frequency of RHS [46], both for work and discretionary trips. Respondents use RHS for longer trips (specifically 5–10 Km) as compared to short trips (less than 5 KM), which might be linked to both a lack of convenient travel alternatives and better value of money for long-distance RHS trips. The previous reasoning gets validated as individuals with average expenditure (i.e., INR 500–1000 per month travel) show the highest usage inclination. The time of day of travel by an individual has a significant impact on RHS frequency for both work and discretionary trip. Intuitively, coefficient estimates are found to be the highest in the morning (AM) peak (7 am to 12 pm) and evening (PM) peak (5 PM-midnight), while less significant in the off-peak hour. In contrast to previous studies [13, 26], the peak use was in the evening (6 pm to 9 pm), while another study indicates [14] that RHS is more commonly used at night (10 pm to 7 am).

The model’s estimated rho values are 0.578 and 0.577 for work and discretionary trips, respectively. Both are significantly different from 0, thus rejecting the null hypothesis (\(\rho\) = 0) and confirming that the error terms in the two equations are correlated. The log-likelihood ratio (LR) test statistics are also significant (at 95% confidence level) for both work and discretionary trip models (See the critical \(\chi ^2\) values in Table 7). This means that the effects of unobserved variables on ride-hailing adoption are highly correlated with those affecting ride-hailing frequency of use. Moreover, the thresholds, defined to be points on a continuous unobservable variable, are used to differentiate the adjacent levels of the response variable (RHS usage frequency in our case). The observed values are all in the order \(\mu _1< \mu _2< \mu _3< \mu _4< \mu _5<\mu _6 < \mu _7\). Predictably, the respective differences between the higher thresholds are comparatively smaller as compared to the corresponding lower ones. At the same time, we also performed the Wald test to assess whether adding socio-demographic and ride-hailing trip-specific attributes increases the explanatory power of the model. The test statistics for the models are significant at a 95% confidence level, indicating that the exclusion of these variables could lead to a misinterpretation of the effects of the mentioned variables, attributing them to the built environment factors in the model. Understandably this finding corroborates Hypothesis 3 (H3).

Conclusions and Discussion

In developing countries like India, a majority of ride-hailing research focuses on user behaviour and demographics, whereas only a few look into the role of built environment (BE) in influencing ride-hailing usage. A few recent studies on the BE topic have all been conducted in the context of developed economies, which are typically very different from developing economies. In India, for example, unlike in developed countries of the global north, the organic urban form and prevalent mixed land uses may have a variable impact on the adoption and use of ride-hailing services. In addition, the current study attempts to extend the existing methodological framework in order to estimate the frequency of usage in conjunction with RHS adoption.

The present study developed a Semi-Ordered Bivariate Probit model to estimate RHS adoption (binary) and frequency of usage RHS (ordinal). This framework enables the use binary and ordinal variable as two dependent variables in a model where the dependent variables are categorical and the error terms of the dependent variables are expected to have a high correlation. It identified several built environment factors, socio-demographic and trip specific variables to have a significant impact on the likelihood of using RHS for both work and discretionary trips. Built environment variables such as destination accessibility, road density, intersection density, and population density depict significant yet varied impacts. Firstly, their impact on RHS adoption and use frequency is different and so are their impacts for work and discretionary trips. The users residing in areas with better connectivity and greater public transport facilities, i.e., higher density of bus stops, roads, and intersections, are less likely to adopt RHS. On the other hand, individuals who live in densely populated areas are more likely to use RHS. As the distance between the origin and the destination grows, commuters are more likely to adopt and use RHS. Bus stop density impacts RHS adoption, but it is not a significant factor in RHS use frequency in work trip scenarios. In discretionary trips, the built environment factor of bus stop density, road and intersection density influences RHS adoption and usage frequency.

Findings from this research paper show that people who reside in areas with poor access to public transportation are more likely to adopt RHS. This finding indicates a complimentary effect which could be used by policymakers and public transit authorities to diversify RHS and bring it into the domain of Mobility as a Service (MaaS). Besides, RHS operators could extend dedicated shuttle services (for instance, the OLA SHUTTLE service launched in India) especially in second-tier cities which seldom have a robust public transport infrastructure. This could prove to be a win-win situation for both public transport authorities and RHS operators while augmenting overall sustainability. The other major policy takeaway is the use of TNCs in suburban areas, as their distance to workplaces is high, and road density is likely to be low (as compared to CBD). This necessitates that the TNC companies maintain sufficient supply in the suburban areas, especially during the morning peak hour. Lastly, RHS adoption is more likely among those who do not have access to a car (but have higher income) than those who do. Therefore, such behavioural propensity indicates that RHS might be able to reduce car ownership provided they deliver car equivalent convenience and can significantly counter the affective emotions related to owning a vehicle. To this end, RHS operators should focus on affordable options in the CBD areas (high road density) to attract such users, which may include auto-rickshaw or two-wheeler-based ride-hailing services. As travel distances in CBDs are likely less and parking not being easily available, such affordable and quick options may increase the adoption among this user group. Overall, considering our study using a combination of BE variables, trip-specific covariates, and demographics, it should aid policymakers in gaining deep insights about both aspects, i.e., RHS adoption and use.

The current empirical model has some drawbacks that should be noted for further study. Most importantly, we would agree that adopting a random and larger sample size could lead to more reliable results. In fact, a panel dataset with attitudes measured would be preferable to the cross-sectional one used in the current study as it is expected to explain the behavioural heterogeneity. Besides, including psycho-social indicators might aid in developing a more comprehensive behavioural framework. Further research may model activity-travel preferences and separate the various types of RHS, such as 2-wheelers, sedan-based, huge SUVs, etc., in order to better understand how travel behaviour evolves as a result of RHS.