1 Introduction

Many cities in middle and low-income countries have introduced reforms to transform decentralized, poorly regulated, informal, or semi-formal public transport (PT) systems into more organized schemes through restructuring, reorganization, or formalization processes (Hidalgo and Carrigan 2010). These processes seek to correct deficiencies in service and to reduce negative externalities (Kash and Hidalgo 2014; Kumar et al. 2021), improve labor conditions (Kim and Shon 2011; Hidalgo and King 2014), improve the quality of service (Hidalgo and Graftieaux 2006; Kim and Shon 2011; Ida and Talit 2015; Rodriguez-Valencia et al. 2022b) and increase ridership (Attard 2012; Ida and Talit 2015; Kim and Shon 2011). Retaining existing and attracting new PT riders is the final key objective of the system itself, but it is also fundamental for achieving the necessary long-term good financial performance of PT operators, and a catalyst for attaining the goal of urban sustainability (Hickman et al. 2013; Bocarejo et al. 2014).

Several failed or incomplete PT reforms, such as those exposed by Tun et al. (2020), Muñoz et al. (2009), and Dhingra (2011), highlight the importance of having a participation process with different stakeholders, where users are central for a successful reform. Sohail et al. (2006) identify the most common flaws in regulatory issues and communication between stakeholders to attain an effective regulation process.

Understanding what users like, prefer, and need (or dislike, avert, or waste) allows for a better retention and attraction of riders and, consequently, better financial and sustainable outcomes. However, transportation planners and operators often base their objectives and targets on supply or performance indicators, such as IPK (passenger per kilometer index), bus bunching, vehicle occupancy, or saturation, while users might be seeking to achieve other aims. Kash and Hidalgo (2014) analyze the transit vision dissonance referring to the differences in the vision, awareness, aspirations, and expectations between transit users and transit planners and operators, under determined regulation policies.

Despite the introduction in 2000 of the bus rapid transit (BRT) in Bogotá, 15,000 traditional privately-operated, semi-regulated buses remained in operation in mixed traffic. The oversupply, lack of integration and coordination, and the disproportionate negative externalities produced by them were used to justify a PT reform in Bogota to improve and modernize the system (Alberti and Pereyra 2020). It included a gradual fleet replacement and reduction, route redesign, and fare integration by means of a smart card (called TuLlave), aimed at fully covering public transportation in the city. An important element of the transformation was regulation: route permits were changed into zonal franchise concession contracts. The reform should have been completed by 2014, but financial and governance problems, especially by small-vehicle operators that formed new companies to operate some of the zonal concessions, precluded a full and punctual implementation. As a result, previous operators were invited to temporarily manage some zones of the city, under the traditional scheme of route permits. After more than a decade of the process, the traditional semi-regulated routes are still operated in Bogotá, simultaneously with the newly formalized integrated bus system.

This situation provided a natural experiment to compare quality of service from the user perspective. We analyze user satisfaction surveys for such comparison, following the concepts developed by Drucker (2007): service is something that the customer receives, not something that the supplier provides. User satisfaction surveys have proven to be useful to understand user needs and the perception of a service (Imam 2014; Tam, 2004). Some authors prefer other methods (Voß et al. 2020), but eliciting user perceptions directly is still the most common mechanism to assess quality of service in public transport (De Oña and De Oña 2015). The analysis of the user responses to the same survey instrument in the two systems is achieved through Importance-Performance Analysis (IPA), discrete choice models, and experts’ judgement. Our research provides insights to planners on where to concentrate their improvement efforts in PT reforms, not only in Bogotá, but in cities in the Global South seeking formalization of their semi-regulated PT services.

2 Literature review

The literature provides valuable evidence of bus transit reforms across a variety of locations in the Global South, such as Africa (Kumar et al. 2021), Latin America (Tun et al. 2020), Santiago, Chile (Muñoz et al. 2009), Rio de Janeiro, Brazil (Golub et al. 2009), León, México (Hidalgo and Graftieaux 2007), Tel Aviv and Faisalabad, Pakistan (Russell and Anjum 1997), and Cali, Colombia (Hidalgo and King 2014) and also within developed countries such as Malta (Attard 2012), Seoul, Korea (Kim and Shon 2011), and 22 cities in Israel (Ida and Talit 2015). Among the many examples, reforms could be divided into two groups: those seeking formalization and regulation (e.g., Santiago de Chile, Rio de Janeiro, Cali) and those seeking efficiency and good service (e.g., Malta and Tel Aviv). All cases have different issues and contexts, but they all sought to find the best way to organize regulated transit provision and define the roles of different stakeholders to fix perceived issues. In this paper we will concentrate on the systems seeking formalization, which are similar to the Bogotá case.

Formalized PT systems differ in various dimensions from the semi-regulated or informal services. Semi-formal PT services generally provide ample coverage, frequent services at a relative low cost for the users and with little direct subsidies from government, but often result in raised traffic congestion, air and noise pollution, and traffic accidents (Cervero and Golub 2007; Tun et al. 2020; Kumar et al. 2021). Previous literature shows that formalization processes face multiple difficulties: from the inconveniences of an abrupt implementation, due to an interruption of passengers’ routine and the difficulty of addressing large-scale changes (Muñoz et al. 2009), to incomplete transition (Hidalgo and King 2014), and financial mismatches in the newly regulated transit system (Kash and Hidalgo 2014). Despite all these difficulties, PT formalization has proven to reduce accessibility gaps between low and high-income populations due to the improvement in road fatalities and air pollution (Bocarejo and Urrego 2022).

The low Level of Service perceived by the users of informal PT systems is one of the most common arguments used to justify the need to transition to a formalized transit system. The aspect that impacts most on the negative perception is the service quality, which includes poor comfort, safety, and security. Moreover, there are other, less frequent causes that promote interventions, such as bad planning, that contains a lack of modal integration and inefficient routes (Hidalgo and Graftieaux 2006; Muñoz et al. 2009; Kim and Shon 2011); the operational costs and unsustainable fares (Golub et al. 2009; Muñoz et al. 2009); and the unmeasured extent of informality (Golub et al. 2009; Muñoz et al. 2009; Kim and Shon 2011; Hidalgo and King 2014).

Implementation delays, financing difficulties and unmatched user’s needs are the most common issues experienced during formalization processes. The most frequent problem found in these types of restructuring processes is the time and budget for completing the reform, usually dependent on political cycles and city finances, not on process needs. Sun et al. (2016) provide evidence of the difficulties of financing regulated transit systems, due to public funding constraints, to achieve favorable conditions for the reform. In many processes, user needs, opinions, or desires were not taken into account (Russell and Anjum 1997; Muñoz et al. 2009; Dhingra 2011; Kim and Shon 2011).

Another major problem in the restructuring processes is the vision dissonance between what the users expect from the service and what the planners are working to achieve (Kash and Hidalgo 2014). Vision dissonance resolution implies knowing and understanding what users value and match their desires and expectations with what the systems provide. By means of mixed qualitative and quantitative methods, Garcia-Suarez et al. (2018) provided evidence of this vision dissonance by confronting user values in Bogotá’s BRT and transit vision. Fully qualitative methods have been also applied, for example by Santana et al. (2020).

Inquiring about a user’s perception of the service is valuable for the operators and planners (Garcia-Suarez et al. 2018). However, user opinions are often disregarded during the design and planning processes. User satisfaction surveys have proven adequate for identifying PT user needs to help practitioners in focusing their efforts (Imam 2014; Rodriguez-Valencia et al. 2019), identifying positive perceptions that enhance user loyalty and attract new users (Olsen 2002). Perception is also a crucial determinant of the long-term financial performance of transit operations, a greater understanding of which may prevent a fall into the vicious cycle of diminishing revenues, frequency, and quality of service. User surveys also help identify mismatches between the operator’s efforts and the user needs (Rohani et al. 2013; Kash and Hidalgo 2014). As there is not a consensus about the conceptual difference between perceived quality of service and customer satisfaction (Tam 2004), we use both terms interchangeably.

Besides transit service flaws during the operation phase, case studies have unveiled common mistakes in planning, design, and implementation stages (Golub et al. 2009; Hidalgo and King 2014). For example, Kash and Hidalgo (2014) provide a framework on how to identify user needs during the process of reorganizing bus transit systems based on semi-structured user interviews, but not regarding specific service attributes. The current literature falls short in identifying both common failures in new formalized transit schemes, and positive attributes from the informal services identified from the user perspective.

3 Bogotá’s PT reform

In Bogotá, a dense city of 8 million people, almost 78% of PT trips were made in a traditional semi-regulated privately-operated bus system, with only 22% in the formalized system, which then only comprised of the BRT corridors and its feeder routes in 2005 (SDM 2005, 2019). Traditional buses operated under route permits granted to bus companies, which leased them to bus owners, who in turn rented the bus operation to drivers on a daily basis. Individual bus drivers then compete for passengers on the road to maximize their revenue; this fierce competition was dubbed the “penny war” (Guerra del centavo) (Hidalgo and King 2014). This scheme led to some positive outcomes (i.e., frequent services, ample coverage, affordable fares) and some negative ones (i.e., oversupply, poorly maintained and aged buses, low comfort, low road safety and appalling labor conditions) (Montezuma 2005; Kash and Hidalgo 2014).

The city decided to eliminate the “penny war” with a new Integrated Public Transport System (SITP in Spanish) to overcome its negative externalities. The SITP integrates the mass transit system (BRT) with regular bus services, called zonal buses (hereafter referred to as SITP-Z), with the aim of covering 100% of public transportation services in the city. To avoid Santiago’s (Chile) chaos derived from a sudden implementation of a PT reform, labeled “the big-bang” (Muñoz et al. 2009), Bogotá chose a gradual implementation approach between 2011 and 2014. In contrast to Santiago’s sudden implementation, Bogotá was not only slow, but it was also incomplete due to financial difficulties experienced by two concessionaries that were responsible for three out of the eleven operational areas. From 2012 onwards, incumbent operators were invited to retain their traditional bus services under semi-regulated route permits with the name “Provisional SITP” (hereafter referred to as SITP-P) in order to provide service coverage for the above-mentioned three areas and ensure connections to other areas in the city. These provisional services continued operating in direct competition, thereby extending the duration of the “penny war” (Kash and Hidalgo 2014).

The SITP-Z structure included private capital investments in new and overhauled bus fleets, private operations, third-party fare collection, and public resource planning and management (Kash and Hidalgo 2014). The new system was structured under long-term concession contracts with private parties, with responsibility for buying and operating the local bus services in Bogotá for 24 years, in accordance with the structuring principles of the previous BRT implementation in 2000. Fare collection was modernized using electronic cards instead of cash and operated through a specialized contractor, which was responsible for guaranteeing fare integration with the BRT system. The operator was also in charge of technology provision for dispatching and controlling, as well as user information services. The SITP-Z brought improved working conditions for the drivers, an upgrade of the vehicle fleet, improved accessibility, increased road safety and reduced tailpipe emissions (SITP 2012).

By 2015 the city reported approximately 5.8 million trips per day using three types of services: SITP-P, SITP-Z and the BRT (trunk plus its feeder services) out of the 14.9 million total trips (SDM 2015). Bogotá’s mode share is shown in Table 1. The reform resulted in subsidies exceeding USD 200 million a year (Martínez 2019) and private operators facing financial difficulties, with some filing for bankruptcy or facing severe credit constraints (Financiera de Desarrollo Nacional 2019). The city ultimately renegotiated the concession contracts, to provide better conditions to the concessionaries linked to improved service performance requirements, while bankrupt operators’ contracts were cancelled (Medina 2019).

Table 1 Bogotá mode share in 2015

Both systems, the formalized (SITP-Z) and the semi-regulated (SITP-P) transit services coexisted from 2012 to 2021. Table 2 describes the main characteristics of the two local bus services; BRT was not considered in this comparison and was not included in the analysis as it has different service patterns compared to the SITP-Z and SITP-P. The main differences between zonal and provisional services derive from fare validation, fare level, operational schemes, and labor conditions of the drivers. As some buses of the SITP-P fleet were overhauled, painted according to the system image, and introduced to the SITP-Z service, some conditions, such as bus capacity, seats, and handrails were similar in both services, but only in a fraction of the fleet. Typical buses from both systems are shown in Fig. 1.

Table 2 SITP-Z and SITP-P operational and vehicle characteristics
Fig. 1
figure 1

Bus of SITP-Z (left) and SITP-P (right) services

4 Methods

A user satisfaction survey to SITP-Z (N = 301) and SITP-P (N = 252) riders in November 2017 was applied. The survey followed a random sampling method to select transit users at bus stops. The locations of survey points were chosen in different city districts and distributed based on data from the 2015 City Household Transportation Survey.

The users were requested to respond to the survey for only one of the two types of services. Therefore, the first questions of the survey classified the respondent as a SITP-Z or a SITP-P-frequent user. If the individual reported using both services with a similar frequency, he/she was assigned to the last system he/she used. The survey inquired about satisfaction, perceptions, passenger behaviors, and socio-demographic data. An open-ended question asked what system attributes needed changing to improve satisfaction and to serve as a proxy of important attributes requiring improvement at the forefront of users’ minds. The main section of the survey enquires about different service attributes that explain user satisfaction (dependent variable). Table 3 shows questions and statements included in the surveys. Satisfaction was evaluated using a 5-point Likert scale with 1 being very low and 5 being very high, starting with a question about the overall satisfaction with the service. Then the survey asked respondents how they would rate different service components, operational attributes and in-vehicle disturbances.

Table 3 Survey variable description

The survey results were first analyzed by comparing the mean responses of SITP-Z and SITP-P using hypothesis testing. We then developed an ordered probit model to explain user satisfaction. Ordered probit models allow an estimation of a hypothetical variable \({\varvec{z}}\) known as a linear predictor, the value of which is compared with ordinal categorical outcomes; in this case satisfaction was measured from 1 to 5. This variable is used to calculate the probability of obtaining one or other categories given a predictor set of variables. The \({\varvec{z}}\) variable is, usually, a linear function expressed as (Washington et al. 2011):

$${\varvec{z}} = {\varvec{\beta}} {\boldsymbol{X}} + {\boldsymbol{\varepsilon}}$$
(1)

where X corresponds to a vector of n predictors, \({\varvec{\beta}}\) is the vector of estimable parameters, and \({\boldsymbol{\varepsilon}}\) represents the random disturbance. Knowing \({\varvec{z}}\), the observed ordinal data, \({\varvec{y}}\), can be defined by:

$$y = 1\;{\text{if}}\;z \le \mu_{0}$$
(2)
$$y = 2\;{\text{if}}\; \mu_{0} < z \le \mu_{1}$$
$$y = 3\;{\text{if}}\; \mu_{1} < z \le \mu_{2}$$
$$y = I\;{\text{if}}\; z > \mu_{I - 1}$$

Equation (2) shows estimable ordinal thresholds \({\varvec{\mu}}\) that account for the intercepts between the categories \({\varvec{y}}\), with \({\varvec{I}}\) being the number of ordered responses. If random disturbances are assumed to be independent and normally distributed with mean 0 and variance 1, the probabilities for each of the \({\varvec{I}}\) categories can be calculated by:

$$P\left( {y = 0} \right) = {\Phi }\left( {\mu_{0} - z} \right)$$
(3)
$$P\left( {y = 1} \right) = {\Phi }\left( {\mu_{1} - z} \right) - P\left( {y = 0} \right)$$
$$P\left( {y = 2} \right) = {\Phi }\left( {\mu_{2} - z} \right) - P\left( {y = 1} \right) - P\left( {y = 0} \right)$$
$$P\left( {y = I - 1} \right) = {\Phi }\left( {\mu_{I - 1} - z} \right) - \mathop \sum \limits_{k = 0}^{I - 2} P\left( {y = k} \right)$$
$$P\left( {y = I} \right) = 1 - \mathop \sum \limits_{k = 0}^{I - 1} P\left( {y = k} \right)$$

where \({\Phi }\left( * \right)\) is the cumulative normal distribution, and \(P\) represents the probability of obtaining one of the \(I\) ordinal categories \(y\) (Vallejo-Borda et al. 2020). Parameters were estimated using R statistical software.

For the Importance-Performance Analysis (IPA), each attribute is drawn on a Cartesian plane, with performance on the x-axis and importance on the y-axis. Each attribute rating serves as the performance and the coefficients’ level of significance of the Probit Models (t-stat) serves as an indirect indicator of an attribute’s importance (Rodriguez-Valencia et al. 2019). A standard practice to set a limit of bad/good performance and low/high importance is to assign the average of both performance and importance as the limits (Matzler et al. 2003; Chen and Chang 2005; Chou et al. 2011). For the performance axis, we set the limit at half of the scale (3.0/5.0) and for the y-axis in t = 1.96 (95% confidence). Attributes that need urgent action (quadrant of low rating and high importance) can be prioritized for service improvement. Further interpretation of the quadrants is presented by Matzler et al. (2003). IPA serves to synthesize the results and, in this analysis, enables the comparison of both systems.

5 Results

Despite the good intentions behind the bus reform in Bogotá, results show that users rated the semi-regulated system (SITP-P) better than the new formalized system (SITP-Z). There is a statistically significant difference between the overall mean satisfaction: while SITP-P is rated with an average satisfaction of 3.20/5.00, the SITP-Z is rated 2.67/5.00 (measured in a 1 to 5 Likert scale, where 3.0 is considered a passing grade). Regarding the perceived attributes of the services, SITP-P rates better in five of the attributes, SITP-Z in seven attributes, and there is no statistical difference in 14 attributes. Table 4 presents the mean differences of all the assessed attributes between both systems using statistical hypothesis testing. Waiting time (Q8) provides one of the largest differences in user satisfaction perception and it is the worst rated for SITP-Z. Despite the modernization of the payment method (using smart card for SITP-Z), users did not rate the variable “Easy payment” differently (Q14). Two additional variables were rated higher for the SITP-Z than SITP-P: vehicle cleanliness (Q9) and ease of transfers within the system (Q12). This is in accordance with the reform objectives that included improved vehicle conditions and offering convenient transfers among SITP-Z services.

Table 4 Mean differences in system attributes between SITP-Z and SITP-P from the user’s perspective

Conversely, several attributes did not show statistical differences between the systems. Despite the introduction of a new fleet for SITP-Z, comfort (Q6) and noise (Q27) attributes did not present any statistical differences with SITP-P. Similarly, road safety (Q13) and sudden braking (Q29) perceptions are not statistically different, i.e., users do not perceive changes in drivers’ behavior. Discomfort due to vendors in buses (Q26) and personal security (Q7) also receive similar ratings in both systems, as well as other operational attributes such as speed (Q20) and time reliability (Q19) and service attributes such as information (Q11), comfortable seats (Q24), and management of complaints (Q18).

Table 5 shows the two independent ordered probit models for the two systems. Probit models provide insights about the attributes that significantly contribute to explain user satisfaction. The degree of satisfaction regarding four service attributes, i.e., fare (Q5), comfort (Q6), security (Q7), and waiting time (Q8), are significant in both systems. In other words, the user perception of these four specific attributes is among the most relevant service features that explain overall user satisfaction, even when running a full model including all survey attributes described in Table 3. Coefficients for these variables are positive, indicating that a higher satisfaction with these attributes will result in an increase in the overall satisfaction.

Table 5 Ordered probit model results for general service attributes

Two more variables appear as significant only for SITP-P: availability of seats (Q15) and driver’s performance (Q10). The influence of seat availability can be explained by the fact that SITP-P runs more buses than deemed necessary if the occupancy standards of SITP-Z were used. It is interesting to observe that drivers’ behavior does not significantly influence the overall perception of SITP-Z despite the effort in improving their labor conditions, provide training and monitoring their performance.

For the SITP-P the effect of waiting time in overall satisfaction (\(\beta =\) 0.182) is much lower than in SITP-Z (\(\beta =\) 0.437). The large coefficient of waiting time for SITP-Z indicates that increasing user satisfaction regarding the waiting time (which is the lowest rated attribute) will result in higher overall system satisfaction rates. For the SITP-P, the users’ perception on bus comfort (Q6) appears to have the largest contribution to overall satisfaction. Currently, SITP-P buses are obsolete, with poor maintenance and deficient standards of cleanliness. Furthermore, drivers often play loud radio music while in SITP-Z there is no music within the vehicle. On the other hand, SITP-P buses have cushions as opposed to hard plastic seats of SITP-Z buses. However, there is no statistical difference in the way users rate comfort (Q6), despite efforts in that regard in the operation of SITP-Z. The contribution of comfort (Q6) to satisfaction in the integrated system buses is almost half of that for the semi-regulated service.

The users’ approval of fares (Q5) positively and significantly influences the overall satisfaction of both systems. To be specific, the lower the price, the higher the rating for variable fares (Q5). The net effect on overall satisfaction is greater for the SITP-P for the combined effect of a larger coefficient (\(\beta =\) 0.276 for SITP-P, \(\beta =\) 0.184 for SITP-Z) and a lower fare level. SITP-P performs much better in the user’s average rating of this attribute (3.49), while the SITP-Z fails (2.73). The actual fare difference of 22% (500 COP = 0.17 USD at 2985 COP per USD in 2017) between both systems appears to have a significant perceptional difference regarding the fare cost. In this case, changing the relative fare implies a higher net difference in the overall satisfaction of SITP-P.

Perceived security (Q7) is poorly rated in both systems (rating is below 3.00 perceived as fail). It is concerning, given that security contributes significantly to the overall user satisfaction. There is no statistical difference in the security perception; robbery (mostly in the form of pickpocketing) occurs evenly in both systems, with similar frequencies (four daily thefts in SITP-Z and five in SITP-P) (Bogotá Cómo Vamos 2019).

Figure 2 presents the IPA graphics for the SITP-Z and the SITP-P. Attributes resulting in the low-performance high-importance quadrant (located in the upper-left quadrant) are our focus. Martilla and James (1977) provide the quadrant interpretation as “concentrate here”. It is remarkable that the same three attributes fall into this quadrant in both systems: Waiting time (Q8), comfort (Q6), and personal security (Q7). These attributes require attention because they are poorly rated by users, while significantly affecting the perceived quality of service. For user satisfaction in the SITP-Z, waiting time (Q8) is clearly a bigger issue. The user waiting time is related to the frequency of services and proper operation (dispatch discipline and drivers’ behavior to avoid bunching). In the newly regulated system, it is perceived as very important (t = 5.29) but very poorly rated (average = 1.76).

Fig. 2
figure 2

IPA results from ordered probit for general service attributes for SITP-Z (left) and SITP-P (right)

The perceived satisfaction with the fare of the system (Q5) is important in both systems; however, for the SITP-P it achieves a good performance level (3.49/5.00), switching to the “keep up the good work” quadrant. The lower fare indicated by 22% (see Table 2) and the much higher statistical importance for the fare service attribute (t = 3.868) imply that the overall satisfaction of SITP-P derives from its affordability. In fact, this is the only attribute located in the “keep up the good work” quadrant in both systems.

The bottom two quadrants (low importance) correspond to the “low priority” (bottom-left), and “possible overkill” (bottom-right) according to Martilla and James (1977). Paradoxically, the SITP-Z shows better performance on these indicators, such as transfers (Q12), easy recharge (Q16), and cleanliness (Q9), which seem to not be crucial to the user’s satisfaction.

6 Discussion

Bogotá experienced a long and costly process of transit reform, which resulted in a lower perceived quality of service for users in the new system (SITP-Z) than the traditional semi-regulated one (SITP-P). It also resulted in a considerable reduction in public transport ridership. The positive intentions of the bus reform in Bogotá did not translate into improvements in user satisfaction; planners failed to address the user needs adequately, falling into “vision dissonance”. The reform, initially intended to correct previous system-inherent drawbacks and perverse incentives (summarized under the “penny war” operation), did not meet the expectations. The reform neglected and dismissed advantages that the previous semi-regulated scheme had, such as high frequency, good coverage, no subsidies, financial stability for service providers and bus owners, and user’s affordability.

There are both positive and negative outputs when assessing the transit system reform in Bogotá. Table 6 describes the outcome of the SITP-Z compared with the previous scheme. This comparison and assessment of the reform brings the same flavor as the IPA outcome. There were improvements in attributes that do not contribute significantly to a better service perception, but the perception of SITP-Z was lower in the most crucial ones. The SITP-Z improved elements like the drivers’ labor conditions, image, or institutional arrangements. However, these improved features and attributes are not necessarily associated with a positive change in overall user satisfaction. As presented in the results, the lower fares (Q5) and lower waiting times (Q8) are the main drivers that influence SITP-P user satisfaction.

Table 6 Operational and financing outputs with SITP-Z

Users are the final recipients of the service. Planners had the possibility to directly consider user needs, asking them, by means of surveys, focus groups, or any other means, what key aspects should be included in the system design. Analyses considering users’ input into account, provide key feedback, being necessary not only when planning and designing a transit reform, but in the daily transit management and operations. Changes and interventions can end up in an iterative improvement process. Recent publications on the perception of the quality of sidewalks (Rodriguez-Valencia et al. 2020, 2022a) and bicycle infrastructure (Barrero and Rodriguez-Valencia 2021), show how users respond to infrastructure and environment features (Ortiz-Ramirez et al. 2021). Rodriguez-Valencia et al. (2022a) suggest the need for more user customization in the design of walking and cycling infrastructure by considering the user experience.

An analysis of user surveys needs to go beyond descriptive statistics, using statistical methods like regressions, discrete choice modeling and causal models. The application of sensitivity analyses is also recommended when conducting some regression or choice models, in order to consider multicollinearity among regressors. This is especially important for variables at the limit of the significance threshold (for more details, see Rodriguez-Valencia et al. 2019). Finally, we want to focus our attention on the potential trap behind the analysis of direct open-ended questions, which can be misleading. Issues at the forefront of the minds of users can be useful as a direct measure, when systematically analyzed as frequencies. However, Rodriguez-Valencia et al. (2019) and Garcia-Suarez et al. (2018) have shown that the results from direct questions versus thresholds from econometric models can be quite different. The forefront-of the mind answers can be mediated by recent experiences or events, and less likely provide information about the complex cognitive process behind the users’ thought process. IPA is a very valuable tool to help identify what really matters for users.

Successful or partially successful cases of citywide bus reforms are less frequent than unsuccessful ones. They provide hints to uncover factors to be considered, like the cases of Medellín, Colombia (Área Metropolitana del Valle de Aburrá 2018) and León, Mexico (Hidalgo and Graftieaux 2007). These cases evolved from agreements with existing semi-formal operators, rather than full-service replacement. These agreements were aimed at improving societal issues but maintaining the core of the existing business structures. Labor formalization and upgraded fleets were implemented, while the responsibility for service management remained fully with the private operators; also enhancing the incentives to provide good coverage and frequency. Gradual upgrades of existing semiformal systems appear to be an attractive option to be considered, as recommended by Kumar et al. (2021).

Financial difficulties of the SITP-Z have been central in the outcome of this reform, leading to bankruptcy of some of the private investors, higher fares (compared with the SITP-P), and burdensome public subsidies to operate the system. The main cause of this financial situation was the initial underestimation of operational costs and the overestimation of the demand, resulting in much lower revenues (creating losses). Operating costs in the SITP-Z were higher than the SITP-P for several operational reasons. In the SITP-P drivers operated as independent contractors that rent the bus on a daily basis, and they incurred maintenance and cleaning duties. Fare collection used cash on-board, and dispatching was empirical (i.e., no large effort in planning frequency). The fare covered expenses regarding full operation, maintenance, and cleaning costs. Evidently, SITP-Z operations are far more expensive. The SITP-Z hire drivers under labor laws which limit hours per day and per week and provide vacations and paid time-off. Labor regulations impose having 2.5 drivers per bus and paying benefits according to the labor regulations. Technology implementation (on board unit, fare validators, cards, etc.) and a fare collection contract are also expensive to acquire and operate.

In this context, if PT formalization is to be accomplished, the city needs to assure that additional costs of formalization are covered, while user needs, like low waiting times, affordable fares and ample coverage are considered. It is not suggested that a city with semi-formal PT services should not reform it, as this type of service produces negative impacts for users, the city, and the operators; but it seems very unlikely that these three aspects can be improved without strong financial and institutional support. Funding for such increased costs would preferably come from charges for the use of private vehicles in order to compensate for their negative externalities (Ardila-Gomez and Ortegón-Sánchez 2016).

In the case of Bogotá, a vicious cycle is evident. Low bus frequency and reduced coverage emerged as a consequence of the reduction of the bus fleet during the reform (Hidalgo and King 2014) and consequently, reliability and convenience were diminished due to the increase in waiting, access, and travel times. This reduction in the service quality and the slow implementation of the formalized scheme led to a reduced demand because users preferred cheaper and more convenient options, like the SITP-P, bicycle, motorcycle, or new mobility options (such as ride-sourcing, scooters, and other micro-mobility options), or informal transportation options (pedi-cabs, unregulated para-transit, etc.). A fall in SITP-Z ridership is associated with reduced revenues which, as a consequence, increases budget constraints that can imply a further reduction on fleet and coverage, which perpetuates the cycle of worsening user satisfaction and quality of service.

With the intention of conducting the best possible reform, transport planners followed best practices from high-income countries, imagining the ideal PT system, or using their instincts to define what should be provided in the transit reforms. Including people’s requirements, needs and desires in the planning and design processes can provide important critical elements to the design of the new system. The challenge is not the formalization of the service per se, but also the pursuit and assignation of the required funding to cover the gap between system revenues and costs. Formalization brings the added value of better public and private institutional arrangements, improved labor conditions for drivers, and fleet improvements (safer, cleaner, and more user-accessible fleet). Results show that better technology, newer buses, uniformed and trained drivers, or painted buses do not significantly contribute to an improvement of the users’ perception; but frequent, reliable, and affordable services do, and should be the main focus of service design.

After the study was completed, by late 2020, the city renegotiated the contracts with the private operators and completed bidding processes for the areas without coverage. By the end of 2021, the reform was completed and the SITP-P service was phased out. To renegotiate the contracts the city assigned additional funding. For the new contracts the city included in the bid’s requirements low and zero emission buses.

In 2020 the pandemic significantly reduced ridership, but having the formalized system in place, allowed the city to keep the service up and running by increasing the operational subsidy. Without the reform, private operators might have to scale down services, affecting the mobility of essential and low-income workers. In hindsight, having the reform provided resilience during the pandemic, despite the issues identified in this study.

Some limitations of this analysis include considerations of reference points when the surveys were applied (Abou-Zeid et al. 2012), although a homogeneity in survey locations was sought based on the city’s 2015 Household Transportation Survey. Changes in routine increase the awareness regarding happiness and satisfaction when surveys are applied. In this case, it was not a before-and-after study; the users were experimenting both types of services at the same time. The study does not consider the two user groups to be different in their socio-economic and travel characteristics as there were no spatial coverage differences between the two services. The survey did not ask for any comparative statement among the two types of services; it only asked the user to rate the service characteristics. Probably such a question would have provided additional insights. Using other marketing methods, like the Mystery Shopper coupled with augmented and virtual reality-based simulation (Voß et al. 2020) may be very useful in advancing improved service delivery.

7 Conclusions

The difficulties reported by the users of the SITP-Z in Bogotá indicate the new regulated service to be significantly worse than the remanent SITP-P services in several dimensions. The transit reform failed in one of the most important purposes: improving service quality. The reform has also created difficulties in the city’s finances, reductions in coverage and frequency of public transport and no significant gains in road safety. New operators also complain about having financial difficulties.

Our research, besides expanding the knowledge on unsuccessful transit reforms in the Global South, contributes to clarifying this phenomenon by means of comparison of user perceptions of the previous and the new systems in Bogotá. We believe that, by primarily focusing on some reform goals (e.g., modernization, reduction of externalities, etc.), transportation planners often provide formalized bus systems that do not necessarily meet user service necessities, and, in many cases, results in a heavy budgetary burden for cities and transit agencies.

In this research we have identified three key issues to be addressed when planning a citywide bus reform. Reforms require sufficient public funds to pay for these new features and conditions, not necessarily important for the user service, but desired for the good of the society. The bus passengers are the actual recipients of the service and the main reason for many of the reforms. The users are the key to identify aspects that need to be improved, to be replaced and to be maintained. User satisfaction surveys and IPA are useful tools to integrate the user perception. Finally, the concept of the new system, rather than resembling a developed world “ideal” system (modern buses, elegant drivers, fancy logos, aesthetical interiors), should consider what matters to the users, while not completely dismissing the benefits of the current service (frequencies, autoregulation, self-financing). Gradual upgrades of existing systems, rather than complete reinvention from scratch, is recommended.

Despite the difficulties observed in Bogotá, the existence of a formalized system provided resilience during the pandemic, and an opportunity to incorporate low and zero-emission buses needed to address air quality and climate change mitigation issues.