Is it possible to attract private vehicle users towards public transport? Understanding the key role of service quality, satisfaction and involvement on behavioral intentions

This paper contributes to the public transport literature by ascertaining the role of involvement upon the service quality-satisfaction-behavioral intentions paradigm from the point of view of private vehicle users. This is the first study that provides a comprehensive understanding of this framework based on the private vehicle users’ perspective. The added value of this research is that, by using a structural equation modeling approach, it provides a comparison of alternative models and uses data from different samples collected in five large metropolitan areas (Berlin, Lisbon, London, Madrid and Rome) for modeling validation. In addition, a SEM-MIMIC approach was applied for controlling the heterogeneity of data due to specific characteristics of the interviewee (territorial setting, place of residence, demographic and socio-economic characteristics and travel related variables). The findings show that involvement is a full mediator between satisfaction and behavioral intentions, and that satisfaction is a full mediator between service quality and involvement. Furthermore, the SEM-MIMIC results revealed that the four latent factors investigated (service quality, satisfaction, involvement and behavioral intentions) dealt with highly heterogenous data. However, the most important finding is that private vehicle users’ involvement is the factor that contributes most to their behavioral intentions towards public transport. Hence, public transport managers might benefit from these outcomes when establishing detailed policies and specific guidelines for public transport systems to engage private vehicle users in a higher degree of usage of public transport services.


Introduction
In 2015, the United Nations approved the 17 Sustainable Development Goals as a call for action by all countries to promote prosperity while protecting the planet. They recognize that ending poverty must go hand-in-hand with strategies that build economic growth, and address a range of social needs while tackling climate change and environmental protection. The eleventh goal is focused on sustainable cities and communities; among the numerous targets is the promotion of public transport (PT) in detriment of the private vehicle, due to a progressive saturation of the existing transportation infrastructure in the wake of population expansion and rapid urban growth. For this reason, strategies fomenting the modal switch remain essential for achieving more sustainable mobility, so that problems such as traffic congestion and pollution do not continue to worsen.
To attract private vehicle users to PT and retain them as users, it is necessary to better understand the aspects that most influence private vehicle users' behavioral intention (BI) or loyalty towards PT. In fact, the central idea behind the theory of planned behavior is that a person's actual behavior is immediately determined by the BI (Fu and Juan 2017), and in turn, the BI is a measure of the strength of an individual's willingness to perform a certain behavior (Ajzen 1991). The BI factor is usually considered together with the concepts of service quality (SQ) and satisfaction (SA), all of them embedded in the well-researched SQ-SA-BI paradigm. This paradigm suggests that SA is the link between SQ and BI (de Oña and de Oña 2015, de Oña 2021b), although some authors have found evidence that SQ may also bear a direct impact on BI (Minser and Webb 2010, Chou, Lu and Chang 2014, de Oña, Machado and de Oña 2015. Likewise, some authors have demonstrated the relevance of involvement (INV) for this paradigm in the case of PT users. Wei and Kao (2010) identified significant differences between the BI of users with high and with low attraction towards PT; while de Oña (2020) determined that INV is the latent factor that contributes most to BI. Previous studies approaching this task are centered on PT users (de Oña 2020, Allen et al. 2019, van Lierop, Badami and El-Geneidy 2018, Wei and Kao 2010), yet as far as the authors know, this issue is not addressed for regular private vehicle users. In fact, most of the recently published articles based on regular car users or other private vehicle users (e.g., motorcycle, bicycle, etc.) (Pedersen, Kristensson and Friman 2012, Redman et al. 2013, Abou-Zeid and Ben-Akiva 2014, Abou-Zeid and Fujii 2016, Kang et al. 2019, Li et al. 2019) focus on identifying the main attributes of service that could help attract users to PT, but they are not centered on understanding the whole paradigm. Thus, further effort should be dedicated to pinning down the effect of INV on the SQ-SA-BI paradigm from the regular private vehicle users' point of view. This information could help PT managers and marketers in defining effective strategies, to convince potential users to make a modal switch.
The structural equation modeling (SEM) is generally considered to be one of the bestintegrated strategic methods for measuring latent factors and assessing the structural relationships between these factors. Numerous authors have used SEM methodology to examine the framework related to users' BI with respect to using PT (Jen, Tu and Lu 2011, Chou et al. 2014, Allen et al. 2019, de Oña 2020, 2021b. Likewise, multiple indicators and multiple causes structural equation modeling (SEM-MIMIC) stands as an excellent means of taking into account users' heterogeneous opinions owing to sociodemographic characteristics, patterns of mobility or other variables. Because SQ, SA, INV and BI are factors that deal with highly heterogeneous subjective data (de Oña 2020, Allen et al. 2020), this method has undergone a sharp increase in usage in the PT field in the past few years (de Oña 2020, Allen et al. 2020, Ingvardson and Nielsen 2019, Allen, Munoz and Ortuzar 2018. De Oña (2020) studied the effect of heterogeneity upon SQ, SA, INV and BI as latent factors of PT users from five European cities. Allen et al. (2020) analyzed the heterogeneity in user perceptions regarding SA and loyalty in the railway services offered in the hinterland of Milan. The variables used for heterogeneity correction were related with travel characteristics, travel habits and socioeconomic characteristics. Ingvardson and Nielsen (2019) studied the relationship between norms, SA and PT use with data from six European cities. They focused on the effects of certain socioeconomic attributes on these latent factors. Allen et al. (2018) investigated an urban bus system, controlling for heterogeneity through travel habits and demographic variables over ten latent factors linked to SA. Based on these experiences, one may conclude that SEM-MIMIC is a proper means of dealing with heterogeneity, allowing researchers to arrive at very robust parameter estimates.
Against this background, the main purpose of the present study is to ascertain the role of involvement in the SQ-SA-BI paradigm when regular private vehicle users are investigated. Four alternative SEM models are compared, and five independent data samples are used for testing and validation. According to Kline (2015), the outcomes obtained in such research can be extrapolated when equivalent models are considered, not just the researcher's preferred model; and once the best model is selected, the results are replicated for different independent data samples. In this way, alternative explanations behind the data are not disregarded, and different model structures that share the same pattern of observed covariances can be accounted for. A second objective of this paper is to capture the heterogeneity present in the latent factors analyzed so as to identify which factors tend to be more heterogeneous, and derive the main sources of heterogeneity among private vehicle users. The heterogeneity analysis is performed under the SEM-MIMIC approach once the preferred SEM model is identified.
As far as the authors know, this is the first study that analyses the role of private vehicle users' involvement towards PT upon the service quality-satisfaction-behavioral intentions paradigm considering different alternative roles, several independent data samples, and controlling for heterogeneity of respondents due to specific characteristics (territorial setting, place of residence, demographic and socio-economic characteristics and travel related variables).
The article is divided into the following parts: introduction (Sect. 1), a theoretical background of studies investigating the role of involvement on the SQ-SA-BI paradigm and regarding private vehicle users behavioral intentions (Sect. 2), a description of the survey and characteristics of the five samples (Sect. 3), an overview of the methodology and research models (Sect. 4), the results (Sect. 5), discussion (Sect. 6) and finally, the presentation of the main conclusions (Sect. 7).

Theoretical background
Involvement (INV) is related to an individual's subjective sense of the concern, care, importance, personal relevance, and significance attached to an attitude (Olsen 2007). Authors Chen and Tsai (2008) state that a consumer´s level of involvement with a given object of interest serves as an important determinant of consumer evaluations and behaviors. That is, in the PT context, passengers with higher PT involvement are those who have higher needs, values, and interest regarding PT (Lai and Chen 2011).
In the literature centered on the marketing domain, and particularly on PT services, other factors have some conceptual overlap with involvement, as they describe how a customer is attracted or engaged with the PT. These factors are image (Ni et al. 2020, Fu, Zhang and Chan 2018, Chang and Yeh 2017, Chou and Yeh 2013, Kuo and Tang 2013, Minser and Webb 2010, Chou and Kim 2009, Park, Robertson and Wu 2004 and attitudes (Simsekoglu, Nordfjaern andRundmo 2015, Borhan et al. 2014). Corporate image (or image) is based on how an individual views the contribution of PT to one's own well-being, and to society (van Lierop and El-Geneidy 2018), while attitude refers to a person's overall evaluation of the willingness or unwillingness towards performing a behavior (Borhan et al. 2014). Hence, although image and attitudes are not exactly equivalents of involvement, here they will be dealt with indistinctly, as in previous studies (de Oña 2020).
In recent years, involvement, image or attitude have become recognized as factors influencing customers´ behavioral intentions (Lai andChen 2011, Borhan et al. 2014). Borhan et al. (2014) holds that attitude has contributed to the shift made from private vehicle use to PT. Different studies have aimed to ascertain the roles of involvement, image or attitude on the SQ-SA-BI paradigm, acknowledging their contribution to this framework (de Oña 2020, Irtema et al. 2018, Machado-Leon et al. 2018, Borhan et al. 2014, Chou and Yeh 2013, Lai and Chen 2011, Wei and Kao 2010. The above studies are based on PT users' point of view; to date, no publication has performed such an analysis from the standpoint of private vehicle users. According to this paradigm, SQ depends on passengers' perceptions about a series of attributes of PT service, while SA is defined as an overall affective response to a perceived discrepancy between prior expectations and the perceived performance afterwards (Oliver 1981(Oliver , 1999. Moreover, Oliver (2010) explains that SQ is more oriented towards cognitive judgments, whereas SA is more holistic and associated with affective judgments. Behavioral intentions, in turn, are related to attitudinal and behavioral measures that are associated with willingness to reuse and willingness to recommend to others (Oliver 2010).
Most of the literature has confirmed the mediator effect of involvement (INV). This latent factor partially or fully mediates one or various relationships established in the SQ-SA-BI framework. However, no consensus exists about how it participates. In many cases INV partially mediates the relationship between SQ and SA (Chou and Yeh 2013, Kuo and Tang 2013, Chang and Yeh 2017, Fu et al. 2018) and/or between SQ and BI (Lai and Chen 2011, Borhan et al. 2014, Fu et al. 2018. Research by Irtema et al. (2018) and de Oña (2020) discerned a mediating effect of INV between SA and BI, although this mediating effect was partial in the work by Irtema et al. (2018) and full in that of de Oña (2020). In some cases, INV mediates the relationship between other latent factors and BI. For example, Lai and Chen (2011) pinpointed a mediator effect of INV between perceived value and BI.
A minor number of studies have suggested the antecedent role of involvement, image or attitudes on SQ (Minser andWebb 2010, Ni et al. 2020), SA (Minser and Webb 2010, Machado-Leon et al. 2018, Ni et al. 2020 or BI (Simsekoglu et al. 2015). This implies that INV exerts a direct effect on one or some of these latent factors, but it does not mediate any relationship of this paradigm.
Finally, other studies point to a direct effect role (Allen et al. 2019) or a moderator role (Wei andKao 2010, Machado-Leon et al. 2016) for INV. Wei and Kao (2010) analyzed INV by comparing users with high and with low INV regarding PT. They found differences between both groups that were significant. Machado-Leon et al. (2016) looked into different roles of INV. They calibrated five different models using data from a LRT system: INV as a moderator, as a mediator (full or partial mediator), and as an antecedent. Even though the results supported the moderating role of INV, they were not conclusive as none of their models were validated (all their models showed one or more non-significant paths). Finally, Allen et al. (2019) suggested a positive effect of loyalty over INV. In their work, however, INV was measured as an intention and not as an attitude, as considered in the previous research. This change in the INV conceptualization could explain the different role considered for INV according to their paper.
Still, the perspective of non-PT users is not considered in any of the previous papers. Most recent studies based on regular car users try to shed light on this topic by attempting to pinpoint what indicators or specific latent factors are crucial for explaining public transport SQ, SA or travel intentions from the standpoint of private regular users (de Oña 2021a, de Oña, Estevez and de Oña 2020, Al-Ayyash and Abou-Zeid 2019, Li et al. 2019, Kang et al. 2019, Mahmoud and Hine 2016, Redman et al. 2013, Bamberg, Rolle and Weber 2003, Hine and Scott 2000. With regard to SQ and SA factors, Hine and Scott (2000) carried out several in-depth interviews and focus groups of PT users and car users about their decision to either undertake a PT trip or to use the car. They found that for car users the most important aspects of PT were journey time, trip flexibility, frequent services, regularity, bus stop located nearby, and discomfort during cold and wet seasons. Mahmoud and Hine (2016) quantified the relationships between several SQ indicators and the overall evaluation of PT in the United Kingdom among PT users and regular car users who occasionally used PT. By calibrating binary logistic regression models, they found that both groups had very similar perceptions. Al-Ayyash and Abou-Zeid (2019) used ordinal logit models to detect which SQ variables explained car users' satisfaction with PT in Beirut. The results of these models showed that only three service variables (bus travel time, cost, and shared taxi travel time) were significant. Authors de  investigated the variables that influence the perception of private vehicle users about the PT services in Madrid (Spain). They analyzed the contribution of several SQ attributes to overall SA. They found that the most important PT's attributes for private vehicle users in Madrid are speed, frequency, and intermodality.
Concerning car users´ travel intentions, Bamberg et al. (2003) analyzed the contribution of two attributes of PT (cost and information) on the possibility of generating a modal switch to PT in a scenario of change of household location. Their results suggested that both variables were significant for the modal switch. Redman et al. (2013) published an extensive research review on quality attributes of PT that attract car users. They showed that perceived quality attributes (e.g., convenience, safety, comfort) and physical (e.g., reliability, frequency, price, accessibility) played an important role in the modal switch. Kang et al. (2019) tried to identify the predictors of drivers' intentions to switch from car to PT and their willingness to use PT. Significant associations were derived between the intention to switch and the latent factors flexible service, convenience and commute impedance. Li et al. (2019) analyzed the intentions of commuting by PT among car users in China. The results of the logistic regression models indicated that reliability, economics, and comfort had significant influence on these intentions. Recently, de Oña (2021a) clarified the relationship between SQ, SA and BI from the point of view of private vehicle users that use PT at least occasionally. With data from Lisbon and Madrid, they found SA to exert a fully mediator effect between SQ and BI, and moreover derived that frequency, punctuality, intermodality and information were the most important SQ attributes for the samples of both cities.

Survey and sample description
This research is supported with data collected through an online panel survey in the metropolitan area of Lisbon, Berlin, London, Madrid and Rome, from May to July of 2019. The survey was the same for the five sites. The questionnaire, with an average duration of seven minutes, was translated into the local language, and was made up of eight sections. This paper only uses five parts of the questionnaire: • Part 1 referring on demographic, socio-economic and travel related variables (Table 1).
Over 500 questionnaires were completed by regular private vehicle users of each European city (in total 2,531). In general, male interviewees are predominant, except in Berlin. Regarding age composition, the highest rates refer to the age brackets of 25-44, and 45-64. Most of the participants had been living in the same area for all their life; so, they had a good knowledge of the PT system in the area. Most respondents were located in the metropolitan area, except in Berlin. Excepting the case of Berlin, the level of studies was very high in all the other cites, with most of respondents having higher education. Concerning the occupation status, most respondents were employees. With the exception of Berlin, all the cities have a similar family size distribution, with a prevalence of families having two, three and four members. In Berlin, families having only one member also made up a large percentage of the sample. Most respondents used PT less than once per week. Finally, the income level fully characterizes the respondents of each city. In Madrid and Lisbon, the predominant net family income was over the sum of three minimal wages. On the contrary, in Berlin, most respondents had less than two minimal wages. In Rome, the income is equally distributed between the three levels exhibited, and in London the two extreme levels are the most represented. To homogenize household income, the minimum wage in 2018 for each country was used as value of reference.
The sample is representative of the general population in each city because a sampling stratified by gender and age was performed, with assignment proportional to the real size of the strata for each city (EC 2019). For other variables in Table 1, we cannot ensure the representativeness of the sample because the distribution of these variables is not available for all the cities in the case of regular private vehicle users. For example, in the case of level of education completed it might seem that people with high level of education are overrepresented while people with low level of education are under-represented, based on data from the general population in those cities (EC 2020). However, it could happen that the data is representative for regular private vehicle users if people with higher level of education have higher incomes and use the private vehicle more than the population with lower level of education and lower incomes. In any case, the reader should be cautious with the generalization of the results. Table 2 shows the average values for all the indicators considered in the survey (more detailed information is provided in Table A1 in Appendix). Regarding SQ and SA, Berlin, London and Madrid show the highest ratings, over 3 points for each indicator. In contrast, in Rome, all the indicators related with SA are rated below 2.5 points. For the full data, proximity, accessibility and safety were the best rated indicators for SQ, while punctuality, individual space and security were the worst rated. Only safety shows full agreement and was rated as one of the three best indicators of SQ among the five cities. On the negative side, a fair degree of agreement is seen for individual space, valued as one of the three worst indicators, excluding Rome. Some of the highest and lowest rated indicators for each city that were not mentioned before deserve our attention. For example, Madrid had high rat-ings for intermodality, and one of its lowest ratings went to cost. Contrary to Madrid, Rome gave a very high rating to cost and a low one to cleanliness. Berlin included among its best rated indicators proximity and among the worst temperature and cost. Among Lisbon's most appraised indicators is intermodality, while among the worst we find punctuality. London included among its most valued indicators frequency, service hours and information; among the least appraised were cleanliness and temperature. 2.84 SQ1 (Service hours); SQ2 (Proximity); SQ3 (Frequency); SQ4 (Punctuality); SQ5 (Speed); SQ6 (Cost); SQ7 (Accessibility); SQ8 (Intermodality); SQ9 (Individual space); SQ10 (Temperature); SQ11 (Cleanliness); SQ12 (Safety); SQ13 (Security); and SQ14 (Information). BI1 (I will use PT for one-off trips); BI2 (I will take PT for regular trips); BI3 (I will increase PT usage); and BI4 (I will recommend PT).
Regarding INV, environment and reduce traffic were the statements highest rated by the five samples, except in London, where freedom is also one of the most appraised indicators. In contrast, there is total agreement for low income and judgement, which obtained the lowest ratings. The indicator low income is formulated inversely; so, a lower agreement indicates a better attitude. Again, in this case, Rome attained the lowest values for almost all the indicators related with the attitudes toward PT. With reference to behavioral intentions, London attains the highest values for all the statements, followed by Madrid and Berlin. In Madrid, the indicator I will take PT for regular trips received a remarkably lower value (2.47) than the other three indicators. There is agreement for the statement I will take PT for one-off trips as being the most valued indicator for all the cities. It is noteworthy that although Berlin, London and Madrid are given the highest ratings in SQ, SA and BI, the differences in the evaluations of the BI's indicators with respect to Rome and Lisbon are quite minor.

Competing models
Four competing models are considered to analyze the effect of INV on the SQ-SA-BI paradigm when regular private vehicle users are investigated. To the best of the authors' knowledge, no research has pinned down the role of INV in the framework of private vehicle users. Thus, recent literature focused on PT users served as a starting point for proposing the competing models.
Studies specifically centered on PT postulate that SA is a mediator between SQ and BI, with discrepancies as to whether this mediating effect is full or partial. Recently, though, de Oña (2021b) confirmed a full mediator effect of SA by using five sample populations for calibrating two alternatives models. In addition, the total mediator role of SA between SQ and BI was validated for a sample of private vehicle users in de Oña (2021a). In that paper two competing models were tested in order to discover how SQ, SA and BI factors were linked. Kline (2015) recommends to use competing models and different samples for model validation and generalization, as it is frequently done in disciplines such as marketing.
With regards to the effect of INV in the SQ-SA-BI paradigm, different roles are considered for this factor: full mediator, partial mediator, or antecedent (Fig. 1). For PT users, de Oña (2020) determined that SA and INV were completely full mediators in the SQ-SA-BI framework. That study used data from multiple cities with a common survey, and it compared eight alternative structural equation models. Particularly, SA was found to play a full mediator role between SQ and INV, and INV played a full mediator role between SA and BI ( Fig. 1.a). Other studies uphold a completely full mediator role of INV, and a partial mediator role of SA ( Fig. 1.b) (Machado-Leon et al. 2016, Olsen 2007. In this case, INV fully mediates the effect of SQ and SA on BI. Nevertheless, the model most supported in PT literature is the completely partial mediator role of INV (Fig. 1.c). In this model INV acts as a partial mediator between SQ and BI and between SA and BI. SA furthermore acts as a partial mediator between SQ and INV, and between SQ and BI (Park et al. 2004, Lai and Chen 2011, Borhan et al. 2014, Machado-Leon et al. 2016, Irtema et al. 2018). Finally, Simseko-glu et al. (2015) suggested that INV acts as an antecedent of the three factors that make up the SQ-SA-BI paradigm, while SA plays a full mediator role between SQ and BI ( Fig. 1.d).
Therefore, four competing models are used to test the theoretical effect of INV on the SQ-SA-BI paradigm. The proposed models are the following (Fig. 1  Following de Oña (2020), a four-step analytical procedure is used to test the theory underpinning the relationships between INV and the factors SQ, SA and BI, and to control for heterogeneity: (1) The model is re-specified as a Confirmatory Factor Analysis (CFA) measurement model (the CFA measurement model is the same for all the alternative models, since they consider the same constructs). (2) Four structural regression models (Fig. 1) are tested with the pooled sample and compared to one another (they are compared using their goodness-of-fit indices and their capacity to explain the BI's variation). A model is considered invalid if data does not support the model (i.e., if paths' signs are not consistent with the theory or if non-significant paths are identified). If more than one model is valid, those with the best goodness-of-fit indices and ability to explain BI's variation are selected for the next step. (3) The selected model(s) is(are) calibrated using the data from the five cities. Only the model showing statistically significant and consistent parameter estimates, reasonable goodness-of-fit indices, and satisfactory BI's variance for all the cities is considered valid, and the model's validity is confirmed. (4) To capture heterogeneity, some attributes regarding the territorial setting, place of residence, specific demographic and socio-economic characteristics (e.g., age, gender, education, income level, etc.) and travel habits were introduced as regressors for the latent factors considered. These models are calculated for the five cities and for the pooled data, allowing to identify differences across cities.

Results
We present the results in four-steps: (a) Confirmatory factor Analysis (CFA), (b) Competing SEM models, (c) Model validation and (d) SEM-MIMIC results.
The data preparation and screening included the standard checks (sample size, outliers, missing values, relative variances, normality, collinearity) for this kind of analysis and for all the cities (Berlin, Lisbon, London, Madrid and Rome). Low income showed a negative bivariate correlation with four of the other indicators used by the INV factor. We decided to exclude low income from the models. Moreover, we performed univariate and multivariate tests for normality. As most indicators were not normal distributed (see Table A1), univariate and multivariate normality were rejected. The Satorra-Bentler estimator (Satorra and Bentler 1994) was used to address this issue. The following results report the χ 2 corrected using Satorra-Bentler estimator, as well as all the corrected model fit indices that use χ 2 . Stata/MP 16.1 was used for the statistical analysis.

Confirmatory Factor Analysis (CFA)
The CFA model consisted of four latent factors: SQ, SA, INV and BI. These factors were respectively related to several observed indicators (Fig. 2). For the pooled sample, Table 3 shows the parameter estimates and the goodness-of-fit statistics for the initial CFA model. Due to the fact that approximate fit indices did not give excellent values (Hooper, Coughlan and Mullen 2008), the standardized factor loading of Environment (0.495) was low, according to Hair et al. (2010); and because the factor INV presented a low value for Average Variance Extracted (AVE) (0.467), the model was re-specified. We re-specified the model step-by-step. However, due to spatial limitations, Table 3 only provides the results of the initial and the final CFA model. In the final model the latent construct INV included just six indicators (Environment was also excluded); and nine measurement error correlations were specified. All these correlations are plausible and theoretically justified. Table 3 shows that all the goodness-of-fit indices for the final CFA model are excellent (Hooper et al. 2008): CFI (> 0.95), TLI (> 0.95), SRMR (< 0.05) and RMSEA (< 0.05). All standardized estimates are over 0.5. The construct validity also improved. The four latent constructs presented values above 0.7 for Cronbach's Alpha and Construct Reliability (CR): ranging from 0.831 to 0.937 for Cronbach's Alpha, and from 0.844 to 0.943 for CR. AVE was above the recommended threshold (0.50) in all cases except for SQ (0.495). However, following Fornell and Larcker (1981), the convergent validity of the factor can be considered adequate if AVE is less than 0.5 but CR is high.

Competing SEM models
At this step, four competing models (Fig. 1) were tested for the pooled sample. Table 4 shows the model goodness-of-fit indices (using Satorra-Bentler estimation), structural paths, and the ability to explain variation in BI. All the goodness-of-fit statistics were excellent, with RMSEA equal to 0.043, SRMR ranging from 0.038 to 0.042, TLI from 0.954 to 0.955, and CFI from 0.959 to 0.960. R 2 -values for BI ranged from 0.768 to 0.783.
Results in Table 4 show that the data did not support two models (Model 3 and 4). In Model 3, where INV was considered as a partial mediator, the relationships between SA and BI and between SQ and BI were not statistically significant and showed inconsistent negative effects. Model 4, which proposed INV as an antecedent, also showed a negative effect for the relationship between SA and BI, which is inconsistent with the theory.
On the contrary, Models 1 and 2 presented significant and consistent parameter estimates. As both models (1 and 2) presented similar values for the goodness-of-fit statistics and their ability to explain BI (R 2 -values) were also almost the same, both of them were retained for the following step.

Model validation
Model 1 and Model 2 were estimated using the data of each one of the five cities (Berlin, Lisbon, London, Madrid and Rome) separately. Table 5 shows the models' goodness-of-fit statistics, R 2 -values for explaining BI and the structural paths.
Although the approximate fit indices are very similar in both models across the five independent samples, data from Lisbon did not support a direct relationship between SQ and INV under Model 2 (the unstandardized estimate for this parameter was not statistically significant). As Model 2 is not validated for one of the databases, we have to reject this model (Kline 2015) and continue the following steps with Model 1. This finding agrees with previous studies focusing on PT users (de Oña 2020).
Model 1 presents fit indices around the recommended cut-off values according to Hooper et al. (2008)  pooled sample. The R 2 -values among the five independent samples range from 0.594 to 0.840. Therefore, these results suggest that Model 1 (Fig. 1.a) is the only valid model. It shows that INV fully mediates the relationship between SA and BI, and SA fully mediates the relationship between SQ and INV. This finding comes as a contribution to the literature, as no studies in the transport field have determined the role of INV in the SQ-SA-BI paradigm for private vehicle users.

Table 5
Structural relationships and goodness-of-fit indices for the five metropolitan areas

SEM-MIMIC results
The SEM-MIMIC approach reveals differences owing to various attributes of the territorial setting, household location, demographic and socio-economic characteristics and travel habits.
The following dummy variables were introduced as regressors in the SEM-MIMIC model, for every latent factor: • City (metropolitan area is the value of reference vs. city center).
• Male (female is the value of reference vs. male).
• Old (under 44 years old is the value of reference vs. 45 or older).
• Frequent (occasional users are the value of reference vs. frequent users). In this study, a frequent user is considered to undertake one or more trips/week. • University (without any university degree is the value of reference vs. with a university degree). • Dependent (non-dependent members is the value of reference vs. having dependent members). • High income (households with income levels below two minimum wages is the value of reference vs. households above that threshold). • Berlin, Lisbon, London and Rome are included in the model as dummy variables, considering Madrid as value of reference.
In this sense, it is hypothesized that differences occur among all the attributes shown, and it will be tested to see if these differences hold. The SEM-MIMIC model was calibrated for each one of the five cities and for the pooled sample. The dummy variables for the territorial context were only used in the pooled sample's model. The SEM-MIMIC results are presented in Table 6. All the regressors, with the exception of dependent members, affect the latent factors and correct them for heterogeneity. They were found significant on one or more factors, so that the MIMIC model corrects for possible bias due to heterogeneity.
As more parameters were introduced into the model, the model's goodness-of-fit indices deteriorated slightly (Table 5 for the city samples and Table 4 for the pooled sample). However, there is evidence that increasing the number of variables in the model tends to worsen some of these parameters (Kenny and McCoach 2003, Allen et al. 2018, de Oña 2020. In fact, the final fit indices were: RMSEA ranging from 0.034 to 0.048, SRMR from 0.035 to 0.054, CFI from 0.915 to 0.974, and TLI from 0.904 to 0.970. Notwithstanding, it is possible to observe that in most of these models the fit indices remain excellent or good. Indeed, RMSEA always remains excellent. Remarkably, in all cases they presented higher R 2 -values (ranging from 0.660 to 0.883), showing that they outperformed the original model for explaining BI's variation. So, the MIMIC models generated valuable policy-related information about heterogeneity in the perception of the latent factors. Table 6 shows that the regressors produced significant results for all latent factors, which highlights that all these latent factors present heterogeneity. It is key to note which regressors affect each individual factor, and it is also important to assess the signs.
For each one of the latent constructs, several regressors were identified as significant in the case of the pooled sample. Being male increased the SQ perception, as did being frequent  The models' results for the five cities also illustrate interesting findings. Table 6 shows the constructs with higher heterogeneity, the regressors that contribute more to such heterogeneity, and whether a consistent pattern is identified in the regressors' behavior (i.e., the direction of the influence is stable). For the five European cities, all the latent factors show heterogeneity, except in Madrid and London where SA does not present significant heterogeneity. In Rome, the heterogeneity of SQ was greater, detecting significant differences on three dummy variables: Male, Old and Frequent. The regressor Frequent was also identified as significant for the five independent samples, indicating that being a frequent user increased the SQ perception with respect to being an occasional one. Moreover, in Berlin, London and Rome, people 45 or older perceived significantly lower SQ. On the contrary, in Rome, the SQ perceptions increased for males. It is interesting to highlight that only in Madrid, higher income levels contributed to better SQ perceptions. In Madrid, in fact, only two regressors generated heterogeneity on the latent factors: the income level (which involves heterogeneity only in terms of SQ) and being a frequent user, which generates significantly better ratings for SQ, INV and BI. This pattern is replicated in Berlin, Lisbon, London and Rome, as being a frequent user also contributes positively to SQ, INV and BI. In Rome, the regressor Male exerts different influences on the four latent factors when compared with their base category (Female). That is, being male positively contributes to SQ and SA, but decreases the attitudes toward PT and the BI. In the case of Berlin, people older than 45 showed significantly lower SQ, SA and INV. The sign of this regressor could be considered consistent along the latent factors. Only in Lisbon and Rome the dummy variable City presented significant differences. Heterogeneity was present for the latent factor INV and the sign of this regressor was negative, indicating that people living in the city center had lower attitudes toward PT. Finally, heterogeneity on SA due to university degree was only identified in Lisbon. In this case, people with a university degree had lower SA than people without a university degree. This regressor was significant only on this latent factor and for this European city. Table 7 shows the total effect among factors for each one of the cities and for the pooled sample. The factor with the higher contribution to BI is INV, with total effects ranging from 0.686 to 0.953. In second place, the SQ's effects on BI range from 0.395 to 0.564, depending on the city. And finally, SA contributes with total effects ranging from 0.359 to 0.506. Previous studies about PT users (de Oña 2020, Lai and Chen 2011) reported the same order of importance. In all the cities, with the exception of Rome, SQ presents a greater total effect on INV than SA. The reason lies on the high direct effect of SQ upon SA, which is higher than 1.0 always, with the exception of Rome. This means that a 1-point increase in SQ predicts a more than 1-point increase in SA. This result is highly relevant for PT's managers, as increasing perceived SQ would lead to a more than proportional increase in SA.

Discussion of results
In order to improve the readability of the discussion section, we have organized it into two subsections. First, we further discuss the competing structural regression models. And later, we detail the results obtained from the SEM-MIMIC approach concerning the heterogeneity of respondents due to territorial setting, place of residence, demographic and socioeconomic characteristics and travel related variables, comparing our results with previous studies.

Competing SR models
By comparing four alternative structural equation models and using data from five different European cities (Berlin, Lisbon, London, Madrid and Rome), we ascertained how private vehicle users are involved with PT in the theory underlying the SQ-SA-BI paradigm. This is the first study to date that can characterize the relationship existing among these four latent factors when dealing with private vehicle users on a collective level.
Three models (Model 1, 2 and 3 in Fig. 1) analyzed the possible mediator role of INV between SA and BI and between SQ and BI. This is an often-explored trend in the literature on regular PT users (Irtema et al. 2018, Machado-Leon et al. 2016, Borhan et al. 2014, Lai and Chen 2011). Yet there is a lack of consensus as to whether that role is partial or full, if it acts as a mediator only between SA and BI factors, or also mediates between SQ and BI. Model 4 was calibrated to test INV as an antecedent of SQ, SA and BI. This latter role of INV is likewise sustained by researchers on regular PT users (Ni et al. 2020, Machado-Leon et al. 2018, Machado-Leon et al. 2016, Simsekoglu et al. 2015, Minser and Webb 2010. Recently, de Oña (2020) tested and validated the relationship between the latent factors of SQ, SA, INV and BI for regular PT users. The findings indicate that INV is a full mediator between SA and BI, and SA is a full mediator between SQ and INV.
In our case, by observing the results obtained with the four competing models, calibrated with the pooled sample, Models 1 and 2 were preselected. Thereafter, by looking into the ten models calibrated within the five independent samples, it could be concluded that Model 1 was the valid one. According to this model, INV fully mediates the relationship between SA and BI. SA additionally exerts a full mediator role between SQ and INV. These results replicate the model structure suggested by de Oña (2020) for PT users, denoting it as the most plausible framework for the INV factor upon the SQ-SA-BI paradigm, both for regular PT users and for regular private vehicle users. Besides, using data from Madrid and Lisbon, de Oña (2021a) analyzed the relationship between SQ, SA and BI from the point of view of private vehicle users, determining that SA exerted a full mediator effect between SQ and BI. In another study, de Oña (2021b) supported the superiority of the full mediator approach of SA over the partial mediator one in urban and metropolitan PT systems from the standpoint of regular PT users.
Thus, several previous research works (de Oña 2020, 2021a, 2021b) come to support Model 1, as all of them rejected the partial mediator role of SA, and one of them (de Oña 2020) also denied the partial mediator role of INV. The other three models analyzed here -Models 2, 3 and 4-are considered invalid because they presented parameter estimates that are non-consistent or non-significant for the whole sample or in one of the cities sampled. Although these models showed adequate goodness-of-fit indices, they presented non-significant paths and had to be re-specified (Kline 2015).
Furthermore, as far as the authors know, this is the first study in the field of PT, and centered on private vehicle users, that compares a meaningful number of competing models with data from different cities. Most PT studies ignore the possibility of alternative models, failing to consider other possible explanations behind the data, which is a form of confirmation bias (Kline 2015). Thus, considering competing models and using different independent samples stand as an adequate strategy to discover the best model structure underlying a phenomenon under study, one that allows the outcomes of research to be extrapolated.
Our results show the importance of introducing INV in the relationship between SQ-SA-BI to improve the model's ability to explain the BI's variation of private vehicle users. While the model proposed by de Oña (2021a) only considered SQ, SA and BI -presenting a low ability to explain BI's variation, ranging from 0.249 in Lisbon to 0.338 in Madrid-Model 1 presents a very high ability to explain BI's variation, ranging from 0.660 in Lisbon to 0.883 in London (Table 6).

Influence of the territorial setting, demographic and socio-economic characteristics, household location, and travel habits
Through the SEM-MIMIC model we identified the influence of the territorial setting, demographic and socio-economic characteristics, household location, and travel habits on the SQ, SA, INV and BI latent constructs. All the regressors showed significant results for at least one factor (with the exception of the regressor Dependent members in the family) both for one or more of the five cities and for the pooled sample.
The pooled data model allowed to appraise the influence of the territorial setting when controlling for all the other demographic and socio-economic characteristics and travel habits (Table 6). While there are significant differences for all the latent factors between Madrid and Rome, the differences between Madrid and Lisbon, Madrid and London, and Madrid and Berlin are significant for three out of the four latent factors, varying in each case. Some similarities are found in the sign of these regressors across the latent factors. The differences between Rome and Madrid are positive for INV and BI, and negative for SQ and SA. This means that the ratings of INV and BI are significantly higher in Rome than in Madrid, and significantly lower with respect to SQ and SA factors. This pattern is also found for the other cities, given that the differences between Lisbon and Madrid are negative for SQ and SA and positive for INV. In addition, the differences are negative for SA between Madrid and London and between Madrid and Berlin, and positive for BI in both these regressors (London and Berlin). It is relevant to highlight that, conversely to the trend seen before, London shows positive differences from Madrid regarding SQ; and Berlin exhibits negative differences with Madrid regarding INV.
Gender, age, level of education, household income, household location, and PT use frequency produced statistically significant results for at least one latent construct, summarizing: • GENDER: Males present a negative effect on BI, yet a positive effect upon SQ, SA and INV. Recent studies about PT users (de Oña 2020, Allen et al. 2018, Ingvardson and Nielsen 2019, Allen et al. 2020 have reported similar findings about SQ and SA, and it could be associated to the fact more females travel with children or make shopping trips, meaning PT may be less comfortable for them; women furthermore tend to perceive less safety than men, which indirectly affects their SA with PT; there is a higher proportion of captive users among women, whereas males could have more access to private vehicles. For PT users, Ingvardson and Nielsen (2019) and de Oña (2020) uncovered non-significant differences in terms of gender for BI. Likewise, de Oña (2020) identified an opposite effect of gender upon INV when regular PT users are analyzed. In the present study, male private vehicle users tended to express higher INV towards PT, while regular PT male users had a lower INV (de Oña 2020). However, this effect is only significant in the case of the pooled sample and Lisbon for the private vehicle users. • AGE: In general, people older than 45 presents lower SQ perceptions, lower SA (only significant for Berlin) and lower INV. On the contrary, and only for the pooled sample, the BI of the older age bracket bears more positive intentions. The effect of age on these latent factors has also been analyzed in the PT users' literature, although some discrepancies can be found. For example, de Oña (2020) uncovered that the effect of age on SQ varied from city to city, being more positive for older users in some of the independent samples, yet in other cities the opposite was seen. In relation to SA, Allen et al. (2018) found that older people tend to be less satisfied, whereas Allen et al. (2020) arrived at the opposite conclusion at another location. As for BI, Allen et al. (2020) and de Oña (2020) determined that older PT users expressed higher loyalty or BI towards PT. A possible explanation for the positive effect of age on BI could be that, even if older private vehicle users are less satisfied with the PT service, they might generate a greater dependence upon PT they grow older and drives less. • LEVEL OF EDUCATION: It is not significant in most cases, though it is significant for SA in the case of Lisbon and for BI for the pooled sample. In this case, the effect on SA is negative, while the effect on BI is positive. The effect upon SA agrees with previous studies on PT users (de Oña 2020, Allen et al. 2020) in that having a university degree contributed negatively to overall SA. It can thus be hypothesized that higher educated private vehicle users are more demanding and would raise standards for the PT service. Contrariwise, the effect upon BI disagrees with the study of de Oña (2020), where regular PT users with a higher level of education presented worse BI; in the present study, regular private vehicle users show better BI when their level of education is superior. • HOUSEHOLD INCOME: It was identified as significant only for SQ and Madrid: people with higher incomes gave higher appraisals. This finding agrees with previous studies on PT users (de Oña 2020, Allen et al. 2020) in which SQ increased for PT users of middle-high incomes. This could be associated to the fact that low-income people make a higher economic effort to use the PT; hence they perceive the price higher than middle-high income users. • HOUSEHOLD LOCATION: It does not present any significant effect on SQ, SA or BI. However, living in the city center shows a negative effect on INV for the pooled sample as well as for Rome and Lisbon. This effect could explain why, even if PT services are generally better at the city center (e.g., higher frequency and punctuality) and they are perceived with higher SQ by people living downtown (de Oña 2021a), regular private vehicle users still prefer to use their private vehicles because they present low INV with PT. In contrast, regular private vehicle users living around the metropolitan area could present higher INV, but in many cases the PT does not offer suitable service for them. • PT USE FREQUENCY: Significant differences are also seen regarding the PT use frequency. It presents a constant pattern for three of the four constructs analyzed: frequent users (defined in this study as people undertaking one or more trips per week) perceive the SQ better than occasional users (less than one trip per week), and they express higher levels of INV and BI. SA presents this same trend for Lisbon and the pooled sample. But authors de Oña (2020) and Allen et al. (2020) did not find this regressor to significantly affect the latent factors considered when analyzing PT users.

Conclusions and recommendations
This paper contributes to ongoing research on the shape of the structural framework underlying SQ, SA, INV and BI or loyalty factors when regular private vehicle users are concerned. The findings lead us to affirm that INV is the most relevant factor affecting the intentions of private vehicle users towards using and/or recommending PT to others, given that this factor fully mediates the relationship between SA and BI; this is also the factor that contributes most to BI when total effects are examined. The findings therefore point to a need for policy makers, transport authorities and PT operators to develop campaigns to improve attitudes toward PT, as a means to persuade the modal shift of regular private vehicle users to more sustainable transport modes. Enhancing SQ perceptions and SA in isolation is not an effective strategy if the level of INV is low. Hence, if the primary focus is attracting private vehicle users to employ PT, awareness campaigns should be oriented to engage, from an early age, the entire population. Differences found in the SEM-MIMIC models between different types of users demonstrate that SQ, SA, INV and BI factors deal with highly heterogeneous data, and that such heterogeneity should be taken into account to avoid biased results. This information should prove useful for PT operators and administrators in the face of policy-making or formulating personalized marketing and awareness campaigns. Incorrect conclusions might mislead PT operators and authorities towards an inefficient use of their resources.