1 Introduction

Level 2 driver assistance systems [1], which provide steering and brake/acceleration support, are already widely available in modern vehicles and hold a great potential to increase traffic safety. To unfold its potentials, however, assisted driving technology must be used by the driver. Therefore, the system must be developed with a user-centered approach in order to create a good user experience and a willingness to use the systems.

Several aspects such as acceptance and trust are important variables affecting actual usage of new driving automation, e.g. [2,3,4]. The acceptance and positive evaluation of driver assistance systems is one central research area in the field of driving automation, e.g. [5, 6, 4]. In general, there are different traditions of acceptance measurement [7, 8], of which technology acceptance models play a central role in recent years` discussion. The origin of this approach lies in the Theory of Planned Behavior (TPB; [9]), based on which the Technology Acceptance Model (TAM; [10,11,12]) was developed, and further extended in the Unified Theory of Technology Acceptance (UTAUT; [13]). On the other hand, acceptance can also be operationalized and measured as an overall evaluative attitude towards a system, with the scale by van der Laan et al. (VDL; [14]). Several authors have used, adapted or extended these classic technology acceptance approaches with regard to usage in the context of interacting with driver assistance or automation systems, e.g. [8, 15,16,17,18]. However, these measures in most cases incorporate a broad array of variables. Each one has numerous items making it difficult to use them in applied research and testing settings during driving. Acceptance can change and develop over the course of driving and interacting with the system [5]. To track these changes, it has to be measured repeatedly. It also needs to be applied during the drive, which requires an unintrusive and quick measurement to meet safety requirements and cognitive capabilities. Therefore, only short versions of scales or one-item approaches are feasible in such settings.

Against this background, in this study, the validity of a single-item acceptance attitude measure (SIAM) for the application in practical settings is explored and compared to available multi-item measures. In this, the goal of this research is not to provide a validated scale for scientific research, but to gather evidence for or against the applicability of a one-item scale for acceptance in practical user research. To investigate its construct validity, a driving simulator study was conducted in which the SIAM, the TAM, and the VDL were rated throughout an assisted drive. To frame the present work in its theoretical context, we will describe two widely used acceptance measurement approaches in the following paragraphs. Further, the necessity of a simple measurement for practical applications is elaborated, and the benefits and drawbacks of single-item scales will be pointed out.

1.1 Theories of Technology Acceptance and the automotive context

Technology acceptance has been found to play an important role in the automotive context, e.g. [8, 17]. For the usage of new driver assistance systems, Beggiato et al. [5] found that the acceptance develops over time when interacting with a system. Rahman et al. [3] took different measurement approaches into account to investigate which acceptance models can be used best in the automotive context. The measurement of technology acceptance is widely affected by the manifold of the different theoretical conceptualizations of acceptance. According to Adell [7, 8] there are five approaches towards a definition of acceptance in driver assistance systems. First, there are approaches which just use the wording of acceptance to assess the construct, e.g. [7, 19]. The definitions further continue from meeting requirements of the user, to different attitudes, to the willingness to use or actual use of a system [7]. To access acceptance of information technology, several different theories have been applied [8]. One of the most prominent and widely used ones [3] are the TPB [9], the TAM [10,11,12, 20], the UTAUT [13], and the acceptance measures conceptualized by van der Laan et al. [14].

The TPB was derived from the Theory of Reasoned Action (TRA; [21]) and assumes that the intention to perform a certain behavior is affected by three factors: attitude, social norm, and perceived behavioral control [9]. The three postulated determinants of intentions are further influenced by respective salient beliefs. According to Ajzen [9], depending on the specific context, not every factor plays an evenly important role. Rahman et al. [3] found, for example, that the attitude towards behavior had the strongest effect on behavioral intention in the context of driver assistance systems. The general idea of the TRA was further transferred to the domain of technology interaction by Davis [10], resulting in the TAM – a model to explain the usage/adoption of information technology. The original TAM postulates attitude towards using a system as a mediating variable between beliefs and actual usage [10]. Accordingly, the modeled beliefs of the TAM are used to predict the attitude towards using a system [10, 12] and further the behavioral intention to use [20] by the two factors perceived usefulness and perceived ease of use (see Fig. 1). At the same time, perceived ease of use affects perceived usefulness. Perceived ease of use can be understood as “the degree to which a person believes that using a particular system would be free of effort” ([11], p. 320). Perceived usefulness is the belief that a certain system is supporting ones` performance. This construct is widely used to predict technology acceptance and has been adapted for different contexts in automotive and automation research, e.g. [7, 22, 23]. Roberts and Ghazizadeh [24] for example applied the scale to measure the acceptance of real-time and post-drive distraction mitigation systems. For the investigation of age related influences on driver assistance systems acceptance, the TAM was extended and validated by Günther and Proff [25] showing differences between the age groups. To gain insights into the intention to use the ADAS system in their vehicle, Kaye et al. [26] applied the TAM in an online survey with car owners. Herein, perceived usefulness and perceived ease of use could predict the drivers intention to use the system significantly [26]. To understand the underlying reasons for usage intentions in more detail, Rahman et al. [27] investigated different additional components in the TAM structure. Herein, beside existing TAM factors, endorsement, compatibility, and affordability were found to be relevant [27]. Also Stiegemeier et al. [4] integrated several factors to the TAM, such as hedonic motivation or safety issues in the context of in-vehicle technology. Another area within driving research was approached by Oviedo-Trespalacios et al. [28], focusing on the reduction of mobile phone usage. In this study, the TAM provided the best explanation for the intention to use an application designed to avoid interaction with the mobile phone while driving [28].

A different approach towards acceptance is proposed by van der Laan et al. [14]. They understand acceptance as the “direct attitudes towards that system” ([14], p. 2) with attitudes being “predispositions to respond, or tendencies in terms of ‘approach/avoidance’ or ‘favourable/unfavorable’ ”([14], p. 2). For their conceptualization they stated two scales: usefulness and satisfaction, with five items for the first and four items for the second scale. The proposed measurement has been used in several studies in the automotive context, e.g. [5, 29]. Blömacher et al. [29] for example used the scale to depict the influence of preliminary system description on acceptance. Feinauer et al. [30] recently investigated if prior knowledge of a system influences the attitude towards the driving automation using the VDL. Winkler et al. [31] examined two warning stages of a collision warning system, measuring acceptance with the VDL in a driving simulator study. Van den Beukel et al. [32] propose a framework to assess early concepts in ADAS interfaces for real world scenarios and used the VDL for the concept´s acceptance. Besides using established and validated scales, some studies adapt items and scales to fit their specific context and application. For instance, Seter et al. [33] incorporate satisfaction, usefulness, usability as acceptance factors tailored to their research context of driver assistance in eco-driving and the safety relevant area of school zones. Badweeti et al. [34] used some wording from the VDL according to Son et al. [23] adapted to their context of ADAS in a field operational test.

1.2 Yet another acceptance measure?

In view of various existing and validated measurement methods, one may ask why another measure should be examined. The described conceptualizations share the opinion that acceptance is built up by different factors. This leads to several subscales and associated items. As a result, the scales are necessarily long, which poses challenges to their application in applied research settings while driving (both in simulated and real-world settings). Here, there is restricted time and possibility to repeatedly measure acceptance with a large number of items. In line, brief and economic measures are already applied in different areas. In the field of acceptance research, one way to measure the construct is to inquire about the acceptability of a system, in some cases only applying one-item [8, 19]. Muslim et al. [35] measured acceptance with a single-item regarding usage intentions to gain insights in the acceptance of lane change collision avoidance systems. On basis of the TAM, Braun et al. [36] used different single-items to assess aspects of ADAS acceptance, such as usefulness and intention to use, in order to tailor the questionnaire to the inquired sample of elderly participants. However, the validation and standardization of such single-items is often not described. As a result, the findings cannot be compared to each other. Those results would have an increase in informative value if the measure was investigated in terms of validity and reliability. Therefore, exploring such a measurement approach which can be applied quickly, efficiently, and leads to comparable results would be useful.

In addition, different reasons for short measures in applied contexts should be considered especially to get insights during dynamic and complex situations.

First, to explore the development of the systems acceptance evaluation over time, the participant should experience the system repeatedly and evaluate the interaction at multiple stages as acceptance can develop with drives and experience with the system [5]. Taking this further, examining more complex interactions in, e.g., urban scenarios, require more frequent evaluations of experienced situations. It was found that dense traffic situations and the driving context with high system interaction frequencies are problematic for the use of ADAS [37]. This highlights the need to further investigate complex use cases to gain a deeper insight on how to improve systems in these situations. However, given the constraints of experimental duration and the amount of use cases, the measure needs to be short to effectively gather insights into the multiple use cases of interest. Importantly, the situation per se should not be changed too much by lengthy questionnaires that interfere with driving and the ‘natural’ experience of the situations.

Second, the evaluation should be conducted during the test drive in both real traffic and simulator settings. Especially the differentiation between an online measurement and a retrospective query plays an important role. In this regard, Endsley [38] rises the need for unintrusive measures while driving, as important memory contents can change, or relevant information can even be lost until after the drive. Additionally, von Janczewski et al. [39] argue for less simulator sickness if the participant does not have to stop. They further highlight time saving components. Therefore, “the instrument should be administrable auditively and should be able to be responded verbally” ([39], p. 212). This procedure of verbal presentation of a single-item questionnaire during the experimental drive has been already applied, e.g., by Stoll et al. [19]. They examined the acceptance of cooperative behavior in automated driving functions regarding lane changes by questioning the participants while continue driving on a highway [19].

Third, the measure needs to be applicable in real traffic. Importantly, the instrument must not distract the participant too much to ensure traffic safety. Therefore, the scale needs to be kept as simple as possible without requiring the participant to take too many considerations into account. In this context, keeping the workload low is another relevant aspect. There are different factors already influencing the workload while driving such as situation complexity or driving experience, see e.g. [40]. By asking during the drive, additional workload should be minimized, not just for collecting unintrusive valuable results [38], but also avoiding cognitive overload and putting the participant on risk.

Additionally, the participants knowledge background needs to be considered. As participants do not know the underlying constructs [39], it needs to be possible to answer the scales items without any specific knowledge. In addition, aiming for diverse samples we need to account for their specific needs. As already outlined by, e.g., Braun et al. [36], specific requirements regarding the examined sample can justify simpler measurement approaches.

These considerations clearly underline that while multi-item scales are a preferable way to measure acceptance, their meaningfulness and applicability in study settings during an experimental drive is somewhat restricted. Consequently, one-item measurement approaches provide a necessary proxy for investigations in the context of driving automation and have already been applied to various research questions e.g., see [36, 19]. Therefore, it is required to explore the validity and reliability of such an economic measure in order to achieve a better fit with the situational context and to meet the requirements of applied research.

1.3 Using a single-item scale vs. multi-item scales

One approach to solve the just mentioned issues could be the usage of a one-item measurement. However, there is an ongoing discussion on single- and multi-item measurement methods, e.g. [41]. Based on the research of Churchill [42], multi-item scales have been considered the method to use in marketing research [43, 41]. A multi-item scale has a higher reliability per definition, e.g. [42], as there are more items for the correlation when calculating internal consistency [43]. Additionally, a single-item cannot fully represent the several facets of a model or concept [43] as one-item might not be enough to fully describe a targeted model [44, 45]. Consequently, if a single-item is used, the underlying reasons cannot be grasped. Therefore, a multi-item scale can capture a more holistic range of a given construct. Additionally, as there are more response categories within multi-item scales they can possibly discriminate responses better [43]. Especially for very complex constructs, the usage of single-item measures does not seem suitable [46].

In some areas, however, it is justified to use single-items. The practical implementation should not be underestimated as Malhotra et al. [41] argued: “Practical constraints in time, monetary costs, respondent fatigue and survey refusal might be other factors that qualify researchers’ measurement choice” ([41], p. 843). They conclude that in simplified settings, testing changes of a construct, the single-item measurement is useful given no specific explanation for the reason of the change is needed [41]. In line, Bergvist and Rossiter [43] compare multi- and single-item measures in marketing research. They focus on the practical and theoretical reasons for using a single-item such as less data and participants related problems, as well as the straightforward measurement of concrete constructs. Additionally, looking at workload which includes the processing capacity needed for a certain task (see e.g. [47]), using only one item compared to multiple items (of the same complexity) should be less demanding.

To date, many researchers have developed single-item scales in various areas of research for an effective measurement of different constructs [48, 46]. Within the automotive context, von Janczewski et al. [39] presented a measurement for cognitive load while using in-vehicle infotainment systems. The authors developed their scale based on the NASA-TLX [49], aiming for a subjective measure to account for cognitive workload while driving [39]. Himmels et al. [50] developed a single-item measurement for user experience in the automotive context. Comparing a good and a bad performing highly automated driving system, they could demonstrate that the scale was sensitive to differences in user experience, indicating its construct validity.

According to the applied context, both approaches are useful and can complement each other (Table 1). To investigate complex relations and to generate a deep understanding of a given construct, multi-item scales are the method to use. Concerning the need for evaluations during the drive with time constraints as well as safety issues, the use of an efficient and economic measurement tool plays a central role. However, psychometric drawbacks should be kept in mind while using such items carefully in a suitable context.

Table 1 Positive aspects of multi- and single-item scales

1.4 Testing the usage of a single-item measurement for acceptance attitude

The usage of short measures in research and the specific context requirements when measuring acceptance while driving rise the need to explore the validity and reliability of single-item scales to measure acceptance attitude. Therefore, we explored and validated a single-item measurement which we define as an acceptance attitude proxi, predicting behavioral intention to use a system according to the TAM [10].

Attitude towards a system plays a central role in system usage. The just presented acceptance approaches have in common that they all include the attitude aspect [10, 14]. Davis et al. [20], for instance, propose the attitude towards using a system as well as behavioral intention to use it to predict system usage, while van der Laan et al. [14] use direct attitudes towards a system to explain acceptance. Attitude was shown to play an important role which is often underestimated [51]. Especially if it is strong, attitude fully mediated the influence of perceived usefulness and perceived ease of use on behavioral intention to use, outweighing the direct effect of perceived usefulness [51].

Taken these considerations into account, the acceptance evaluation of new systems should be included directly in the validation process to meet the users` needs. Especially in terms of traffic safety, the usage of new driver assistance systems has a great potential (see e.g. [52]) which is lost if the system is turned off. Therefore, users` attitudes should be considered as early as possible in the development process. This approach needs to be holistic and efficient, evaluating the technology in an overall manner. Thereby it´s aimed for the development of a positive acceptance value and a satisfactory driving experience with the new systems as well as saving development costs. As already outlined by Adell [7], the actual usage of a system is the prerequisite for fulfilling the systems positive effects on todays` traffic in the end.

Taken the different requirements into account, we used the acceptance attitude evaluation of the system: „I perceive the system as good” (in German, see Table 2) to approximate acceptance attitude. It considers the attitude part as defined in Davis [10, 12], which is an antecedent of the intention to use the system [10, 20, 53]. Hereby the term “good” approaches the overall positive evaluation of the system including different aspects of the construct. As we use a single-item measurement, we need a term that incorporates different aspects of acceptance attitude.

Fig. 1
figure 1

TAM including Acceptance Attitude based on Davis [10, 12] and Davis et al. [20] using the SIAM

1.5 Research question and hypotheses

Keeping the requirements for a measure while driving in mind and considering the different approaches of measuring acceptance in the driving context, this paper aims to explore if an overall single-item can efficiently approach acceptance defined as the attitude towards using driver assistance systems. For such a measure to be applicable, it should be comparable in terms of validity and reliability to more extensive measures used in current acceptance research. Single-items could be shown as equally valid in other areas of automotive research [50, 39]. Therefore, a single-item approach is investigated regarding its validity compared to existing measures. As outlined above, the VDL is one of the most widely used scales in acceptance research. Therefore, we used this scale to establish construct validity (see [41, 54]) for the SIAM. Additionally, although several models to measure ADAS acceptance exist, the TAM is one of the most widely used ones (see, e.g. [3, 28]), and included in the study as a second construct to test the validity of the SIAM.

According to the outlined attitude approach, we hypothesize that:

  • the SIAM correlates with the used VDL over different points of measurement (convergent validity).

  • the SIAM and the VDL can similarly differentiate between a group that has received a full and a group that has received an incomplete introduction to the systems functionalities (construct validity).

  • the SIAM predicts the intention to use a driver assistance system as attitude measure in the TAM (predictive validity).

2 Method

To access the validity and reliability of the SIAM, we conducted a simulator study measuring acceptance and acceptance attitude at several points of measurements before, during, and after a drive through complex scenarios. Herein, we applied the item scales of the TAM, VDL, and SIAM to compare the measurement approaches. Further scales were used for different research questions which will not be further accounted for.

2.1 Sample

In total, 87 participants were recruited for the present study. Thereof, fourteen participants dropped out due to technical issues and nine participants aborted the drive due to experiencing symptoms of simulator sickness, leading to a final sample size of N = 63 participants. There were twelve female participants (7.56%). Participants were aged from 21 to 61 years (M = 34.04, SD = 10.19). Participants received a pre-questionnaire including questions on demography prior to their participation in the study, which was not completed by six participants. Hence, these are not included in the demographic sample description provided here.

2.2 Driving simulator and system settings

The study was conducted in the BMW Driving Simulator Centre, an extensive research centre [55]. We used a high-fidelity driving simulator suitable for differently complex scenarios. A BMW one- series full scale vehicle mock-up was used. The motion system included a medium sized hexapod with six degrees of freedom. Visualization was implemented with LED-walls surrounding the simulator platform in a 220° field of view. For implementation of the driving Scene, BMW simulation framework Spider [56] was used.

After the participant activated the Level 2 driving automation, the vehicle supported the driver lateral and longitudinal including keeping speed and distance from other vehicles. The activation was shown with different light segments and icons on the steering wheel, combi, and head-up display. Further, system-assisted lane changes were conducted. Additionally, the system supported the driver at traffic lights and in intersections. In this study, the different systems were communicated as one overall system and according to the definition of Level 2 systems the responsibility was kept at the driver’s side. Further, the study design was positively evaluated by the internal ethics board and piloted before inviting participants.

2.3 Design

The study was conducted as a between-subjects design including the factor system introduction. One group was fully introduced into the systems functionalities, the other only got an incomplete instruction. In this way, construct validity can be assessed if the single-item is capable of distinguishing across the two groups, with higher acceptance for the full introduction group and lower acceptance for the incomplete introduction group. Various measures of acceptance (TAM, VDL) were moreover included in the study to demonstrate the SIAM’s convergent validity. Strong correlations of the proposed SIAM with the TAM and VDL can be interpreted as evidence of convergent validity (see [41]), the predictive validity (see [54]) can be shown by including the SIAM in the TAM structure predicting behavioral intention to use.

2.4 Procedure and use cases

For an overview of the procedure see Fig. 2. Upon arrival, participants were briefed on the general study setting, COVID-19 specific precautionary measures, and the consent forms for data privacy and simulator use. In the latter, the participants were informed that he or she could end the experiment any time without explanation. The data privacy form included, e.g., explanations on collected data, storing procedures and the rights of the participants on their data. The demographic questionnaire was filled out online beforehand to keep direct contact low due to the specific COVID-19 arrangements. Upon arrival at the simulator site, participants were assigned either to the full introduction or to the incomplete introduction group. After completing a short pre-questionnaire, participants then either received detailed instructions explaining the functionality and capabilities of the driver assistance systems which would be used throughout the study drive (full introduction group) or did receive an incomplete briefing on the functionality of the system (incomplete introduction group). This incomplete instruction group got less details regarding system behavior, limitations, and visualizations. After the instruction, hence before initial system exposure, participants were asked to fill in their initial acceptance using the TAM, the VDL and the SIAM.

Entering the simulator, participants received safety instructions for the simulator and were again briefed to stop the experiment if they feel any indicators of simulator sickness. In addition, the experimenter was carefully aware to check for any signs of simulator sickness during the drives. After a short manual familiarization highway drive, the experimental drive was initiated. There were four different driving scenarios, in each of which participants drove from a highway section over a rural sequence into an urban environment. In the urban drive the participants had to pass several traffic lights and turns with different amounts of pedestrians crossing. Further, the scenarios were slightly varying regarding the surrounding traffic, speed limits, and overall visibility of the course. Additionally, an edge case scenario was implemented to present a SAE Level 2 system [1] which needs to be supervised. Participants were asked to use the system as much as possible during the four use cases. Each scenario took approximately 15–20 min to complete depending on traffic and driving. For a higher immersion, interacting pedestrians were included in the urban use cases. After the highway exit and at the end of the city drive, participants were asked for their acceptance with the SIAM and VDL while the vehicle was standing. After the whole driving section acceptance was measured again with the TAM, the VDL and the SIAM. As this study was part of a bigger experiment, before, after and during the drive different other scales were included and rated, which will not be further considered in the scope of the present paper.

Fig. 2
figure 2

Experimental procedure

2.5 Dependent variables

Acceptance was measured based on the TAM [10] and the VDL [14]. This was compared with the single-item acceptance attitude measurement approach SIAM. All items were asked after another without distinguishing between the constructs. Only the VDL with the differing scale could possibly be distinguished. The participants were asked to respond to the items one by one verbally during the study and with an online form before and after the drive. The used and adapted items can be seen in Table 2. As in the current study the SIAM was included in a bigger questionnaire, we suggest a simple application in Appendix 1.

The TAM items were measured on a 7-point Likert scale ranging from strongly disagree to strongly agree. The scales for the VDL were slightly adapted in their wording (see [57]) and due to the verbal respondent while driving, the fields were labeled with numbers ranging from one to five. According to the suggested statistical analysis by van der Laan et al. [14] the data was transformed into a scale ranging from − 2 to + 2. All items were translated into German.

Table 2 Scales and related items

2.6 Statistical analysis

IBM SPSS Statistics 26 [59] software was used for statistical analysis. After testing for internal consistency of the scales and an estimate using an approach with the correction for attenuation formula [60], we proofed for validity. Third several linear regression analyses and a mediation were carried out to test the SIAM as attitude measure in the TAM.

2.6.1 Internal consistency of scales

As the scales used were translated to German and therefore slightly adapted to former ones, we calculated reliability of the different scales. Cronbach´s alpha was used to assess internal consistency. Values above .70 were used as the criterion [61]. For the single-item, Cronbach’s alpha cannot serve as a measure for internal consistency. For the reliability of the SIAM, we used the correction for attenuation formula as suggested by Wanous and Hudy [60], because the study setting did not include the possibility to measure the preferable test-retest reliability. The reliability of a single-item is hereby estimated based on its correlation with another scale measuring the same construct (in that case, the VDL), and that scale’s reliability.

2.6.2 Validity: VDL and SIAM

We conducted several Pearson correlations for the VDL and the SIAM over all points of measurement to prove for convergent validity (see [41]). Linearity assumption was checked with scatter plots of VDL and SIAM, statistical outliers with boxplots. The test for the normality assumption was left aside as the sample size was > 30 and therefore the correlation is robust against violations on normality [61]. For the group differences approach [54], we assessed whether a difference in acceptance as measured by the VDL or the SIAM occurred between the full introduction group and the incomplete introduction group.

2.6.3 Validity: predicting behavioral intention

First, we checked for linearity, outliers, autocorrelation, multicollinearity, and homoscedasticity of residuals as well as normality of residuals. To predict the behavioral intention (BI) we conducted several linear regression analyses. First for the single-item only and then including the factors perceived ease of use (PEOU) and perceived usefulness (PU) for the TAM. This was calculated twice, using the VDL and the SIAM as attitude measure. We further checked for mediation using the procedure shown by Baron and Kenny [62] and applied by Rahman et al. [3] for the TAM. Hereby we conducted three regression analyses according to Fig. 1. First, with the independent variable regressing on the mediator (PU on SIAM/ VDL). Second, the independent variable regressing on the dependent (PU on BI) and finally the dependent variable on the mediator and the independent variable (BI on SIAM/ VDL and PU). Each step needs to be shown significant and in the third step the effect of PU on BI needs to be lower or non-significant than in step two regressing BI on PU.

3 Results

3.1 Reliability of scales

For internal consistency, Cronbach´s alpha was calculated for every subscale in the pre- and post- questionnaire. With Cronbach´s alpha above .70, every scale showed high internal consistency for the VDL. For the TAM, the subscales PU (αpre = .85; αpost = .90), PEOU (αpre = .80; αpost = .76), and BI (αpre = .88; αpost = .93) all showed good internal consistency.

The estimated reliability of the SIAM as calculated based on the correction for attenuation formula [60] and the overall VDL internal consistency (Cronbach’s alpha) is presented in Table 3.

Table 3 Reliability of scales before, during the scenarios for Highway (HW), City and after the drive

3.2 VDL and SIAM

There was a strong positive correlation between the SIAM and the subscales of the VDL Usefulness and Satisfaction as well as the overall acceptance measured by the VDL ranging from r = .64 to r = .89 on the different points of measurement shown in Table 4. Following, for the groups differences [62] of the measures the full introduction group and the incomplete introduction group were compared regarding SIAM/VDL measures. Contradicting the expectations, neither the VDL, t(61) = 0.52, p = .603, nor the SIAM, t(61) = 0.85, p = .399, reflected a significant group difference.

Table 4 Correlations of SIAM and VDL before, during the scenarios for Highway (HW), City and after the drive

3.3 TAM and SIAM

The SIAM predicted a large proportion of variance in BI before the drive, adj. = .44. Attitude as included in the VDL showed similar results (Table 5). Attitude and PU also significantly predicted BI (Table 5). Further, replacing the VDL as attitude measure in the TAM with the SIAM, a similar proportion of variance was explained, adj. = .47, as when the VDL was used, adj. = .47. Additionally, PU and PEOU predicted the SIAM just as well as the VDL (Table 5). A similar pattern of results emerged in the post-drive ratings shown in Table 5.

The mediating effects of attitude could be confirmed for the pre- and the post- questionnaire. The structure of the TAM with PU influencing attitude and attitude predicting BI was confirmed. PU taken alone significantly predicted BI, Bpre = 0.65, Bpost = 0.82, p < .001, and PU predicted the SIAM as attitude measure, Bpre = 0.76, Bpost = 0.88, p < .001. Entering SIAM to the regression model, the SIAM significantly predicted BI, Bpre = 0.47, Bpost = 0.70, p < .001, and the effect of PU on BI was reduced, Bpre = 0.29, p = .036, Bpost = 0.21, p = .155. The mediating effects were also shown for the TAM when entering the VDL as attitude measure with reducing the effect of PU on BI, Bpre =0.41, p = .001, Bpost = 0.46, p = .003, (Table 5).

Table 5 Linear regressions and mediation analysis for the TAM using VDL and SIAM as attitude measure

4 Discussion

In the present study, we explored the applicability and validity of a single-item approach to access acceptance attitude (SIAM) in driver assistance systems in applied settings. To investigate the SIAM’s construct validity, a driving simulator study was performed in which the SIAM, the VDL and the TAM were rated throughout an assisted drive in complex traffic scenarios.

The used SIAM showed overall high correlations with attitude as measured by the VDL over different points of measurement, supporting the convergent validity of the SIAM. No difference was found between the fully introduced and the incompletely introduced groups by either the VDL or the SIAM. The manipulation check was thus not confirmed, which is why the groups differences validity approach [62] had to be left aside as an indicator of the SIAM construct validity. The fact that both the systems experience and simulator driving were new to the subjects may have resulted in a positive evaluation of the system regardless of its introduction. Further, a comparable proportion of variance in BI was explained by the TAM using either the VDL or the SIAM as a measure of attitude, e.g. [3]. Descriptively, the TAM explained even a larger proportion of variance using the SIAM than when using the VDL as attitude measure. The mediation analysis proposed by Baron and Kenny [63] suggests that the SIAM is influenced by PU and influences BI, hence confirming that the SIAM matches attitude as positioned in the TAM structure. The results were consistent for data collection before and after experiencing the driving situations. Taken together, these outcomes show that the SIAM can supplement the usually applied VDL as attitude measure within the TAM, supporting our hypothesis. It further could be used as a valid substitute of the VDL as acceptance attitude measure if the methodological approach requires a quick and efficient measurement of acceptance. This emphasizes the capability of the SIAM to capture attitude in the acceptance context. The SIAM represents a holistic approach allowing for an economic rating of an acceptance attitude proxy of new systems while driving. The explored single-item approach therefore goes along with a huge practical benefit, as an acceptance attitude approximation can be measured quickly and directly during the driving situation. The SIAM fulfils all the pre-defined requirements to serve as an easy-to-use proxy for the attitude of acceptance. It can be applied repeatedly which allows to investigate the development of acceptance over time, is simple, and does not require a high degree of elaboration by the participant. Nevertheless, it should be carefully considered, that such single-item approaches should only be used if the application context requires it. Regarding reliability, we used correction for attenuation formula suggested by Wanous and Hudy [61] to provide a rough estimate of the SIAM’s reliability. The internal consistency of the overall VDL and the SIAM were comparable. The TAM was also sufficiently reliable. It should be noted, however, that providing indicators of internal consistency cannot replace a retest. The provided reliability estimates, i.e., both Cronbach’s alpha and the indicators provided using the correction for attenuation formula, should accordingly be treated with caution and the scales’ reliability should be reconsidered in following investigations including retests.

Nevertheless, the existing models including multiple items and factors are indispensable if specified reasons for acceptance should be measured in more detail. More information can be captured, and the different facets of the underlying construct can be addressed, e.g. [44, 43]. Consequently, the TAM can help to better understand and work on specific parts of acceptance with the aim of achieving positive perceptions of new systems. For the practical context we are aiming at, it is especially important to grasp the key issues during driving to further develop new systems efficiently and user oriented. In that case, long item scales are not applicable. Therefore, the SIAM can be an economic tool. To bridge the information gap on detailed reasons for acceptance the usage of additional scales before or after the drive can supplement the method. Besides the existing models, it could be useful to apply open questions after the drive, to receive a deeper insight in the underlying aspects of the evaluation. That way, the impression during the drive could be captured efficiently and safely but including hints for further elaboration of the reasons behind the given answers. Taken the different advantages and disadvantages together, single- and multi-item approaches are not contradictory but rather complementary, whereby each approach must be seen useful in the respective application context.

4.1 Limitations and future research

Several limitations of the SIAM should be considered. Due to the positive formulation of the single-item, there could be a general positive bias of the questioned evaluation. The usually applied VDL on the other hand compares positive and negative aspects as a semantic differential, which may allow the participant to elaborate the answer more precisely. As the VDL and SIAM showed similar results, such a bias does not appear to differ across the instruments. However, this might be stronger in other research settings. This and the validity for the specific context could be tested with a short pilot phase or an additional application of the VDL before or after the drive.

Additionally, the used version of the VDL was reduced in items. As the shortened scale has already been successfully used in human-robot interaction showing differences in acceptance evaluations of negative and neutral conflict resolution strategies of a robot in an interaction scenario [58] and the internal consistency of the scale was high in this context, we rated the scale to be applicable. However, it should be tested if the comparison of the SIAM with the full VDL leads to different results. Cronbach’s alpha, which is the reliability estimate provided in the present investigation, is positively associated with the number of items included in a scale [63]. Having used the shortened version of the VDL, it can hence be expected that we arrive at lower reliability estimates for both the VDL and the SIAM when compared to a full version. Therefore, the presented results are rather underestimated, not overestimated.

Furthermore, no significant difference was observed across the experimental groups with neither the VDL nor the SIAM. The treatment thus failed to induce a difference in attitude towards the system, respectively the instructions could not generate a suitable difference in the acceptance attitude. As both scales did not reflect the group difference, this cannot be taken as proof of a disadvantage of the SIAM compared to the VDL. However, considering that convergent validity is a rather weak form of validation, future investigations should again aim at investigating whether difference in acceptance can be reflected by the SIAM. It could additionally be an option to retest the SIAM comparing two differently acceptable systems instead of manipulating the knowledge about the same system. Additionally, it would be useful to further validate such a single-item approach using the measure by itself without additional scales in order to control for influencing effects.

Additionally, several disadvantages concerning single-items apply in this context as described earlier. First, there is a lower reliability per definition as the internal consistency using Cronbach’s alpha can only be measured with multi-item scales correlating the scales items with each other [42]. Second, less information can be extracted due to less response categories being available to the rater, resulting in a lower ability to discriminate and less sensitivity for the underlying factors [43]. Those are important facts to consider when using single-item measures. However, as the applicability of single-items has proven several benefits [43, 41], the economic advantages should be emphasized in this context.

Further, the applied SIAM was used in German language. To test its applicability in other languages, it should be translated and validated in the respective languages.

Additionally, the generalizability of the results should be investigated. As the item is designed in a very general way, it should be applicable in different contexts. Not just different driving scenarios but also other user groups, such as elderly, could benefit from such simple measures as outlined by Braun et al. [36]. However, the transfer to other research areas should be investigated further.

Finally, we only tested the SIAM for its validity, but not the actual applicability while driving. As this might differ, e.g., depending on the drivers` capabilities and situation complexity, further research is needed to examine the safe usage of inquiries while driving. Even the use of such very short measures while driving must be considered in the context of the situation and should only be used if safely applicable.

5 Conclusion

The present paper showed that the SIAM as a single-item measure could serve as an efficient approximation to get insights in acceptance attitude in specific applied research settings. The SIAM was found to correlate highly with the VDL, demonstrating its convergent validity. The mediation analysis further demonstrated that the SIAM could be used according to the VDL as a more economic attitude measure within the TAM. Using the SIAM provides a useful tool to get a quick insight into the acceptance attitude at multiple stages during a study if the settings require a more economic tool. The SIAM therefore appeals due to its simplicity and efficiency. However, the psychometric drawbacks of such measures should be kept in mind and important advantages of multi-item scales should be considered. Additional scales may for instance be included in post-questionnaires to get deeper insights into underlying reasons for the ratings.