1 Introduction

According to the Society for Automotive Engineers (SAE 2018), highly automated (Level 4) driving is “the driving mode-specific performance by an automated driving system of all aspects of the dynamic driving task, even if a human driver does not respond appropriately to a request to intervene”. An example of such vehicles is the automated buses and shuttles mostly used for demonstration and research purposes in many cities around the world, where conventional vehicle controls such as steering wheel and pedals are removed. In these vehicles, a “safety driver” monitors the system in case of emergencies, or system limitations, and some form of joystick and/or button is used to control the vehicle, if and when required. These are usually pod-like vehicles, which can carry around 12–15 passengers, and travel at low speeds (less than 20 km/h) along pre-defined routes (Alessandrini et al. 2015). Examples of such vehicle demonstrations, normally used for first/last mile transport, include those deployed by the CityMobil2 (Alessandrini et al. 2015), GateWay (GateWay 2016), and NS WePods (WePods 2016) projects in Europe, or the Navya trials in Las Vegas (Navya 2017).

Due to complexities of the urban environment, including its varied infrastructure, there is a mix of traffic between automated and non-automated vehicles, as well as other road users such as pedestrians and cyclists. Conventional vehicles with Level 4 capability are not yet available on our roads, although Original Equipment Manufacturers (OEMs) and new players of automated vehicle technologies, such as Waymo, Google and Apple, are currently conducting trials in this context in different cities around the world. In addition, national and European projects such as HumanDrive (https://humandrive.co.uk/), and L3Pilot (https://www.l3pilot.eu/) are working with a number of vehicle manufacturers to conduct on-road trials of automated driving at SAE Levels 3 and 4. However, in terms of human factors, the focus in these trials has purely been on the on-board users’ experience of the system, rather than any interactions with external road users.

In recent years, the interest in how, and whether, these vehicles should communicate and interact with other road users in complex urban traffic (including other drivers, and vulnerable road users (VRUs) such as pedestrians and cyclists), has increased (see Madigan et al. 2019; Rasouli and Tsotsos 2018; Rothenbucher et al. 2016). However, currently, official consideration of the value or success of bilateral interaction and communication methods between AVs and other road users is at a very early stage, although recent efforts by OEMs suggest that there may be an appetite for providing some form of externally facing communication by the AV, as evidenced by promotion of various prototypes and concepts in the general media (Lagström and Lundgren 2015).

Although there is currently a limited understanding of the effects of these designs on actual road-user behaviour, example prototypes have included different forms of lighting on vehicles to signify automation state (on/off) and movement intention (stopping/starting/decelerating), as well as use of light reflections on roads and pavements, to provide messages and information. Discussions on this topic have also begun by standardisation organisations such as the United Nations Economic Commission for Europe (UNECE), the Society of Automotive Engineers (SAE J3016TM) and the International Organisation for Standardisation (ISO, TC22/SC39/WG8 2020). Finally, following early reports from road users that some form of external communication method for acknowledging detection and relaying movement information by AVs may be useful (e.g. Merat et al. 2018; Schieben et al. 2018), a number of current and recent research projects have begun to examine the value of different forms of externally facing interfaces (collectively termed external human machine interfaces or eHMIs) on pedestrians’ crossing behaviour (e.g. Ackermann et al. 2018; Clamann et al. 2017; Fridman et al. 2017). Studies in this context, which attempt to investigate road users’ reactions to, and impressions of, these eHMI have relied on a number of techniques, including use of Wizard of Oz studies, which provide a “fake” automated vehicle (e.g. Habibovic et al. 2016; Rothenbucher et al. 2016), or virtual reality/head mounted displays (e.g. Ackermann et al. 2018; De Clercq et al. 2018; Deb et al. 2018; Hudson et al. 2019). Some studies have found beneficial effects of the presence of an eHMI in helping pedestrians make safe crossing decisions (e.g. Dey et al. 2017) and increasing their perceived safety around AVs (e.g. Habibovic et al. 2016). However, it is important to ensure that the eHMI conveys the message it intended, as Lee et al. (2019) found that the same eHMI could potentially be communicating two very different messages. For instance, their study showed that a pulsing light-based cue combined with a fast auditory sound cue was rated as the best signal for conveying two different messages, ‘I am giving way’ and ‘I will start moving’. Other studies have failed to find any improvements in road users’ comprehension of AV intentions with the inclusion of eHMI and have suggested that road users focus more on the movement patterns and speed of the vehicle to make their decisions (e.g. Clamann et al. 2017). To date, the focus of these studies has been on the factors influencing pedestrian and other road users’ crossing decisions, and there have been no explorations of critical conflict scenarios.

1.1 Observations of current traffic interactions

In addition to studying road users’ interactions with fabricated AVs, one area of research which may be beneficial for providing information in this field is observation of road user behaviour in current settings, to understand what cues pedestrians and car drivers use for successful interaction and communication in a mixed traffic environment.

Road user interactions have been studied by a range of research disciplines. These include studies from a traffic conflict and safety perspective, and investigations based on communication- and linguistics-based disciplines (see Markkula et al. 2020 for summary). As early as the 1960s/1970s, Goffman (1971/2010) was using a sociological perspective to identify how humans moving as “mobile shells” or “vehicular units” do not move irrespective of other such units, but are sensitive to, and recognise them, coordinating and adjusting their movements, to avoid collision and mutual obstruction. His work highlights how road users are repeatedly faced with challenges around how they can make their actions recognisable by others, and how they can establish, and maintain, mutual orientation with others, to engage in joint coordination of their mobile actions (Haddington and Rauniomaa 2014). Goffman identified how interaction rituals develop, whereby rules for behaviour are often governed by the frame through which participants interpret, and define, the situation at hand. In traffic scenarios, this interpretation is often influenced by road infrastructure, traffic rules, cultural expectations, etc. (Markkula et al. 2020). This type of work provides a strong theoretical framework for understanding the, often subtle, manner in which road users coordinate their movements, based on slight changes in trajectory, or informal communication chains, and how these movement patterns may differ across different locations/cultures.

Previous work regarding interaction of pedestrians in present road settings suggests that different forms of non-verbal interaction (such as eye contact, hand and/or head movements, or vehicle-based signals) are used. This form of communication is thought to be used by drivers to confirm to vulnerable road users (VRUs) that they are noticed, or given priority, in an interaction situation, i.e. a situation where the behaviour of at least two road users can be interpreted as being influenced by the possibility that they are both intending to occupy the same region of space at the same time in the near future (Markkula et al. 2020). This form of communication is also thought to increase compliance with traffic instructions and rules (Hamlet, Axelrod and Kuerschner 1984; Kleinke 1977). Such findings are used, for instance, by policy makers, such as the US Department of Transportation, to recommend that pedestrians ‘make eye contact with drivers as they approach you to make sure you are seen’ (California Department of Motor Vehicles 2015) and ‘Make eye contact with the drivers of approaching vehicles whenever possible.’ (Arizona Department of Transportation 2018) and also ‘Make eye contact with drivers before you cross the street’ (Alberta Transportation 2013).

Thus far, research studies attempting to understand how drivers and pedestrians communicate in current urban settings have delivered mixed results, with some studies suggesting that eye contact does play a role (Guéguen et al. 2015; Walker 2007; Sucha 2014), where others report that eye contact does not play a major role (Dey and Terken 2017), and that sometimes the presence of drivers is not even perceived (Risto et al. 2017; Sucha et al. 2017; Straub and Schaefer 2018). Explicit communication is also reported to be ‘rare to non-existent’ (Dey and Terken 2017; Rasouli et al. 2018), whereas the dynamic movement and behaviour of vehicles, such as speed changes (Várhelyi 1998), inching forward at intersections (Wang et al. 2014) and stopping behaviour (Dey and Terken 2017) are reported as more important cues.

Currently, over a third of globally reported traffic-related deaths and injuries occur between pedestrians and vehicles (Crandall et al. 2002; World Health Organisation 2013). A number of factors contribute to these figures, including the fact that pedestrians can determine their own course within the urban space, as they think best (Jian et al. 2005). Such flexibility also causes pedestrians to be one of the most unpredictable groups of road users, who cannot be effectively controlled by regulations (e.g. Jian et al. 2005; Cambon de Lavalette et al. 2009; Ward et al. 1994). Therefore, as the move towards higher levels of automation in vehicles increases, and the driver’s responsibility for the control and manoeuvring of the vehicle is removed, it is reasonable to imagine that other forms of communication must be provided by the AV, to replace the driver’s role in this context. The hope is that this will ensure the same level, if not better, compliance with road rules and regulations, by these new forms of transport, ideally leading to higher trust and acceptance of AVs, and assisting in successful traffic flow between all actors in a mixed urban environment of the future, without compromising road safety.

As well as understanding how generic, and internationally agreed, rules of interactions between road users might inform AV behaviour, the role of ‘social norms’ in this context is also important (Goffman 2010; Rasouli and Tsotsos 2018). This includes, for example, understanding whether pedestrians from a particular region are more likely to use hand gesture as a form of communication while crossing the road, or whether the approaching car always yields to the pedestrians in certain regions, more than others. These are the informal rules which play a vital role in road users’ interactions, including the ability to predict intention and behaviour (Evans and Norman 1998; Farber 2016; Lee and Sheppard 2016; Wilde 1980). Social norms are thought to differ from region to region, and even for different scenarios, and use cases (Straub and Schaefer 2018), which makes their adoption by AVs challenging. In addition, these norms may impact on road users’ preferences about how an AV communicates with them, e.g. through auditory or visual cues (see Merat et al. 2018). Conte et al. (1998) mentioned that norms could make social behaviour more predictable. Therefore, it can be argued that conforming to social norms may increase an AV’s safety, efficiency and acceptance on the road.

1.2 Current study

To further understand how pedestrians interact with vehicles in a mixed traffic setting, and establish whether social norms influence this behaviour, the current study used a newly developed observational protocol, and a follow-on on-site questionnaire, applied at six observed locations, in three European cities (Leeds, Munich and Athens). It should be noted that the focus of this study was on identifying communication patterns which may be possible to replicate or enact, using technology available to automated vehicles, and thus the focus was on easily identifiable explicit and implicit cues. The study aimed to explore a number of issues: (1) Do approaching vehicles provide explicit communication messages? (2) What types of vehicle movement behaviours are used to convey messages? (3) What movements do pedestrians make on the approach to a crossing point and during a crossing? The on-site questionnaire provided a further method for exploration of the use of vehicle- and driver-based information for pedestrians’ crossing decisions, along with the manner in which pedestrians themselves believer they convey their crossing intentions.

2 Method

2.1 Observation sites

Six different observation sites were chosen across the three countries (see Fig. 1). These sites were chosen following a workshop between interACT project partners, which included the selection of suitable use cases for investigation (Wilbrink et al. 2017). Sites were chosen based on the criteria that communication would occur between road users, and that it was an urban area with a designated speed limit of 50 km/h. A separate set of studies have explored the types of communication occurring between pedestrians and vehicles in low-speed areas (see Uttley et al. 2020 under review).

Fig. 1
figure 1

Top left to right: The intersections at Leeds, Athens and Munich. Bottom left to right: Munich traffic island (two sections), Munich jaywalking, and Munich zebra crossing. Yellow arrows represent the location and direction of pedestrians’ crossings. The blue and green lines represent where the approaching vehicles were coming from, and their respective arrows showing the directions they were travelling to. The red stars represent the location of a group of two observers, whereas the questionnaire administrator was mobile while administering the questionnaire (color figure online)

To maximise the number of interactions between pedestrians and vehicles, one of the “must have” use cases chosen was one which included the need for a vehicle to ‘react to crossing of non-motorised traffic participants at crossings without traffic lights’ (Wilbrink et al. 2017). The final locations for each city were chosen due to regular and numerous crossing opportunities for pedestrians, in the presence of approaching vehicles, which provided opportunity for frequent interactions between vehicles and pedestrians. Table 1 provides more information about each location, such as the map coordinates, location descriptions, road users’ established priority rules, as well as the road’s posted speed limit. Particular effort was placed on ensuring similarities between the sites as much as possible, although, for practical reasons, such as availability of suitable locations for video-recordings (not reported here), some compromises had to be made. In particular, these locations allowed a clear observation of where pedestrians looked during road crossings, along with providing some information of the different ways in which drivers might communicate with other road users in un-signalised settings, where priority is unclear. Although this type of interaction might change across different infrastructure and driving environments, the inclusion of multiple sites allowed us to draw some conclusions on what typical interaction patterns might look like, and whether these were different across the three cities.

Table 1 More information about the observation sites

2.2 Developing the observation protocol

To provide a uniform method for recording behaviour at the chosen sites, an observation protocol was designed and converted into a digital application by the project team, which was then used for all locations. This application was usable on different electronic devices, available off-line, and also allowed the synchronisation of data between the video recordings (not reported here) and questionnaires (see also Dietrich and Ruenz 2019).

This pedestrian–vehicle observation protocol consisted of three main sections (see Fig. 2): The first section included 99 ‘event types’, which captured observable behaviours of the pedestrians and vehicles, as they interacted with each other at each location, such as their looking behaviour, any observed hand gestures, signals and movements (see Camara et al. 2018). The second section of the application allowed recording of environment-based information, such as time of day and weather conditions. This section also included a recording of participants’ demographic data (see Table 2). The third section provided a schematic representation of the junction, which allowed observers to draw any further details of the interaction, not recorded on the app, such as the direction of movement of the observed pedestrian and vehicle (not reported here).

Fig. 2
figure 2

A screen shot of the observation protocol application

Table 2 Demographic data of pedestrians at each of the six observation sites

2.3 Developing the post crossing questionnaire

To understand what factors pedestrians use for their crossing decisions, and whether vehicle- and driver-based information is used in this decision-making process, a short, 15-item, questionnaire was devised by the project team. These 15 questions included (1) Where are you travelling from? (5-item multiple choice question, MCQ) (2) Where are you travelling to? (5-item MCQ) (3) How regularly do you use this crossing? (7-item MCQ) (4) How safe did you feel during that crossing? (4-item Likert scale from very unsafe to very safe) (5) Why? (6) What information from the vehicle, if any, did you use to decide it was safe to cross? (8-item MCQ) (7) What information from the driver did you use to decide it was safe to cross? (6-item MCQ) (8) Was there any other information you used to determine how safe it was to cross? (9) How long did you feel they were waiting to find a suitable crossing gap? (3-item Likert from longer than usual to shorter than usual) (10) Did the presence of other people affect your decision of when to cross? (11) If so, in what way? (12) How did you indicate their intention to cross the road? (13) Who do you think has priority in this situation? (14) Are you a car driver? and (15) the ARUBQ questionnaire (see Elliott and Baughan 2004).

Here, pedestrians were also asked to report if they use any form of gesture and body language to communicate their intentions when crossing the road. The questionnaire was developed in English and translated into German and Greek by the relevant project partners. Before final administration, each translation was then checked by an independent German- or Greek-speaking colleague at Leeds. Due to its length, only some sections of the questionnaire (questions 6, 7, 8, 12), regarding pedestrians’ use of cues, are included here.

2.4 Procedure

Ethical approval for this study was granted by the University of Leeds Ethics Committee (AREA 17-010). The studies were conducted on weekdays, between November and December 2017, with the weather conditions mostly sunny, or overcast, at all locations. Rush-hour times were avoided as it became too difficult to capture the details of interactions between two specific individuals. The same procedure was adopted for each observation site, with three observers for each site sharing the task of completing the observation protocol, and follow-on questionnaire (Sect. 2.3). As shown in Fig. 1, the three researchers (two observers and one questionnaire administrator) were positioned in a group, at a location, which was far enough from the intersection to avoid influencing the behaviour of the road users, but close enough for observing and recording the interactions.

The observers’ main task was to identify a potential crossing manoeuvre by a pedestrian and record their behaviour during this exercise. For each observation recorded, one of the observers was responsible for choosing a pedestrian to observe, and the other focussed on the driver. Once the observation team were ready to begin an observation, they selected the next pedestrian whose trajectory suggested that they would be crossing the road, where there was also a vehicle approaching the intersection that they may have to interact with. A recorded observation started when the pedestrian was approximately 2–5 m away from the road (approaching phase) and continued until the pedestrian had arrived on the other side of the road (crossing phase). When observing the pedestrian/driver, the researchers took turns to vocalise the event types observed, with this data captured using a voice recorder. Once the interaction was complete, both observers completed the digital observation protocol together. This procedure was practiced extensively in a series of pilot observations studies lasting around a week, before the main data collection process occurred. In addition to assisting the observers with their roles and responsibilities, and fully familiarising them with their task, these pilot studies allowed sharing of any knowledge across the team, and an update of the app, if required. Following these pilot studies, some categories/sections were altered. For example, the category “clearly made eye contact” was removed after initial piloting showed that this was too difficult to establish with certainty, and the order of items was changed to make the app easier for users to navigate, by clustering the pedestrian and driver behaviours together.

Once a particular pedestrian’s behaviour had been noted using the observation protocol, the questionnaire administrator approached the pedestrian and asked if they would be willing to complete a short questionnaire about the crossing that they had just made at the Leeds, Athens and Munich intersections. If they responded positively, they were provided with a short verbal introduction to the study and asked to sign a consent form. The questionnaire administrator then read the questions aloud, and also presented the questions visually to each participant, noting their answers on the questionnaire document. This change to verbal format followed the pilot study, since participants found it difficult to read the questions on the side of the street, and the process was found to be more efficient when the questions were read aloud. For multiple choice questions, the administrator left a gap after each option which allowed participants to respond immediately, rather than having to memorise the list. Participants were not compensated for their time to respond to the questionnaires, which took approximately 10 min to complete.

2.5 Participants

Table 2 provides a breakdown of the demographic information for pedestrians in each of the observation sites. Pedestrians’ age was estimated to be one of five categories (child, teenagers, young, middle-aged, or older adults). The percentages reported are based on the number of valid cases, excluding cases where this detail was not recorded—see Table 2. Table 3 provides an overview of the demographic data of the pedestrians who took part in the post crossing questionnaire, summarised for each city.

Table 3 Demographic data of the pedestrians at each location for post-crossing questionnaire

A total of 989 observations were made. However, not all observations ended up in an interaction between the pedestrian and a vehicle, and therefore 288 cases were excluded. Thus, there were a total of 701 interactions across the six sites, where a pedestrian and vehicle approached the shared space at about the same time.

3 Results and discussion

3.1 Observation protocol

A total of 99 predefined temporal “event types” were recorded on the application, to define interactions between the pedestrians and vehicles/drivers during the pedestrians’ approaching and crossing phase (see Camara et al. 2018). To allow a more simplified analysis framework, these event types were further collapsed and categorised according to similarities in gesture. For example, original event types ‘waved hand’, ‘raised hand sideward’, and ‘raised hand in front’ were collapsed into a new category described as ‘pedestrian’s hand gestures’. This provided 20 event types, which were defined by the pedestrians’ behaviour (movement, looking behaviour, hand gesture) as well as the vehicles’/drivers’ behaviour (movement, signal used, and drivers’ hand gestures) during the whole course of the crossing (combining the approaching and crossing phase).

The percentages of each event type observed were then calculated for each site (see Table 4). Since a particular observation could include more than one event type, these behaviours were not mutually exclusive, and the data presented in this table, therefore, adds up to more than 100%. The following sections summarise the main findings from the observations.

Table 4 The percentage (%) of event types observed at each observation site

3.1.1 Do approaching vehicles provide explicit communication messages?

From the 701 interactions observed across the six different locations, the results showed that there was almost no explicit communication message provided to pedestrians by the vehicles observed (less than 1% of vehicles honked their horn, or flashed their lights). In addition, only about 4% of drivers used hand gestures to communicate with pedestrians. Therefore, in contrast to previously proposed hypotheses (e.g. Wilde 1980; Clay 1995; Sucha et al. 2017), this study suggests that, currently, drivers do not use explicitly observable forms of communication when interacting with pedestrians in these particular urban use cases. Of particular value here is the uniform results observed across the three countries. Of course, it is important to highlight that this absence of explicit messages was observed for the particular range of road crossing situations studied here, where the posted speed limit was around 50 km/h. This is in contrast to studies conducted at sites with lower speed limits (e.g. Schneemann and Gohl 2016). Therefore, further observation studies in different urban environments are needed to ensure these results are representative (Uttley et al. 2020, under review).

3.1.2 Movement behaviour of the approaching car

When pedestrians had priority, such as at the zebra crossing in Munich, the approaching vehicles were more likely to decelerate (74%) and stop (56%) to allow the pedestrians to cross. A less than 100% adherence to the law by these drivers may be due to the, sometimes, unclear placement of pedestrians and possible unpredictability of their intention. However, as observed by Sucha et al. (2017), in the city of Olomouc, 36% of drivers failed to yield to pedestrians at zebra crossings, even when pedestrians were waiting with the intention to cross (Várhelyi 1998).

On the other hand, in situations where the drivers had priority, i.e. in the absence of zebra crossings, observations showed that a much lower number of vehicles stopped or decelerated, with varied behaviours observed across the different infrastructure. For example, no vehicle decelerated or stopped for pedestrians in the jaywalk location in Munich, but 22% of vehicles decelerated, and 23% stopped when pedestrians attempted to cross at the intersection in the same city. In addition, at these sites, drivers were more likely to decelerate (average of 25% across sites) than stop (average of 17% across sites) for pedestrians. This suggests that, in the future, when AVs are adjusting their manoeuvre or yielding for pedestrians, it is acceptable for them to simply decelerate, rather than come to a full stop.

3.1.3 Pedestrians’ movements

On average, only about 6% of pedestrians were seen to use some sort of hand gesture when crossing the road. Most pedestrians were seen to look towards the approaching vehicles (89%) when crossing. Since the remaining pedestrians did not use an explicit communicative gesture before crossing the road, identifying their intention to cross is clearly a problematic process for both drivers, and, arguably, future automated vehicles.

For all of the sites, most of the observed pedestrians decelerated and/or stopped before crossing the road (average of 80% across sites), but this was noted less often at the Munich zebra crossing (66%) and Athens intersection (56%). Around 39% of pedestrians at the Leeds intersection and 38% of those in Athens continued in their path and did not stop before crossing the road.

In summary, these observation studies, from the 701 interactions across three European cities, seem to suggest that pedestrians and drivers do not, in fact, exhibit explicit communicative gestures to indicate intention when interacting with each other in a mixed traffic environment. To investigate this matter further, and establish whether pedestrians are aware of, or indeed believe they are using any obvious external cues from drivers and vehicles when attempting a road crossing, more information was sought from the observed pedestrians via a short questionnaire, administered immediately after their crossing was observed.

3.2 Questionnaire study

3.2.1 Use of vehicle- and driver-based information

Participants were provided with a series of options and asked to state what vehicle-based and driver-based information they used to make their crossing decision for the particular crossing they had just made. They were allowed to choose more than one option (see Figs. 3 and 4).

Fig. 3
figure 3

Percentage of pedestrians who responded ‘yes’ for each vehicle-based information used, for each city

Fig. 4
figure 4

Percentage of pedestrians who responded ‘yes’ for each type of driver-based information used, for each city

The number of ‘yes’ responses provided by pedestrians for each category in this section was divided by the number of options available. For instance, for vehicle-based information, if a pedestrian said that they used ‘speed’ and ‘distance’ only (2 of the 7 categories provided), they were taken to use 2/7 = 28.6% of the vehicle-based information provided.

Three paired-sample t tests (one for each city) were conducted to investigate the proportion (%) of vehicle-based and driver-based information used by pedestrians in each location. The paired-sample t tests revealed that, compared to Munich, pedestrians in Leeds and Athens used vehicle-based information much more often than driver-based information, when deciding whether or not to cross the road (Leeds: t(66) = 6.15, p < 0.001; Athens: t(63) = 3.68, p < 0.001, Munich: t(14) = 0.489, p = 0.633). However, this could be due to the smaller data set. About half of the pedestrians questioned in Leeds and Athens reported not using driver information at all while crossing at the intersection, whereas about 20% of the pedestrians questioned at the Munich intersection reported not using driver information.

When pedestrians were asked to provide details of any other information used when crossing (‘Was there any other information you used to determine how safe it was to cross?’), additional responses were rare, but included auditory cues, i.e. ‘listening’, ‘hearing’, and ‘sound of traffic’ (Leeds, N = 3), the use of nearby traffic lights to determine the manoeuvre of approaching vehicles (Leeds, N = 3), and ‘knowledge of normal behaviour at this crossing’ (Leeds, N = 1), ‘common sense’ (Leeds, N = 1) and ‘instinct’ (Leeds, N = 1). In Munich, only one pedestrian provided additional responses which was ‘driving style’ (Munich, N = 1). Only two pedestrians provided further information in Athens, which included ‘no, I just passed’ and ‘I don’t remember’.

3.2.2 Indicating their intention to cross

Pedestrians were also asked to indicate what kind of information they provided to show their intention to cross, for the crossing they had just made (see Fig. 5). As before, they were allowed to choose more than one option.

Fig. 5
figure 5

Percentage of pedestrians who responded ‘yes’ for each intention information provided, for each location

Over 50% of pedestrians did not report using any type of intention-signalling information to show their crossing intention, apart from ‘stepping forward’ which was reported by 70% of Munich pedestrians. It should be noted that although these results provided an indication of the information the pedestrians were conscious of considering, they do not necessarily capture any unconscious processing which may have occurred. In addition, there is a considerably lower number of data available for Munich intersections as compared to Leeds and Athens, and interpretation of this data should be done with some caution.

4 General discussion and conclusions

This study observed the movement and behaviour of pedestrians and drivers/vehicles at six designated locations in three European cities. The aim of the study was to establish if, and how, these actors communicate with one another when interacting in a mixed traffic environment, and especially when negotiating priority during use of the same road space. The ultimate aim of this study was to establish how such human–human interactions can be used to design suitable external interfaces for future automated vehicles, when the driver is no longer in charge of negotiating with pedestrians in such circumstances. Importantly, the focus of this paper has been on identifying patterns of human behaviour which can be easily adapted for use in human–robot interactions, such as communication by future automated vehicles, where finer, and more discreet movements and gestures observed by previous sociological and linguistic studies may not be useful.

The results from the observation studies showed relatively consistent behaviour across the three cities, with pedestrians and drivers rarely using explicit body language to communicate with each other, relying instead on kinematic cues (such as distance, speed, braking). Follow-on questionnaires reported that pedestrians rely on factors such as vehicle speed, and distance, to decide on road crossing, relying less on driver communications such as eye contact, or head/hand movements. Only 27% of pedestrians reported using eye contact with the driver to signal that they intended to cross, while reciprocal eye contact by the driver, acknowledging pedestrian intention, was reported by 27% of pedestrians. To understand these findings in more detail, it would be useful if future studies used eye-tracking methods to determine whether eye contact has been established during interactions and how it affects communication and interactions.

Generally, in this study, we found that pedestrians reported using more vehicle-based, than driver-based, information to make a crossing decision. This finding was consistent across locations. These results support previous studies (Dey and Terken 2017; Risto et al. 2017; Sucha et al. 2017). However, there are a few important points to consider for interpreting these results. First, it is important to stress that these observations were conducted on roads with a 50 km/h speed limit, and that more explicit communication may well occur between drivers and pedestrians sharing the same space on roads of lower speed limit (Schneemann and Gohl 2016; Uttley et al. 2020 under review). Such explicit communication may also be more prominent during deadlock scenarios, or encounters at short distances, where road users have to negotiate priority. Second, although quite rare, use of driver-based information was still reported. Therefore, future studies should consider the circumstances in which explicit communications between road users do actually occur. Third, although we have used this data to speculate what information may be needed from future automated vehicles, it is also important to address this question when pedestrians are interacting with actual AVs (e.g. Merat et al. 2018), since the shape and behaviour of these vehicles may be fundamentally different to that of current, manually driven vehicles. Finally, future studies should explore the impact of individual differences, such as road users’ attitudes towards road interactions. For example, it can be argued that courtesy and altruism behaviour, or attitudes towards violations, may influence such interactions, with previous studies showing that these characteristics are linked to pedestrians’ propensity to engage in risky and delinquent behaviours (see Elliott and Baughan 2004).

In terms of differences across the cities, our findings suggest that, compared to the other two cities, pedestrians in Athens were more likely to use eye contact to indicate crossing intention and were more likely to refrain from using any vehicle- or driver-based information. This highlights the potential need to consider the influence of different cultural and social norms when deploying AVs in different cities. It is also important to note that factors such as country-specific road design and infrastructure, traffic density, weather, etc. may have impacted on the interactions observed (see Madigan et al. 2019), and, therefore, any cross-cultural comparisons should be interpreted with caution.

Crucially, this exploratory observation study provides valuable insights, showing that, despite some country-specific infrastructural differences, across the range of locations, road users rarely used explicit communication to convey information about crossing intentions. This suggests that, in higher speed urban environments, road users will base their interactions on each other’s movement patterns, indicating that there may be limited requirement for automated vehicles to adopt explicit communication solutions in these environments. Another interesting finding from our study was that drivers were more likely to decelerate (16%), rather than come to a complete stop for pedestrians (11%). Results in this context are somewhat conflicting, with Sucha et al. (2017) and Lee et al. (2019), suggesting that pedestrians are more likely to attempt crossing only when the vehicle has come to a full stop, with others showing that pedestrians often cross before the vehicle has come to a complete stop (e.g. Risto et al. 2017; Domeyer et al. 2019). At present, it seems reasonable to suggest that, to improve traffic flow, future AVs do not need to come to a complete stop, to aid pedestrians’ crossing. Future studies should expand the scope provided here, to explore interactions between other road users such as vehicle–vehicle interactions and interactions with other vulnerable road users such as cyclists or older or younger age groups.