Introduction

Birds that perform coordinated vocal displays are known as duetting or chorusing bird species (Hall 2004). A vocal duet occurs when two individuals combine their vocalizations non-randomly, usually for the purpose of delivering a conspicuous synchronized vocal performance (Thorpe 1972; Farabaugh 1982). It is usually broadcasted by a mated pair to defend its territory, advertise the mated status and maintain pair bond (Grafe and Bitz 2004; Logue and Gammon 2004; Odom et al. 2017; Odom and Omland 2017; Wheeldon et al. 2020). It also occurs during the courtship display (Soma and Iwama 2017) and it can be performed even by two or more males (Foster 1981; Trainer 2002). When birds perform such coordinated display, the interplay of single components produced by each individual forms a new meta-signal also known as a collective signal (Brumm and Slater 2007). Such collective signal has its own properties that are shaped at the individual level by the acoustic features of each bird, and the manner through which the partners combine their song (Logue and Krupp 2016). Duets have many functions and could transmit different information depending on the context and the species (Dahlin and Benedict 2013; Hall 2004; Mennill and Vehrencamp 2008). When more than two birds, usually from the same social group, join their song to broadcast a collective and cooperative vocal display, they perform a chorus. It has similar functions in territorial defence to duets (Baker 2004, 2009; Radford 2003; Seddon and Tobias 2003; Wu 2013).

There is a lack of knowledge regarding the coordination strategies used by chorusing bird species where all participants start singing at the same time. Such group coordinated display requires that individuals share information and agree to perform a collective action. In this context, a specific intra-group signal could be used by group members to synchronize their song. In several African barbet species (Piciformes, Lybiidae), specific vocalizations combined with visual displays precede vocal duets and choruses and may play such role. This was first described by Skead (1950) in the Black-collared Barbet (Lybius torquatus). The author provided three significant points: the pre-duet calls sound different than the song; the bird who initiated the duet gave the pre-duet calls while its mate simply replied by starting its own song sequence (Payne and Skinner 1970); finally, birds combined their calls with visual display. Later on, the presence of pre-duet/chorus vocalizations combined with visual postures defined as “greeting ceremony” was reported in at least eight species of the African barbets (Short and Horne 1983). The authors suggested that because these pre-duet notes are relatively soft compared to the loud duet/chorus song and can hardly be heard beyond 20–30 m away from the birds, they might have a function in intra-group relationships and pair-bond maintenance. However, there is a lack of studies among the different barbet species that describe how birds use these vocalizations, making any assumptions regarding their exact functions difficult.

The aim of our study is to describe the use of the pre-duet and chorus vocalizations notes (described as “chewp note” by Short and Horne 2001) to determine whether it might constitute a specific signal used by the Yellow-breasted Barbet (Trachyphonus margaritatus somalicus) to initiate a group vocal display. This barbet species is considered as a chorusing species that introduces its communal songs with greeting ceremonies, but the group vocal behaviour has not been studied yet. We used the term “introductory sequence” instead of “greeting ceremony” because we considered the former more neutral since the function of this behaviour remains unknown. We organized our investigations around two mains hypotheses:

Hypothesis 1: The chewp notes are a specific signal used by the Yellow-breasted barbet during vocal duets and choruses introduction. The chewp notes should be directly followed by a duet or chorus song. If these vocalizations have a function in group singing initiation, we would not observe a duet or a chorus starting without such a signal.

Hypothesis 2: The introductory sequence could act as a recruitment signal used by the initiator to attract other group members and trigger them to sing. If this is true, we expect that the initiator systematically gives an introductory sequence to induce a group vocal display, while the followers could answer simply by starting their song without an introductory sequence. Followers should immediately react to the signal and join the initiator. Alternatively, if both the initiator and the followers behave the same way during a duet and chorus initiation, the introductory sequence could be a greeting ceremony in the sense of an intra-group social behaviour used as mutual agreement to sing in a group or to reinforce bonds.

To investigate these hypotheses, we used playback stimulations and we recorded with a video camera the duets and choruses of different groups of Yellow-breasted barbet in Djibouti. We analysed the vocal and visual components of their group displays.

Methods

Species and sites of study

The Yellow-breasted Barbet occurs in thorn-bush and acacias savannah, usually along dry watercourses in the semi-desert sub-Saharan regions. It is a non-migratory bird species that lives in pairs or small social groups which defend their territory throughout the year and is described as a group-breeding species (Soma and Brumm 2020, personal field observations). There is a sexual dimorphism, the male has a black patch on its throat that is lacking in the female (Redman et al. 2016; Fig. 1c). We monitored and recorded two wild populations of the species in the Republic of Djibouti, during February–March 2019 and 2020 (Djalélo population: N 11 21.266 E 042 47.842; Assamo population: N 11 01.501 E 042 54.139). We put colour rings combination on each individual and saved GPS coordinates (Garmin 64) of each pair / group territory. In total, 72 individuals were ringed in 12 distinct sites (in total, we identified 18 sites in 2 years, Supporting Information S1, S2).

Fig. 1
figure 1

A Spectrogram of the beginning of a duet song with the introductory vocal sequence (black-grey), given here by the female, consisting of a series of chewp notes. Then, the duet sequence with the song of each duetter colored (blue for the male and orange for the female). b Spectrogram of the two type of chewp notes series: on the left the low chewp notes, on the right the high chewp notes. c Picture of a duetting pair of Yellow-breasted Barbet. The bird on the left is the male recognizable with the black patch on its throat, the female is on the right side, with her crest erected

Playback experiments

We conducted playback experiments that simulated a territorial intrusion to induce a group defensive display. The aim was to attract birds close enough to allow us to film with a video camera their group display. The playbacks were performed in the morning from 6:15 to 11:00 a.m. (average sunrise 6:20 a.m.) during February and March 2019–2020. In 2019, we used “Natural playback song” created with barbet songs previously recorded in 2016 in Djibouti (recorded either with or without playback of songs obtained from xeno-canto). We selected the best quality recorded songs, filtered (high pass filter 800 Hz), and normalized. In 2020, we had enough data and knowledge to create “Synthesized playback song” made with the Soundgen package under R software. Each playback consisted of 30 s of a solo, a duet, or a chorus song that included an introductory vocal sequence directly followed by the song sequence itself. Playbacks were broadcasted through a JBL CHARGE3 loudspeaker (power: 20 W, frequency response: 65 Hz–20 kHz, signal-to-noise ratio: > 80 dB), just after we visually localized the tested group in their territory. The amplitude of the playback was adapted to the natural amplitude of barbet’s song that we obtained using a sound meter (model: testo 816, automatic mode, range from 30 to 130 dBA, fast time weighting mode A). The behavioural response of birds was recorded with a shotgun microphone Sennheiser MKE600 connected to a Canon XF400 camera (audio: 44.1 kHz, 16 bits; video: 1080p, 30fps). We conducted 55 recording sessions in total (two field recordings done in 2016 were included in our data due to their good quality and the similar way the playback sessions were done). We selected 49 best quality of the 55 recordings for the analysis. It corresponds to 26 groups of barbets recorded in 17 sites (one group in 2016, 12 groups in 2019 and 13 groups in 2020). The highest number of groups than sites is due to the fact that in many cases, a group of birds caught in 2019 on a specific site was different from the group of birds caught in the same site in year 2020 (Supplementary Information S1–S2). The recordings are available at https://doi.org/10.7479/gyfm-af02.

Data extraction

To describe the bird behaviour when replying to playback stimulations and investigate the possible existence of a multimodal signal, we created a list of behavioural variables inspired by what was observed in the closely related D'Arnaud's Barbet (Trachyphonus darnaudii) and the Black-collared Barbet (Short and Horne 1982):

  • Tail display: we considered that the tail is raised and fanned over the back when the tilt angle was greater than 45 degrees and it was not caused by the bird's balance adjustment while perching or moving.

  • Song sequence: the individual song sequence during a duet or chorus.

  • Introductory sequence: the chewp notes series emitted before the start of the song sequence.

The videos collected were analysed with BORIS (Behavioral Observation Research Interactive Software, Friard and Gamba 2016) to extract data regarding the visual displays. We analysed the audio part of our recordings under PRATT software (version 6.1.15; method: Fourrier; window shape: Gaussian; time steps: 1000; frequency steps: 250), to precisely annotate the start and end of acoustic events of the individuals recorded. We then combined both the acoustic and visual data to construct the behavioural timeline (s). In each group display analysed, we divided the participants in two categories: The leader is the individual that gives the very first vocalisation either of the introductory sequence or the song sequence. The follower is the individual that replies to the leader's vocalisations.

Acoustic analysis of the chewp notes

We isolated 54 chewp notes from our recordings using Raven Lite 2 software, filtered with a high pass filter 800 Hz to reduce background noise and resampled to 16,000 Hz. The acoustic analyses were conducted under R software using the Seewave R package (wl = 128, overlap = 75%). We extracted the mean, max and min of the fundamental and the first harmonic, the peak frequency, the first and third quartiles (Q25, Q75) and inter-quartile (IQR) of each chewp notes analysed. The average slope of the fundamental frequency was calculated using the frequency values at the beginning and the end of each note. To determine the time and frequency of the main stationary and inflexion points, we tracked the dominant frequency and used it to build a polynomial model that fits the dominant frequency modulation and calculated the first and second derivatives. The stationary and inflexion points gave us more details about the modulation pattern (increasing or decreasing frequency signal as well as the curvature of the signal).

Statistical analyses

We first classified each of the 54 chewp notes into two categories based on their acoustic features: 33 high chewp notes (n = 11 individuals, 3 notes per individual) and 21 low chewp notes (n = 7 individuals, 3 notes per individual). Then, we conducted a Hierarchical Clustering on Principal Component (HCPC) using the package Factoextra in R, to determine if our two categories high and low chewp notes were meaningful. The first step was to perform a Principal Component Analysis (PCA) based on the acoustic measurements to reduce the dimensionality of the dataset (FactoMineR package in R, data scaled, three dimensions kept). We computed the results of the PCA to perform an HCPC based on the first two principal components (Factoextra package in R). We limited the hierarchical clustering to 2 clusters, focusing on the main distinction between high and low chewp notes, and to not continue the clustering based on other acoustic similarities or dissimilarities (ex: sex acoustic specificity, acoustic similarities between individuals).

We counted the number of duets and choruses that started with or without an introductory sequence. We then calculated the mean time delay of answering of the follower after the emission of the first leader’s vocalization in each group. We compared the use of chewp notes between leader and follower individuals. First, we calculated the percentage of the introductory sequence that contained only high chewp notes, only low chewp notes, or both chewp note types using one group display per group recorded. Then, we performed a Mann–Whitney–Wilcoxon test (Wilcoxon Sum-Rank test, two-sided with continuity correction) to test whether leaders emit more introductory notes than followers do, we used the average values of leaders and followers per group to allow each group to contribute equally to the statistics. We also analysed in which manner birds combine acoustic and visual components by determining when the visual display occurred during the full vocal sequence timeline (introductory vocal sequence + song sequence). Finally, using one recording per group, we counted the number of individuals that introduced their song only with the introductory vocal sequence: “acoustic only”; the individuals who used the visual display only: “visual display only”; those who displayed both the introductory vocal sequence and the visual tail posture: “multimodal display”; and the individuals that did not perform any visual display nor used pre-duet and chorus calls and just directly started their song: “no display”. We conducted a Pearson’s Chi-Squared Test for Homogeneity to know whether the two categories, Leader and Follower, differ in the way birds introduce their song and compared the Pearson’s residuals to determine on which categorical variables they differ the most with the R package corrplot. All statistical analyses were conducted with R software (version 4.0.3).

Results

Acoustic analysis of the chewp notes

The Yellow-breasted barbet initiates its song sequence with a series of chewp notes which we defined as the introductory vocal sequence (Fig. 1a, Supplementary Information S3). We divided these introductory notes into two categories according to their loudness and frequency shape: “High chewp” are short-medium range, high-pitched notes (mean fundamental frequency = 1931 Hz ± 47, mean 1st harmonic = 3804 Hz ± 89.6, the dominant frequency was on the fundamental or the 1st harmonic according to the individual, Fig. 1b-right, Table 1); “Low chewp” are the softer version of the high chewp, meaning they are lower in intensity and frequency range (mean fundamental frequency = 1405 Hz ± 47, the 1st harmonic was too soft to get consistent measurements, Fig. 1b-left, Table 1). We performed a PCA using 13 acoustic parameters of the 54 chewp notes. The cumulative variances of the two principal components explained 68% of the variances. The 1st PC account for 52% with the main contribution of frequency features (freq _stationary, mean_fundfreq, Q25, fund_min and fund_max). The 2nd PC explain 16% of the variance with mainly time related features (IQR, duration, Q75 and slope). For more details about eigen values and variables contribution, please refer to Supplementary Information S4. The HCPC assigned 20 chewp notes in cluster 1 and 34 chewp notes in cluster 2. We found that the cluster 1 contained 20 low chewp notes, the cluster 2 contained all the 33 high chewp notes and one low chewp note. Overall, HCPC correctly assigned high and low chewp notes in two distinct groups based on their acoustic features. Only one low chewp was misclassified.

Table 1 Mean ± SEM of 16 acoustics measurements conducted on 54 chewp notes (21 low chewp and 33 high chewp notes)

Acoustic structure of the duet and chorus introduction

All of the duets and choruses recorded started with an introductory vocal sequence from at least one individual. The mean reaction time of followers per group to reply to the leading individual was 2.88 ± 0.61 s (mean ± SEM, n = 19 groups). This reaction time concerns only the individuals that were present when the playback simulation started. We did not account for the individuals that joined the chorus with some delay due to the fact that they were far from the rest of the group at that moment. When looking at the first follower only, the mean reaction time per group was 1.73 ± 0.48 s (n = 19). Half of the introductory sequences of leaders were pure series of high chewp notes, and the second half were a combination of high and low chewp notes. Only 3.8% were pure series of low chewp notes (Fig. 2a—LEADER, n = 26 leaders). Regarding followers, 53.1% of the introductory sequences contained a mix of high and low chewp notes, and 25.0% high chewp notes series only. However, 21.9% were composed of low chewp notes only (Fig. 2a—FOLLOWER, n = 32 followers). We also found that leaders produced a significantly higher amount of high chewp notes than followers did per introductory sequence (W = 63.5, n = 22 groups for leader category, 25 groups for the follower category, p < 0.001, Fig. 2b). However, we did not find any significant difference regarding the number of low chewp notes per introductory sequence emitted between leaders and followers (W = 182, n = 20 groups for leader category, 17 groups for the follower category, p = 0.72, Fig. 2c). We controlled for the sex of individuals, and we found that there is no significant difference between males and females in the number of chewp notes emitted, regardless of whether they are leaders or followers. It suggests that the emission of chewp notes did not depend on the sex of the individual but rather its role as leader or follower during a group display initiation.

Fig. 2
figure 2

A Frequency of use of three types of introductory vocal sequence: with high chewp notes only, low chewp notes only and sequences with a combination of high and low chewp notes. Almost all the Leaders emitted high chewp notes, half of them used high chewp notes series only, while the other half used a mix of high and low chewp notes. The majority of the followers also used high chewp notes only or combined with low chewp notes. However, around 22% of the sequences recorded contain only low chewp notes. b-c Violin plots representing the number of high and low chewp notes per introductory sequence for both the leaders and followers. Leaders emitted significantly more high chewp notes to introduce their song than followers did (Mann–Whitney-Wilcoxon Test: W = 63.5, n = 22 groups for Leader category, 25 groups for the Follower category, p < 0.001). There was no difference in the number of low chewp notes emitted by leaders and followers (Mann–Whitney-Wilcoxon Test: W = 182, n = 20 groups for Leader category, 17 groups for the Follower category, p = 0.72). We also noticed that the number of high chewp notes emitted by followers did not differ from the number of low chewp notes emitted both by followers and leaders, while Leaders emitted a higher number of high chewp notes per sequence than low chewp notes

Who is displaying? The use of a multimodal signal

We found that the visual tail posture occurred at the beginning of the vocal display (Fig. 3a). The average duration of introductory vocal sequences was 4.24 ± 0.35 s (n = 16). All visual displays started during the chewp notes series with some delay (average delay = 2.06 ± 1.41 s) and ended for most of the individuals after the end of the introductory vocal sequence (n = 12/16: average delay after the end of the vocal introductory sequence: 3.43 ± 1.09 s). These results show that some birds combined the chewp notes with a tail visual display but without fine temporal coordination. Birds also erected their dark forehead crest (Fig. 1c), but we did not find it constituted a specific signal used during duet and chorus introduction and seemed to rely more on the general emotional state (excitation, stress, aggressivity) of the individual during the group vocal displays, as well as in other contexts. Finally, we found that leaders and followers did not introduce their song the same way (χ2 = 18.1, df = 2, n = 20 leaders, 35 followers, p < 0.001, Fig. 3b). Though both leaders and followers mostly used purely acoustic displays, leaders performed multimodal displays as often, while some followers did not introduce their song with chewp notes nor tail display (Pearson’s residuals, Fig. 3c).

Fig. 3
figure 3

A Gantt chart of the 16 individuals that displayed a multimodal signal in barplots. The introductory vocal sequence in dark grey is followed by the song sequence in light grey. The visual tail display starts with the introductory vocal sequence and end during the beginning of the song sequence. b Barplots comparing whether leaders and followers introduce or not their song with a specific display when duetting or chorusing. A chi-square test of homogeneity reveals a significant difference between the two categories ( χ2 = 25.746, df = 2, p < 0.001, n = 29 leaders, 47 followers). c Pearson’s residuals show which variables account the most for the difference between leaders and followers. Leaders are positively associated (blue) with multimodal display while followers are positively associated (red) with no display

Discussion

In this study, we investigated how the Yellow-breasted Barbet starts a coordinated group vocal display. We found that barbets initiate their song with a specific vocalization named chewp notes. Acoustic measurements allowed us to identify two variations of such call type: the high chewp, which is louder with a “wave shape” frequency modulation, the dominant frequency can be on the fundamental or the 1st harmonic; the low chewp is softer in intensity, lower in frequency range and more variable in terms of frequency modulation and duration. We also found that the initiator of a group vocal display used more high chewp than followers and sometimes combined chewp notes with a tail visual display to perform a multimodal display.

The pre-duet and chorus vocalizations were previously interpreted as a “greeting ceremony”, which is according to its initial definition, a kind of ritualized social behaviour executed by two or more individuals that takes the form of a multimodal display involving specific calls and visual postures. Such display may or may not lead to a duet/chorus song (Short and Horn 1982). The authors defined the term “greeting ceremony” based on what they observed in the Black-collared Barbet. They found that greeting ceremonies were more common than duets and identified three kinds of calls: grating, chatter, and “tyaw” calls. Moreover, they found that most greeting ceremonies that led directly to a duet sequence were shorter in duration and contained “tyaw” calls, while greeting ceremonies that did not introduce a duet song were longer in duration and ended with soft grating or chatter calls. According to this information, we suppose that greeting ceremonies might encompass two distinct social behaviours, one exclusively serves in “meeting” or “joining” events (which we consider as true greeting ceremony) while the second serves specifically in duet initiation. It is important to mention that David Ward admitted that it was difficult to perceive any differences between these calls though, because frequency and volume were quite variable. He then grouped all these vocalizations as grating calls (Ward 1986).

In our study on the Yellow-breasted Barbet, we found that chewp notes are a specific vocalization used during a group vocal display initiation and we did not record these calls in any other contexts than duet and chorus displays. Moreover, we made few observations when the group members did not reply to the chewp notes of an individual. In this case, the bird that gave chewp notes did not start its song sequence or gave just a few syllables and quickly stopped. Another use of chewp notes that was not mentioned in the results due to the very few occasions we observed it, was during troop movements. When birds were actively singing, and one individual suddenly flew away to another tree location, the moving individual broke its song sequence with chewp notes while flying. The rest of the group immediately followed to continue their chorus on the other tree. However, if the group did not follow, the individual that moved remained silent after it had landed. Finally, in three instances the leading individual broke its song sequence with few chewp notes during a chorus when one individual that was absent initially, joined with some delay the chorusing group. These additional observations combined with our main results led to two conclusions. First, is that these chewp notes are used specifically in the context of duet and chorus displays. Second, that barbets use these vocalisations for intra-group interactions to induce an immediate behavioural response from the other group members. Similarly, in the Laughing Kookaburras (Dacelo novaeguineae), typical introductory syllables usually emitted by one individual appear to cause others in the group to join and sing in chorus (Baker 2004). Introductory vocal elements are also mentioned in the song of the duetting White-eared Ground-sparrows (Melozone leucotis) and described as the first part of the song (Sandoval et al. 2015). In the White-browed Sparrow Weavers (Plocepasser mahali), a duet initiation consists of harsh notes emitted by both sexes, or one to three introductory syllables at the beginning of a duet (Voight et al. 2006). The emission of introductory vocal elements to initiate a coordinated group display seem to be common in many duetting and chorusing bird species and therefore, needs more attention and dedicated studies about their usage and possible functions.

We found that the leader and follower individuals did not behave in the same way when introducing their song. This result is consistent with the “leader–follower strategy”, where the follower is the individual who adapts its behaviour according to the leading individual who promotes the activity (Fairhurst et al. 2014). A leader within a group is generally necessary to induce a joined action from other members (Wheatcroft and Price 2018). For example, troop movements are usually initiated by one individual considered as a leader, who gives specific calls when flying to invite the rest of the group members to follow (Koykka and Wild 2015; Radford 2004). Synchronizing the behaviour to sing in a coordinated way can become quite challenging when several individuals are involved. Thus, the emission of chewp notes could serve as a recruitment signal that informs all the group members in the vicinity about the start of a communal vocal display and that they must join the leading individual. Since followers could also reply with chewp notes, and we did not find a significant difference in the number of low chewp notes given by both leaders and followers, the introductory vocal sequence could serve as agreement between participants to act together. Chewp notes could also be used as a coordination signal that helps participants to coordinate their song sequences from the beginning. In the White-browed Sparrow, it was hypothesized that the song onset of the bird that joins the duet represents the common cue that defines the onset of vocal coordination in both birds (Hoffmann et al. 2019). Thus, the introductory part of the duet in this species might not be used by duetters as a common cue to coordinate their song. In duetting Songbirds in general, partners coordinate their song by following a precise set of rules known as the duet code (Logue and Krupp 2016), that needs to be learnt when they are young birds and even in adult stage (Rivera-Caceres et al. 2016, 2018). However, many duetting, and chorusing bird species such as barbets are non-oscine-birds, and the strategies of song coordination for those species has been subject of less attention.

We found that the leading individual sometimes combined chewp notes with a specific visual display consisting of the tail raised and fanned. Visual display synchronized with the vocalizations is observed in several duetting barbet species e.g. Lybius torquatus, Lybius vieilloti and Trachyphonus darnaudii (Payne 1971). The author suggested that those displays may be directed toward the mate. In the Black-collared Barbet, all birds engaged in duetting or chorusing behaviour may combine both acoustic and visual display. However, the most active bird considered as the leading individual during greeting ceremonies and duets was the bird with the cocking tail, and it was never reported that more than one bird in a group displayed in such a way (Short and Horne 1982). They thus concluded that such specific visual display might be associated with sex and/or dominance status. These observations are consistent with our findings in the Yellow-breasted Barbet. However, in our case, both the male and female could lead the duet and chorus and used the tail display. Performing a multimodal signal in the context of a group collective behaviour could increase the detection and/or the discrimination of the leading individual from the rest of the group, helping followers to focus their attention toward the right individual. The visual component could serve as an “amplifier” of the vocal component or “alerting signal” to decrease the receiver’s reaction time to the chewp notes sequence (Hebets and Papaj 2004). In D'Arnaud's Barbet, one individual constantly performs oscillation movements with its tail raised when it is actively singing in a duet (Payne and Skninner 1970, Wickler 1973). This suggests that the tail display may play a role in the song coordination in this barbet species. Recent work on the Australian Magpie-larks (Grallina cyanoleuca) also revealed that mated pairs use wing movements as conductor baton to enhance their song coordination, resulting in a more threatening signal for neighbours (Ręk and Magrath 2020). In our model species, however, the tail display was restricted to the song introduction.

Further investigations are needed to understand the role of chewp notes and the tail display in the Yellow-breasted Barbet and how such signal affects the group cohesion and song coordination. It would be interesting to investigate whether the chewp notes sequence could give some information regarding the threat perceived and the level of investment the participants must provide during the group vocal performance. The chewp notes features could transmit information related to the emitter identity, its sex and status in the group, as well as information on how intensively the participants must sing in terms of rhythm and song duration. Females often started the duets and choruses, but we did not find any difference in the number of chewp notes emitted between males and females whether they are leaders or followers. Moreover, we recorded one chorus where three males started together, joined later by a female of the group. These observations suggest that sex roles in communal display are similar, even though there is a sexual dimorphism with the presence or absence of the black patch on the throat, as well as with the pitch frequency of the song sequence which is higher for the female compared to the male (Short and Horn 2001). In the D’Ardaud’s Barbet, experimental removal of one duetter resulted in its replacement by another subordinate group member that sang the appropriate duet role, with male singing the duet part that was assigned to the female and vice versa (Short and Horn 1983).

More data need to be collected on duetting and chorusing barbet species to conduct a comparative behavioural analysis to investigate the role of sex during group displays, as well as the different strategies of song coordination and mutual attention. This would certainly help to understand why and how certain species perform coordinated chorus displays while others do not.