Research on social facilitation and social inhibition is one of the oldest fields in social psychology (Triplett, 1898). Despite over a century of research, the underlying mechanisms of the effects are not entirely clear. Especially in a sports context, spectator effects are highly relevant, since problems to perform in front of an audience may contribute to choking under pressure in competitive contexts (Wallace, Baumeister, & Vohs, 2005). Note, however, that the presence of an audience and competitive pressure tend to be confounded in real life: Larger audiences usually are present in the more important sporting competitions.

Strauss (2002) used the classification of motor performances by Bös and Mechling (1983) to differentiate spectator effects between different types of motor tasks. Strauss describes a facilitation of performance of tasks predominantly relying on strength and conditional factors, whereas the performance of coordination-based tasks is more likely to be inhibited (Strauss, 2002). According to a recent review, the majority of studies has used coordination tasks like pursuit rotor tracking (van Meurs et al., 2022). To our knowledge, the performance of dancing routines as a dynamic balance and coordination task has not been investigated yet in the social facilitation literature. Only a few studies have investigated the influence of spectators on balancing, a gross-motor coordination-based skill. We argue that dancing involves balance abilities. Balance (or postural control) is defined as “the act of maintaining, achieving or restoring a state of balance during any posture or activity” (Pollock, Durward, Rowe, & Paul, 2000, p. 405). It can be divided into static balance (maintaining a base of support with minimal movement) and dynamic balance (maintaining a stable balance while performing a task) (Hall, 2012; Winter, Patla, & Frank, 1990). Previous studies on audience effects on balance have led to different results, with most findings pointing towards social facilitation. For example, facilitation effects were found balancing on stabilometers (Murray, 1983). Studies conducted with free-standing ladders found facilitation effects (Landers & Landers, 1973; Landers, 1975), or reported no effects (Livingston, Landers, & Dorrance, 1974). Neither positive nor negative effects were identified on a balance board task (Lau, Schwarz, & Stoll, 2019), while MacCracken and Stadulis (1985) observed positive effects in their line/beam walking task only when performer’s balance ability was high.

In addition to these conflicting results, previous studies on spectator effects tend to lack ecological validity. Participants were often asked to perform motor tasks in highly controlled laboratory environments, with spectators being instructed to deliberately focus their attention on other aspects of the situation (for example, by reading a book), and to not show any reaction to the actor’s performance (“mere presence” paradigms). To close this research gap, we developed a study design with high external validity that asked a group of female dancers to perform a dance routine repeatedly with and without an audience. Dancing puts high demands on dynamic balance, coordination, rhythmicity, gross-motor precision, interpersonal timing, and cognition (Bläsing et al., 2012; Vicary, Sperling, von Zimmermann, Richardson, & Orgs, 2017; Zardi, Carlotti, Pontremoli, & Morese, 2021). Many aesthetic sports (like rhythmic gymnastics, dancing, synchronized swimming, high diving) are judged by experienced raters. Athletes have to be able to perform highly complex coordination tasks in front of an audience, often with the aim of perfect synchrony across several team members.

Task-difficulty is one prominent factor influencing social facilitation or inhibition effects. According to the traditional social facilitation hypotheses, simple (well-learned) tasks are facilitated, whereas in complex (not well-learned) tasks, an inhibition of performance occurs (Zajonc, 1965). We therefore decided to assess spectator effects in a specific dance sport, carnival dancing (“Gardetanz”), and to follow a group of dancers across one competitive season during which the choreography can be classified as difficult (not well learned) at the beginning and easy (well learned) in the end.

The type of dancing is a kind of line dance or can-can dance, mainly performed in German-speaking countries. A carnival dance choreography puts a high demand on dynamic balance skills and coordination, and may also include acrobatic components. Teams who take part in competitions prepare a specific dance choreography for one season, and compete against other teams with their individual dance choreography in several competitions from mid-September to mid-December. Before the competitive period starts, the teams practice their choreography over the course of several weeks. The competitions take place in front of audiences, but the atmosphere is less cheerful and light-hearted compared to the typical carnival setting. The atmosphere and the typical reactions of the audience is more comparable to other dance competitions, for example to competition ballroom dancing. For the skill level of the dancers that were recruited for our study, audiences in competitions usually consist of 50 to about 200 people, which are other competitors, people interested in carnival dancing as a competitive sport, and friends and relatives of the athletes. While a dance is performed in a competition, spectators do not clap their hands, swing to the music (“schunkeln”)Footnote 1, or sing along, but rather watch the performance attentively.

We expected the dancers’ performance to improve over several test sessions. Concerning social facilitation effects, the assumptions formulated by Strauss (2002) would indicate that dancing as complex gross-motor coordination-task should suffer from the presence of an audience, especially when the skill is not well-learned and comparatively difficult. With increasing practice (when approaching the start of their competitive season), the team should be increasingly able to perform the dance in front of an audience, potentially even leading to social facilitation effects.

Methods

Participants

We tested a carnival dancing group consisting of 15 female dancers (age: 19 ± 5.12 years). All participants were active in the same dance-club and had been training together for approximately 3 months. On average, the group has been active in carnival dance for 13 years (± 5.40) and can therefore be considered experienced. The study was approved by the ethics committee of Saarland University, and it complied with ethical standards. The authors report no competing interests.

Apparatus and experimental task

Participants performed their carnival dance choreography alone and in front of spectators four times (timepoints – T1–4: T1 & T2 = not well learned; T3 & T4 = well learned) over the course of 8 months (within-subjects design). The dance took 3:30 min to complete. On each occasion, two video cameras filmed their performances, both of them facing the dance group. Session 1 took place in June, immediately following the finalization of the choreography. Session 2 took place in August. Session 3 (mid-September) was the final training before the first competition, and session 4 (mid-December) was the final training session before the most important competition of the season (the final competition of the series, which decides about the ranking). On each occasion, the dance choreography was performed twice, once in front of spectators, and once alone. The order of presentation (in front of spectators or alone) was counterbalanced across the test sessions to control for practice and fatigue effects. Sessions 1 and 3 started with the “alone” condition, and sessions 2 and 4 with the “spectator” condition. The passive spectator group consisted of about 15 to 25 different people in every session. Spectators entered the gym hall before the dance started, quietly watched the performance without showing verbal or nonverbal reactions, politely clapped their hands after the dance (without showing strong enthusiasm), and left after the dance was finished. We argue that this type of audience behavior corresponds to typical carnival dance competitions.

We used seven professional raters to evaluate the eight resulting dance performances. All raters were official jury members of the “Bund Deutscher Karneval” (Society of German Carnival). Their mean age was 46 years (± 15). On average, they had been jury members in competitions for 8.4 years (± 8). The raters received the video clips, and performed their ratings at home. Since the same choreography was judged several times, a modified version of the official competition rules in German carnival dance sport was used (Bund Deutscher Karneval, 2006). Competitive ratings for the choreographies usually consist of ten dimensions: walking to the start formation, initial position, quality and uniformity of the dress, charisma of the dancers during the dance, diversity of steps, difficulty level of the dance, elegance and joyfulness, synchronicity of the dancers, choreography of the music, and choreography of the dance. For the current study, the dance group was not wearing their official dance dress, but black t‑shirts and black sport tights. They also did not walk to their starting positions, but started their dance choreography with the dance itself. Therefore, the dimensions “walking to the start formation”, “initial position”, and “quality and uniformity of the dress” were excluded from the scoring procedure. Since the same dance was judged eight times by each rater, there was no variation across ratings for the dimensions “diversity of steps”, “difficulty level of the dance”, and “choreography of the music”. These dimensions were therefore also excluded from the overall rating for the current study. The remaining categories were “charisma of the dancers during the dance”, “elegance and joyfulness”, “synchronicity of the dancers”, and “choreography of the dance”, resulting in maximum score of 55 points. Following the tournament rules, the best and worst ratings were imputed by the mean rating of the other raters. All raters were blinded to experimental condition and timepoint, since the performances of the group were video-recorded and presented in a randomized order. The raters were allowed to watch each video recording once, and they always saw the perspective from the front, same as in a competitive setting.

Procedure

The study took place in a gym hall of a German carnival dance club. Each test session lasted approximately one hour. Participants signed informed consent before the study started. The consent form explained that the dance routine would be videotaped, in order to enable experienced raters to judge the quality of the dance. In the first session, participants were informed about the study procedure and completed demographic and sport-specific questionnaires. Prior to performing the choreography, the dancers did a self-administered warm-up of about 20 min. A rest period of 20 min was implemented in between the two dancing conditions (alone, spectator).

Overview of analysis

The statistical analysis was performed via SPSS Statistics (version 25; IBM Corp., Armonk, NY, USA). The data are available on open science framework. To examine the influence of time and spectators on dance scores, a repeated measures analysis of variance (ANOVA) with the within-subject factors spectators (2, present or absent) and time (4, T1–T4) was calculated. Sphericity and normality were checked by Mauchly’s test and Kolmogorov–Smirnov tests. The alpha level used to interpret statistical significance in the analyses was 0.05. Significant main effects or interactions were further investigated by post hoc analysis using paired-samples t‑tests, including effect size measures.

Intraclass correlation coefficients (ICC) using a 2-way mixed-effects model (absolute agreement) and their 95% confident intervals (95% CI) based on a mean rating (number of raters – k = 7) as well as Cronbach’s alpha (α) were used to calculate interrater reliability.

Results

The influence of spectators on dance performance

The average rating of T1–T4 is represented in Fig. 1. Average points for the performance ratings in the alone and spectator condition for each timepoint are also shown in Table 1.

Fig. 1
figure 1

Influence of spectators on the carnival dance performance. Error bars = standard error (SE) mean, max maximum

Table 1 Points (means and standard deviations, maximum = 55) for the dance performance ratings in the alone and spectator condition for each timepoint

Normally distributed data are given for 5 out of 8 test occasions for the Kolmogorov–Smirnov test (spectators, t1: D(7) = 0.32, p = 0.026; t2: D(7) = 0.17, p = 0.200; t3: D(7) = 0.25, p = 0.200; t4: D(7) = 0.19, p = 0.200; no spectators: t1: D(7) = 0.26, p = 0.151; t2: D(7) = 0.28, p = 0.096; t3: D(7) = 0.31, p = 0.037; t4: D(7) = 0.36, p = 0.007). We decided to report the ANOVA in the current paper, since it is not very sensitive to moderate deviations from normality. Simulation studies, using a variety of non-normal distributions, have shown that the false positive rate is not affected very much by this violation of the assumption (Glass et al., Glass, Peckham, & Sanders, 1972; Harwell, Rubinstein, Hayes, & Olds, 1992; Lix, Keselman, & Keselman, 1996). Concerning the assumption of sphericity for the repeated measures ANOVA, Mauchley’s test indicated that sphericity had been met for timepoint (χ2 (5) = 4.64, p = 0.469), and for the interaction of timepoint and spectator condition (χ2 (5) = 7.58, p = 0.189).

In the ANOVA, there was a significant main effect of timepoint, F(3, 18) = 11.25, p < 0.001, η2p = 0.65. Figure 1 shows that there was a tendency to improve performance over time (linear trend for the timepoint factor p = 0.001), although the group performed worst at T3, as reflected by the cubic trend of the timepoint factor reaching significance as well (p = 0.002). There was a positive influence of spectators on performance, F(1, 6) = 14.84, p = 0.008, η2p = 0.71. However, the spectator effect changed over the course of the training period, as shown by the significant interaction of time and group, F(3, 18) = 3.84, p = 0.027, η2p = 0.39. As a follow-up analyses for the significant interaction effect, paired-samples t‑tests showed that the dance performance was higher in front of spectators at T2, t(6) = 2.70, p = 0.036, dRepeated measures = 1.02, but not at the other timepoints.

Table 2 presents the dance performance ratings for each rater. For the “alone” condition, the ICC between the observed variables was 0.85 (95% CI 0.50–0.99), α = 0.93. For “spectator” condition, the ICC was 0.75 (95% CI 0.25–0.98), α = 0.84. If both conditions are considered together, the resulting ICC was 0.87 (95% CI 0.62–0.99), α = 0.93. Based on the ICC and Cronbach’s alpha, the interrater reliability between the seven judges was excellent (Cicchetti, 1994).

Table 2 Points (means and standard deviations) for the dance performance ratings in the alone and spectator condition for each rater (maximum = 55)a

Discussion

The current study investigated the effects of spectators on the dance performance of a carnival dance group. As expected, the dance performance level increased over the course of the test sessions. The performance decrements in session 3 had not been predicted. It seems that despite regular training, situational factors (for example, day-to-day performance fluctuations) also played a role. The experimenter reported that subjects appeared rather nervous and irritated in session 3, possibly due to the start of the competitive season in the near future. To elucidate such influences on performance, future research with this paradigm should include self-reports of the dancers on their level of nervousness or their subjective evaluation of performance.

When planning the study, we simply dichotomized the dance choreography into not well-learned (sessions 1 and 2) and well-learned (sessions 3 and 4). In reality, however, learning curves may be nonlinear, and qualitative changes of the choreography probably also depend on the feedback that is provided by the coach over the course of training. Contrary to our predictions, there were no systematic increases in social facilitation with increasing practice of the dance choreography. Over time, the dance team may have approached a ceiling effect: After several months of practice, and immediately before the most important competition of the season (timepoint 4), room for further improvement may have been limited. In future studies, additional timepoints and multiple performance measures could be used to more precisely map the timecourse of performance changes.

Being watched by others led to performance increments in session 2. On the other test occasions, there was no systematic influence of the spectators. Social inhibition was not observed on any timepoint. As far we know, our study is the first to assess the effects of spectators on performance in dancing. It is therefore not possible to compare our results with other studies. In gymnastics, a sport similar to dancing, Paulus and Cornelius (1974) and Paulus, Shannon, Wilson, and Boone (1972) found decrements in performance in a gymnastic routine of college students when spectators were present. Paulus and Cornelius (1974) showed that the performance decrement in a gymnastics routine in front of spectators was even stronger in more skilled individuals, and in a setting where individuals had received a prior “warning” that the upcoming trial would have to be performed in front of spectators (see also Paulus et al., 1972). The authors interpret this as support for the importance of social evaluations on spectator effects (for example, Cottrell, Wack, Sekerak, & Rittle, 1968).

One strength of the current study is that several timepoints were investigated. In addition, the study design is high in ecological validity, since a real dance choreography was used. Seven experienced raters provided information on performance quality, and the use of video-recordings enabled us to blind raters with respect to the presence or absence of an audience as well as the assessment timepoint. The raters’ evaluations were highly reliable.

Limitations lie in the fact that only one dance team was assessed. Future studies should take differences between groups into account. In addition, we did not implement a “pure” alone condition in the current study. Participants were aware that their performances were evaluated, even when no spectators were present, and the experimenter was always present. Furthermore, our sample consisted of females only. Recent research has shown that audience effects may differ between males and females (Heinrich, Müller, Stoll, & Canal-Bruland, 2021), so gender effects should be included in future research.

It should be kept in mind that carnival dancing puts high demands on the ability of group members to synchronize their movements. Performance ratings suffer strongly if individual team members fail to keep the rhythm or deviate from the choreography. The current study did not assess individual performances and their influence on the rating of the whole team. Future research may disentangle such individual contributions in a systematic manner, and also investigate whether these are influenced by personality characteristics (for example., Uziel, 2007). Furthermore, dancing may involve conditioning abilities as well (Bös and Mechling, 1983), especially if a dance routine is rather long. The current dance routine lasted for 3 and a half minutes. Dancers do not appear to show clear signs of physical exhaustion in the videotaped dances, but there may be individual differences in this respect. A systematic evaluation of the cardiovascular challenge, for example by monitoring participants’ heart rates over time, would allow for a clearer classification of dancing as a motor skill in future studies.

In the long history of research on social facilitation and inhibition effects, early approaches (for example, Zajonc, 1965) often reduced the spectator condition to the mere presence of other people, who often did not even pay attention to the performance (for reviews concerning motor tasks, see Strauss, 2002, and van Meurs et al., 2022). This has been criticized, since social evaluations of performances may be a prerequisite for audience effects (Cohen, 1980; Henchy & Glass, 1968). The current study implemented a spectator condition that included possible social evaluations, at least from the point of view of the actors. Spectators attentively watched the dancers, and also signaled some appreciation by clapping their hands after the dance ended. Future research should take a closer look at the effects of specific positive or negative audience reactions on performance (Baumeister, Hutton, & Cairns, 1990; Harb-Wu & Krumer, 2019; Wallace et al., 2005). For example, the use of prerecorded video clips of virtual audiences could be an elegant way to construct such situations, allowing for optimal experimental control.

From an applied perspective, it should be kept in mind that “typical” audience reactions differ considerably across sports. Different sports elicit different spectator behaviors like cheering, booing, crowd noise, and applause. Sometimes, spectators are not even permitted to show strong reactions (for example, in tennis), while sports like soccer often involve highly emotional and noisy spectator behavior. The current study tested competitive carnival dancers, who are used to perform in front of rather “serious” audiences while competing. In a real carnival setting, the atmosphere will be a lot more relaxed and cheerful.

Our study recruited experienced athletes in a sport that requires superior balance and coordination performances. Performing sophisticated movement skills in an aesthetic way is an integral part of carnival dancing, or of any group dance (Vicary et al., 2017). To succeed, athletes have to be able to perform in front of an audience and under competitive pressure. People who have a high tendency for choking under pressure (Mesagno & Beckmann, 2017) may drop out of such sports after a while. We argue that studies contrasting experts and novices should take such selection effects into account when interpreting their findings.