1 Introduction

Medical error is reported to be the third leading cause of death in the US [1]. Communication failures are reported to cause adverse events during surgery [2]. Communication consists of both verbal and nonverbal interactions. It has been suggested that abusive verbal communication during surgery contributes to increased incidence of errors [3]; however, verbal communication is essential in daily clinical practice, and communication becomes more difficult and challenging without verbal transmission. To reduce communication failures, it is crucial to convey focused messages in a clear and articulate manner [4].

Especially since coronavirus disease 2019 (COVID-19) emerged, aerosol-generating medical procedures such as tracheal intubation and extubation have been report to place anesthesiologists in close proximity to the airway, potentially increasing their exposure to the virus [5]. Consequently, during such periods, many practical guidelines recommend the routine use of personal protective equipment (PPE) including an N95 mask and face shield to protect from aerosol-spread infections [6]. In addition, use of PPE during intraoperative care has been discussed in some guidelines [6]. It has been reported that use of a surgical mask or an N95 mask significantly impairs speech performance [7,8,9]. Furthermore, it has been suggested that an N95 mask significantly impairs speech intelligibility and perception compared to a surgical mask [8, 9]. It has been confirmed that voice signal attenuation is strongly determined by the type of layered PPE combination [10]. In addition, presence of background noise further impairs speech intelligibility and perception [9]. It is likely that we unconsciously make an effort to increase vocal intensity to overcome the impairment [7], and it has been reported that an N95 mask combined with a face shield increased speech reception threshold, which is the lowest hearing level at which speech can barely be recognized or understood, by 1.5-fold [2]. These findings were investigated by using explicit and unambiguous words; and it is unclear whether the same results are obtained when ambiguous words with multiple possible meanings are used. Besides, it has been suggested that potentially ambiguous language is common in the field of surgery and has the potential to cause miscommunication [11]. In addition, there are several reports available that show noise in the operating room decreases effective intraoperative communication [12, 13]; however, there have been no reports regarding effects of noise emitted by operating room equipment on speech intelligibility of the receivers wearing on PPE. During surgery, effective verbal interactions are crucial to reduce communication failures. Considering these things, it is very probable that use of PPE in the operating room environment can increase communication failures although this concern has not been elucidated. To avoid such situations, supplementary means of communication strategies may be required because communication failures may cause medical errors during surgery, potentially leading to severe consequences [2, 14].

In this study, we investigated the effects of a surgical mask and an N95 mask, combined with a face shield, on speech intelligibility of explicit and ambiguous sentences. In addition, the effects of noise specific to the operating room environment on the speech intelligibility of these sentences were also investigated.

2 Methods

This study was registered with the University Hospital Medical Information Network Clinical Trials Registry (UMIN R000050373). Considering the nature of the study, institutional ethical approval in this study was waived by the institutional review board of Fukushima Medical University Hospital. However, informed consent for participation was obtained from all the participants. Twenty-eight healthy volunteers, who had no hearing disorders in previous biannual medical examinations, were recruited from the operating room staff for the study. Absence of hearing disorders was defined as the ability to hear < 30 dB at 1000 and 4000 Hz.

2.1 Preparation of speech discrimination test for Japanese syllables

Even among individuals with normal hearing ability, the effectiveness of hearing can vary depending on both the person and the specific circumstances. In daily life, therefore, speakers usually adjust the volume of their voice according to the situation [7]. Thus, we created a speech discrimination test for Japanese syllables to determine the optimal volume for each subject. We made 10 sets of 10 Japanese syllables. Each set was produced by a speaker at a rate of one syllable per second, and recorded using a generic PC software (Windows Voice-Recorder, Microsoft Corporation, Redmond, WA, USA) placed 50 cm away from the speaker. Each set consisted of a different combination of letters. All recordings, including those described in the following sections, were performed under the same condition and by the same speaker.

The speaker was instructed to maintain a constant volume during each recording. These recordings were performed in a room in the operating center of our hospital, which was free from noise generated by operating room equipment, but had background noise that did not exceed the night-time noise limit for hospitals (40 dB) [15].

2.2 Recording of explicit and unambiguous short sentences

We made 10 sets of short sentences consisting of 2–4 semantically explicit and unambiguous words. Each set included 5 sentences. Using the same recording condition as speech discrimination test, each set was recorded with the speaker while wearing a surgical mask (JN-M37X, JMS Co., Ltd., Hiroshima, Japan) or an N95 mask (Hi-Luck 350, KOKEN-LTD, Tokyo, Japan), combined with a face shield (MeGUARD, MITAS Inc. Fukui, Japan) (Fig. 1). A total of 20 sets of speech data were recorded.

Fig. 1
figure 1

Pictures of a speaker wearing a face shield combined with a surgical mask (left) or an N-95 mask (right)

2.3 Recording of ambiguous short sentences

We made 10 sets of short sentences containing two to four words with multiple possible meanings according to our experience. Each set included 5 sentences. Each set was recorded in the same manner as described for the preparation of explicit sentences. A total of 20 speech data were recorded. They consisted of Japanese words. Therefore, it should be difficult for non-Japanese leaders to recognize that they were ambiguous enough. A couple of sample sentences in English are shown although they are not identical in Japanese. "I’ve seen this desert (dessert)". "I saw a bat (bad) in the park". "My aunt (ant) is strong".

2.4 Recording of background noise using operating room equipment

We selected the sounds of a vacuum suction system, a fan running at normal speed, and a medical drill as representatives of 50, 60, and 70 dB noises, respectively [16]. Each sound was recorded for 3 min in the operating room of our hospital by Windows Voice-Recorder. When the noise samples were used during the tests, the output levels were adjusted to their respective target levels using a smartphone sound level meter application (NIOSH Sound Level Meter, EA LAB, Ljubljana, Slovenia).

2.5 Speech discrimination test and adjustment of the output level

Each subject was positioned 2 m away from the PC loudspeaker and underwent the speech discrimination test for Japanese syllables (Fig. 2). Regarding the subject position, assuming the distance between a surgeon and an anesthesiologist or operative nurse, who was not involved in the surgical team, 1 m may be too close and 3 m may be too far. Therefore, we employed 2 m as an appropriate distance for this study. The volume was initially set at 100 points, and was lowered by 10 points if the subject was able to answer correctly. The output level of the Windows Voice-Recorder was adjusted to be 10 points higher than the audio volume threshold recognized in the speech discrimination test. When noise was applied, a PC loudspeaker for noise reproduction was placed next to the PC loudspeaker for listening tests, which was positioned 2 m away from the subject. The output level of each noise sound was adjusted by a NIOSH Sound Level Meter. During each noise reproduction, the audio volume of the PC loudspeaker for listening test was adjusted to the level determined in the speech discrimination test because speakers usually adjust vocal volume depending on situations [7]. These tests were performed in a room in the operating center of our hospital, which was the exactly same room where the test recordings were performed.

Fig. 2
figure 2

Experimental scene. Each subject was positioned 2 m away from the PC loudspeaker and underwent the speech discrimination test so that the output level of the Windows Voice-Recorder was adjusted for the subject, and was asked to listen to explicit and ambiguous sentences and to write down the speech. A PC loudspeaker for noise reproduction was placed next to the PC loudspeaker for listening tests. The sounds were reproduced at 50, 60, and 70 dB, respectively, at the subject’s position

2.6 Listening test in normal background noise baseline

Each subject was asked to listen to one set from each of the four combinations of speech data at the output level adjusted for the subject, and write down the speech. Each test was randomly served from the pooled recordings.

2.7 Listening test with noise exposure

Next, the effects of noise exposure on speech intelligibility were evaluated. The sounds of vacuum aspiration system, fan, and medical drill were reproduced at 50, 60, and 70 dB, respectively, at the subject’s position. After determining the volume for listening test with each noise exposure, listening test was performed as described previously, in the three noise exposure levels.

Finally, each subject underwent a total of 16 listening tests; eight for explicit sentences and eight for ambiguous sentences. The correct answer was defined as writing down exactly the same sentence. The set of sentences used in each test was randomly chosen by the envelope method and was used only once per subject.

2.8 Sample size and statistical analysis

For the purpose of sample size calculation, we hypothesized that percentages of correct answers for the speech pronounced with a surgical mask and an N95 were 80% and 40%, respectively, and the common standard deviation of accuracy rate for ambiguous sentences was 20%. We estimated that six subjects would be required to provide a power of 0.95 with a type I error probability of 0.05. Considering multiple comparisons in normal background noise and three noise exposure levels, we calculated the sample size to be 24 (6 × 4) subjects. Finally, we recruited 28 volunteers considering divergence of estimation. Age is presented as the median and interquartile range. A ratio of the sexuality is presented as the number of patients. The numbers of correct answers are presented by boxplots. The Wilcoxon signed-rank test was used for comparisons of numbers of correct answers. For multiple comparisons, more conservative probability values calculated by using the Holm correction were used to determine statistical significance. When corrected p values exceeded 1, the Dunn-Sidak correction was used. Analyses were computed using R version 4.0.4 (R Foundation for Statistical Computing, Vienna, Austria). A value of p < 0.05 was considered statistically significant. In addition, the absolute effect size r, which can be interpreted as a classical Pearson’s correlation coefficient, was calculated for each comparison. Roughly arbitrary criteria for Pearson’s r values: r of 0.1 is considered small, 0.3 medium, and 0.5 large [17, 18].

3 Results

Sixteen males and 12 females, with a median age of 28 years (26.8–33.5), participated in this study.

The number of correct answers for the test sentences are presented in Fig. 3. For explicit sentences, speech intelligibility was significantly lower for sentences produced with an N95 mask than those produced with a surgical mask in the background noise of 60 dB. The effect size of speech intelligibility between a surgical mask and an N95 mask was considered as slightly higher than medium in the presence of 50 and 60 dB noise. For explicit sentences pronounced with a surgical mask, intelligibility was significantly decreased only in the presence of 70 dB noise and its effect size was also close to large. As for explicit sentences pronounced with an N95 mask, intelligibility was significantly decreased in the presence of 60 and 70 dB. In addition, its effect size was considered as more than medium in the presence of 60 and 70 dB noise.

Fig. 3
figure 3

The number of correct answers (sentences) in listening tests of explicit (left) and ambiguous (right) sentences. The gray and white box plots represent the results of listening tests for sentences produced with a surgical mask and an N95 mask, respectively. The sounds of a vacuum aspiration system, a fan running at normal speed, and medical drills were used as representatives of 50, 60, and 70 dB, respectively, in the operating room setting. "p" means the probability that the difference between the variables is zero. "r" means the effect size indicating the practical difference between explicit and ambiguous sentences

With regard to ambiguous sentences, intelligibility of speech produced with an N95 mask was not significantly lower than that with a surgical mask at all noise levels. The effect size of speech intelligibility between a surgical mask and an N95 mask was considered as more than medium only in the presence of 70 dB noise. As for ambiguous sentences pronounced with a surgical mask, intelligibility was not significantly decreased at all noise levels; however, its effect size was considered as more than medium “in the presence of noise exposure of 60 and 70 dB. Regarding ambiguous sentences pronounced with an N95 mask, intelligibility was significantly decreased in the presence of 60 and 70 dB noise. In addition, its effect size was considered as medium to large at all noise levels.

Taken together, speech intelligibility of explicit sentences was significantly higher than that of ambiguous sentences, regardless of mask type and noise level. In addition, its effect size was almost more than large at any condition. Statistical values are not shown to avoid complication of the figure.

4 Discussion

The present study showed no significant difference in intelligibility of recorded speech produced with an N95 mask or a surgical mask, combined with a face shield, in normal background noise. In addition, speech intelligibility was affected more greatly by the presence of background noise and language ambiguity than by mask type. However, it is worth noting that intelligibility of speech produced with an N95 mask was sometimes reduced in the presence of noise exposure and language ambiguity. It should be recognized that speech produced with an N95 mask can be difficult to understand in the noise of the operating room setting, even if it is explicit and unambiguous.

As described previously, a surgical mask and an N95 mask significantly impairs speech performance [7,8,9]. Furthermore, an N95 mask significantly impairs speech intelligibility and perception compared to a surgical mask [8, 9]. In the present study, intelligibility was not significantly affected by mask type under the baseline condition and 50 dB noise exposure, even if the speech consisted of explicit words. These results were contrary to our expectations. This is because our baseline noise level in the quiet operating room we used did not differ much from the 50 dB noise exposure. In addition, we used a series of incoherent sentences without medical content because comprehension for sentences can be affected by sequential coherence in sentence series [19]. Therefore, it is possible that the subjects occasionally misunderstood the content due to the sequential incoherence, even though each sentence was explicit, which might have masked the effect of the N95 mask on speech intelligibility. However, the negative effect of the N95 mask on speech intelligibility of explicit sentences was enhanced in the presence of 60 dB noise. Sound attenuation is more prominent with an N95 mask than with a surgical mask [20], which might cause this difference in speech intelligibility between the N95 mask and the surgical mask. Regarding the presence of background noise, 70 dB may have been too loud to distinguish the difference by mask type. In the listening test of the present study, the volume was adjusted for each subject depending on the noise levels. However, the higher the noise level, the lower the speech intelligibility. This finding suggests that verbal effort for elimination of ambiguity alone does not overcome speech intelligibility impairment due to noise exposure.

For ambiguous sentences, the number of correct answers was significantly lower compared to explicit sentences, regardless of mask type or noise exposure. This suggests that the ambiguous sentences we used were appropriate for the purpose of the study. Intelligibility of speech produced with an N95 mask and a surgical mask were both significantly reduced to the same extent, and degraded as the noise level increased. In speech perception, the visual aspects of speech are very important as well as the auditory aspects [21]. The listening test in the present study used recorded speech data, which may have further impaired speech intelligibility of ambiguous sentences due to lack of visual information. In the actual settings, intelligibility may improve due to visual information from the surroundings even if the speech contains ambiguous words and is produced with some kind of mask. The specific conditions in this study might have masked the difference in speech intelligibility between the mask types. However, we need to keep in mind that, to add insult to the disadvantage condition, the N95 mask aggravated more speech intelligibility of ambiguous sentences in the highest noise condition. In addition, regardless of mask type and presence of noise, the median accuracy rate for ambiguous sentences was less than 50%. This result suggests that use of ambiguous language can be the main cause of communication failures, resulting in medical errors. To improve the safety of medical services, it is crucial to avoid ambiguous language when speaking, and to restructure or summarize the content for clarity when necessary [22].

Regarding similarity of our methodology to the Speech In Noise Test Revised (SPIN-R) test, which is standard in audiology research and similarly uses predictable versus unpredictable words, this has traditionally used a signal-to-noise ratio (SNR, S/N) of 8 dB [23]. In this study, the speaker volumes at 100 points 2 m away from the PC loudspeaker were 70–80 dB, which varied depending on the words. However, we adjusted the audio volume of the PC loudspeaker for listening test to the level determined in the speech discrimination test and did not measure dB levels at that time. Therefore, the SNR was not constant during the study and varied among the examinees.

There are several limitations to the current study. The first limitation is that we did not evaluate intelligibility of speech produced without face masks or face shields. In the operating room setting, wearing a surgical mask, and a face shield if there is a risk of blood exposure from the operative field, has been the standard since before the COVID-19 pandemic. Therefore, we did not consider situations where neither masks nor face shields are worn during surgery. The second limitation is that it remains unclear whether the explicit and ambiguous sentences we used were appropriate and reasonable. The explicit sentences we used might have had a certain ambiguity at the lexical level, which might have affected the number of correct answers in addition to incoherence of test sentences. It is also unclear whether the ambiguity of the sentences we used was sufficient to yield satisfactory results. It is very difficult to evaluate the degree of ambiguity. Considering the number of correct answers for the explicit and ambiguous sentences, we believe that they were ambiguous enough for us to conceive their ambiguity. In addition, we recorded the sounds generated from the surgical equipment to simulate the noise in the operating room. However, they were not real. Thus, the fidelity of the sound played back may be limited. Regarding the play back of speech, we’d like to provide the equal conditions to the examinees. However, the speech may be unnatural as representatives for oral communication in the operating room. Even considering these drawbacks, we think that this simulation study can help us to understand failure of verbal communications. Besides, sentences we used in this study did not contain medical content, which could be listed as one of limitations. Lastly, we conducted this study in Japanese. Therefore, we need to be careful to judge whether these results can be obtained in any languages. However, any language must have both explicit and ambiguous sentences in their own. Thus, we believe that our results could be useful to understand verbal communications in a specific situation.

5 Conclusions

Intelligibility of speech in Japanese in normal background noise was impaired with a surgical mask and an N95 mask, combined with a face shield, even if the speech consisted of explicit words. The impairment of speech intelligibility was significantly exacerbated with ambiguous language use and noise exposure specific to the operating room environment, regardless of mask type. It is of note that speech intelligibility was occasionally further reduced with the use of an N95 mask. To improve verbal communication in the operating room environment, it is important to avoid ambiguous language and consider complementary means of communication to reduce ambiguity and enhance intelligibility. In addition, the study results may be affected by other factors, such as the location of the speaker and the receiver or PPE combinations. Further studies are required for these research interests.