1 Introduction

The coronavirus disease 2019 pandemic has led to a wider variety of work styles in various locations, which has increased opportunities for office workers to work outside of traditional office spaces. The environmental characteristics of a home or commercial space differ from those of conventional office spaces. In such differing environments, the goal is to achieve production efficiency equal to or greater than that of conventional workplaces. Numerous studies have examined the impacts of environmental factors on intellectual productivity [1], as well as the effects of the sound [2], visual [3], air quality [4] and thermal environments [5]. Among these environmental factors, although the influence of various sound sources has been considered a factor of the sound environment, there are generally cases where environmental sounds such as broadband noise [6], road traffic noise [7, 8], and conversation sounds [9] with relatively high sound pressure levels (SPLs) are present. The impact of broadband noise, with a relatively high SPL of around 75–95 dB, has been shown to be naturally large [6], whereas even traffic noise with a moderate SPL has been shown to affect cognitive performance [7, 8]. Asakura and Tsujimura [10] found that household sounds, such as those with much lower SPLs than road traffic noise, can affect cognitive performance. Furthermore, they also reported that these sounds may show an upper acceptable level of annoyance at SPLs around 40 dB [10], depending on the noise sensitivity of individuals. The effects of low-level sounds (i.e., 34–45 dB) on intellectual tasks have also been studied. Tamura reported that when sounds with low SPLs are perceived as noise instead of unavoidable phenomena, a given task can be quantitatively maintained, but qualitative performance tends to be reduced [11]. On the other hand, no case studies have examined the effects of environmental noise with an SPL < 30 dB on human productivity. However, it has been noted that various sounds generated in an office environment can affect work, even if none of the sounds has a significant SPL [12].

In the present study, an experimental investigation was conducted to examine the psychological and physiological effects of sound environmental stimuli with a low SPL on the performance of intellectual tasks. The psychological factors related to the degree of disturbance, concentration, and stress, as measured by subjective evaluation, were compared with the degree of physiological stress, as measured using salivary alpha-amylase (sAA) activity.

2 Methods

2.1 Experimental Procedure

The experimental setup and flow are shown in Fig. 1. The experiment was conducted in a soundproof room to avoid the contamination of low-level sounds via exterior noise. The study participants sat at point P inside the room, as shown in Fig. 1c.

Fig. 1
figure 1

The experimental flow: (a) the overall flow of the experiment including six auditory conditions and the partial flow of each of the auditory conditions, b the experimental setup showing a participant inside the soundproof room, c the dimensions of the soundproof room, d-1 the measurement situation for each auditory stimulus, d-2, d-3 the spatial relationships between the sound source and the HATS recording device, and e a screen showing the task calculations in the software [14] using the Uchida–Kraepelin (UK) method

In this experiment, four types of artificial fluctuating sounds and two types of static sounds with equivalent continuous A-weighted SPLs of 24–29 dB were used as acoustic stimuli. The participants, while exposed to these sounds through headphones (ATH-W1000Z; Audio-Technica, Tokyo, Japan), performed a 5-min intellectual task based on the Uchida–Kraepelin (UK) psychodiagnostic test [13] under each experimental condition using Kraepelin Training free software [14] running on a smartphone (iPhone 13; Apple, Cupertino, CA, USA).

For the sound playback, the headphones were connected to a laptop PC with an audio interface (UR22mkII; YAMAHA Steinberg, Tokyo, Japan). After the task was completed, the participant evaluated the degrees of disturbance, concentration, and stress perceived during the task. Saliva intake was measured and sAA analysis was performed before and after the intellectual work, as shown in Fig. 1a. These experimental procedures were carried out for all six auditory stimuli. A 5-min break was provided between each experimental condition. The order in which the six types of sounds were reproduced was randomized for each of the participants to reduce the effect of the order. Before the experiment, the participants were also given sufficient practice with the intellectual task on the smartphone. Then, instructions were given for the situations that should be assumed in the experiment, as follows. The participants were instructed to carry out the intellectual work carefully, as though it involved actual work as opposed to a hobby. For the auditory stimuli, artificial sounds were generated at a distance of 1 m to the left of the participants. No specific instructions were given as to the location in which they were supposed to be working.

2.2 Participants

The study participants were 20 healthy young male volunteers (mean age ± standard deviation [SD]: 21.7 ± 1.3 years, age range: 20–23 years). Only men were recruited because of the known existence of a gender difference in psychological responses to noise [15]. In accordance with EN 50332-1 and EN 50332-2 proposed by the European Committee for Electrotechnical Standardization [16, 17] as sound pressure regulations for portable audio players and the university’s ethical guidelines, this experimental study was designed to be noninvasive. Informed consent was obtained from all participants. The subjects were advised in advance to get adequate sleep the day before to avoid sleep deprivation on the day of the experiment. The experimental collaborators were briefed on the study purpose and methods, as well as the anonymization and use of data. Furthermore, prior to the study, all participants were asked about their hearing ability, and all assured that their hearing was normal. The participants were also asked to confirm that they were not in a state of hypersensitivity to sound.

2.3 Acoustic Stimuli

In this experiment, six types of sounds were used as auditory stimuli: writing with a ballpoint pen on paper (S-PB), typing and clicking (S-TC) to evoke working sounds in the office, TV news (S-TN), chewing potato chips (S-CC) to evoke the sound of breaks between office work, pink noise without meaning (S-PN), and silence (S–SI). Previous studies have investigated the effects of background noise, including chewing sounds, on the learning efficiency of individuals with highly sensitive misophonia [18], finding that learning efficiency decreased in the presence of gum chewing sounds and increased in a quiet environment [19]. The most typical evoked sounds in misophonia are chewing [20, 21] and repetitive tapping noises, such as pen clicking [21]. On the other hand, as the prevalence of misophonia has been estimated to be around 20% of the population [22], it is possible that chewing sounds may increase annoyance to a small extent, even in relatively healthy individuals. Nevertheless, many evaluations of the quality of crispy sounds have been conducted in recent years, and some aspects of these sounds have been accepted positively in terms of improved texture [23]. As described above, chewing sounds include various aspects of context; for example, in a place shared with other people, a situation in which people may be working in close proximity to those who are eating and drinking can also be assumed. Thus, S-CC was also included as a test sound, as it may have the potential to increase or reduce annoyance.

The auditory stimuli were recorded using a head and torso simulator (HATS) system (type 8328A; ACO, Tokyo, Japan) in the soundproof room, where the reverberation was suppressed by sound-absorbing materials (Fig. 1d). Assuming an actual office environment, the height of the ear on the HATS dummy head was set at 1.2 m, which was assumed to be the height of the ear in the human sitting position, and was placed 1 m to the direct right of the person generating the sound (Fig. 1e). The equivalent A-weighted SPLs of these recorded 5-min auditory stimuli were measured as shown in Table 1. The background noise inside the room was around 20 dB, indicated as S–SI. Therefore, the other auditory stimuli were measured so that the signal-to-noise (SN) ratio was > 6 dB, and for (b), the SN ratio was about 5 dB, because the SPL at 1 m was lower than that of the other sounds. The acoustic output level of the audio interface was adjusted so that the recorded auditory stimuli were played back using the headphones to reproduce the SPLs shown in Table 1. The frequency characteristics of each of the above stimuli reproduced at the SPLs of Table 2 are indicated in Fig. 2. As indicated in the trend, the S-CC has a relatively higher frequency component, while the other sounds have generally lower frequency.

Table 1 SPLs of the auditory stimuli: silence (S–SI), pink noise (S-PN), writing with a ballpoint pen on paper (S-PB), typing and clicking (S-TC), TV news (S-TN), and chewing (S-CC).
Fig. 2
figure 2

Frequency characteristics of each of the acoustic stimuli

2.4 Intellectual Tasks

The participants were presented with a calculation task based on the UK method [13]. This method is used to ascertain the effects not only of auditory stimulation on the performance of intellectual tasks, but also of moderate mental stress on the participants and how this stress is affected by the auditory stimulation. It has been used for these purposes in many previous studies [24, 25]. The participants were instructed to add single-digit numbers continuously using the Kraepelin Training software [14] on the smartphone (Fig. 1e), working as quickly and accurately as possible. When a cue was given, the participants were to begin calculations. After 5 min, a cue for the end of the test was given. For evaluation of the results, the test performance of each participant was assessed in terms of the number of correct answers. While the number of errors was assumed to reflect attentional control [13, 26], the numbers of errors of all the participants were quite small compared with the number of all answers < 1%. One of the reasons for this was that the overall duration of the test was relatively short (5 min). In a previous study [27], the error rate for the UK test was also < 1.0%, even under longer observation. Therefore, the number of errors was not considered in this study.

2.5 Physiological Measurements

In this experiment, physiological responses to mental stress were inferred based on information obtained from saliva. Previous studies have assessed the degree of stress response in saliva using salivary cortisol [28], amylase [29], or chromogranin A [30]. A comparison of the results of these three methods confirmed that the results are consistent, although gender differences in the statistical trends have been observed [31]. Regarding gender differences, it was originally noted that stress responses differed between men and women [32], and several studies have found a stronger salivary cortisol response in men than in women in reaction to stressors [33]. On the other hand, Maruyama et al. [34] showed that sAA levels displayed a rapid increase and recovery, returning to baseline levels 20 min after a stressor, whereas salivary cortisol responses showed a delayed increase that remained significantly elevated from baseline levels 20 min after the stress challenge. Their analyses revealed no gender differences with regard to the sAA response, but did find significantly higher salivary cortisol responses in females. Furthermore, they indicated that younger subjects tended to display higher sAA activity. Therefore, in the present study, sAA was used to consider younger age groups in the experiment, which was conducted in men only, in view of gender differences. The sAA levels were measured using a portable monitor (DM-3.1; NIPRO, Osaka, Japan). The participants were carefully instructed about the saliva sampling procedure, as described in the manufacturer’s manual (NIPRO). Next, they collected their own saliva by placing a filter paper under their tongue for 30 s, and the sAA concentration was measured with the monitor. The analysis was initiated by promptly inserting the saliva chip into the unit, and after about 60 s, the sAA was displayed in units of KU/L. The above procedure was followed according to the methods used by Yamaguchi et al. [35].

2.6 Psychological Measurements

The degree of sound disturbance was adopted as an evaluation item, referring to a study on sound disturbance during intellectual work using electroencephalography (EEG) signals [36]. The psychological state of the participants who were exposed to the auditory stimuli was then evaluated based on three kinds of indicators: the degrees of concentration, stress, and disturbance. The participants were instructed to evaluate the three assessment items as they recalled, and not to rewrite their evaluations after subsequent reconsideration. They were also instructed to respond within 1 min, which is the required time for sAA measurements. A 10-point Likert-type scale was used to evaluate the three assessment items, with responses from 1 = “not disturbed at all” to 10 = “strongly disturbed” for the degree of disturbance, from 1 = “high level of concentration” to 10 = “no concentration at all” for the degree of concentration, and from 1 = “very stressed” to 10 = “not stressed at all” for the degree of stress.

2.7 Statistical Analyses

A multi-way analysis of variance (ANOVA) was conducted to investigate the effects of each auditory factor on the psychological and physiological indicators. If the ANOVA results showed significant main effects, the relationship of the mean values between each group was evaluated in detail through multiple comparisons using Bonferroni correction. By contrast, differences between groups in sAA before and after the auditory stimulation were assessed using a paired t-test. All statistical analyses in the present study were performed using BellCurve for Excel (Social Survey Research Information, Tokyo, Japan), with p values < 0.05 considered significant.

3 Results

The mean numbers and SDs of correct responses obtained for each of the auditory stimuli are shown in Fig. 3. The results of the one-way ANOVA with the auditory stimulus as a factor showed no significant difference for the main effect of the auditory stimulus. This result is similar to that reported in a previous study on the degree of disturbance during intellectual work using EEG signals [36], where no significant difference was found in the main effect of the auditory stimulus when pure tones were used as the auditory stimulus. In any case, the effect of the auditory stimulus on work was small, indicating that it was not reflected in the workload. This was attributed to the simplicity of the experimental tasks, which were easy to carry out even if the participants felt disturbed by the sound or stressed by the task.

Fig. 3
figure 3

Mean and standard deviation for the numbers of correct answers (error bars show 95% confidence intervals) for each auditory stimulus

Next, the mean subjective scores and SDs for the degrees of disturbance, concentration, and stress caused by the reproduction of the auditory stimuli are presented in Fig. 4a–c, respectively. The results of one-way ANOVAs with auditory stimuli as a factor showed significant differences (p < 0.01) in the main effects of all auditory stimuli on the degrees of disturbance, concentration, and stress. The results of multiple comparisons between each of the auditory stimuli showed significant differences in the pairs shown in each figure. In terms of the degree of disturbance, the auditory stimuli S-TN and S-CC showed significantly higher disturbance than did the other four auditory stimuli. In terms of the degree of concentration, the S-TN and S-CC auditory conditions showed significantly lower concentrations than did S–SI. With regard to the degree of stress, the S-TN and S-CC auditory conditions showed significantly higher stress than did S–SI and S-PB, respectively. No significant differences were found between S–SI and S-PB in any of the categories.

Fig. 4
figure 4

Mean and standard deviation for subjective scores (error bars show 95% confidence intervals) for each auditory stimulus under the conditions of the degree of a disturbance, b concentration, and c stress. *p < 0.05, **p < 0.01

Next, the physiological results are indicated as follows. The baseline values of sAA described above may have differed between each of the subjects due to stress and other factors caused by the experimental environment, which is different from usual for the subjects. Therefore, the pre- and post-work sAAs and the difference between pre- and post-work sAAs (ΔsAA) are shown in Fig. 5a and b, respectively. First, a paired t-test of the sAAs measured pre- and post-work, respectively, showed significant differences (p < 0.01) only for S-PN and S-TN. Next, a one-way ANOVA with auditory stimuli as a factor showed a significant difference (p < 0.05) for the main effect of auditory stimuli. As a result of multiple comparisons, a significant difference (p < 0.01) was found only between S-TN and S-CC, as shown in Fig. 5b.

Fig. 5
figure 5

Mean and standard deviation for sAA scores (error bars show 95% confidence intervals) for each auditory stimulus: a pre- and post-work sAAs and b ΔsAA, which is the difference between the pre- and post-work sAAs. *p < 0.05, **p < 0.01

4 Discussion

The correlation among each of the subjective scores are indicated in Fig. 6. As shown in Fig. 6a, the results of the present subjective evaluation experiment indicate decreases in concentration as the degree of disturbance increased. This is consistent with the finding in a previous study that the degree of concentration is strongly related to the degree of disturbance [36]. It is also a natural consequence that the degree of stress increases as observed in Fig. 6b, because reduced degrees of disturbance and concentration can be obstacles to task performance. It should be noted that the conditions of S-TN and S-CC indicate almost the same degrees of each of the subjective indicators.

Fig. 6
figure 6

Correlation between the subjective scores of a concentration and disturbance, b concentration and stress

Next, regarding the physiological results of Fig. 5b, only the S-TN auditory stimulus showed significantly higher ΔsAA; S-CC, which showed similar subjective rating values to S-TN, showed the lowest values. Then, the relationship between each of the subjective scores of (a) disturbance, (b) concentration, and (c) stress, and the ΔsAA is presented in Fig. 7. While in all categories except S-CC the ΔsAA increases in some degree in relation to subjective impression, it can be seen that the values of ΔsAA differ significantly between S-CC and S-TN despite having almost the same subjective impression score. This trend appears to be particularly pronounced in the relationship between stress and ΔsAA in Fig. 7a. First, as S-TN is a meaningful noise containing speech, the sense of work disturbance was markedly increased, and the resultant stress may have increased the sAA. However, the degree of stress caused by S-PB and S-TC, which may also increase annoyance, was relatively moderate. On the other hand, S-CC resulted in the same level of stress as S-TN, but in contrast to S-TN, post-work sAA was the most reduced among all conditions. In other words, the sound of chewing crispy chips was subjectively disturbing and stressful during work, but it did not cause stress physiologically. There have been several cases where differences in stress states have been predominantly detected in sAA-based analysis results [29, 34]; therefore, the degree of physiological stress indicated in the present study also seems plausible. Although the participants were instructed to assume they were engaged in office work, it is possible that such chewing sounds were actually unconsciously recognized as sounds that occur in a more casual setting than in an office, suggesting that the participants may have consequently performed intellectual work in a state of physiological relaxation. However, the participants may have experienced an increase in subjective stress due to the common knowledge that chewing sounds are perceived as a less favorably accepted artificial action sound and may increase annoyance for a certain proportion of people. This also suggests the importance of examining the effects of sounds, even those generally considered to be unpleasant, on people in detail by comparing them not only with subjective impression ratings, but also with physiological responses.

Fig. 7
figure 7

Correlation between each of the subjective scores of a disturbance, b concentration, and c stress, and the ΔsAA

4.1 Limitations

The present study has some limitations. First, sample size was calculated a priori, and the moderate sample size of 20 subjects somewhat restricted opportunities to examine differences between each of the physiological conditions. Specifically, a sample size of 20 participants gave post hoc power of 35, 79, 36, 26, 75, and 27% to detect differences between the pre- and post-work sAA under the S–SI, S-PN, S-PB, S-TC, S-TN, and S-CC acoustic conditions using a two-group t-test with a two-sided significance level of p < 0.05.

Second, intellectual tasks were performed using artificial and non-artificial sounds reproduced at low SPLs; however, none of the auditory conditions had any effect on work performance. Therefore, whether the reproduction of artificial sounds at very low SPLs can significantly change the work efficiency of simple tasks remains unclear.

Third, to examine whether the intellectual tasks used in this study had a learning effect on the participants, the number of correct answers was calculated for each minute of the overall 5-min task, and the main effect of the order of task execution was tested by ANOVA. However, no main effect of the task order on the number of correct answers was found. Therefore, it can be said that no particular learning effect was observed in the current 5-min experiment. Nevertheless, it cannot be said that a learning effect does not occur when the task is actually carried out for a longer period of time. Thus, it is necessary in the future to examine the effect of longer task durations.

As future research, the effects of auditory stimuli on work efficiency should be investigated by controlling the type, duration, and difficulty of tasks and increasing the sound types and physiological indicators. At this point, it would be useful to examine the degree of stress in more detail using EEG signals [37] and heart rate [38] in combination as other physiological indices. Second, the present results are limited to men in their 20s. It may be possible to draw more general conclusions by investigating a wider range of participants, including women and older people.