1 Introduction

Traditional films are designed to be perceived by two senses: sight and hear [1, 2]. Technology development during the latest years has allowed the transformation of audiovisual experiences into a multisensorial one, named as the Cinema 4D according to some works (Cardini, 2011; [3,4,56, 76]. Several experiments have used smell, taste or touch stimuli in films or videogames [7],Bumsuk [8, 9]. For instance, Lemmens et al., [10] designed a suit that provided vibrotactile stimulation to boost the emotions of the viewers during the projection of audiovisual works. None of the above studies used brain activity measurements, stating their results only according to the subjective impressions of the participants.

To evaluate the impact of viewing films in humans, several neuroscience studies have used electroencephalogram (EEG) technology to investigate the viewers’ brain activity [114, 12,13,14,15,16, 92]. Some of them have focused on emotions, showing that temporal and orbitofrontal areas are the most important ones when it comes to emotional content processing.

Regarding the identification of hemispherical brain lateralization, several hypotheses still remain under a competitive controversial. One of them suggests a higher participation of the right hemisphere on the emotion perception, even regardless its valence, i.e., if it is positive or negative [17]. On the other hand, a different participation of each hemisphere as a function of the emotional valence is proposed: the left hemisphere is the one more closely related to the valence of the emotions and the right one is more related with their intensity [17]. An additional proposal links positive emotions with the activation of the left hemisphere, while the right one activation would be raised by negative emotions [18,19,20,21,22], though further studies are needed to reinforce these valence differentiations. Notwithstanding, we can find at least a consensus in the literature: temporal lobes and orbitofrontal areas are the brain regions activated in any emotional processing [23, 24]. Although the quantity of basic emotions is still a controversial topic, at least the consensus reaches to the following six: anger, fear, sadness, happiness, disgust, and surprise [25,26,27, 77]. On the other hand, emotions can be analyzed according to their content in valence –negative/unpleasant to positive/pleasant– and excitation –low to high– [28, 96].

Regarding the relation between emotions and brainwaves, Jung et al. showed that explicit violent content in films increases the EEG register activity mainly in delta, theta and alpha waves [97], but other studies widen the scope of frequencies related to emotional reactions [29, 98], also showing the theta band preponderance [72]. Some EEG-based studies demonstrate the hypothesis of a higher brain activity intensity under negative emotions compared with that produced by positive [30, 31]. The origin of this activity could be located on the cortical-limbic structures, anterior orbitofrontal and temporal areas [32, 89].

Espenhahn et al. analyzed the brain activity in young people while viewing films by using the somatosensory evoked potentials measured with an EEG [78]. Their result, however, were not focused on the evaluation of emotional responses with different ways of watching the movies, as in the work presented here. Closer to it, Raheel et al., [33, 99] performed a study in which the brain activity was analyzed while watching the film with and without image and sound, and with some additional stimulation such as heating or ventilation. Their conclusion was that those stimuli increased the emotional intensity induced by the movie.

Furthermore, other studies have designed tactile devices, such as vests to enhance people's affective communication through multimedia [34], or to evoke specific emotions through vibration and heat [35]. Additional works have focused on enhancing the movie-watching experience through tactile stimulation [10, 36, 37], similar to Lemmens et al.'s [10] multisensory jacket, designed with 16 different segments of 4 motors each capable of generating synchronized vibrotactile stimuli during moments in a film. Furthermore, other studies have designed haptic vests applied to video games and augmented reality [38]. In this vibrotactile feedback jacket, fourteen vibration actuators were integrated to generate an immersive experience by activating different vibration patterns and intensity levels according to the game scenes.

Some other recent researches have used vibrotactile stimuli to boost the attentional or emotional response in people with some disabilities [39, 40], or even in non-disabled people [14, 41]. Recently, a study was conducted involving hearing impaired individuals, where the cortex activity of the participants was recorded using EEG [42]. While seeing a video of a neutral landscape, two different soundtracks were played, each one evoking a clearly different emotion. In one mode, subtitles were projected remarking the type of music, its title and authorship. In the other mode, the same video was projected along with a synchronized vibrotactile stimulation through a haptic glove instead of subtitles. Hear impaired people response was demonstrated to be highly attentional in the first case (frontal lobe activation), but highly emotional in the second one (temporal lobe activity).

Nevertheless, up to our knowledge, no relevant studies have been performed on the brain activity in people without disabilities when only tactile stimulation is applied while viewing films. The goal of this work is to analyze this activity while viewing a traditional film with sound and image, and that when adding tactile stimulation on the hands, focusing on the differences to understand the way in which the brain processes this new stimulus. This will pave the way to separate different neural processes, and to distinguish the exact role of the involved brain areas as a foundation to perform analysis when working with people with disabilities.

Despite the existing studies on the analysis of viewers' brain activity and emotions to date, there are no EEG studies that assess the differences between traditional viewing of audiovisual works and multisensory works, such as those that involve tactile stimulation. We have decided to conduct a study that provides results regarding these differences, considering that the perception of movies, video games, and other audiovisual content will increasingly be influenced by multisensory stimuli in the near future.

2 Materials and Methods

Thirty five participants make our sample up, with ages between 18 and 75, in accordance with other similar EEG-based studies [12, 43, 44]. Prior to the experiments, subjects were asked about their phobias, mental disorders, psychiatric and neurological pathologies, and/or if they use psychotropic substances. None of the participants mentioned anyone of the above conditions. This study is part of a line of research whose clinical trials were approved by the Clinical Research Ethics Committee (CEIC) of the San Carlos Clinical Hospital since April 4th 2019, Madrid (Spain).

2.1 Audiovisual material

Following the conclusions of some studies that used audiovisual stimulation to analyze emotions with EEGs [4, 45], and, overall, the one of Pereira et al., [46], where the use of videos of more than one minute was demonstrated as convenient when trying to detect emotions through EEG, we used a film sequence lasting 5 min. It was shot by us, working in a small cinema team in which two professional actors (an actress and an actor) played under a well illuminated but intimate atmosphere a tender scene, where they kiss and caress in the bed of a room. The aim of the director was to create the environment to define the personal relation between two lovers.

The video was produced in a hotel room in the central area of Madrid. It involved a director, a camera operator/director of photography, a sound technician, and a production assistant. The technical equipment used included a Sony PXW-FS5 camcorder, LED lighting, and a Zoom H5 sound recorder with Rode NTG-4 microphones. Recording files were stored in MXF format with the XAVC codec, at a resolution of 1920 × 1080 in progressive mode, 25 frames per second, and a bitrate of 50 Mbps. Subsequently, these files were converted to the Apple ProRes 422 codec so they could be edited in Final Cut Pro X. The final export of this process was an MP4 file with the H.264 codec, at a resolution of 1920 × 1080 and a bitrate of 8 Mbps.

To check its validity, 110 students from the Information Sciences Faculty from Complutense University of Madrid were surveyed to value the kind of emotion that was transmitted by the film. A questionnaire was applied using a model of emotions which sets a space of limited, discrete, basic emotions, as well as some complex emotions [28, 47],Zhao y Ge, 2018). The video was watched by students in a classroom at the Faculty of Information Sciences at the Complutense University of Madrid, in three different sessions. When the screening was over, the attending students were instructed to complete a questionnaire posted on the Virtual Campus using their computers or mobile devices.

In this preliminary study students rated the film as 'pleasant', opposed to 'unpleasant,' with an average score of 4.1 (5 being 'pleasant' and 1 being 'unpleasant'). In the questionnaire, we also asked them to rate the perceived intensity of a specific emotion on a scale of 1 to 5. The emotions being: happiness, relaxation, satisfaction, and surprise. The students associated the video more with feelings of relaxation (3.56) and satisfaction (3.01) than with happiness (1.72) and surprise (1.06).

After the preliminary study, we projected the same film to 35 participants (different from the previous 110) in two different modes: 1) image and sound, and 2) image, sound, and tactile stimulation (Fig. 2). Thus, in mode 1 viewers watched and heard the video in a traditional way, whereas in condition 2 the same video was synchronized with tactile stimuli.

The director of the film chose the exact times that had to trigger the tactile stimuli (Fig. 1). Mainly, it was during the images where the action included explicit physical contact between the actors, such as kisses and caresses. Once these moments were determined in milliseconds, they were included in the protocol so that, through the “EEG Control” software, they could be launched in complete synchronization with the video. Before conducting the experiment, the director viewed the audiovisual with tactile stimulation to ensure that all stimuli were correctly placed. It is worth to say that the exact placement and duration of the stimuli was established as a purely creative film decision, just like the lighting, framing or setup could have been [48, 79].

Fig. 1 
figure 1

Sample of frames from the video where there is tactile interaction between the two video characters 

We employed a protocol that encompassed all commands for automatic synchronization with the video through our proprietary software (“EEG control”). The software serves as a protocol hub that unifies communication across multiple systems. This protocol includes timestamps in milliseconds, such as "play video," "glove vibration," or "EEG marker (Fig. 2). This ensured precise synchronization between the video and tactile stimuli or EEG markers (data collection of brain activity).

Fig. 2
figure 2

Schematic representation of the two modes of the experiment. Mode 1: imagen and sound. Mode 2: Image, sound, and tactile stimulation. In both modes there was a simultaneous EEG recording

We must remark, however, that the mode was selected in a random way for each viewer, thus avoiding any memory effect in the brain responses. Half of the participants first viewed Condition 1 and then Condition 2, while the other half did the opposite. The first participant watched the video first in Condition 1 and then in Condition 2. Participant 2 watched it first in Condition 2 and then in Condition 1, and so on.

2.2 Tactile stimulation and protocol

Two haptic gloves were used for the vibrotactile stimulation. Each glove was a Inesis Golf Glove 100 where 6 Uxcell 1030 micromotors were placed, one on each of the finger pads and one on the palm. These micromotors provided haptic stimulation by vibration (Fig. 3). The motors operated at 3 V DC and consumed up to 70 mA each, with a maximum of 630 mA per glove. All other main electronic elements were placed on a printed circuit board (PCB) or Arduino shield, with three L293D motor controllers (to control the 12 motors for the 2 gloves) and all the appropriate connections to the gloves and power supply. In order to provide enough power and to avoid damaging the Arduino board due to an excessive current demand by the haptic stimulation devices, we designed a drive circuit system to control each motor switch. This circuit was made up of a power bank, another L293D motor controller (which controlled up to 4 motors and provided up to 0.6A per channel) and a flyback diode to protect against motor flyback currents.

Fig. 3
figure 3

Gloves used with the coin motors locations clearly visible

The positive haptic stimulation lasts 1.6 s each. The haptic stimuli are modulated in a square PWM signal with variable duty cycle due to intensity control, in bursts of 1 kHz. It is applied finger per finger, with the following order of cumulative activations: palm – thumb – index – middle – ring – little finger and then reverse deactivation towards the palm, in a 1.6 s full sequence. This pattern is produced in both hands simultaneously. To generate the stimuli, an Arduino UNO rev3 was used, which in turn is triggered by a control PC and synchronized with the viewing of the film. It was mandatory an exact timing as well as a perfect fitting of the glove to the user’s hand, ensuring the right haptic stimulation through the motors.

Every moment in which the director of the film decided the insertion of an emotional tactile stimulus, a mark was generated in the EEG to analyze the 200 ms (Fig. 4). We selected this analysis window in the EEG recordings because, according to several studies, it is suitable for analyzing the emotional activity of viewers during the viewing of a visual or audiovisual work [42, 49,50,51,52].

Fig. 4
figure 4

Design of the marks and tactile stimuli for the experiment recording through EEG

Each participant underwent a control test to verify that the vibration itself did not produce brain activity beyond tactile detection. This vibration consisted of a signal with a constant frequency of 2 Hz and a duty cycle of 10%, with the 6 motors of each hand on for 50 ms and off for 450 ms. The duration of this test was 3 min. The reason for using such a low duty cycle was twofold: not to cause discomfort due to its long duration, and to maintain the tactile stimulus constant.

All the stimulations were tracked in this way to evaluate the differences of the brain activity among the different viewers. Once the EEG record was obtained, a Self-Assessment Manikin (SAM) questionnaire was performed to evaluate the emotional experience of the viewers, regarding valence, arousal, and dominance [28, 47, 53]. Each participant reported, individually, their emotional experience for each one of the modes of the film on a score of up to 5 points. For valence they chose between pleasant (5 points), pleased (4 points), neutral (3 points), unsatisfied (2 points), and unpleasant (1 point). For arousal they selected between excited (5 points), wide-awake (4 points), neutral (3 points), dull (2 points), and calm (1 point). Finally, for dominance the participants chose between dependent (5 points), powerlessness (4 points), neutral (3 points), powerful (2 points), and independent (1 point). The duration of the whole experiment, including EEG instrumentation and gloves setups and initializations, as well as the questionnaires, was around 60 min per participant. No payment was made to anyone in the sample group for this study.

2.3 EEG recording method

A 64-channels Neuroscan Quik-Cap was used for acquiring the EEG recordings. The software was ATI Pentatek © (from Advantel SRL). Prior to obtain the EEG recording, the impedance was checked to be under 5 kΩ. Reference electrodes where ubicated on the two mastoids. Sample frequency for the register was 1 kHz. These data were averaged using a mean reference. A visual inspection on each one of the obtained registers was performed to clean data away from artifacts due to eye or muscle movements. Those noisy channels were substituted by a linear interpolation of adjacent channels. Additionally, those channels whose square magnitude was higher than four standard deviations of their mean power were substituted by the mean of the adjacent channels [54].

2.4 Brain sources localization

To allocate the origin of neural activities, an EEG inverse problem was assessed. It was solved using Low Resolution Electromagnetic Tomography (LORETA) method, [55]. The solution for each model was restricted to one, or a combination of more than one, specific anatomical structure. Those structures had restrictions derived from the assumption of segmenting the average brain atlas from the Montreal Neurological Institute (MNI) [56] into 90 parts. This procedure followed the automated anatomical atlas (AAL) [57].

LORETA was applied through the Neuronic software [4958, 59, 90] in 50 to 70 artifact-free 200 ms windows, for each participant and mode. It produced a series of bioelectrical activation maps revealing the maximum activation areas for each group.

Once those areas were located, statistical parametric maps (SPMs) were computed through the Hotelling’s T-square test against zero, voxel by voxel, to determine statistically significant sources. Applied to independent groups [60], it allowed the obtention of probability maps with thresholds at a false discovery rate (FDR) of q ¼ 0.05 [61], depicting them as 3D activation images overlapped to a MNI brain model. Once the probability maps were obtained, those anatomical structures greater than 10 voxels and over the threshold – according to AAL atlas—were identified and highlighted [61]. Subsequently, local maxima were located using the MNI XYZ-coordinate system.

3 Results

LORETA results reveal a maximum activity both in the left orbitofrontal area (X = 2, Y = 51, Z = –5, with T2 = 4.07) and the right orbitofrontal (X = 3, Y = 52, Z = –4, T2 = 4.21), with a maximum intensity of 4.21 in the mode 1 (image + sound). In mode 2 (image + sound + touch), the average brain activity was located in the upper right frontal area (X = 27, Y = 62, Z = 6, T2 = 26.40), medial right frontal (X = 40, Y = 53, Z = 8, T2 = 22.,99), as well as left orbitofrontal (X = -5, Y = 62, Z = -5, T2 = 18.054) and right orbitofrontal (X = –4, Y = 62, Z = –43, T2 = 19.718) with a maximum intensity of 26.40. These results are shown through the brain activity maps in Fig. 5.

Fig. 5
figure 5

EEG results in: left.—Mode 1 (image and sound) and right.—Mode 2 (image + sound + touch). Maximal intensity projection areas are displayed in yellow/red colour. SPMs were computed based on a voxel-by-voxel Hotelling's T2 test against zero

The questionnaires performed after the EEG experiment show differences between the emotional response of the viewers during the conditions of mode 1 and mode 2. While in the condition 1 the viewers scored the emotional valence with a 4.2 (Pleasant), that of the mode 2 was slightly higher, 4.3 (Pleasant). Regarding the arousal, there were higher differences: average score of mode 1 was 3 (Neutral), whereas multisensorial condition 2 reached a 4.1 (Wide awake). Moreover, participants scored a dominance of 4.1 (Powerlessness) in mode 1, and 2.1 (Powerful) in condition 2.

4 Discussion

Our results show that the tactile stimulation produces a higher activity in frontal and orbitofrontal areas during the viewing of a film with multisensorial stimuli. According to several studies [59, 62], emotions are located in orbitofrontal brain areas while the attentional processes arise from the frontal areas.

Interestingly, we find a remarkable increase in the brain activity of the condition 2 (image + sound + touch) respect to the mode 1 (image + sound) in superior frontal areas. Several studies have related superior frontal areas with attentional processes [63, 64]. Therefore, the reason for the observed increment in those areas may be the generation of attentional processes in viewers when a new stimulus is added to the traditional viewing of a film, as they are not used to that way of perceiving the audiovisual works, like some authors state [65, 66, 83]. Furthermore, although there is a significant difference in brain activity between both conditions (4.21 in Condition 1 compared to 26.4 in Condition 2), the results are consistent with studies [67] that suggest that when we are accustomed to a stimulus, the introduction of a new one has a greater impact on brain activity. In the case of this experiment, participants were accustomed to viewing movies with sound and images, but not with a tactile stimulus.

This is a conclusion that could be easily linked with the research in habituation psychology [68, 69, 70, 76]. In a cinematographic environment, the incorporation of a new stimulus could be compared to the situation produced at the early stages of its development, at the beginning of the 20th Century, where the viewers could see the first frames on a screen. Among the audience, some were not able to perceive persons, but “flying heads”, due to not seeing the whole bodies of the actors on the screen [71]. This was a new kind of stimulation for which the brain needed a certain training and, hence, requiring a volitional attentional process.

On the other hand, it is remarkable how the results show a higher lateralization of the brain activity, pointing towards the right hemisphere in condition 2 (image + sound + touch) when compared to mode 1. According Pralus et al., [17] right hemisphere activation in the brain is related to a higher intensity in the perception of an audiovisual work, and the left hemisphere activation is more related to the valence of the emotion. Our EEG recording results seem to reinforce the above conclusions, because they match with the viewers’ answers in those questionnaires [28, 96]. Although the participants provided a similar valence score in both conditions (pleasant), they remarkably differed in arousal and dominance. While arousal was neutral in condition 1, viewers felt more awaken in condition 2. Indeed, they marked with a high score the tactile stimulation, qualifying it as “powerful”, whereas the condition 1 was “powerlessness”.

Finally, all these conclusions agree with the raising of frontal areas activities in condition 2, reported by several authors as to be linked with attentional processes [63].

5 Conclusions

We conclude that the tactile stimulation increases the brain activity intensity of the viewers while watching a film with emotional content. They perceive a higher emotional intensity, as well as they develop more cognitive attention focused on the projected film.

We would like to remark some limitations of the present study. First: the tactile stimuli were located only in both hands. Therefore, further research is needed to analyze the brain activity with additional olfactory or gustative stimulus, or/and tactile on other parts of the body. Second: for future research, we think that a comparison should be made between two groups, one stimulated by a series similar to the one applied to our mode 2, and other one in which a training on the sensorial stimuli had been previously performed. In this way, we could value the real effect on those viewers of the tactile stimulation used to perceive works with multisensorial stimuli.