Keywords

1 Introduction

Brain-computer interface (BCI) is a system that records the brain activity (Electroencephalogram - EEG) to communicate the brain in real time with external devices without relying on peripherical nerves or muscles [1,2,3]. Even though BCI is a non-invasive and mature technology, nowadays its applications are considered as an emerging research field focused on improving brain-machine communication [4, 5]. For its features, the BCI system has become an important tool for neurofeedback applications [6, 7] that allows users to be aware of changes in their brain activity and react to those changes. All these mechanisms, used in the field of brain activity, can be translated into data, and be processed to the synthesis of sound or metaphorization of musical structures [8,9,10,11], allowing the user to navigate between limited control, automation, and creation, introducing the concept of Brain-Computer Music Interface (BCMI).

The earliest example of transforming the EEG signal to sound appears in the literature shortly after the invention of EEG. In 1934, Adrian and Matthews [12] were the first researchers who monitored their own EEG with sound through the replication of the earliest EEG descriptions of the posterior dominant rhythm (PDR). However, the first brain composition was performed in 1965 by the composer and experimental musician Alvin Lucier controlling percussion instruments via strength of EEG PDR [11, 13]. Following Lucier’s experience, David Rosenboom created a music piece in 1970 in which EEG signals of several participants were processed through individual electronic circuits, generating visual and auditory performance for the Automation House in New York [14]. These early BCI music performances were based on EEG signals sonification. Nevertheless, in 1997 Rosenboom introduced an attention performance driving music by detecting several EEG parameters [15]. Few years later Eduardo Miranda started to create music using EEG signals processing, allowing the composers to modulate the tempo and dynamics of the music thus giving place to the BCMI concept [11, 16]. After the first work of Miranda, several researchers like Dan Wu [17, 18], Ian Daly [10], Thomas Deuel [9], and Mei Lin Chen [19] among others, proposed different BCMI compositions and sound representation systems using diverse EEG patterns and processing techniques.

In this context, a real-time graphical sound representation based on metaphorical interaction design was proposed, which allows navigating through the motor imagery cognitive task in a bidimensional plane. This representation was developed using the OpenBCI EEG acquisition system [20] to generate a pipeline information processing based on data packets transmission over Open Sound Control (OSC) [21] from BCI to Max/MSP [22] and a metaphorical Graphical User Interface (GUI) programmed in Processing [23]. This system presents a neurofeedback interaction through a graphical and sonorous output, the first related to sound representation based on image processing, and the second related to the sound synthesis convolution as a result of motor imagery user interaction. In the python codification stage, a storage and data normalization were carried out to obtain spectral energy data of brain waves in the frequency domain. Then data packing and transmission were performed via OSC protocol to Max/MSP, where all brain data were mapped to a floating point. Here, a statistical discrete processing allowed the classification and parameterization of data into sound representation. This metaphorization was generated through amplitude modulation (AM), frequency modulation (FM), granular and subtractive sound synthesis and their convolution, setting up the graphical sound representation by quadraphonic signals and 3D navigation. The navigation is associated aurally with synthesis and sonification in order to make the visual synchronism with space control conscience. Therefore, the color background and the image pointer change in an interpolated way in the 3D GUI. The sound and visual responses evoke stimulation changes, allowing the user to make decisions and execute variations of his conscious brain activity.

The layout of the paper is as follows: Sect. 2 presents a detailed description of all materials and protocol used; Sect. 3 introduces the results and their discussion; and finally, Sect. 4 offers the conclusions of the research.

2 Materials and Methods

2.1 Brain-Computer Music Interface

Brain-Computer Music Interface (BCMI) is a BCI used specifically in Music and Sound Representation applications. The BCMI is an acquisition and processing system that takes EEG signals measure from the user’s brain electrical activity in order to generate commands in real time. The BCMI provides a communication channel between the brain and an external sonorous softwares or devices [2, 7, 10, 11]. The BCMI system allows the users to learn how to regulate consciously their brain activity in order to control a specific external music application [11, 24].

2.2 Interpolation BCMI Prototype Design

Python Patch for the OpenBCI. The prototype proposed uses the OpenBCI EEG acquisition system in which two Cyton and Daisy boards for the acquisition, amplification, signal processing and data transmission were used. Each board acquires 8 EEG channels, using the international standard 10–20 for the location of the electrodes (Fig. 1).

Fig. 1.
figure 1

10–20 electrodes location

The OpenBCI is an open source and presents the hardware data software capabilities described in Table 1:

Table 1. OpenBCI technical specifications

For Interpolation of the BCMI prototype development, an adjustment of Hurtado [25] was made. This patch creates a graphical interface for EEG data representation in the time domain, the percentage of energy in the frequency domain of brain rhythms. Besides that, brain rhythm calculation through FFT is made and then its data packing and transmission to Max/MSP is accomplished via OSC.

Max/MSP Patch. The interpolation system is associated with a motor imagery conscious task which fluctuates between 13 and 15 Hz, mixing with alpha waves from occipital brain activity, related at the same time with \(\mu \) brain rhythm. In their interaction, the electrochemical brain signals are translated into data and the processes are related to sound synthesis. This system allows the user to navigate with control, automation and creation restrictions. For this purpose, a sound graphical representation in real-time, which implies a basic training and pipeline communication between OpenBCI, Python, Max/MSP, and Processing is proposed.

In the prototype formulated, a parallel process of active and passive brain states is applied to control a GUI interface and the parameter variation of AM, FM, granular, and subtractive sound synthesis schemes, as well as the parallel between the active or conscious control process through motor imagery and the passive or unconscious control process through brain rhythms energy variation. The Max/MSP programming is focused on discrete data treatment, synthesis parameterization, and sound representation, and the processing programming is centered on visual representation (GUI).

The general programming of Max/MSP is divided into the following steps:

  1. 1.

    Connection system between Python and OpenBCI, which is in charge of analyzing the incoming brain information in frequency domain and scaling it to execute the audio control.

  2. 2.

    Visualization of brain rhythms variation.

  3. 3.

    Cursor variation and its position randomness which is modified by high alpha and low gamma rhythm variation, and its discrete classification results in a motor imagery task.

  4. 4.

    Position of the cursor into the navigation interface and synthesis components selection.

  5. 5.

    Sound synthesis of each Navigation GUI extreme distributed as follows: left, granular synthesis; up, FM synthesis; right, AM synthesis; down, subtractive synthesis.

  6. 6.

    OSC communication with Processing to send and receive relevant information about the position and data related with the volume and the audio reproduction.

  7. 7.

    Reverb subpatch for all the synthesis signals.

  8. 8.

    Output audio wave visualization and compressor stage which avoids overload in audio output.

Sound Synthesis Process. The sound synthesis process is divided into amplitude and frequency modulation synthesis, granular synthesis and subtractive synthesis. These syntheses were inspired by the brain signals behavior based on the hypothesis that the brain rhythms can change in function of a specific brain feature as a signal modulation.

Amplitude Modulation (AM) Synthesis. The amplitude modulation is based on the ring modulation described in the Max/MSP cycling 74 Tutorial Part 2 and 3 [26]. Here, the carrier and modulator signals are multiplied sample by sample, resulting in a signal in which the amplitude of the carrier changes following the amplitude modulator signal, while the frequency of the carrier remains the same. This modulation process is described in Eq. 1, where, \(f_m\) is the modulator frequency, \(f_a\) is the frequency for the amplitude variation, \(f_c\) is the carrier frequency and k is a constant.

$$\begin{aligned} V_{AM}(t)=(0,25k\sin (2\pi f_m t))+\sin (2\pi f_a t)\sin (2\pi f_c t) \end{aligned}$$
(1)

Frequency Modulation (FM) Synthesis. The frequency modulation encodes the information in a carrier signal of constant amplitude in a proportion of the instantaneous frequency of the modulator signal. The FM syntheses used multiplied the carrier signal with the harmonic relation \(F_m / F_c\). The FM modulation allows the emulation of instruments and the generation of sounds with a complex spectral response. The behavior of the FM synthesis is described in Eqs. 2 and 3

For \(0 < t \le t_a\)

$$\begin{aligned} V_{FM} (t) = \frac{t_{mi}}{4} \sin \left( f_c + f_{mo} t_{mi} \left[f_{mo} t_{mi} - floor \left( \frac{1}{2} + f_{mo} t_{mi} \right) \right]\right) \end{aligned}$$
(2)

where \(f_c\) is the carrier frequency, \(f_h\) is the harmonic frequency and \(f_{mo}\) is the modulation frequency given by \(f_{mo}=f_c \sin (2 \pi f_h t_{mi}) \)

For \( t \ge t_a \)

$$\begin{aligned} V_{FM} (t) = \frac{t_a}{4}\sin \left( f_c + 2 f_c t_a \sin \left( 2 \pi f_h t_a \right) \left( f_m{mo}t_a - floor\left( \frac{1}{2} + f_{mo} t_a \right) \right) \right) \end{aligned}$$
(3)

Granular Synthesis. The granular synthesis consists of the wave conversion into wave grains. It is based on small acoustic events generation called acoustic grains with a temporary duration less than 50 ms. The granular synthesis applied consists of creating a simple waveform in a signal segment. Following, the signal is multiplied by a saw-tooth signal to granulate it, and then an envelope signal is applied with a phase shift to avoid unwanted spikes. This synthesis is described in Eq. 4.

$$\begin{aligned} r_1(t) = \left\{ \begin{array}{ll} 0, &{} t< 0\\ t, &{} 0\le t < t_t \\ t_a &{} t \ge t_t \end{array} \right. \end{aligned}$$
$$\begin{aligned} \delta \left( \frac{3}{4} t\right) = \left\{ \begin{array}{ll} 0, &{} t \not = 0\\ \frac{3}{4} t_a, &{} t = 0 \\ \end{array} \right. \end{aligned}$$
$$\begin{aligned} P(t)= \frac{1}{2} (1 + saw(t)) = \frac{1}{2} + \left[ f_g t - floor \left( \frac{1}{2} + f_g t\right) \right] \end{aligned}$$
$$\begin{aligned} V_{GN}= 50 \sin (2 \pi f_0 t) P(2 \pi f_g t) r_1(t) \frac{1}{2} \left[ 1 + \sin \left( P (2\pi f_g t ) + \delta \left( \frac{3}{4} t_t\right) \right] \right) \end{aligned}$$
(4)

Where, \(f_0\) is the original carrier frequency, \(f_g\) is the grain frequency, \(t_a\) is the amplitude of t, and \(t_t\) is the slope of t.

Subtractive Synthesis. The subtractive synthesis consists of a selective band pass filter to eliminate specific harmonics of the signal. The bandwidth of the filter is determined by the delta rhythm bandwidth, and the central frequency of the filter is calculated according to the highest spectral energy of gamma rhythm.

Processing Programming. Processing is an open-access software based on JavaScript used in visual arts. The proposed system uses it for making a real-time visual representation of the BCI through a GUI that includes a cursor and a brain illustration. The cursor might move into five regions according to the brain activity responses addressed by the auditory synthesis and the sonification process, allowing the users to improve the visual synchronism with the brain control commands.

The processing allows sending information packets via OSC where they are verified and compared to generate new parameters of the 3D brain illustration. Figure 2 shows the five cursor regions. Each region corresponds to a specific element. The horizontal location represents the speaker fount sound (left or right), the vertical position is the volume intensity (High or Low), and the middle region is silence. Every time the cursor moves between regions both the GUI background and the cursor change their color in an interpolated way.

Fig. 2.
figure 2

GUI response according to the cursor location

Neurofeedback. The prototype proposes a neurofeedback model through four spatialized speakers to reproduce three-dimensional panels according to the GUI navigation. The user is located in the center of the speakers in front to a screen (Fig. 3). Here, the user is able to do the actions shown in Table 2. In a controlled environment, the user may be able to move and transform the cursor at will, as well as to stop the movement through a relaxation state, allowing the system to return to a neutral state.

Fig. 3.
figure 3

User experiment location

2.3 Participants

Six healthy musicians who self-reported normal hearing participated in the study. This group was divided into two subgroups, an experimental group with three blind participants, and a control group with three normal vision participants. Each participant was informed about the purpose of the study and signed informed consent prior to participation. Each participant took part in the experiment three times at an interval of one week. Participants were monetary compensated for their time.

2.4 Experiment Setup

The experiment was performed in the movable image laboratory of the Universidad de Caldas (Colombia). The laboratory guaranteed an quiet with acoustically isolated environment. The experimental equipment used in this experiment was an OpenBCI for the EEG signal acquisition, a screen for the GUI visualization, two cameras one for the facial recording of the participants and the second one for the overall experiment recording. Likewise, it was necessary to use two computers, one for the signal acquisition and pattern recognition process, and the second one for the Max/MSP processing, and a quadraphonic audio system for the sonorous immersion (Figs. 4 and 5).

Table 2. Neurofeedback controlled process
Fig. 4.
figure 4

Experimental scene

Fig. 5.
figure 5

Final prototype

Previous to the experiment it was necessary to perform a musical and interface control training. The procedure for each experiment had the following steps: 5 min of relaxing state; 5 min for placing the ECG acquisition system; 2 min to be in a neutral state; and 2 min for control imagery experiment. After each experiment, the participants answered a self-assessment test focused on sound level perception, integration difficulty level, mental control level, cognitive strategies, memory, sensations, neuro-feedback cognitive awareness, experience, and movement imagination level among others.

Table 3. Self-assessment results, interpolation prototype
Fig. 6.
figure 6

Average response of the sixteen EEG channels for all experiment participants.

3 Results

For the interpolation prototype, the analysis was made through the brain rhythms variation in function of the motor imagery control of all participants. These levels were measured using static an dynamic levels. The general results (Table 3) show that the high dynamic responses of the participants can be associated with the motor imagery learning process, allowing them to control the system step by step. The static level reflects the low interaction between four control stages of the prototype. As the user starts to navigate in a conscious way into the system, the static level decreases and the dynamic level increases. These measure levels also allow inferring if the participants got a reproduction, integration or neurofeedback level.

According to Table 3, the participant I1 had a low control of the interpolation system in which only \(50\%\) of the experiment time was possible to control the motor imagery system assigning a low reproduction level. In the case of the participants I2, I3, C1 and C3, the high dynamic level responses showed an integration process within the sound generation in function of the system navigation. Finally, the highest dynamic response of the participant C2, exhibited a complete control of the sound generation mechanism related to the interactivity between the conscious navigation and the desired sound and achieved a neurofeedback level. Likewise, it was possible to see that in the sound moments produced by the visual control of the interpolation system, the occipital and parietal regions showed several changes in their activity levels (Fig. 6), confirming the visual and auditory synthesis related to these brain regions. Also, the EEG average level showed a synchronized response in the temporal lobes in the research participants I1, I2 and I3, while the control participants C1, C2 and C3 showed a desynchronized response.

Similarly, in the EEG activity recorded in the control participants, two activity patterns stood out on the O1, O2, C3 and C4 channels associated with the occipital and central brain regions, respectively. These EEG characteristics were generated through the sound control using the visual motor imagery interface. However, these signal patterns did not appear in the research participants’ data due to their visual limitations. The sound control process of the research participants was carried out through auditory navigation, because of the training process performed before the experimental phase.

4 Conclusion

The interpolation system was proposed as a sonorous navigation model based on motor imagery. This system used a BCI interface as a way of communication between the brain and a technological external device. Here, the sound generation was proposed through the convolution of four sonorous syntheses: AM, FM, granular and subtractive. To evaluate the sound system operation, an experiment was made under controlled conditions, where six professional musicians participated. The participants were divided into two sub groups, the first one was formed by three blind participants (research participants); and the second one was formed by three participants without any visual limitation (control participants).

Into the system evaluation, it was found that all participants achieved different control levels associated to their static and dynamic response, as a consequence of the training process, the musical experience and the degree of commitment. During the experiment process, it was observed that all the participants with normal vision controlled the system through the graphic navigation interface based on motor imagery, where a high brain activity in the central and occipital regions was evidenced. On the other hand, it was found that the blind participants controlled the system through an auditory processing as a perception tool of the navigation interface, allowing them to generate sound changes intentionally. This kind of control showed a brain activation in the temporal regions, which are in charge of the auditory human process. During the experiment, all the participants expressed feeling a physical movement sensation and speed of thought accompanied by a floating sensation.

Finally, with the results obtained, the hypothesis that EEG signals acquired during the experimental process can have signals deflections in a latency between 8 to 15 ms is proposed, which according to the literature [27] can be related as early auditory evoked potentials, as a response to the sonorous generation of the system.