1 Introduction

Virtual reality (VR) is increasingly becoming part of our daily lives and its applicability is widening [29]: from video games and entertaining, to education, medical treatment and rehabilitation, military applications, architectural and urban design, digital marketing and activism, engineering and robotics, fine arts, heritage and archaeology, occupational safety, social sciences, psychology and many more. For this reason, it is not only important to understand how this technology can affect our brains, but also its potential to extend the current scope of neurological diseases detection methods. Using VR in new specific triggering scenarios will allow controlling and extending the available functionalities of the visual stimulation systems used for the detection of such pathologies [4].

This reflection is not new. Back in the 1990s, research performed to detect the risk of television and video games exposure resulted in a set of recommendations for TV manufacturers and video games developers. Simultaneously, the studies performed were the basis to many new uses of these techniques in education [9] or in rehabilitation [8], to name a few.

Photosensitivity is an abnormal visual sensitivity of the brain resulting in a photoparoxysmal response (PPR), i.e. a brain epileptic discharge consisting of a cortical spike or a spike-and-wave provoked by a flash or a visual stimuli [26]. There are four different types of PPR resulting from different brain responses to intermittent light stimulation [31]. Even though PPR can be found in non-epileptic electroencephalogram (EEG) recordings, it is strongly associated with epilepsy [16]. The relevance of the PPR relies on its association with specific epileptic syndromes [34] and monitoring of treatment in a clinical context [20]. The photosensitivity range is related to the likelihood of occurrence of reflex seizures in daily life; thus, it is of crucial interest to early detect PPR. The most commonly used procedure to detect PPR, known as intermittent photic stimulation (IPS), is described in [23]. It proposes to submit the subject to a series of light flashes while simultaneously monitoring the brain activity using EEG signals, according to the European consensus group methodology for visual stimulation defined in 2012 [27]. The light flashes frequency is first increased from a minimum to a maximum or until a PPR is observed (whatever happen first). If no PPR is observed, the process finishes. Should a PPR occur, the procedure is repeated in an inverse manner, i.e. starting with a top flickering frequency that is gradually decreased until a minimum is reached or a PPR happen. The aim is to detect the minimum and the maximum frequencies at which the subject shows PPR, if any, with the minimal exposure.

The detection of PPR is usually performed by physicians, i.e. clinical neurophysiologists and nurses, who manually review the EEG signals’ variability in search of PPR [2, 14, 22], taking into account each subject’s clinical context such as age, seizure and family history. To our best knowledge, no automated method for the detection of PPR has been developed so far. This research is mainly focused on the design of a new and safe procedure for the automatic detection of PPR within EEG signals using digital bio-markers implemented using VR and AI techniques.

In this sense, [25] designed a PPR detection method by analysing the potential and oscillation of the response provoked by a flashing stimulation, but following a different stimulation pattern from the standard one. There are other recent studies that analyse the photosensitivity and epilepsy based on other generalized discharges or seizures than PPRs: in [18], a detection method based on the band amplitude fluctuation computed from a high-frequency and a low-frequency components of the EEG windows in each EEG channel is proposed; [30] applied the extreme gradient boost technique for the classification of seizures in two different ways (applying a standard partitioning of the data and applying a leave-one-out cross-validation scheme), while a channel-independent long short-term memory network is used in [5]; the information extracted from EEG and electrocardiogram (ECG) signals is used in [35] in a multi-modal neural network which analyse the data in three different ways (only EEG data with a convolutional LSTM network; only ECG data with a residual convolutional network; and a fused network which combines the outputs of the individual networks to perform the final classification); in [6], K-nearest neighbours and artificial neural networks are used for the detection of ictal discharges and inter-ictal states; [32] proposed an EEG single-channel analysis applying three types of visibility graphs (basic, horizontal and difference) to represent different EEG patterns.

Other studies make use of additional and different biometric measures for the same purpose, such us electrocardiograms (ECG) [10, 11, 28], electromyograms (EMG) [3, 36] or magnetoencefalograms (MEG) [24].

This study proposes an alternative to the conventional IPS procedure for PPR detection using VR and machine learning (ML). This research is focused on the most frequent PPR type; thus, the PPR detection still needs more research work as it is not completely solved. However, introducing VR would eventually allow to study and to develop new and safer procedures for PPR detection. Since the VR infrastructure is much more flexible than the standard one, it can be easily configured to carry out new assays and stimulation paradigms, allowing for an advanced IPS/Visual stimulation system. Our proposal includes a VR device with a wireless connection to a computer that has access to the data gathered by the EEG sensors. The subject must wear at the same time both the EEG cap and a head mounted display (HDM). Furthermore, a plausible solution for the automatic PPR detection is proposed using some features extracted from the EEG signals in an average montage, and classic ML techniques. An average montage is used so that the electronegative PPR discharge will express with an upward deflection of the EEG signal in the affected channel.

The main contributions of this research are:

  • To introduce VR in a close loop with AI and ML models, so medical procedures could be revisited and enhanced. This contribution can lead in the near future to more advanced diagnose tests and procedures.

  • To provide the neurophysiology department at Burgos University Hospital with a novel instrument to analyse the impact of VR in relation to photosensitivity by integrating this solution in its daily work.

  • To develop ML models to detect anomalies in the EEG recordings when the patient is flashed using either VR-ML IPS or conventional IPS.

The structure of this study is as follows. The next section focuses on the description of the proposal, detailing the different elements in the loop. Section 3 gives details of the experimentation set-up that has been carried out at the proposal, while Sect. 4 includes all the obtained results and the discussion on them. The final section draws the conclusion of this research.

2 A prototype for VR-ML IPS procedure

The proposed solution complements the conventional set-up by means of introducing a VR device that the subject must wear along with an EEG cap (see Fig. 1). The signals from the EEG sensors are analysed using well-known ML techniques. The intelligent module will eventually control the VR contents to gain increased capabilities and to perform more complex assays. This section gives details on each of the main modules: the VR part (next subsection) and the ML module (Sect. 2.2).

Fig. 1
figure 1

To the left, the conventional set for IPS procedure. To the right, the new VR-ML set for IPS procedure, including automatic EEG analysis and PPR detection

2.1 VR design for IPS and PPR detection

Flashing lights are one of the main triggers of photosensitive responses. VR-Photosense [15] is a software designed to detect photic-driving and PPR while using VR and wearing a head-mounted display (HMD). VR-Photosense offers a VR scenario with IPS in order to measure brain responses to flashing lights at various frequencies and using different sequences. The main goal of this software is to simulate conventional IPS tests in a virtual reality environment.

Conventional IPS places the light stimulator at a very short distance from the patient’s eyes, creating high exposure to flickering lights which are perceived with intensity even when the patient’s eyes are closed. In order to emulate the exposure and sensory effect caused by the conventional IPS light stimulator, a virtual reality scene has been designed as a 3D enclosing spherical dome environment with the patient’s vision placed at its centre (see Fig. 4). This design suppresses patient’s peripheral vision and increases the focus on the visual stimuli even with the eyes closed.

The VR-Photosense’s set-up is fairly simple. After downloading and starting the app, the cardboard viewer, into which a smartphone is inserted, is secured to the subject’s head using the adaptable straps, leaving the upper part of the head clear. Then, the EEG cap is easily set up on the subject’s head covering all necessary points of contact as shown in Fig. 2.

Fig. 2
figure 2

VR-Photosense set-up with a cardboard viewer and an OpenBCI EEG headset

The software system is divided into two parts: the light stimulation and the monitoring software. The VR-Photosense’s default configuration stimulates with white light combined with dark black background, emulating the conventional IPS. Introducing an innovation to conventional IPS, VR-Photosene allows the colour of the flickering light and background to be changed from the default configuration, so we designed two coloured settings in addition to the white one: i) one with bright red flashes and deep blue background; ii) and another with deep blue flashes with dark black background. These tones may influence the brain discharges, and combined may be more or less provocative when compared to the default one, allowing to study brain behaviour reacting to the stimulation with both of these scenarios.

VR-Photosense has also been designed to resemble the conventional stimulation set-up and to facilitate the work of physicians, both clinical neurophysiologists and nurses, while conducting photosensitivity tests at the hospital. This system includes a monitoring feature that allows to observe in real time what is happening on the VR stimulation via a web server. This is thanks to the use of Websockets, a communications protocol that offers full-duplex communication channels over a single TCP connection making it faster and with low latency to update system.

This monitoring feature along with EEG recordings translates into a full coverage IPS scenario that closely resembles the conventional set-up, with the difference of replacing the traditional stimulation device with a low cost VR headset and the VR-Photosense software. The EEG set-up to carry out VR-Photosense testing at the hospital consists of using Natus Brain monitoring and Neuroworks software for EEG recording. Furthermore, the different hardware and software elements can be easily mixed and replaced as they are completely independent from each other. For example, the VR-Photosense can be configured to work with higher end HDMs such as Oculus; or EEG recording can be performed using a different device such us OpenBCI 3D printed device and open source software, which were used at the university EEG laboratory for preliminary experiments.

All in all, VR-Photosense offers an innovative, low-cost and cross-platform IPS scenario in virtual reality, with more upcoming features to be included aimed to achieve a more detailed diagnosis.

This light stimulation software has been implemented in Unity 3D using the programming language C#. Unity is a video game development engine that allows designing 3D scenes by means of a visual editor and the programming of gameplay events via scripting. These scripts are associated with the game objects included in the scene so that they behave in the desired way.

Fig. 3
figure 3

VR-Photosense software package diagram

Within the scene, the assets included are structured in software components as shown in Fig. 3.

  • Controls: auto-generated script and input system. The new InputSystem 1.0.2 of Unity has been used, which allows to set up the desired inputs through an interface. Different inputs can be entered from different devices for the same action. This makes configuration and connectivity with different types of input hardware, such as keyboards and game controllers, seamless.

  • GoogleVR: an SDK for Android that allows the creation of virtual reality applications to be used with Google Cardboard HMD.

  • Plugins: set of Android plugins needed to export applications for devices using this operating system.

  • Resources: all the scripts developed for lighting and connection with the monitoring side. This package also includes shaders and materials used in the scene.

  • Scenes: the designed virtual reality scene (Fig. 4).

  • XR: default Unity configuration for extended (virtual and augmented) reality apps.

Fig. 4
figure 4

Scene diagram implemented in Unity 3D

The developed scene has four objects or GameObjects: the camera, two point lights and a sphere (see Fig. 4). The purpose of the sphere is to provide a black background, placing the camera inside it and inverting the normals, thus avoiding transparency. This inversion process is done by adding a custom shader to a new material assigning it to the sphere. As for the spotlights, one of them is the main white light and the other provides a blue background for the custom colour functionality.

All developed scripts use the MonoBehaviour built-in class, since Unity uses a standard Mono run-time implementation. These classes are considered blueprints and each time they are associated with a GameObject, a new instance of the object defined by that blueprint is created. In them, two methods are predefined: Start and Update. In this case, only Start is used, which will be called by Unity when loading the scene. It is also important to note the use of corrutines. Corrutines are functions that allow pausing and resuming the execution in the frame in which it was paused. In this case they are used to implement the flickering effect in the lights and the stimulation sequences.

As for the input system, two options were considered: remote control and keyboard. Considering the number of configurable parameters and the controlled actions to be performed by the person in charge of the stimulation, the use of only a VR remote control was considered insufficient. Bearing in mind the end user and the technology they use on a daily basis, the use of a Bluetooth keyboard was chosen. As a result, more configurations are available and the commands are set in the way these experts consider more comfortable to perform the stimulation in the most efficient possible manner. However, in order to enable the use of the stimulation with the Oculus Quest 2 glasses, this configuration has also been adapted for the two remote controls associated with this HMD. Finally, the list of commands available is shown in Table 1.

Table 1 List of available commands in VR-photosense application

2.1.1 Monitoring

In addition to simulating the IPS environment, it is deemed necessary to know what is happening inside the scene during the tests. To address this problem, a monitoring component—a web page (https://vrphotosense.herokuapp.com)—was added to the system. This solution allows the physicists to see in real time what is happening in the VR stimulation, e.g. the evolution of the sequences and the commands entered by the user.

Another possibility was to implement a sound system to communicate the situation through audio, but it was considered less efficient as it was volatile and did not have a visual record of the simulation.

Fig. 5
figure 5

Theoretical functionality of the monitoring system

To implement the monitoring system, a WebSokets component was used, which allows real-time and fast communication between the smartphone application and the web page avoiding the use of a database. The desired functionality is as shown in Fig. 5. Client 1, the smartphone application, sends information about the stimulation to the server, and the server passes it to client 2, the web page, which displays it on the screen along with a time stamp.

Fig. 6
figure 6

Sequence diagram of the system using WebSockets

WebSockets is a protocol that provides bidirectional, full duplex communication over a single TCP socket. The environment in which the VR-Photosense application is used is a fairly sensitive one for medical testing, thus imposing an as fast as possible exchange of information exigence. WebSockets is probably the most popular protocol for this type of use cases where data needs to be sent in real time. As we have seen before, Unity is based on the Mono platform as scripting engine, which means that it works in .NET (C#). This framework provides a default support for this protocol that is also supported by Mono, through the System.Net.WebSockets namespace. The operation in this case would be as shown in Fig. 6.

The server has been developed using Node. It is a simple server that receives information from the client and returns it. In this case, nothing is done in Unity with the returned information; it is the monitoring client that makes use of it. This client is an HTML page that connects to the server, opens a connection, gets the information coming from the server that is part of a message and displays it on the screen. This HTML client has been implemented in the simplest possible way since it is an add-on to the main work and its appearance in terms of design is not relevant for the EEG laboratory staff.

2.2 ML-based PPR detection

PPR can be found in epileptic syndromes that present seizures with or without visual stimulus trigger, and in some subjects both types of seizures are observed. Waltz classification [31] is used to define the expression of the PPR from an electroencephalographic point of view, introducing up to four different types of PPR:

  • Type-1: spikes within the occipital rhythm.

  • Type-2: parieto-occipital spikes with a biphasic slow wave.

  • Type-3: parieto-occipital spikes with a biphasic slow wave and spread to the frontal region.

  • Type-4: generalized spikes and waves or polyspikes and waves.

All of them are depicted in Fig. 7. Type-4 PPR is the most frequently found in epileptic syndromes where photosensitivity constitutes a clinical concern, as it seems to have a strong association—higher than 90%—with epileptic seizures; the detection of this type of PPR represents the challenge focused on this research. Besides, in clinical practice, it is frequent that expression of PPR is variable, and in many times we obtain PPR that not necessarily fit within only one category of those initially defined by Waltz. Even more, the morphological characteristics of a given subject’s PPR may vary because of clinical variables as doses of anti-epileptic treatment, sleep quality, etc.

To identify Type-4 PPR we propose the use of sliding windows—one second length and a tenth of a second shift—followed by a pre-processing stages that subtracts the window average and performs a feature extraction. Well-known ML techniques consider these features to propose a final label to the window.

These ML techniques are applied in three parts that will be described in this section. First, the set of transformations that will be applied and their rationale are explained in Sect. 2.2.1. Then, the design and deployment of the ML models are detailed in Sect. 2.2.2 and, finally, the training of the ML part is described in Sect. 2.2.3.

Fig. 7
figure 7

The four types of PPR: a type-1: spikes within the occipital rhythm; b type-2: parieto-occipital spikes with biphasic slow wave; c type-3: parieto-occipital spikes with biphasic slow wave and spread to the frontal region; d type-4: generalized spikes and waves

2.2.1 Feature extraction

The selection of mathematical transformations for signals representation is, per se, a problem that needs to be carefully addressed. The main point is to select features that, in conjunction, include, as a whole, the information that the experts use to make a decision. Therefore, all the possible windows must be analysed and significant differences must be stated between the anomalies and the normal signal state. Nowadays, the problem of feature transformation is dealt with deep learning (DL) and, more specifically, with auto-encoders; however, due to the lack of data to train the networks, we left this issue for future research.

For this study, we focused on channels Fz and O2 because these are the channels were PPRs more frequently appear regardless their type. We pay attention to their normal and abnormal behaviour for the considered PPR type. Some examples of the windows that might be faced are depicted in Fig. 8. From the analysis of the signals, the following features set has been selected to represent each EEG data window, where w is the width of the window, t refers the time stamp for which the feature is computed, c is the channel—either Fz or O2—and \(d_t^c\) is the EEG c channel’s signal with an average montage—the PPR expresses with an upward deflection—at time stamp t:

Fig. 8
figure 8

Three EEG fragments from different conditions: the left-most and the centre recordings are considered normal conditions, while the recording at the right shows a PPR. The recordings include, from top to bottom, signals from the F3-AVG, Fz-AVG, F4-AVG, O1-AVG and O2-AVG channels, respectively, where AVG stands for the average of all recorded electrodes

  • Cumulative First derivative, also known as the intensity of the signal, computed as \(CFD_t^c=\frac{1}{w}\sum _{i=0}^{w-1}|d_{t+i+1}^c-d_{t+i}^c|/\Delta\). This feature has been chosen because PPR present high rate of change in the value of the channel. \(\Delta\) represents the interval between consecutive samples, which is kept constant.

  • Cumulative Second derivative computed as \(CSD_t^c=\frac{1}{w}\sum _{i=0}^{w-1}|d_{t+i+2}^c- 2\times d_{t+i+1}^c+d_{t+i}^c|/\Delta ^2\). This feature has been selected because there is also a high rate of change in the first derivative, but not so high as for artefacts.

  • Number of relevant peaks using the S1 measurement proposed in [19] and computed as follows. Equation 1 defines the calculation of \(S_1\), where k is the predefined number of samples and p is the current sample timestamp for which we are determining whether it is a peak or not. The \(S_1\) transformation represents a scaling of the TS, which makes the peak detection easier using a predefined threshold \(\alpha\).

    $$\begin{aligned} S_1(p)=\frac{1}{2}\times \left\{\max _{i=p-k}^{p-1}{(d_p^c - d_i^c)} + \max _{i=p+1}^{p+k}{(d_p^c - d_i^c)}\right\} \end{aligned}$$
    (1)

    So, for each point for which \(S_1(p)\) can be computed within the EEG sliding window we compute \(S_1(p)\); a peak occurs in time p if the value \(S_p\) is higher than \(\alpha\) and is the highest in its 2k neighbourhood. In the original report, all the parameters (k, \(\alpha\)) where carefully determined for each problem in order to optimize the peak detection. In this research, k is set to 10 and \(\alpha\) is set to three times the standard deviation of the EEG channel values when no activity is shown (upper right corner in Fig. 8).

  • Sum of the absolute values, to measure the area under the curve of the EEG signal.

  • Maximum differences, also known in some fields as amount of movement [1], which measures the differences between the highest and the smallest values—expected to be a high value. It is calculated as \(MD_t^c=|\max _{i\in [t, t+w]}(d_i^c) - \min _{i\in [t, t+w]}(d_i^c)|\).

  • Average Energy as proposed in [33], as the sum of the squared discrete FFT components magnitudes of the signal.

All the features are standardized; given a data set, the average and the standard deviation are computed and used to transform the values into a normal distribution with mean 0.0 and standard deviation 1.0.

2.2.2 Designing and deploying the ML part

The goal in this stage is to obtain models able to label the pre-processed EEG signal windows as normal or as PPR. Due to the data imbalance, the most interesting approach is to use unsupervised learning, so anomalies can be detected. However, this might generate too many false positive, so it could be interesting to also develop a complementary supervised learning solution. Therefore, for this research we proposed to use, first, unsupervised learning to obtain a model that signals those anomalous windows and then to classify the anomalous windows as PPR or normal using a supervised approach. The unsupervised learning is specific for a given subject, while the supervised learning is a generalised model.

For this stage, we will develop models following the workflow proposed in Fig. 9. Data from the EEG sensors is windowed as explained before. For each window, the average is calculated and subtracted; the transformations from the previous subsection are calculated afterwards. A one-class classifier, learned for the current subject data, labels the window as normal or not. In case a window is labelled as an anomaly, then the two-class classifier, learned from other subjects, labels the window as PPR or not.

For the one-class classifier, an unsupervised one-class k-nearest neighbours model (1C-KNN) [12, 17] is tested: this model has been selected for its fast training and evaluation times while still performing sufficiently accurate.

For the two-class classifier we propose K-nearest neighbour (2C-KNN) due to the small number of instances in the available data set.

Both classifiers are from the scikit-learn library for Phyton [21].

Fig. 9
figure 9

The workflow of the designed approach. The data gathered from the current subject are pre-processed. When a data window comes from frequencies smaller or equal to fc (the cut frequency, which is the stimulation frequency value at which the first PPR appears), the window is preserved for the training of the one-class models; otherwise, the window is labelled as normal or as anomaly. In this latter case, the two-class classifier labels the window as PPR or not. However, when no window is labelled as including a PPR for the current frequency, the windows for this frequency are also considered and the one class model is re-trained

2.2.3 Training the ML part

Training the models has two main stages as can be seen in Fig. 10: (i) training the one-class model and (ii) training the two-class model. At this moment we have two collections of data: (a) a collection of windows from the current subject (CPData), all labelled as normal, and (b) a collection of windows from the historical records (HRData), each window with its corresponding normal or PPR label. CPData is used in the one-class training, while HRData is used in the two-class training. The first part of the training is the 2C-KNN learning using the HRData; in case of highly imbalance of the data set, SMOTE will be used. Different values of the parameter K are tested for both classifiers to find the best performing model.

When analysing the data recorded for the current subject, an incremental training is proposed. The idea is repeating the one-class classifier training until a real PPR be detected at a certain flashing frequency. That is, in case the frequency to be tested is increased for example from 4 to 6 Hz, if no PPR is detected in this new stimulation frequency, then the 1C-KNN is trained including the windows gathered from the first frequency range (1–6 Hz), and then, the next flashing frequency is evaluated (8 Hz) and the process is repeated again until a PPR is detected. The stimulation frequency at which the first PPR is detected is the cut frequency (fc). The process is illustrated in Fig. 11.

Fig. 10
figure 10

All the data are windowed and the features are computed. Although all the training stages seem to be simultaneous, they are performed at different time

This process has been described for the following flashing frequency increase sequence—standard frequencies used are 1, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 25, 30, 40, and 50Hz.

Additionally, the PPR analysis also includes a decreasing frequency study, starting at certain frequency (i.e. 50 Hz) and decreasing the frequency until a PPR be found. The whole process should then be repeated but with fc set to the corresponding limit; the same workflow shown at Fig. 11 is mostly valid with minor changes: i) the comparison operation is \(\ge\), ii) the 2C-KNN is specific for the decreasing flashing frequencies.

Fig. 11
figure 11

Incremental learning of the one class models once no PPR has been found for a certain flashing frequency f. The windows gathered for this f are also used to train the one-class models until the fc value is found. The figure shows the increasing flashing frequency case, a similar structure is proposed when decreasing the flashing frequency

3 Experimentation set-up

3.1 Evaluation of VR-ML IPS procedure

In the preliminary stage of the study, a series of intermittent light stimulation tests were performed under the supervision of physicians (clinical neurophysiologists) at the hospital. The experimentation was divided into two series of tests in order to determine whether the virtual reality stimuli have a photosensitivity impact, both on healthy subjects and on previously diagnosed photosensitive subjects.

As a starting point, the simplest response to look for in an EEG recording is a photic-driving response, that is, a common physiological response to light stimuli during an IPS, which is triggered by an intermittent photic retinal stimulation that can be detected in the alpha rhythm [7]. It consists of brain activity that matches the time and frequency or harmonic frequency of intermittent light stimuli [13]. It is usually greater when the light stimulation is close to the subject’s alpha rhythm.

A second series of tests were carried out to determine the ability of VR-Photosense to elicit photoparoxysmal responses. These tests looked for PPR induced by photic stimulation, as it has been previously described in this paper, at existing EEG records corresponding to previously diagnosed photosensitive subjects.

The study is ongoing. Although the low number of subjects does not yet allow establishing generalizations, the initial series of tests confirmed very promising results regarding the use of VR-IPS both for the detection of photic-driving as well as PPR responses as described in the Results and Discussion section.

3.2 Evaluation of the ML-based PPR detection

The data set gathered from Burgos University Hospital is used to study the performance of the ML method for the PPR detection. This data set includes ten anonymized EEG recording sessions from different photosensitive subjects recorded with the hospital’s own equipment. Each session consisted of two continuous recordings while the IPS (the conventional IPS and the proposed VR-IPS) procedure was performed: the frequency of the photic stimulation was increased in the range 1–50 Hz for the detection of the minimum frequency (fc) that causes a PPR in each subject. The duration of each session varies in the range 3–5 minutes. The EEG signals were recorded at a sampling rate of 500Hz from 19 electrodes placed according the 10–20 standardized system, as shown in Fig. 12. Each recording was then manually labelled by the expert clinical neurophysiologist by visual analysis, labelling every PPR with their corresponding type: five subjects showed Type-4 PPR, while five other subjects showed other types of PPR.

Fig. 12
figure 12

Position of the 19 scalp electrodes used to record EEG signals according to the international 10–20 system of electrode placement as defined by the international federation of EEG societies

Training the models follows a leave-one-participant-out cross-validation scheme to train both one-class and two-class classifiers. Therefore, EEG windows from the current subject are not used in training the 2C-KNN model, while the labelled windows from the remaining subjects conforms the HRData data set; conversely, the data from the current subject conforms the CPData data set, which is used by the 1C-KNN classifier.

The HRData is used to train the 2C-KNN classifier; to do so, tenfold cross-validation is used to select the best parameter value. On the other hand, the CPData is used to train the 1C-KNN but also to evaluate the proposal. Therefore, the CPData is split in two: i) the data for a frequency \(f \le f_c\), where \(f_c\) is the current subject’s cut frequency—used to train the 1C-KNN—and ii) the data for a frequency \(f > f_c\), used to evaluate both the 1C-KNN and the 2C-KNN for those instances labelled as anomalies.

The accuracy (ACC), the sensitivity (SEN) and the specificity (SPE) will be used to measure the quality of the models. Furthermore, the average, median and standard deviation among the tenfold of the cross-validation will help in the evaluation of this model.

For the sake of simplicity, only increasing flashing frequencies training and analysing is shown. The complementary decreasing flashing frequencies procedure would be performed in a similar manner, so it is omitted to avoid overload of figures and numbers.

The experimentation will be divided into two main parts: training and evaluation of the 2C-KNN using the HRData, and training and evaluation of the 1C-KNN using the CPData. Due to the fact that this research is focused only on Type-4 PPR, the experimentation will be performed considering only these phenomena; however, we also replicate the same experiments for the complete data set considering all PPR types.

4 Results and discussion

4.1 Photic-driving response and PPR detection using VR-Photosense

Our initial testing of VR-Photosense both on healthy subjects and previously diagnosed subjects, confirms the expected correlation between the IPS and the EEG recording. Moreover, we have identified photic-driving responses (see Fig. 13) as well as PPR responses (see Fig. 14) using the Natus Neuroworks monitor. In the former one, the expected photic-driving responses of a healthy patient can be seen after the photic stimulation, more strongly in the occipital and parietal channels, and at frequencies of stimulation that are similar to the alpha rhythm of the patient, that is their basal oscillating frequency when awake and in quiet state. In this case, the frequency spectrum of the brain response after the stimulation at 10Hz reveals a strong component around the same frequency value as expected. In the last one, it is shown how a PPR was triggered on an epilepsy diagnosed subject right after the photic stimulation at a frequency rate of 30Hz started. Each stimulation starting point is indicated by the purple labels at the top of the plots. As artefacts, we can see the blink electropositve deflexion in both frontopolar regions and some muscular contraction activity, showed as fast frequencies burst that appears darker in the recording.

Feedback from this stage of the study has been used to refine the software and accommodate it for clinical use at the hospital.

Fig. 13
figure 13

Photic-driving response identified on a healthy subject using VR-Photosense at the flash frequency 10 Hz

Fig. 14
figure 14

PPR response identified on a previously diagnosed subject using VR-Photosense at the flash frequency 30 Hz

In this research it is shown that no differences have been found between the EEG responses recorded when using the conventional IPS procedure or the proposed VR-IPS procedure, suggesting that they are completely comparable. This assertion is based on the evidence that a clinical neurophysiologist cannot tell the difference between two EEG recordings, one registered while submitting the subject to the conventional IPS procedure, and the other one obtained while submitting the same subject to the new VR-IPS procedure.

At this point, testing VR-Photosense among diagnosed subjects is crucial to advance on this study. At the time of writing this paper, the research team has the approval of the ethics committee from Burgos University Hospital to conduct a trial on several subjects who suffer from photosensitivity, which will allow us to measure the performance of VR-Photosense to detect photoparoxysmal responses.

4.2 Performance of the ML-based PPR detection

Two different stages conform the measurement of the performance of the PPR detection: the evaluation of the 2C-KNN and the evaluation of the 1C-KNN. In this way, it would be possible to determine whether any of the classifiers needs improvements or if the whole system is able to detect the PPR.

For the evaluation of the 2C-KNN, Table 3 includes the mean (MN), median (MDN) and standard deviation (STD) of the ACC, SEN and SPE for each case and for the best parameter K found in the leave-one-participant-out tenfold cross-validation of the corresponding HRData.

For the evaluation of the 1C-KNN, Table 2 includes the mean (MN), median (MDN) and standard deviation (STD) of the ACC, SEN and SPE for each subject and flashing frequency, and for the best parameter K found. In this case, the cut frequency (fc) started at 1 Hz to evaluate all possible flashing stimulation frequencies. Then the 1C-KNN is trained, and the next frequency data is labelled as anomalies or not. Finally the fc is increased in order to repeat the process with the next frequency data.

In both tables, two different sub-tables are shown: the upper one corresponding to the performance of each classifier when taking into account all types of PPR; the lower one corresponding to the case where only Type-4 PPR are considered.

Table 2 Results of the anomaly detection performance for all PPR types (upper) and only for Type-4 PPR (lower) in channel Fz of the 1C-KNN classifier with K=15 neighbours. Mn = Mean. Mdn = Median. StD = Standard Deviation. *Since subjects from \(P_5\) to \(P_9\) do not have any Type-4 PPR, their results are not highlighted
Table 3 Results of the leave-one-out cross-validation of PPR detection performance for all PPR types (upper) and only for Type-4 PPR (lower) in channel Fz of the 2C-KNN classifier with K=21 neighbours. Mn = Mean. Mdn = Median. StD = standard deviation

As can be seen, the detection results are better in the case where only Type-4 PPR are considered by both classifiers, as might be expected since our method was designed to focus only on the detection of this type of PPR. However, there is still room for further improvements.

The results of the 1C-KNN classifier, computed with the best value found for the parameter K, i.e. K = 15 neighbours, are shown in Table 2. The first thing to note is that the results for subjects 5 to 9 expose ACC and SPEC values equal to or very close to 1 while SENS is equal to 0. The reason for this is because those subjects did not show any Type-4 PPR throughout their recordings. All other patients presented at least one Type-4 PPR among all the PPR triggered during their sessions.

Despite the fact that figures in Table 2 are not impressive, the results obtained for the specific Type-4 PPR detection are slightly better than those obtained if all the PPR types are considered but only for some subjects; for two of them the 1C-KNN decreased its performance. This might be due to the fact that some PPR are not pure Type-4 PPR, making their detection even more challenging. Besides, the behaviour of the 1C-KNN obtained only for the Type-4 PPR performed perfectly for those patients that did not include any Type-4 PPR; furthermore, the performance considering the ACC and the SPEC are really high.

In terms of the results of the 2C-KNN classifier shown in Table 3, the same comments can be repeated. In this case, one of the main reasons of the poor performance could be the high HRData data imbalance, forcing to introduce high rates of oversampling for the minority class while severely undersampling the majority class when using SMOTE. As a consequence, the outcome of the SMOTE might not represent the variability of the initial space, penalizing the overall results.

This study proposes several features to represent the EEG TS; the obtained results may suggest these features are not so representative and that a more complete study on possible transformations is needed. Moreover, a feature extraction stage would be required where introducing more features to avoid the difficulties of finding an ideal feature subset.

Additionally, the performance of KNN—either one class or two classes—may suggest to introduce more complex ML methods, such as random forests. However, due to the limited size of the data set available for this study, advanced ML methods were not used. Provided more data is made available, these methods must be contemplated as robust alternatives.

Undoubtedly, the experimentation and the results shown in this study represent a very first step in this research; it must be considered as a proof of concept. This is due to two main reasons: On the one hand, the fact that only ten patients have been studied so far suggests that much more patients must be analysed to conclude that the VR-IPS is totally equivalent to the conventional IPS; however, if so, the potential of VR will easily surpass the old procedures, introducing new medical research lines. On the other hand, the limited amount of data and its unbalanced nature represents a challenge for ML; obtaining accurate models certainly requires a higher amount of data, so evaluation methods using either leave-one-participant-out or K-participants fold cross-validations schemes become feasible and credible. Nonetheless, what it is clear from the experiments and the obtained results is that VR-IPS represents very promising research that can potentially be spread to many other different areas, such as Alzheimer disease evaluation.

There are several improvements that can be introduced to the ML model. First, a more in-depth analysis of a wider set of transformations from different domains must be done, such as temporal energy, statistical properties or spectral measures. Afterwards, unsupervised learning could show some relationships between these features and the labels, showing some more promising transformation. Furthermore, feature extraction—either principal component analysis or locally linear embeddings—must be applied to reduce the feature subset. With these new set of features, different models can be studied, such as random forests, support vector machines or perhaps the KNN can proof good performance. In any case, the models must be valid to deal with this type of one-class and two-class problems in highly unbalanced problems. Besides, data augmentation techniques can also be employed to increase the number of available experiments. In case enough data are gathered from the experiments currently being carried out at Burgos University Hospital, deep learning—auto-encoders plus dense layers or TS classification using, for instance, long short-term memory networks—and/or XGBoost could also be applied in combination with different techniques such as High Frequency Oscillations.

5 Conclusions

For this study, our research team has developed a novel and low-cost VR system that mimics and updates the conventional intermittent photic stimulation (IPS) systems. It is cross-platform and can be used with multiple types of VR devices. It can also be used in any professional environment and with any type of EEG recording device, as VR-Photosense is independent from the type of EEG headset being used.

Our VR-IPS stimulation has proofed effectiveness to identify photic-driving responses on healthy subjects as well as PPR on photosensitive previously diagnosed subjects during the initial testing. We are currently extending this work to conduct a clinical trial on a large number of patients.

We also proposed a ML-based PPR detection procedure that extracts six standardized features (Cumulative First Derivative, Cumulative Second Derivative, Number of Relevant Peaks, Sum of Absolute Values, Maximum Differences, Average Energy) from the EEG windows and then sequentially applies two different versions of the K-nearest neighbours algorithm: the first one is an unsupervised one-class KNN classifier (1C-KNN) that detects anomalous activity in the EEG window; the second one is a supervised two-class KNN classifier (2C-KNN) that decides if the previously detected anomalous windows belong to PPR activity or not. This technique is designed to target only the detection of Type-4 PPR, as they represent a more dangerous photosensitivity state than the other types.

Despite the ML results, which were not as high as expected, the proposal can be considered a good proof of concept in terms that, with more research, a robust and resilience method for detecting PPR with VR+ML would be possible. However, due to the lack of large and good EEG data sets this has not yet been possible, although at Burgos University Hospital are working on gathering new data from subjects.

From the operational point of view, the learning curve of the new VR IPS is similar to the one of the conventional IPS, according to the feedback comments provided by the physicians of the neurophysiology department at Burgos University Hospital who have been using the new system so far, when compared to the conventional IPS they use every day named Natus Nicolet V44.

Furthermore, more alternatives can be followed concerning the ML solution. Introducing more complex ML methods such as random forests, feature extraction techniques such as principal component analysis or locally linear embedding or deep learning solutions, when the amount of available data allows us to do so, such as auto-encoders plus dense layers or long short-term memory networks. All of these improvements represents future research work.