Spatiotemporal EEG Dynamics of Prospective Memory in Ageing and Mild Cognitive Impairment

Prospective memory (PM, the memory of future intentions) is one of the first complaints of those that develop dementia-related disease. Little is known about the neurophysiology of PM in ageing and those with mild cognitive impairment (MCI). By using a novel artificial neural network to investigate the spatial and temporal features of PM related brain activity, new insights can be uncovered. Young adults (n = 30), healthy older adults (n = 39) and older adults with MCI (n = 27) completed a working memory and two PM (perceptual, conceptual) tasks. Time-locked electroencephalographic potentials (ERPs) from 128-electrodes were analysed using a brain-inspired spiking neural network (SNN) architecture. Local and global connectivity from the SNNs was then evaluated. SNNs outperformed other machine learning methods in classification of brain activity between younger, older and older adults with MCI. SNNs trained using PM related brain activity had better classification accuracy than working memory related brain activity. In general, younger adults exhibited greater local cluster connectivity compared to both older adult groups. Older adults with MCI demonstrated decreased global connectivity in response to working memory and perceptual PM tasks but increased connectivity in the conceptual PM models relative to younger and healthy older adults. SNNs can provide a useful method for differentiating between those with and without MCI. Using brain activity related to PM in combination with SNNs may provide a sensitive biomarker for detecting cognitive decline. Cognitively demanding tasks may increase the amount connectivity in older adults with MCI as a means of compensation.


Introduction
Mild cognitive impairment (MCI) represents an intermediary stage between typical ageing and dementia [1]. Individuals meeting the criteria for MCI are ten times more likely to develop Alzheimer's disease than those without MCI [2]. While there are currently no cures for dementia-related diseases, understanding early neurophysiological differences between those experiencing cognitive decline and healthy ageing may offer useful ways to evaluate the efficacy of interventions. Cognitive domains such as executive function and attention have been extensively investigated in MCI [3,4]. However, prospective memory (PM) is posited 1 3 as a sensitive early biomarker for those at risk of dementiarelated disease [5].
Prospective memory is the ability to remember to perform an action at a future point in time (e.g. remembering to take a medication at the correct time) [6]. PM constitutes a large part of everyday memory [7] and everyday memory failures [8]. The ubiquity of PM underpinning many activities of daily living is, therefore, an essential feature of autonomy. Often, PM is one of the first patient-reported complaints to family members or health professionals [9] and is subsequently of clinical relevance.
Little is known about the neurophysiology of prospective memory (PM) in healthy older adults and older adults experiencing mild cognitive impairment (MCI). Neurocognitive research has particularly implicated the anterior prefrontal cortex (aPFC) in PM function, demonstrating the importance of the aPFC in PM encoding, maintenance, and retrieval [10,11]. Concerning PM intention retrieval, cortical areas such as the insula, posterior cingulate cortex (PCC), and medial temporal lobe (MTL) are of particular importance [12]. These areas, and connections between them, are found to be impaired in older adults with MCI [13,14].
Electroencephalography (EEG) signals that are timelocked to the presentation of a stimulus or behaviour are known as event-related potentials (ERPs). ERPs provide high temporal resolution of underlying cortical activity [15]. It is understood that there are co-occurrences of the N300 ERP (300-500 ms) over the posterior scalp regions and the frontal positivity in anterior cortices [16], related to cue detection. A late parietal positivity complex (600-1200 ms), known as the prospective positivity, is related to the retrieval of the PM intention from memory [17]. ERP differences between young and older adults are routinely reported in PM [11,17,18]; however, only one previous study has evaluated neurophysiological differences in older adults with MCI suggesting poorer PM performance is due to early processing deficits in stimulus familiarity and the ability to reorient attention [19].
Past studies have provided important insights into understanding PM in ageing and cognitive decline. However, most extant analytical techniques create models by separately processing spatial and temporal information. For ERP data, this is problematic as the temporal component may display complex interactions that dynamically alter over time [20]. Recent developments in artificial intelligence allow us to overcome this limitation through models that can learn changes in temporal information over time while preserving the spatial relationship of the data. The third generation of artificial neural networks, known as spiking neural networks (SNN), can model spatiotemporal interactions with spatiotemporal brain data (STBD) in a biologically plausible way [21]. The application of SNNs to dimensionally high STBD has proved to be an effective way of modelling and extracting knowledge from a variety of data sets that possess time and space qualities [21]. Previous studies have proven the efficacy of SNN modelling in fMRI [22], EEG restingstate [23] and ERP data [20,24], enabling multidimensional learning, which can be interpreted through 3D visualisation, pattern recognition and classification.

Aims and Hypotheses
The current study aims to build on the SNN methodologies for modelling the spatiotemporal dynamics of ERP data. The study proposes new approaches for modelling, learning, visualising and extracting knowledge from ERP data related to working memory, PM and cognitive decline. The study aims to explore and further understand the spatiotemporal and functional differences between younger, older and older adults with MCI. Through the machine learning functionally of SNNs, this study will evaluate the efficacy of using PM as an early indicator of cognitive decline in older adults. For the most part, classification studies of MCI have used resting-state EEG data [25]. However, little research has evaluated the ability to classify brain activity of individuals with and without MCI when performing those tasks most relevant to their diagnosis, namely memory. Evaluating cognitive aspects of memory may be more effective than using resting-state or structural MRI data for classification of individuals with MCI [26]. By comparing the neurocognitive functioning of individuals performing working memory and PM tasks, the current study will determine which of these aspects of cognition is more effective in differentiating between the groups. It would be expected that given PM is one of the first cognitive complaints of those who go on to develop MCI [9], one would expect to find greater classification accuracy for PM stimuli. The current study is broken down into two experiments: firstly, ERP responses to working memory and PM will be modelled and classification accuracies evaluated; secondly, using statistical methods, as demonstrated in previous research [27], new knowledge of local (intra-region) and global (inter-region) connectivity will be extracted from the SNN models.
To this end, the current study hypothesises that (1) there will be differences in visualised SNNs between young adults, older adults and older adults with MCI. (2) SNNs will provide better classification accuracy between the groups when modelling responses to PM stimuli relative to working memory stimuli, and (3) SNNs will have superior classification accuracy compared to traditional machine learning methods. (4) There will be differences in local and global connectivity between young and older adults, and older adults with MCI will have decreased levels of connectivity at the local and global level.

Participants
Thirty young adults (YA; 13 females, mean age = 24.7 years, SD = 3.43), thirty-nine typically ageing older adults (OA; 24 females, mean age = 72.87, SD = 4.18) and twenty-seven older adults with a diagnosis of MCI (MCI, 12 females; mean age = 77.54, SD = 6.49) were recruited for the study. All participants were right-handed and had no history of dyslexia, no history of drug or alcohol abuse and fluent in English. Older adults with a diagnosis of MCI were referred via memory assessment clinics or through Join Dementia Research. All participants completed the Hopkins verbal Learning Test-Revised (HVLT-R) [28] to ensure that individuals were not likely to have Alzheimer's disease or experiencing MCI if part of the healthy ageing group. Participants also completed the geriatric depression scale [29] to ensure symptoms of depression did not confound the results. The study was approved by the Health Research Authority, UK (REC reference: 17/EM/1010).

Procedure
Participants were sat approximately 60 cm from an LCD monitor. EEG equipment was attached before impedance checks were made. Participants firstly completed an ongoing-only working memory task before completing two PM tasks. The two PM tasks built upon the ongoing task with the PM cues embedded within the stream of ongoing task stimuli (Fig. 1). Participants, therefore, completed the PM task and the ongoing task simultaneously. PM instructions were given at the start of the task and were followed by a delay before the first PM cue appeared. In line with past research [30], only 10% of all stimuli were PM cues allowing participants to re-engage with the ongoing stimuli and better simulate real-life PM events.

Ongoing Task
A 1-back word categorisation task was used as the ongoing task (Fig. 1a). Participants made continuous decisions of whether the word presented is of the same category as a preceding word. Participants were instructed to press a button on a response box with the right index finger if the word was semantically related to the previous word. The ongoing task compromised 300 stimuli from the category norm data base [31] with a 25% chance of a word belong to the same semantic category as the previous category. Each word was presented for 500 ms with a 2-s stimulus onset asynchrony between words. Stimuli requiring a response during the ongoing task will be referred to as a 1-back target . Related words are those which are from the same category as the previous word and require a response. Unrelated words are those that did not belong to the same category as the previous word and did not require a response. b The embedded prospective memory tasks. Examples of the perceptual and conceptual prospective memory cues are highlighted with grey bars

Prospective Memory Task
The PM paradigm (Fig. 1b) incorporates two PM conditions: perceptual and conceptual, where perceptual relates to visual features of the cue which prompts a PM response and conceptual relates to a semantic cue that is less salient than perceptual cues and therefore requires greater attention to remember. Participants were instructed to press a button with their right index finger when a word was of the same semantic category (ongoing task) and to press another button with their left index finger if there was a PM cue. For the perceptual PM condition (PM percept ), participants were told to respond to words appearing in capitals (e.g. 'SWORD'). For the conceptual PM condition (PM concept ), participants were told to respond to the word of four-footed animals (e.g. 'cat'). Each PM task contained 600 stimuli 10% of which were pseudo-randomly presented PM stimuli.

Electrophysiological Recording and Processing
Electroencephalographic activity was recorded with a 128-channel Active Two Acquisition system (BioSemi, Amsterdam, Netherlands) sampling at 2048 Hz. Data were collected using ActiView V6.05 (National Instruments, TX, USA). EEG preprocessing was performed in MAT-LAB R2019a (The Mathwords, Inc) using custom scripts and EEGLAB. Data was referenced to linked mastoids and downsampled to 256 Hz. A high-pass finite impulse filter (FIR) was applied at 0.01 Hz and a low-pass FIR filter at 35 Hz. Line noise was removed using CleanLine. Independent components (runica analysis) were visually inspected and artefacts were rejected. ERP epochs were 1000 ms from stimulus onset with a baseline correction.

Spiking Neural Network Computational Architecture for Modelling and Visualising Working Memory and Prospective Memory Activity
The proposed SNN architecture is an evolving spatiotemporal data machine (eSTDM) modelled on neuromorphic, brain-inspired SNN processing concepts [32]. It is designed to map brain data into a 3D brain space of spiking artificial neurons (AN) while preserving the topological information of the recorded brain activity. Principally, this architecture draws its inspiration from the biological rules (e.g. smallworld (SW) connectivity and leaky-integrate and fire models (LIFM)), which govern memory learning dynamics of neurons exhibited within the brain.
Each AN within the SNN behaves as an informationprocessing unit. It learns from the temporal data that is propagated through it, adapting and memorising the patterns of activity by influencing the interconnected neurons within the network. Akin to the brain, SNNs incorporate time into their computation and thus are superior in biological plausibility compared to previous neural networks that do not account for temporal dynamics. The architecture to be employed processes several different modules based on evolving SNN framework [33]. As illustrated in Fig. 2, the modules consist of an input-encoder module (Fig. 2d), where data is encoded into spike-trains and the spatiotemporal variables are mapped into input neurons that transfer the spiketrains to the SNN model; a 3D SNN module (Fig. 2e), where the characteristics of space and time are recorded and learnt in an unsupervised mode; a visualisation module (Fig. 2f), where captured spatiotemporal connectivity of the brain can be visualised; a SNN classification module, where the spatiotemporal patterns from the 3D SNN module are classified or used to predict an output (Fig. 2g); an optimisation module (Fig. 2h), to fine-tune the parameters of the system; a pruner module (Fig. 2i) where inactive ANs are removed and only functional ANs (ANs that emitted spikes during the unsupervised learning) and neural connections are retained for further analysis. The following steps detail the methods applied in this study: 1. The temporal data are encoded into sequences of spikes using the threshold-based representation algorithm [34]. 2. A 3D SNN model of LIFM ANs is created, where the spatial mapping of the ANs is defined using the Talairach brain template [35]. 3. The EEG channels are mapped as input ANs to their corresponding location in the Talairach template. 4. The mapped SNN model is initialised, where ANs are connected using the SW connectivity proposed in Braitenberg and Schüz [36]. 5. The initialised SNN model is trained with the encoded spike sequences from ERP data, entering via the input ANs. The learning rule is the unsupervised spike-timedependent plasticity (STDP) [37] that changes the weight of the connection between every pair of connected ANs. During this process, the SNN model learns from the temporal information and forms pathways that can be interpreted, which the SNN will use to classify new information. 6. The spike sequences of the EEG data are again propagated through the SNN for supervised learning related to the classification tasks. Output ANs are created for each sample (i.e. one output AN for each participant). Each output AN is connected to all ANs of the 3D SNN model. 7. The deSNN algorithm is applied for supervised learning [38] and adapts the connections between the 3D SNN model and the output ANs. 8. For the classification of new temporal data, steps 6 and 7 are repeated. Then, the data are classified by applying the K-nearest neighbours algorithm using the K nearest (similar) output ANs created during step 6 to the new output AN.

Input ERP Data Encoding in the SNN Model
The pre-processed and baseline corrected ERP data were firstly ordered into a temporal sequence of real-value vectors. These vectors were then encoded into a series of discrete spike-trains using a threshold-based representation method (TBR), demonstrated to be able to construct large scale networks with arbitrary, configurable synaptic connectivity [39]. This algorithm is employed to identify relevant changes in the ERP signal thus reducing noise. If the value of change in the signal surpasses a predefined threshold value, then a spike is encoded. Upward and downward changes in the ERP data are interpreted as positive or negative (1 or − 1) spikes. Bi-directional algorithms like this are well suited to EEG data due to their sensitivity to significant changes within gradient signal changes.

SNN Initialisation, Mapping, Learning, Classification and Visualisation
Following the encoding of the ERP signal to spike-trains (Formula 1, Appendix 1), a 3D SNN structure was created that can map the functional and structural characteristics of the data from which it is recorded. To this end, Talairach coordinates [35] were used to map the 128 EEG channels to the 3D SNN model (visualised as the green ANs in Fig. 2e. These coordinates define the position of the spiking ANs in a brain-like SNN model and the position of the EEG electrodes [40] as the input ANs. Each neuron in the network then represents one cm3 of the human brain and the entire network consists of 1471 ANs [41]. A LIF architecture was used to model the ANs [42] (Formula 2, Appendix 1). The SNN was initialised according to a biologically plausible model of SW connectivity [43], where neurons that are topographically closer possess stronger interconnectedness and therefore capture patterns of interest from the model. Following completion of the unsupervised learning, a deSNN algorithm [38] was used to train an output classifier in a supervised learning method. A RO learning rule [44] was applied to initialise the connection weights, and then the STDP rule [37] was used to adjust these weights according to the spikes that follow the initial spikes to the postsynaptic AN (Formula 3, Appendix 1). The STDP accounts for the timing of pre and postsynaptic Proposed Spiking Neural Network Architecture for prospective memory ERP analysis. ERP data collection across the three participants groups: a younger, healthy older and older adults with mild cognitive impairment. b recording of the data during the experimental working and prospective memory tasks. c ERP data is extracted and cleaned. d cleaned ERP data is transformed into spike-trains. e spike-trains for each EEG channel are propagated into a 3D space of artificial neurons via the input neurons which are depicted as green artificial neurons. f trained SNN can be visualised. g output neurons are created and represent the final classification of the data. Each of the artificial neuron connects to the output neurons. h a grid search method is used to find optimal parameter settings for classifying between the participant groups i pruner module removes all connections which did not change for each group to create sparse models for each participant group and are visualised action potentials causing automatic adjustments to be made to the synaptic strengths and sensitivity of the postsynaptic ANs and consequently captures the spatiotemporal dynamics of the input data. Two other important variables for the classifier module are mod and drift (Formulas 4 and 5, respectively). Each training sample provided to the model is associated with an output AN, which is connected to all the other ANs in the 3D SNN and the connection weights are initially set to zero. The weights of these output ANs change as a function of the RO learning rule, which itself is calculated by the order of incoming spikes (mod) from different connections. The earlier a spike arrives in the output AN from 3D SNN, the greater its importance in increasing the corresponding connection weights. In terms of ERP data, it is useful to think of this as where the greatest amount of emphasis is placed within the ERP (i.e. toward to start or the end of the ERP). These newly formed connection weights will then increase or decrease according to the following number of spikes (drift) when the next spikes arrive at the AN over time.

Experimental Framework
To extract the learnt patterns of activity of the SNN models, computational experiments are performed for each of the memory tasks and their class (i.e. participant group). Each class contains n samples which are used to train the SNN model and are validated through a tenfold cross-validation to assess the accuracy of the model. Once the best model is found, then it is possible to extract the individual contributions of each of the network classes over epochs of interest (200-400 ms, 400-800 ms, 0-1000 ms). After training, those ANs that did not emit a spike were identified and removed (pruned) along with their connections.

Knowledge Extraction from SNN Models
To further extract knowledge from the SNN models from the SNN models. Knowledge extraction was performed in two ways: 1. ANOVA was applied to test for differences in local connection weights in each of the SNN models as a function of group and topography (scalp region). 2. Network analysis was applied to uncover the global neurocognitive interactions between the different topographical areas for each of the SNN models.
The weights of the input ANs (i.e. EEG electrodes) were averaged according to the outlined topographical clusters depicted in Fig. 3 (Appendix 2), which were informed through previous PM ERP research [16,30,45,46]. Averaging the connection weights of the input neurons offers a way of understanding differences in local connection weight changes between the groups within a specific area. Average clusters were created as a means of controlling the number of comparisons [47,48] and to capture the mean of the AN weight changes within an area connected to input AN. Analyses were performed in JASP 0.10.2. A series of mixed measures ANOVAs was performed for each of the created SNN models. Therefore, each stimulus type at each time point (i.e. 200-400, 400-800 and 0-1000) was analysed separately using a 4 (Cluster: frontal, central, parietal, occipital) × 3 (Group: YA, OA, MCI) ANOVA to evaluate group differences at midline clusters. For lateral clusters, a 7 (Cluster: frontal, frontocentral, central, frontotemporal, parietal, inferior parietal, occipital) × 2 (Hemisphere: left, right) × 3 (Group: YA, OA, MCI) ANOVA was used to analyse group differences in the networks. Post-hoc (linear contrasts) tests were used to further explore group differences, and Bonferroni corrections were applied to account for multiple comparisons. Partial eta squared was reported for each Group effect as an indicator of effect size [49]. Previous research has demonstrated the effectiveness of using ANOVAs for understanding local intra-region wiring in young adults with and without autism spectrum disorder [27,50].
The averaged connection weights were then used for the network analysis to understand global connectivity between areas across the cortex at each time frame. The network used describes a graphical representation of the correlations between each of the clustered weights. In these networks, clusters are represented as nodes and the correlations as edges connecting/wiring nodes together. The line thickness and transparency of the network graph represent the strength of the correlation, where thicker edges represent stronger correlations. The generated network displays are fixed in line with the cluster layout of the scalp map in (e.g. Fig. 10) for ease of comparison between groups.
A network model that analyses all possible correlations within the network requires the estimation of many parameters, including n threshold parameters for the nodes and n*(n-1)/2 for pairwise correlations between nodes. Estimations in the current study are equal to 153 parameters. One available solution to this problem is to apply the 'least absolute shrinkage and selection operator' (LASSO) technique [51]. This method enables some edges to shrink to zero and be omitted from the model. This is achieved by LASSO through the continuous shrinking of coefficients toward 0 as λ increases. A benefit of applying LASSO is the ability to handle more variables than observations [52,53]. However, given the high expected correlations between topographically close variables, i.e. averaged cortical clusters, the irrepresentable condition assumption would be violated [53]. This assumption requires that variables relevant to the model may not be highly correlated with irrelevant variables. An alternative method proposed by Zou [54], known as 'adaptive least absolute shrinkage and selection operator' (adaLASSO, Formula 6, Appendix 1), can be employed to adjust for this violation. While variables are all equally penalised with the LASSO method, variables are assigned different weights in adaLASSO and can subvert the irrepresentable condition assumption.
The adaLASSO produces a sparse, more conservative network model with only a small number of edges enabling a more interpretable model of the relationship between node weights for each participant group. Prior to analysis, a tuning parameter is required to control the level to which the omission of small correlations is applied. This tuning parameter was selected through bootstrapping and was validated using cross-validation. The importance of each node in the network was then evaluated through betweenness and degree indices. Betweenness provides a measure of the number of shortest paths passing through a specific node. A node with higher betweenness is said to have more network control as Topographical clusters used for ANOVAs and network analyses. Purple, frontal; green, frontocentral; yellow, central; red, frontotemporal; blue, parietal; pink, occipital; grey, inferior parietal more information is passing through that node [55]. Degree is the total amount of connections each node has, therefore indicating the strength of links to other areas (reported as 'node strength'). A node with a higher degree can be thought of as having a greater influence on connecting nodes [56].
Graphical models of brain data have proven their efficacy in a variety of imaging modalities, such as fMRI [57], EEG [58,59], magnetoencephalography (MEG) [60] and DTI [61]. Of these, sparse graphical models are distinctly efficient at determining connectivity between in highly interconnected brain data and at offering a robust and interpretable model of the most significant interactions between cortical areas [62]. To explore the differences in cognition between YA, OA and MCI, an SNN architecture used to model, learn, classify and Fig. 4 Flow-chart of the proposed work for understanding spatiotemporal dynamics of prospective memory between younger, older and older adults with mild cognitive impairment visualise the EEG data related to the different memory tasks (1-back target , PM percept and PM concept ). A SW connectivity radius of 2.5 units (distance between two consecutive neurons) was used for the SNN model, which has previously demonstrated its effectiveness for ERP modelling using SNN [24]. The SW connectivity rule allows the network the potential to form neuronal connections two ANs away in each of the x, y, z directions of the coordinate space. Small random weights are applied to each neuron (−0.1, +0.1). In previous studies modelling EEG data with SNNs, an 80/20 positive-negative initial connection weight ratio has been applied [23,63]. This ratio of inhibitory neurons is reflective of the 20-30% of inhibitory, GABAergic neurons found in the mammalian brain and is demonstrated as an optimal percentage for maximising the learning of a neural network [64,65]. However, initial results demonstrated that this ratio was not optimal for modelling the current data. Figure 5 illustrates the trained network for the 80/20 positive-negative ratio (a) and the 50/50 positive-negative ratio (b). The results show a greater amount of model learning for the 50/50 ratio as evidenced by the greater amount of connection changes from the initial connections. Therefore, a model containing 50% inhibitory connections demonstrated a better level of discrimination between the classes and was subsequently used for modelling the ERP data.

ERP Data Modelling Using the SNN Architecture
Similar to a biological neuron, when the simulated LIF AN receives spikes over time, its membrane potential increases until it reaches a pre-defined threshold. When the AN fires and emits an output spike, it cannot produce a new spike within a refractory period and its membrane potential is said to leak. The membrane potential can have certain leakage between spikes, which is defined by a leak parameter. The training of the SNN model requires EEG signals to be transformed into a spike-train of binary positive and negative spikes (−1 or 1; Fig. 6). These spikes reflect the changes in amplitude of the EEG signal and are created based on an encoding algorithm. A bi-directional TBR [34] was applied to all the EEG channel signal's gradient relative to the time series. The neural connections in the initialised SNN model were later modified during an unsupervised learning process with the input spikes steaming to the SNN model via input ANs.
The model was then trained using these spikes-trains in an unsupervised mode employing a STDP learning rule [37]. The application of this algorithm allows spiking ANs to learn successive temporal relationships between data points from the data across and within EEG channels. These connections in the model architecture can be analysed and used to draw new understanding of the data. Figure 7 shows the final SNN following the creation of neuronal connections created during STDP learning, which reflect the dynamic patterns of connectivity.
When the supervised learning process is completed, the connection weights between the output ANs and the 3D SNN model are established. Then in the validation phase, the new ERP samples which were excluded from the learning phases are used to test the model. For every new testing ERP sample, an output testing AN is evolved and connected to the already trained SNN model, and its connections are modified while the ERP sample is passed to the SNN model. Then for classification of this testing AN, a K-nearest neighbour (KNN) algorithm was used, where the newly formed testing AN connection vector is compared with the existing output ANs' connections, and the top k similar output ANs (referring to the top similar ERP samples) identify the class label of this testing AN (ERP sample). This procedure is repeated for all the testing samples, one by one, through creating different output testing samples and classifying them.
A grid search method was used in the current study for fine-tuning a combination of parameters and reducing the classification error. Each parameter was searched within a range, specified by a minimum and maximum, through 5 iterations. A tenfold cross-validation was used to validate the results. Therefore, for every model creation, 78,125 iterations of training (using 9 folds of samples except the holdout fold) and testing (using the holdout fold) were performed with different combinations of these parameters. The parameters that resulted in the best accuracy have been reported as the optimal parameters: • The threshold for firing was set to 0.5, the refractory time was set to 5 and the LIF neuron model was set to 0.005. • The STDP rate of the unsupervised learning algorithm was set to 0.002 for positive synaptic modifications and 0.003 for negative connections.
• The mod parameter was set to 0.4 and the positive and negative drift was set to 0.002, 0.004, respectively. • The KNN was set to 13 nearest neighbours. Fig. 6 Example of how an event-related potential encoded into a spike-train. a Eventrelated potential taken from one participant (electrode Cz) in response to a conceptual prospective memory stimulus. b Event-related potential encoded into a binary spike-train using the TBR algorithm The optimisation procedure finds the best performing model for each of the SNN models. Table 1 presents the final test-fold of the best performing model (i.e. the held-out fold). Each of the SNN models was compared against other machine learning methods. Model accuracies was calculated using confusion matrices. At each memory stimulus type, the SNN model outperformed the other methods. Additionally, the results show that the PM percept and PM concept SNN models were better at classifying brain activity of the groups (83.33% and 80%, respectively) compared to the 1-back target models (73.94%). It is also seen that with the use of ERP data, YA are well discriminated from the other two groups, while the ERPs of OA and MCI overlap to a certain degree. From the trained 3D SNN networks, we can begin to see the patterns of connectivity emerge.
To understand within-group differences, the training samples were separated by propagating only the information for that class (i.e. YA, OA or MCI) through newly created networks that kept the same initialised network and parameter settings attained during the supervised learning stage. By using the same initialised connections and only allowing the EEG data from one group to make changes to the network, different patterns of connectivity for each group can be learnt. Thus, three separated SNN models were trained with each of the classes. The initialised SNN models were modified during the STDP learning that adapted the spatiotemporal connections. After the training, those neural connections that had not changed for each model were considered inactive and were pruned from the network.
The removal of inactive ANs enables the creation of a fine-tuned, sparse networks [66] showing only the most important connections for the pre-trained class, enabling better visualisation of the differences between the groups. This step was performed across three different time periods to reveal the neural connections across time. The first time period was selected as 200-400ms to capture the early cognitive processes associated with cue detection and monitoring [67]. The second time epoch was 400-800ms, which encapsulates the later processing of stimuli and is related to deeper contextual and memory processes [16]. Finally, the full epoch was propagated through the network for each class to understand the learnt connections across the entire data range. The pruned networks can be visualised for each stimulus type in Fig. 8 for the 1-back target , Fig. 9 for PM percept and Fig. 10 for PM concept .
From Fig. 8, it is apparent that there are similar patterns of activity occurring for each of the defined time periods across the groups. However, it is difficult to see how the network is differentiating between the classes at both the 200-400ms and 400-800ms time range. Differences appear to be more prominent when the whole 0-1000ms epoch is pruned for each of the classes. This may suggest that information outside of the 200-800ms epoch is important in understanding ageing and cognitive decline. In particular, differences between OA and MCI appears to be substantial in the 0-1000ms PM concept model. There are fewer positive and negative connections for the MCI group. Interestingly, the pruning method appears to have removed many ANs in the left frontotemporal region for the 0-1000ms model in the MCI group in each of the models, implying relatively little spatiotemporal activity for the 1-back target during the working memory task.
Additionally, for PM stimuli across the wider epoch (0-1000ms), MCI demonstrated fewer positive and inhibitory connections relative to OA and YA. In the PM percept stimuli, these inhibitory connections are spread more globally across the 3D SNN for healthy groups (YA, OA), but within the MCI group; the inhibitory connections are restricted to occipitoparietal, frontocentral and frontal regions. It appears inhibitory connections are spread similarly across the network, albeit fewer overall inhibitory connections for the MCI group. However, there appeared to be more positive connections for the healthy groups relative to MCI.
To validate the visualised changes from the pruned SNN models, histograms were plotted showing the pruned SNN connections weights (Fig. 11a). Compared to the initialised weights of each SNN, each model placed more emphasis on negative connection weights through training. Moreover, it is apparent that the weights now follow a somewhat Laplacian distribution, characterised by the heavy tails as demonstrated in the QQ-plots (Fig. 11b) and the high Kurtosis values (Table 2), with the addition of the failure to reject the distribution being from a normal distribution (Table 2). This distribution type has been shown to respond well to variable selection features, such as different applied methods of the adaLASSO [68].

Knowledge Extraction from Spiking Neural Networks
Due to the non-normal distribution of the SNN weights, data connection weights were firstly transformed using a natural logarithm. The adaLASSO regularisation was performed to discover the most important connections for each pruned epoch for the early (200-400ms) and later (400-800ms) pruned SNN epochs, along with the complete (0-1000ms) epoch. A 10-fold cross-validation was performed for each of the variables. To increase the robustness of the results, each process was repeated 1000 times. Network plots were mapped to a scalp array dependent on their topographical features. For example, Fig. 12 graphically illustrates the most important edges within the working memory (1-back target ) network as a result of non-essential connections being forced to zero. The thickness and colour intensity of the lines are proportional to the wiring connection strength given as edge weight (EW). Positive wiring connections are displayed in blue and represent strength of a positive correlation between two regions and red line show

Ongoing Working Memory (1-backtarget)
A summary of all significant local connectivity Group effects and interactions for the 1-back target stimuli is presented in Table 3. No Group differences were found at midline clusters or in the 0-1000 ms SNN model. Graphical networks are presented in Fig. 12.

Ongoing: 200-400 ms ANOVA Connection Weights
At bilateral clusters, there was a significant Cluster × Group interaction due to greater connection weights for YA relative to OA (frontal, p = 0.019; frontocentral, p = 0.009; central, p < 0.001; frontotemporal, p = 0.005; inferior parietal, p = 0.029) and YA relative to MCI (parietal, p = 0.011). OA had greater connection weights at bilateral central clusters relative to MCI (p = 0.008). The interaction was also explained by an effect of Cluster for all participants, where bilateral central clusters had larger connection weights than all other clusters (ps < 0.001). In YA, bilateral inferior parietal cluster connection weights were larger than in parietal clusters (p < 0.001).

Ongoing: 400-800 ms ANOVA Connection Weights
At lateral clusters there was a significant effect of Group, where YA had significantly greater connection weights across all bilateral clusters than OA (p < 0.001) and MCI (p < 0.001). No differences were found between the OA and MCI groups (p > 0.05). There were no other significant Group effects for 1-back target stimuli.

Ongoing Task: Network Analysis 200-400 ms
For the 200-400ms network analysis of SNN models of the working memory ongoing task, the number of edges reduced to zero was approximately 95%, 97% and 98% for YA, OA and MCI, respectively. The centrality indices reveal a common area of importance across all participants groups over the right parietal cluster (YA node strength =

Perceptual Prospective Memory (PMpercept) Statistical Analysis
A summary of all statistically significant Group effects from the ANOVAs is presented in Table 4. No significant Group differences were found over midline clusters. Graphical networks are presented in Fig. 13.

Perceptual PM Network Analysis: 200-400 ms
Network models for 200-400ms induced a sparsity level of 97% for OA and 98% for YA and MCI groups. As seen in Fig. 13, the right frontal cluster has strong associations with neighbouring clusters across all participant groups. In the YA, the network reveals a strong positive connection between the right frontal and the right frontocentral cluster (EW = 0.33), which also exhibits significant network control

Perceptual PM Network Analysis: 400-800 ms
For the network models of 400-800ms, approximately 91%, 96% and 98% of all edges were set to zero for the YA, OA and MCI groups, respectively. The models demonstrate the importance of the right frontocentral cluster over this epoch as all participant groups show similar levels of degree strength (YA node strength = 1.55; OA node strength = 1.91; MCI node strength = 1.08), although only significant levels of betweenness for YA (2.70) and OA (3.56) were  connections, where network control is ascribed to the right frontocentral cluster (betweenness = 2.57). The same local network is not found in the MCI group, but a negative connection is found for right frontocentral-left parietal clusters (EW = −0.38).

Conceptual Prospective Memory (PMconcept) Statistical Analysis
All significant Group effects ANOVA results are presented in in Table 5. There were no significant Group effects over midline clusters. Graphical networks are presented Fig. 14.

Conceptual PM 0-1000 ms ANOVA Connection Weights
At lateral clusters, a significant Hemisphere × Group interaction Group is explained by a significant effect of Group in the right hemisphere (MCI > YA, p = 0.024) and a significant effect of Hemisphere in OA and MCI, such that the right hemisphere connection weights were significantly greater than the left hemisphere (OA: p < 0.001; MCI: p = 0.007).

Conceptual PM Network Analysis: 200-400 ms
Over 200-400ms, approximately 97% of edges for YA and OA and 95% of edges for the MCI group were set to zero.

Conceptual PM Network Analysis: 400-800 ms
Model sparsity was approximately 90% for YA, 96% for OA and 94% for MCI. There are several network similarities

Discussion
The current study aimed to improve the modelling and understanding of the neurocognitive dynamics that underpin PM across young adults, older adults and older adults experiencing MCI using a brain-inspired computational framework. Through the classification functionality of the SNN architecture, the current experiment sought to determine whether PM stimuli would enable better classification accuracy of brain activity between the groups compared to working memory stimuli. Two conclusions can be drawn from the results: firstly, greater accuracy was achieved by using a SNN methodology relative to the traditional ML methods (SVM, MLP, MLR); secondly, greater SNN classification accuracy was achieved with brain responses to PM stimuli compared to the ongoing working memory task at classifying brain activity between young adults, older adults and older adults with MCI. This study also sought to uncover new insights of spatiotemporal connectivity between the groups through a series of ANOVAs and network analyses on each of the SNN models. In general, younger adults exhibited greater local cluster connectivity compared to healthy older adults and older adults with MCI indicated through increased wiring across the scalp. Network analyses largely show that older adults with MCI had decreased wiring across the cortex in response to working memory and perceptual PM tasks. However, network activity in response to conceptual PM revealed greater cortical wiring in those with MCI relative to young and healthy older adults. Prospective memory may have enabled greater classification accuracy relative to working memory due to the higher cognitive demands required to complete the PM tasks [69]. Coping with additional cognitive demand is problematic for older adults with MCI [70,71], which would be reflected in neural signalling and subsequently in the SNNs allowing for easier discrimination of neurophysiological activity. Thus, the earlier detection of dementia-related diseases may be achieved through the application of ML methods in cognitively demanding memory tasks as opposed to simple cognitive tasks. Although, this would need to be confirmed, for example, within a longitudinal design. PM tasks may be particularly suitable for discerning neurocognitive differences in those with MCI as it is one of the first reported cognitive complaints of older adults who go on to develop MCI and dementia [72].
The current study demonstrates that 3D SNN model interpretability can be enhanced by propagating ERP data (as spike-trains) from a specific group and pruning network connections that did not change. The removal of these connections produced sparse SNN models which better reflect patterns of learnt activity. Therefore, these sparse networks offer a novel method to interpret models of task-based neurophysiological activity to gain a deeper understanding of connectivity through spatial and temporal dimensions. Additionally, pruning non-important connections enabled the connection weights to follow distributions that allow for statistical analyses to be performed on the models.
Analyses demonstrated the importance of anterior clusters for all participant groups in response to perceptual PM stimuli. Both younger and older adults had significantly greater levels of wired regions across the cortex relative to older adults with MCI, particularly in frontal and frontocentral clusters. The aPFC plays an important role in the balance of attention between the ongoing task and the intermetal representation of PM stimuli [73]. FMRI evidence implicates functional deterioration in attention networks (dorsal and ventral) in older adults with amnestic MCI and Alzheimer's disease [74][75][76]. The decrease of frontocortical connectivity in older adults with MCI possibly represents a deterioration in the functional networks responsible for performing the attentional balance between working memory and PM intention maintenance.
Contrary to the perceptual PM models, older adults with MCI showed greater cluster-to-cluster wiring across the cortex than younger and older adults in the conceptual PM. This concurs with evidence of hyperconnectivity and faster memory decline [77]. Some researchers report decreased functional connectivity in those with MCI and early Alzheimer's disease [78][79][80][81], while others report increases in cortical connectivity [82][83][84][85][86]. In a comparison of resting and task-based connectivity, Jiang and Zheng [87] demonstrated that while functional connectivity in an MCI group decreased relative to healthy controls at rest, in the working memory task, the MCI group exhibited group exhibited greater inter-and intra-hemispheric connectivity. Jiang and Zheng suggest that levels of cortical connectivity increase as a means of supporting atrophied cortical regions during memory tasks. Although the current study did not compare the results with resting-state activity, similar compensatory mechanisms may be being utilised throughout the cortex in the MCI group during the more difficult conceptual PM tasks relative to the highly salient perceptual PM task and the relatively simple working memory task. Indeed, evidence shows greater inter-hemispheric EEG connectivity in those with MCI when memory task demands increase [88], which was not found in healthy controls. Increased functional connectivity between parahippocampus and the middle frontal gyrus is associated with decreased episodic memory performance in those with MCI [86]. Similarly, frequent rewiring is seen in ASD individuals as the individual adapts to increased cognitive demands [27]. Taken together with the current findings, decreases in connectivity may be apparent in resting-state and simple memory tasks in those with MCI. Potentially, however, as the difficulty of a task increases, so might the connectivity for individuals with MCI possibly reflecting compensatory mechanisms.
The current study used a 50/50 positive to negative connection ratio to initialise the SNN models. While this provided a better rate of learning within the current study than the more commonly used 80/20 ratio, other configurations were not tried. Potentially, there are better positivity to negative ratios for modelling ERP data in a SNN architecture. Therefore, future researchers are encouraged to incorporate the experimentation of different initial connection weight ratios during the optimisation stage in SNNs.
The proposed approach has shown an excellent ability to discriminate functional brain connectivity networks for young adults, older adults and older adults with MCI. As such, this SNN method may be further utilised to measure the effect of treatments applied over time in older adults with MCI in their PM ability. In addition, turning the identified spatio-temporal patterns of functional connectivity into linguistic knowledge (rules) as suggested in [21] can help to better understand PM decline and MCI.
The current study proposes the utility of comparing cortical connectivity in those with MCI between simple cognitive tasks and more cognitively demanding tasks. Future research should explore other cognitive domains that can be varied in their difficulty to confirm the differences in hypo-and hyperconnectivity in MCI. It would be expected that as cognitive decline progresses, the more challenging cognitive tasks will also be associated with decreases of functional connectivity in line with the declines found in AD [89].

Conclusion
The current study shows that the spatiotemporal connectivity in working memory and PM tasks can be modelled and visualised using SNNs to gain an increased understanding of the effects of ageing and cognitive decline. The SNNs demonstrated that classification accuracy of brain activity related to working memory and PM is better than conventional ML methods. Moreover, STBD in response to PM stimuli provides better classification accuracy than working memory in SNNs. Visualisation of the task-based memory activity can be improved through pruning of inactive ANs and neural connections in the SNN models. Analyses of the SNN models revealed different spatiotemporal connectivity between the groups at a local and global level. In general, local cluster connectivity was greater for younger adults but older adults had increased connectivity in frontal clusters during the perceptual PM models. The older adults with MCI had decreased global connectivity relative to healthy older adults and younger adults in the working memory and perceptual PM, but in the conceptual PM models, MCI patients had increased cortical connectivity potentially reflecting compensatory mechanisms during a more cognitively demanding PM task. Electrode locations are given in relation to the 10-5 system