The Geras Solutions Cognitive Test for Assessing Cognitive Impairment: Normative Data from a Population-Based Cohort

BACKGROUND: There is a need for the development of accurate, accessible and efficient screening instruments, focused on early-stage detection of neurocognitive disorders. The Geras Solutions cognitive test (GSCT) has showed potential as a digital screening tool for cognitive impairment but normative data are needed. OBJECTIVE: The aim of this study was to obtain normative data for the GSCT in cognitively healthy patients, investigate the effects of gender and education on test scores as well as examine test-retest reliability. METHODS: The population in this study consisted of 144 cognitively healthy subjects (MMSE>26) all at the age of 70 who were earlier included in the Healthy Aging Initiative Study conducted in Umeå, Sweden. All patients conducted the GSCT and a subset of patients (n=32) completed the test twice in order to establish test-retest reliability. RESULTS: The mean GSCT score was 46.0 (±4.5) points. High level of education (>12 years) was associated with a high GSCT score (p = 0.02) while gender was not associated with GSCT outcomes (p = 0.5). GSCT displayed a high correlation between test and retest (r(30) = 0.8, p <0.01). CONCLUSION: This study provides valuable information regarding normative test-scores on the GSCT for cognitively healthy individuals and indicates education level as the most important predictor of test outcome. Additionally, the GSCT appears to display a good test-retest reliability further strengthening the validity of the test.


Introduction D
ue to the irreversible nature of major neurocognitive disorders (MND), early diagnosis for those affected is critical for timely intervention. Modern diagnostic tools, such as various imaging modalities and cerebrospinal fluid biomarkers have improved our diagnostic accuracy substantially (1,2). These tools are however often expensive or time-consuming and are consequently not well-suited for screening large samples of at-risk populations, potentially leaving underserved communities without an opportunity to receive treatment or other interventions (3). Therefore, it is critical to develop a cost-effective and easy-to-use screening tool for MND that can detect early signs of disease before significant cognitive deterioration has ensued, a stage known as mild neurocognitive disorder (4).
Current primary screening methods for neurocognitive disorders still largely depend on analogue "pen and paper" based tests administered to patients by health care providers (5). The most known and used cognitive tests include the Montreal Cognitive Assessment (MoCA) and the Mini-Mental State Examination (MMSE) (6,7). We have recently reported a proof of concept study indicating that the Geras Solutions Cognitive Test (GSCT), a self-administered digital screening tool for cognitive impairment, is a promising option for potential large scale screening in the setting of cognitive deterioration. The GSCT has been tested on patients with suspected cognitive deterioration and shown equally good discriminative properties as the MoCA test in identifying patients with MND and mild neurocognitive disorder (8).
Computerized cognitive tasks offer several advantages over traditional paper-and-pencil assessments that are particularly valuable in repeated testing contexts, including standardized administration, ease of scoring and administration and ease of generating alternate forms of tasks (9)(10)(11)(12). Moreover, the ability to accurately measure reaction times (RTs) makes computerized testing particularly useful for detecting subtle changes in cognitive function (11,13). To further increase the validity of the GSCT, there is a need to obtain normative data and evaluate the effects of education, age and gender on the test scores. This is of importance because it is known that factors such as education, age and sometimes gender affect the scores on cognitive tests including MoCA (14).
Repeated cognitive testing is frequently used to assess changes in cognitive function. Among older adults, repeated testing may be used to detect the onset of neurological disease, to monitor disease progression, to evaluate the effectiveness of interventions designed to slow or prevent cognitive decline, and to assess cognitive side effects of pharmaceuticals intended to target noncognitive functions (11,15,16). However, repeated use of cognitive tasks can lead to performance shifts. Aside from showing minimal performance shifts, tasks employed for repeated assessment must also produce scores that show good test-retest reliability (TRR), i.e. the ability of a test to replicate the relative order between subjects on a test when administered twice (17,18). In contrast to the well-documented retest effects for scores on paperand-pencil tasks, the literature on repeated computerized cognitive testing among older adults is relatively sparse. For physicians using GSCT, a reliable TRR over time can ensure that detection of cognitive impairment is due to the condition and not to poor reliability of the test itself.
The purpose of the study was to obtain normative data, examine effects of gender and education on test scores, and to determine the TRR of the GSCT.

Patients
This study was a normative trial conducted in Umeå, Sweden, from March 2021 to February 2022. This study involved the analysis of data collected during the Healthy Aging Initiative Study (HAI), which is currently in progress at Umeå University, Sweden, and has been previously described (19). The aim of the HAI is to examine traditional and potentially novel risk factors for MND, cardiovascular disease, and injurious falls in 70-year-old men and women. As the study is populationbased, the only inclusion criteria are as follows: (1) age of exactly 70 years at the time of contact and (2) current residence in the municipality of Umeå, Sweden. Contact information was drawn from population registers, and eligible individuals initially received written information about the research project. Follow-up telephone contact was made approximately 7 days later, during which individuals also received verbal information. Those who agreed to participate received the instructions and were scheduled for testing 2-3 weeks later.
The sample selected for this study comprised of 147 individuals who underwent HAI testing. Inclusion and exclusion criteria for this specific study for the sample selection were as follows; Inclusion criteria: cognitively healthy individuals (MMSE > 26), fluent in Swedish, age: 70 years of age, provided written informed consent. Exclusion criteria: participation in a cognitive study within the last 3 months, diagnosis and/or symptoms of depression, serious somatic disease, any disease or events affecting the central nervous system, cerebrovascular disease, current medication with psychoactive drugs, burnout or stress related disorder, fever, anxiety. A subset of 36 participants was asked to perform the GSCT twice on two separate occasions two weeks apart to establish TRR. The first self-administration of GSCT was done at the clinic. For the retest 13 subjects administered the test at home while 19 individuals did the retest at the clinic.

Geras Solutions cognitive test
The GSCT, is a newly developed digital selfadministered screening tool for cognitive impairment and is included in the Geras Solutions APP (GSA). GSCT is developed on existing cognitive assessment methods (MoCA and MMSE) and includes additional proprietary tests developed at the memory clinic, Karolinska University Hospital Stockholm, Sweden. The test is suitable for digital administration through mobile devices supporting iOS and Android (8).
The test is composed of 16 different items assessing different domains of cognition, developed in order to screen for cognitive deterioration in the setting of MND and mild neurocognitive disorders (previously mild cognitive impairment). The GSCT is scored between 0-59 points in total and has six main subdomains including; memory (0-10 points), visuospatial abilities (0-11 points), executive functions (0-13 points), working memory (0-19 points), language (0-1 point) and orientation (0-5 points).

Statistics
All statistical analyses were done using Statistica software (version 13). Baseline descriptive characteristics were calculated and are provided in Table 1. Variables were tested for normality and parametric tests were used in the analysis. Differences between groups were calculated using independent t-test. A general regression model was used to determine the predictive effects of normative variables (gender and education) on GSCT score. TRR was assessed using Pearson correlation as commonly done in similar research papers (20) and mean differences are already examined using dependent t-test. Out of the all included patients (n=147) three subjects were excluded from final analysis due to lack of complete test results thus leaving 144 patients for analysis. Of the 36 patients asked to perform the test twice, 32 were included in the analysis of TRR. The four excluded patients displayed large discrepancies between test results of more than 1.5 SD. This was due to clear test irregularities in these patients for example starting the test multiple times.

Power analysis
We aimed at an 80 % power at a significance level of 0.05 for the regression analysis of normative data including two predictor variables of gender, education level. Using Cohen's effect size f2 for regression analysis a value of 0.15 would represent a medium effect size. The sample size needed to detect this difference would be 69 patients. Previously established sample size recommendations for regression analysis suggest N > 104 + m assuming a medium effect size (21). We therefore aimed at a total study sample of 150 to account for dropouts thus also increasing power to detect smaller differences.

Ethics
Ethic approval was obtained from the Regional Research Ethical Review Board of Umeå University, Sweden (no. 07-031M). All participants provided written informed consent to participate and were made aware of their possibility to terminate their participation at any time. The study was conducted in accordance with the World Medical Association's Declaration of Helsinki.

Results
Descriptive statistics are provided in Table 1. A total of 144 patients completed the study and the mean GSCT score for all participants was 46 (±4.5) points. No significant differences in GSCT total score were observed between males (M=46.3, SD=4.5) and females (M=45.7, SD=4.4), t(142) =-0.77, p = 0.44. When analysing differences in test scores depending on level of education, individuals with more than 12 years of education (M=47.0, SD=4.5) had significantly higher test scores, t(142) =3.48, p = 0.001, as compared to subjects with less than 12 years of education (M=44.4 SD=4.1). The overall regression model, including both gender, level of education as well as their interaction, was statistically significant (R2= 0.09, F(3, 140) = 4.49 p = < 0.01) explaining 9% of the variance. High level of education was a significant predictor of GSCT test score (β = 2.7, p = 0.02) whereas gender was not (β = -0.8, p = 0.5). No significant interaction was seen between gender and education (p=0.9).
TRR was (r(30) = 0.8, p <0.01). Sub-analysis of TRR depending on whether the second test administration was conducted at home or at the clinic was examined. Patients conducting the retest at home showed high correlation between test and retest (r(11) = 0.85, p <0.01) and likewise for patients doing the retest at the clinic (r(17) = 0.77, p <0.01).
Practice effects were examined showing a significant improvement between test administrations (t(31) =-6.0, p = <0.01) with a mean difference of 2.5 points between tests.

Discussion
In this study we present normative data for the GSCT as well as assessment of TRR in order to further validate this tool as a potential screening instrument for cognitive impairment. The GSCT has previously been tested on subjects with suspected cognitive deterioration and later diagnosed with either MND, mild cognitive disorder or subjective cognitive impairment (8). In that study GSCT showed a slightly better accuracy in correctly identifying this in mild neurocognitive disorders/mild cognitive impairment with a sensitivity of 0.88 compared to 0.83 for MoCA while both tests showed similar specificity of 0.55 and 0.54 receptively (8). Overall the initial study indicated that the GSCT performed at least as well as currently available screening tools for MND while simultaneously providing several advantages including the possibility of time efficient large scale testing.
In this study, no significant differences in total GSCT score were observed between males and females suggesting that no inter-gender differences exist regarding test outcomes on a population level. Other studies investigating the effect of gender on cognitive test performance are somewhat inconsistent. Some studies indicate an effect of gender with females showing higher scores on MoCA (22) while other research indicate no effect on MoCA (23) or MMSE (24).
When analyzing differences in the test scores for level of education, we found that individuals with more than 12 years of education had significantly higher test scores, as compared to subjects with less than 12 years of education. This is in line with other studies indicating a significant effect of education on test outcomes (14,22). The mean difference between the groups in our study was 2.6 points. This implies that the level of education should be considered and adjusted for when assessing GSCT scores, which is the case with the traditional assessment of MoCA where subjects with less than 12 years of education receive an extra point (6).
These findings were further validated in the general regression model where we included both gender, level of education as well as their interaction to examine their impact on GSCT score. High level of education (>12 years) was a significant predictor of GSCT test score and associated with a 2.7 points increase in total score as compared to low level of education. Gender was not a significant predictor of test score and no significant interaction was seen between gender and education.
Age is known to correlate with cognitive test scores and is usually examined in normative studies (14). Due to the unique setup of the study, including a population based cohort from the HAI study, all included patients were of 70 years of age and thus analysis of the association between age and GSCT scores was not possible. We know from the previous publication that patients with subjective cognitive impairment and thus no objective findings of cognitive impairment had mean scores of 45 points and a mean age of 57 (8). Interestingly, recent research has indicated that the previously observed association between increased age and lower test scores may be due to underlying neuropathology rather than an effect of ageing itself, thus age could be a confounder for the association between neuropathology and test scores (25). Further studies are needed to elucidate the effect of age on GSCT scores.
TRR was also examined in this study. Thresholds for "good" reliability vary depending on the purpose of testing. For example, clinical decision-making typically requires higher reliability than research settings (26). In this paper intraclass correlation coefficients (ICCs) and Pearson's correlations greater than 0.7 are considered to indicate good TRR, values 0.5 to 0.7 to indicate moderate reliability, and values < 0.5 to indicate poor reliability (26). In our study a subset of patients were administered the test twice, with a two week interval between tests, with some subjects conducting the re-test at home while others completed it at the clinic. Overall there was a significant correlation between tests with a correlation coefficient of 0.8, thus indicating good TRR according to previously established criteria (26). Additionally, test-retest correlations remained highly significant independent of the secondary test location indicating that test setting does not affect GSCT outcome. Overall, the GSCT seems to generate reliable results between test administrations.
Practice effects were observed with subjects scoring significantly higher at re-rest with a mean difference of 2.5 points. This is in line with other research indicating that tests such as MoCA are prone to practice effects, particularly between first and second administrations, a fact that needs to be considered in the setting of high frequency testing (20,27,28). Interestingly, a recent study has suggested that practice effects, or rather the lack of practice effects in individuals, could be a potential marker for future cognitive decline (29). Lack of practice effects during longitudinal follow-up has also been associated to increased cerebral burden of both amyloid and taupathology as well as cognitive decline (30).

Conclusions
Our findings provide important normative data for the newly developed digital cognitive screening test; GSCT. In this study we have established normal scores for cognitively healthy individuals and identified education level, but not gender, as a significant predictor of test outcome. Furthermore we have shown that GSCT displays good test-retest reliability, thus strengthening the validity of the tool. Further studies are needed to determine the effect of age on GSCT scores.

Limitations
The lack of different age groups in this study is a clear limitation and future studies are needed in order to elucidate the effect of age on GSCT score. There may also be a selection bias in the study, participants accepting inclusion in the HAI-study and subsequently in this substudy may differ from the average 70 year old in Sweden indicated by a large proportion of individuals with a high level of education.
Disclosure statement: This research did receive funding from Geras Solutions. The study and data collection was conducted independently at Umeå University.. Five of the authors are part-time employees or associated to the company that created the digital test; Geras Solutions; J Ulfvarson, V Bloniecki, K Javanshiri, G Hagman and Y Frenud-Levi.
Funding note: Open Access funding provided by Karolinska Institute.
Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.