Introduction

Telephone-based cognitive screening (TBCS) is crucial to both clinical and experimental telemedicine. Indeed, besides allowing clinicians to deliver first-level assessments to individuals with poor access to de visu healthcare services [1, 2], TBCS eases the implementation of large-scale epidemiological studies [3] and prevention campaigns [4]. In this regard, the ongoing COVID-19 pandemic has accelerated the recourse to TBCS tools, increasing the need of standardized instruments for the remote assessment of the subjective and objective cognitive complaints/failures in the general and elderly populations during restrictions and lockdowns [5,6,7].

Moreover, when compared to videoconference-based approaches, TBCS more easily allows reaching underserved or elder populations since it requires minimal technological support and expertise [8, 9].

Among TBCS instruments, the Telephone Interview for Cognitive Status (TICS) [10] is one of the most widespread and statistically sound test, assessing both instrumental domains (orientation, language, and memory) and attentive/executive functions [11]. Its applicability in research settings and clinical usability in different neurological populations has been extensively demonstrated [8, 12, 13].

Several versions of the TICS have been developed worldwide, differing, for instance, for the presence or absence of a delayed recall task [14], although little consensus has been reached as for its optimal format [13, 15].

In Italy, the original TICS version [10] has been shown to be administrable face-to-face [16]; as for its remote use, an attempt to its adaptation with promising, albeit preliminary, evidence of its psychometric and diagnostic goodness dates back to 2006 [17]. However, a comprehensive, up-to-date standardization study for the Italian TICS has not been provided yet.

With these premises, the present study aimed at (1) developing a culture- and language-specific Italian version of the TICS while assessing its (2) psychometric and (3) diagnostic properties.

Methods

Participants

Three-hundred and sixty-five participants from different Italian regions were recruited (see Tables 1 and 2). Demographic and occupational data were collected (see Table 1). Occupational status was codified as white- vs. blue-collar based on the nature of working activities carried out the most during the individual lifespan (i.e., primarily manual vs. clerical job activities). Exclusion criteria were as follows: (1) having received a clinical diagnosis of neurological or psychiatric diseases; (2) severe internal-medical conditions and organ/system failures; (3) non-compensated metabolic disorders; and (4) uncorrected hearing deficits. Participants were recruited between 2020 and 2021; some of them were personal acquaintances of researchers from the University of Milano-Bicocca, others were recruited via word-to-mouth advertising. The study was approved by the Ethics Committee of the University of Milano-Bicocca. Participants provided their informed consent to their participation to the study.

Table 1 Sample stratification for age, education, and sex
Table 2 Demographic and cognitive data

Materials

A back-translated version of the original, English TICS developed by Brandt et al. [10] was adopted. Culture-specific items were adapted according to Ferrucci et al.’s [16] guidelines. No disagreements on linguistic aspects emerged among the authors, while minor discrepancies on cultural adjustments were solved throughout discussion.

The original TICS score [10] ranges from 1 to 41 and comprises 11 items assessing orientation (personal, temporal, and spatial; score range: 0–12), attention and executive functioning (backward counting, backward calculation, abstraction; range: 0–9), language (naming to description, sentence repetition, and oral comprehension; range: 1–8), and memory (immediate recall, semantic memory; range: 0–12). An off-label delayed recall subtest (DR) of the 10-word list was additionally administered to N = 152 participants as the last task. The total TICS score comprising DR thus ranges 1–51.

Participants were also administered the Italian telephone-based Mini-Mental State Examination (Itel-MMSE) [18], a TBCS test whose validity and reliability has been previously demonstrated [19].

Procedures

Call quality was tested via an in-depth sound-check from both the examiner and the examinee standpoints (see Supplementary Material 1). The examinee was preliminarily introduced to those actions required to execute tasks during the assessment. An informer was required to (1) ensure about the absence of facilitations within the setting and (2) confirm address information provided by the examinee (as needed to test spatial orientation). Two raters independently scored N = 57 protocols to test inter-rater agreement. Seventy-seven participants were followed up at a 30-day distance to assess test–retest consistency.

Statistics

Analyses were performed via R 4.1.0 [20], SPSS 27 [21], and Stata 16 [22].

Minimum sample dimension was estimated at N = 318 based on a correlational model with a small-to-medium effect size (ρ = 0.2; 1 – β = 0.95; two-tailed α = 0.05) via the R package pwr [23].

Skewness and kurtosis values were judged as indexing non-normality if ≥|1| and |3|, respectively [24]. As cognitive measures proved not to distribute normally, associations of interest were tested through non-parametric techniques. More specifically, the relation between cognitive measures, as well as that between cognitive measures and age and education, was tested through Spearman’s coefficient. Consistently, the interplay between sex and TICS measures was tested via Mann–Whitney tests. Bonferroni corrections for multiple comparisons were performed when adequate.

Factorial structure was investigated through principal component analyses. Internal consistency was tested via Cronbach’s α, computed on dichotomous items via the R package ltm [25]. Test–retest and inter-rater reliability were assessed via intra-class correlations.

Item difficulty and discrimination were examined by means of an item response theory two-parameter logistic model [26] run via the R package mirt [27]. Canonical difficulty was judged for values ranging from − 4 to + 4 [28, 29]. With regard to discrimination, items could be classified as “discriminative” (≥ 1.5) or “highly discriminative” (≥ 1.7) [29].

Receiver-operating characteristics analyses were run to test diagnostic accuracy. A performance below vs. above the 5th percentile on the Itel-MMSE was addressed as a proxy gold standard.

Results

Mean age of participants was 53.16 ± 16.03 years (range: 18–89 years), whereas mean education was 13.01 ± 4.46 years (range: 0–26 years); 147 participants were males, 218 were females. The majority of participants came from Northern Italy (N = 271), whereas 11 were from Center and 83 from Southern Italy. One-hundred and fifty-nine participants were classified as white-collar, whereas 206 as blue-collar. Cognitive scores are also summarized in Table 2.

All TICS measures, except for orientation sub-scores, were inversely related to age (− 0.36 ≤ rs-365 ≤  − 0.12; p ≤ 0.05) and positively associated to education (0.13 ≤ rs-365 ≤ 0.40; p ≤ 0.05); no sex differences were found (p ≥ 0.24).

Total scores on the TICS and Itel-MMSE proved to significantly converge (rs-365 = 0.37; p < 0.001). Itel-MMSE scores were associated with all TICS sub-scores, but the strongest correlation was found with the orientation subtest (see Table 3). All TICS sub-scores were internally related at αadjusted = 0.008 (0.18 ≤ rs-365 ≤ 0.28; p ≤ 0.001) except for the orientation subscale, which was associated with attentive/executive sub-scores only (p < 0.001). TICS total scores correlated with all of its sub-scores at αadjusted = 0.013 (orientation: rs = 0.31; attention/executive functioning: rs = 0.59; memory: rs = 0.86; language: rs = 0.39; all Ns = 365 and ps < 0.001). Consistently, DR items proved to be associated with the memory subtest (rs-152 = 0.73; p < 0.001) and the total score (rs-152 = 0.61; p < 0.001).

Table 3 Spearman’s coefficients between the Itel-MMSE and TICS scores

A clear mono-component structure was detected (here denominated “global cognition/cognitive efficiency”) that explained 18.84% of variance, with moderate-to-high saturations (0.31 ≤ r ≤ 0.68), except for repetition, personal, and temporal orientation items (r < 0.3). Reliability was excellent as inter-rater (ICC = 0.94), good as test–retest (ICC = 0.78), and internally acceptable (Cronbach’s α = 0.63).

A summary of item difficulty and discrimination values is reported in Table 4. Overall, TICS items showed moderate-to-high difficulty, with backward subtraction task yielding the highest difficulty. Backward subtraction items also proved to be the most discriminative.

Table 4 Item difficulty and discrimination for the TICS

The TICS proved to be highly accurate in discriminating between those performing below vs. above the 5th percentile of the Itel-MMSE (see Fig. 1); similar findings were obtained when comparing the TICS with vs. without DR (see Fig. 2), with the former being slightly more accurate than the latter (χ2(1) = 3.84; p = 0.050).

Fig. 1
figure 1

Receiver-operating characteristics (ROC) curve for the TICS against the Itel-MMSE. The reference measures were a performance above vs. below the 5th percentile on the Itel-MMSE. AUC = .83, SE = .03, 95% CI [.77, .89]

Fig. 2
figure 2

ROC curves for the TICS with vs. without DR subtest against the Itel-MMSE. The reference measures were a performance above vs. below the 5th percentile on the Itel-MMSE. TICS without DR: AUC = .79, SE = .05, 95% CI [.69, .89]; TICS with DR: AUC = .73, SE = .06, 95% CI [.61, .85]

Discussion

The present work provides Italian clinicians/researchers with updated evidence supporting the validity, reliability, and diagnostic soundness of a back-translated and culturally adapted Italian version of the TICS. Its adoption is indicated for epidemiological studies [3] and clinical trials [30], as well as for telemedicine practice, opening up to easier longitudinal studies, a greater reach of underserved [31] or home-locked-down [32] populations, as well as multi-stage prevention campaigns [4]. Furthermore, as minimally relying on physical supports, the TICS might be useful for both bedridden and visually impaired patients [16], as well as to administration in infectious environments [33].

This study relevantly contributes to the literature on of first-level (i.e., cognitive screening) TBCS tools [13], whose utilization is expected to increase with continuous improvement of telehealth care services [34]. Moreover, the robust statistical framework of the present work aligns with the recently underlined need for a greater psychometric rigor when developing/standardizing TBCS instruments [35, 36].

Consistently with previous studies [37], the Italian TICS proved to be a valid measure of general cognitive abilities, thus endorsing its use as a neuropsychological screening test. With this respect, as the highest contribution to its total score was provided by memory items and the orientation subtest yielded the highest correlation with the Itel-MMSE (≈40% of whose items assess orientation) [19], the TICS confirms its potential for Alzheimer’s spectrum disorders [10, 17]. However, it should be noted that the off-label DR item did not increase the diagnostic accuracy of the TICS, in line with previous evidence [38]. The present findings thus quantitatively support the adoption of the original TICS format, although the inclusion of the DR task would provide further relevant semiotic information.

Moreover, its excellent inter-rater reliability ensures that, in spite of the remote administration modality, the TICS is minimally dependent of examiners’ subjectivity, thus being suitable for usage by several practitioners of similar backgrounds. This last finding is of key importance as TBCS tools have been questioned as possibly being lowly reliable [36].

Item-level features have been shown to be highly relevant to interpretation of TBCS scores [35]. Consistently, those herewith enclosed should lead practitioners to pay particular attention to backwards subtraction items, as proving to be the most informative.

There are some limitations that need to be considered. First, in the present sample, participants aged ≥ 86 years old are little represented: future investigations should therefore focus on the feasibility of the TICS in the very old population, for which complete normative data often lacks. Second, the region representativeness appears to be moderately biased toward Northern Italy; in this respect it is worth mentioning that the TICS has been shown to be feasible also in Southern Italian individuals in the context of an epidemiological study [39]. Moreover, since the recruitment of participants for the present study occurred during the COVID-19 pandemic, this situation could have, at least to some extent, influenced the present test scores, due to subjective/objective cognitive difficulties observed even in the general, healthy, population during COVI-19-related emergencies [5,6,7]. However, it should be also noted that, in the present sample, participants had no history of psychiatric/neurological illnesses, hence any potential influence of the pandemic on the findings herewith reported is likely negligible.

There is also an intrinsic limit of the TICS that should be acknowledged, namely, the fact that it requires sufficiently intact hearing as to validly interpret the results. Therefore, the individual hearing status should be thoroughly examined before TICS administration especially in the elders, given the high incidence of audiological decline is elderly [40].

Finally, it should be borne in mind that further investigations are needed on the clinical usability of the Italian TICS in different neurological populations. Indeed, until now, only one study tested its usability in patients with Alzheimer’s disease [17].