Introduction

Time orientation varies from individual to individual and is generally stable within individuals. Several studies have reported the associations between time orientations (including the balance of multiple time-orientation dimensions), with various aspects of individuals’ characteristics, including thought processes such as planning for retirement (Mooney et al. , 2017) and cognitive functions such as working memory (Witowska & Zajenkowski, 2021) (For a review, see Stolarski et al. (2020)).

Despite little evidence of the relationship between long-term memory functioning and time orientation, we can reasonably consider a possible bidirectional relationship between them. How to measure individuals’ time orientation depends on the purpose (for example, conducting a questionnaire survey (Zimbardo & Boyd, 1999; Webster, 2011) and sampling messages on social media (Park et al. , 2017). For the following reason, we expected that the time orientation observed, particularly in natural conversation among older adults, is relevant to their memory abilities.

Conversation and memory functioning

People who routinely in their daily lives talk to others about their recently acquired information may have their memory abilities enhanced by the routine. This direction makes sense by considering episodic memory processing and the nature of the conversation that can facilitate it. Linking newly acquired information to relevant past experiences and prior knowledge while explaining it to other people may facilitate elaborate encoding of information before it is lost without being consolidated.

Memory retrieval is often classified into two primary processes: recollection (intentional retrieval of detailed context, including spatiotemporal aspects of an episode) and familiarity (feeling of not knowing the details but just knowing) (Jacoby , 1991). Literature has reported that older adults are likely to impair the former, while the latter does not change much with age (Del Missier et al., 2015; Lustig & Lin, 2016). This is partially attributed to age-related reduction in deep encodings, which leads to retrieval being more dependent on general familiarity and thereby can cause less distinctive or false representations of episodic memory (Grady & Craik, 2000). Repeating such less elaborate encoding in daily life may undermine the resolution and accuracy of information; however, specific contexts created with others through conversations can provide schematic support, which leads to overcoming encoding deficits. The results of recall tests for numbers and objects showing that age differences in recall performance were negligible when the items were placed in everyday contexts (Castel , 2005; Castel , 2007) support this view.

In addition, although age-related deficits in self-initiated effortful memory processing increase dependence on familiarity, thereby leading to failures of successful retrieval (Del Missier et al., 2015; Lustig & Lin, 2016), frequently receiving questions from others in conversations may be beneficial for memory plasticity because they may function as cues that demand effortful retrieval.

Several training strategies for memory enhancement have been proposed, and their effects have been reported to be of some magnitude (Verhaeghen et al. , 1992). However, even if these training methods are effective for a specific task, these effects may not be generalizable to memory in everyday life. Therefore, a combination of multiple training methods and interventions in daily life (i.e., a multimodal approach) has been proposed (Lustig et al., 2009; Ranganath et al., 2011; Gross et al., 2012; Wenger & Shing, 2016). In the present context, the complexity of everyday conversations is promising for improving memory. We can presume that if people have more opportunities for memory encoding and retrieval of recent-acquired information, they train memory frequently and have a greater ability to memorize new information. This is also supported by a social epidemiological study showing that a higher proportion of family members in later-life social networks reduced the frequency of contact with friends and was associated with lower episodic memory (Sharifian et al. , 2020) if we presume that friends are more likely to convey new information than family members.

Contrarily, we can expect that those with a greater ability to memorize recently accessed information talk about more recent topics in natural conversations, all else being equal. In experimental settings, the ability to memorize recently accessed information has been measured by several neuropsychological tests, such as logical memory subtests from the Wechsler Memory Scale-Revised (Wechsler , 1987). In this task, recall performance for the experimenter’s brief talk was regarded as a direct indicator of memory ability. However, little is known about how memory function is reflected in verbal characteristics, including time orientation, in natural conversations.

To examine the latter direction (but we will come back to the first direction later), the present study used dictation data of natural group conversation among participants and investigated how time orientation in their utterances is associated with their memory ability estimated by a standard neuropsychological memory test.

Sentence classification

To characterize participants’ utterances, we divided a widely used time category to classify time orientations in previous studies (Zimbardo & Boyd, 1999; Webster, 2011; Park et al., 2017; Demiray et al., 2018), namely the past, into recent and remote categories. There must be a temporal categorization that distinguishes between the periods before and after new information is integrated. The present study introduces a more fine-grained coding scheme, namely, the “Recent” is added to the existing time categories, past, present, and future. “Recent” is defined as the most recent month.

In classifying utterances, we did not rely solely on grammatical tenses, but rather, we focused on the time when the event mentioned by the speaker occurred. This is because our primary interest is in the participant’s ability to recall memories. Further, note that languages differ, for example, in how many tenses are distinguished and whether verbs or adverbs are used to express them (Fabricius-Hansen , 2006). Our classification, which emphasizes events instead of tenses, is expected to contribute to the robustness of our results to language differences.

In understanding the meaning of an utterance, its temporal range is sometimes the focus of attention. This is related to the concept of ‘aspect,’ which is expressed in English, for example, in the perfect and progressive forms. Nonethelss, the manner of expression and the degree of grammaticalization vary from language to language (Comrie , 1976; Sasse , 2006). For this reason, this study occasionally uses compounded tags such as ‘Past-to-Present’ to classify the temporal period expressed by an utterance. Furthermore, in this case, an utterance is classified in terms of its contents and contexts instead of formal grammar.

Additionally, we categorized each utterance based on whether the content was experience-based and/or knowledge-based, given that declarative memory can be divided into two types: episodic memory, which refers to the memory of personally experienced events with a specific source in time and space, and semantic memory, which is the memory of general facts without specific temporal and spatial contexts (Tulving , 1972).

We also employed the person of the subject (i.e., first or third person) for utterance classification to more clearly characterize the utterances describing experience and knowledge. Sentences in which the second person is the subject are regarded as means to learn more about other participants’ episodes and are outside the scope of this study’s analysis, except to examine their frequency. Further, we classified utterances by prioritizing contents instead of grammar. This would increase robustness to differences in how subjects are expressed in different languages, such as the omission of the subjects and the use of dummy subjects.

We then regressed the extracted characteristics of the participants’ utterances on their memory abilities measured in advance using standard memory tests.

Time orientation and intervention

The scope of the present study extends beyond simply specifying the relationship between time orientation and memory function. Returning to the relationship between conversational habits and memory functions, we used the obtained results to develop intervention programs to maintain older adults’ memory functions, in which people are encouraged to habitually focus on specific time orientation in their conversation. This idea is grounded in the semiflexibility of time orientation, as well as neuroplasticity.

If an individual’s time orientation is invariant, even with neuroplasticity, it may be difficult to enhance memory function through an intervention program that focuses on a specific time orientation. However, studies have implied that time orientation can be altered by interventions. A representative example is a clinical study to balance the time orientation of people with post-traumatic stress disorder symptoms whose maladaptive negative orientations had become the dominant time orientation by encouraging them to focus on the opposite positive time orientation, which showed significant improvement. (see Sword et al. (2015) for the background and case studies; see Sword et al. (2014) for other promising applications.)

Given the semiflexibility of time orientation, if older adults with higher memory functions tend to have a specific time orientation in their natural conversation, we could construct an effective intervention program to maintain or enhance memory functions by encouraging people to adopt the habits of those with higher memory functions.

The coimagination method (Otake et al. , 2011; Otake-Matsuura , 2018; Otake-Matsuura et al. , 2021) is a conversational program designed to alleviate the decline in cognitive functions, including episodic memory, which is negatively affected by aging (Shing et al., 2010; Craik & Rose, 2012), based on the mechanism of memory consolidation and the semiflexibility of time orientations. In this method, older adults participate in group conversations. The topic of each group’s conversation session was determined in advance. Participants were asked to take photos of the topic before the session, and during the session, they showed and explained their photos to other participants. By implementing the sessions regularly, this method aims to obtain participants in the habit of encoding recently acquired information, thereby enhancing the ability to retain new information that would otherwise be lost before memory consolidation.

Structure and goal of this paper

The present study used dictation data from a randomized controlled trial conducted in Otake-Matsuura et al. (2021), which compared changes in cognitive functions of older adults in the intervention group adopting the coimagination method with those in the control group of daily conversation. We first specified the participants’ time orientations from their conversations for both the intervention and control groups. When adopting a new approach toward time orientation, we have to be aware that the results on which individuals tend to place more weight vary depending on the sampling method adopted. For example, Demiray et al. (2018) found that individuals’ utterances in real-life conversations have a retrospective bias, a tendency to talk about past events more than future ones, whereas a prospective bias, a tendency to think about future-oriented events more than past-oriented ones, has been found in previous studies focusing on individuals’ thoughts or self-reports (Song & Wang, 2012). Therefore, it is important to investigate the frequency of each type of time oriented utterance using our novel sampling method and coding scheme, which is reported in “Sentenceclassification by tags”.

Subsequently, we examined the relationship between memory functioning and extracted time orientations in “Associations with memory functioning". This was conducted only for the control group because the conversation in the control group can be regarded as a proxy of their everyday conversation. Thus, it suits our purpose to examine the relationship between conversational habits and memory functions.

Based on these results, we next considered how conversational intervention programs could be more effective in maintaining and improving memory function in older adults. We examined which session topics increased the frequency of utterances with features found to be associated with higher memory functions in the previous step in “Proportions ofsentences with relevant tags per session in the interventiongroup”. More sessions could lead to improvements in future intervention programs.

In short, the central interest of this paper is in “Associations with memory functioning", “Sentence classification bytags” is preliminary to its examination, and “Proportions ofsentences with relevant tags per session in the intervention group” is for future intervention. Numbering in the online supplementary materials is prefixed with SM (e.g., Table SM1).

Methods

Data

Participants

This study used the dictation data obtained from an experiment examining the intervention effect of the coimagination method on cognitive performance in Otake-Matsuura et al. (2021). Detailed information on the participants’ characteristics has been reported in their paper. (Descriptive statistics for the demographic variables used in this study are reported in “Associations with memory functioning".) Community-dwelling healthy adults (aged 65 years or older) were recruited from the Silver Human Resource Center. Sixty-five out of the 72 people gathered satisfied the following inclusion criteria: not having dementia, not having a neurological impairment, not having a disease or medication known to affect the central nervous system, and a score of the Japanese version of the Mini-Mental State Examination (MMSE-J) (Sugishita et al. , 2001) higher than 24. All participants were Japanese speakers. Thirty-two participants were randomly assigned to the intervention group, and the remaining 33 were assigned to the control group. In both the intervention and control groups, a 30-minute group conversation by each subgroup consisting of four fixed members, except one of the control groups consisting of five members, was conducted once a week for 12 weeks.

Measures

In addition to the MMSE-J, the participants’ cognitive performance was evaluated using several tests in Otake-Matsuura et al. (2021). Among them, logical memories I and II from the Wechsler Memory Scale-Revised (WMS-R) (Wechsler , 1987) are the most relevant to the hypothesis of the present study. The WMS-R is a widely used neuropsychological test to measure verbal episodic ability, the ability to encode novel information and recall information after a delay. In this task, participants were required to listen carefully to the experimenter who read a short story aloud. Immediately after encoding, they were asked to recall the details of the story as accurately as possible. They are also required to recall them after a 20 to 30-minute delay. We refer to these two tests as LM1 and LM2, respectively, and call the two LM together. This study examined how participants’ scores on these tests at baseline relate to the characteristics of their utterances in group conversations.

Collection procedure

Participants in the control group were involved in unstructured group conversations without predetermined topics or guidance and in 30-minute health education about successful aging. Meanwhile, participants in the intervention group based on the coimagination method were involved in the structured group conversation with predetermined topics and guidance and received a 30-minute explanation of the intervention.

The coimagination method is a group conversation method that aims to train older adults’ cognitive functions with strict timekeeping and using a photo to engage in talking with other participants (Otake et al. , 2011). In addition, the method was implemented using an original robot facilitator and systems to provide strict timekeeping for every participant in the conversation (Otake-Matsuura , 2018; Yamaguchi et al. , 2012). In the group conversation, the facilitator robot Bono-05 (Otake-Matsuura & Tokunaga, 2020) played the role of a chairperson and encouraged participants to talk about their experiences with photos. Before attending the experiment, participants were asked to take photos based on a predetermined topic announced by the human operator. Participants were further asked to select the two photos that they thought are the best for each session. Each participant had one minute to explain each photo and two minutes to discuss it with other participants. During the experiment (Otake-Matsuura et al. , 2021), the topic changed every week; thus, the participants were encouraged to talk about the latest topic, which they had experienced recently within the week, using the two photos for a total of two minutes. Subsequently, each participant discussed their experiences with the other participants for a total of four minutes. The robot controlled the amount of speech of each participant during the group conversation. For example, if a participant talks too much, the robot stops that participant from speaking, after which the robot encourages the least talkative person to engage in conversation.

During the experiment, we collected the participants’ voice data with unidirectional microphones, which aimed to record each participant’s voice independently to reduce the noise. We also collected the results of transcribing (participant’s name, start time of utterance, end time of utterance, and result of transcribing) with an original transcribing system, which is based on Google Cloud Speech-to-Text API technology. In addition, the transcribed text included some incorrect transcribing results that came from the limitation of speech-to-text technology (e.g., different words with similar pronouns). Thus, to keep the data as accurate as possible for natural language data, we have manually corrected the text with some part-time job workers.

Annotations

To characterize the participants’ utterances, we entrusted the annotation tasks to IR-Advanced Linguistic Technologies, Inc., a company dealing with natural language processing located in Japan, according to our guidelines summarized in the following subsections.

The data were first annotated by the workers and then checked and updated by other workers. We used the updated data for the analysis. To examine the robustness of the results against the differences in judgments between the first and second workers, the results using the data before the update are shown in Section SM2. Although some differences were observed between the two results, they did not qualitatively change our claim.

Preprocessing

In this study, the unit of analysis was a sentence. Thus, tags characterizing the utterance, as defined below, were assigned to each sentence by annotators. However, when different tags were assigned to multiple clauses in a compound statement, the sentence was split into clauses with different tags.

We excluded duplicated tags from the analysis for the following two reasons. The first reason is technical. Our dictation data were obtained by automatically converting the speech, with a microphone assigned to each participant, into text by voice recognition, followed by manual correction. Therefore, as a result of picking up the voice of a louder speaker with another participant’s microphone, a similar sentence remained as the utterance of the latter speaker. In this case, annotators detected an actual speaker and set a flag for the duplicated sentence. The second reason is that tags were assigned to sentences that were fragmented because one person’s utterance interrupted another person’s utterance. In this case, the annotators set a flag for one of the fragmented sentences. In addition to these two cases, sentences judged to be meaningless by the annotators were also excluded. The total number of resulting sentences during the intervention period was 89,186.

Intention tags

The annotators assigned one of the following six tags to all sentences in terms of the intention of a participant’s utterance: Topic provision, Question, Reply, Utterance, Asking-back, and Others. The Topic provision tag was assigned to utterances in which speakers provided topics based on their experiences. The Question and the Reply tags were assigned to the question of the provided topic and reply to the question, respectively. The Utterance tag was assigned to sentences that were not classified into the first three categories. The Asking-back tag was assigned to sentences that intended other participants to repeat their previous utterances. The remaining sentences lacking substances such as repetition, exclamation, and rephrasing were included in the Others tag. The annotators tried to assign tags defined in the subsequent sections so that the conversation made sense only by sentences with tags, Topics provisions, Questions, or Replies as much as possible.

Time tags

The annotators assigned Time tags to sentences with one of the three types of Intention tags: Topic provision, Question, and Reply. Time tags are classified as follows.

Past:

is assigned if the sentence mentions an event or a fact that occurred more than a month ago, and that has been completed at the time when the participant spoke. The participant’s experience is also regarded as a past event (e.g., “I saw an elephant,” “I have ever seen an elephant,” and “That game was uninteresting.”) If annotators were unsure whether the mentioned event or fact was a completed one or it was still ongoing, the sentence was regarded as a Past-to-Present event, as defined below. For example, the sentence “Paint is evolving, is not it?” is regarded as a past-to-present event, while the Past tag is assigned to the sentence “In 2017, paints have made great progress.”

Recent:

is assigned if the sentence mentions an event or a fact that has happened within a month and currently completed (e.g., “Last week I went fishing.”)

Present:

is assigned if events, facts, states, properties, ideas, etc. mentioned in a sentence are still occurring or the participant still has it at the time of their utterance (e.g., “This house is unoccupied,” “I am female,” “I think that football is uninteresting.”) If a mentioned event or action that started a little while ago is continuing, the Present tag is also assigned to the sentence (e.g., “Bob is waiving his hand.”) “I’m in trouble” is another example of a sentence with the Present tag because the participant did not explicitly state that he/she had been in trouble for a long time.

Future:

is assigned if the sentence mentions an event about the future or something that has not occurred yet, such as a plan, prediction, desire, and supposition (e.g., “I’m going to the USA,” “It will rain tomorrow,” “I want to go to the USA,” and “If it rains…”)

Past-to-Recent:

is assigned if the sentence mentions an event that had begun in the past and has been completed recently (e.g., “After graduating from university, I had lived alone until just a week ago.”)

Past-to-Present:

is assigned to a sentence mentioning an event that happened more than a month ago and continues presently, including customs (e.g., “I have been studying history for ten years,” “I climb mountains every summer,” and “Taking the dog for a walk is my daily routine.”) As previously defined, a sentence mentioning states, properties, and ideas is assigned the Present tag. However, if the participant explicitly states that the mentioned event has continued for a long time, that sentence is labeled Past-to-Present (e.g., “This house has not been occupied for three years,” “I have thought that football is uninteresting since childhood.”)

Recent-to-Present:

is assigned to a sentence mentioning an event that has occurred recently and continues presently, including customs (e.g., “I have been studying history since last week.”) As previously defined, states, properties, and ideas are included in the present category. However, if the participant explicitly stated that the mentioned event was still ongoing, the Recent-to-Present tag was assigned to the sentence (e.g., “This house has not been occupied for three days.”)

Present-to-Future:

is assigned to a sentence if an event in the sentence has just begun, and then that will continue, or when a participant wishes that it will be that way (e.g., “I will finish the job that I have just started by tomorrow.”)

Past-to-Future:

is assigned to a sentence if an event in the sentence has already happened in the past and will continue, or if the participant wishes or expects it to be so (e.g., “Let’s stay good friends,” “I want to continue working until death,” and “He will continue the work he started three years ago.”)

Recent-to-Future:

is assigned to a sentence if an event in the sentence has recently happened and will continue, or if the participant wishes or expects it to be so (e.g., “He will continue the work he started a week ago.”)

Experience and Knowledge tags

The annotators also assigned the Experience tag and/or Knowledge tag to a sentence with one of three types of Intention tags: Topic Provision, Question, and Reply. “Experience” and “Knowledge” tags are defined as follows.

Experience:

If a sentence is not a question and the object or subject of the sentence includes “I,” people close to the speaker, the speaker’s possessions, or what the speaker has observed, we assigned the Experience tag to the sentence. In the case of question or confirmation, the Experience tag was assigned to the sentence when the object or subject of the sentence includes “you,” people close to the speaker, the speaker’s possessions, or what the speaker has observed.

Knowledge:

If a sentence is not a question and mentions general knowledge, the sentence is labeled Knowledge. In the case of questions or confirmation, the Knowledge tag was assigned to the sentence if the subject of the sentence was neither “I” nor “you.”

By definition, the concepts of experience and knowledge are not mutually exclusive. Let us consider the following conversation: Person A, showing a picture, says, “This is the giant Buddha in the XX temple that I visited the other day.” Person B asks, “How many meters is that?” Person A replies, “I think it was about 5 meters.” In this case, the first and third sentences seem to be classified as Knowledge in terms of literal meaning. However, those sentences are also about content that Person A would not know until Person A visited and indicated that Person A was recalling what happened there. Therefore, both Knowledge and Experience tags were assigned to the sentences.

Henceforth, the four types of sentences are denoted as follows: sentences with both Experience and Knowledge tags by “Exp-Know,” those with Experience tag and without Knowledge tag by “Experience,” those without Experience tag and with Knowledge tag by “Knowledge,” and those with neither Experience nor Knowledge tags by “No-label.”

Person tags

The subject in a sentence is also included in the analysis to extract the characteristics of sentences describing experience and knowledge more clearly. If there was any confusion in judging the omitted subject, sentences describing experience were tagged in the priority order of first, second, and third person. In contrast, sentences describing knowledge were tagged in the priority order of third, second, and first person.

If the subject was different between the main clause and the subordinate clause, or the main clause and the adverb clause, as in the following sentence, the utterance was split: “I was surprised because Bob came over.” This sentence is divided into main and subordinate clauses, and the former and latter are tagged as first person and third person, respectively.

We refer to the tags indicating the first, second, and third person as Persons1, 2, and 3, respectively.

Statistical methods

We firstly present the distribution of sentences with each type of tag as basic information. In testing the differences in the proportions of tags between the intervention and control groups over the entire intervention period, chi-square tests were conducted, and the associated effect sizes were reported. We next observed the distribution of tags in each session stratified into the intervention and control groups.

We used hierarchical clustering based on Wald’s method to extract the intra-personal patterns. The method involved transforming the data into vectors, with elements being the intra-personal proportions of each type of tag. We then studied the frequency of each pattern within the intervention and control groups. The number of clusters was determined using the pseudo \({ t}^2\).

Subsequently, we examined whether and how participants’ memory functioning was associated with the frequency of sentence generation concerning tags. Based on the findings of the previous literature reviewed in “Introduction”, the tags of interest to us are combinations of Past or Recent tags and Experience or Knowledge tags. To estimate the association, four types of generalized linear models were applied to participants’ memory functioning measured at baseline (as explanatory variables) and the number of their sentences with concerning tags over the entire intervention period (as response variables): Poisson regression, zero-inflated Poisson regression, negative binomial regression, and zero-inflated negative binomial regression. We included zero-inflated models because we observed some participants who had never uttered a sentence concerning tags. In all the models, a log-link function was used, and the total number of sentences for each participant was included as the offset variable. Namely, the expected number of participant i generating sentences with a tag of concern \(y_i\) is expressed as

$$\begin{aligned} \log E(y_i)= & {} \log N_i + \beta _0 + \beta _1 {score}_i + \beta _2 {age}_i\nonumber \\{} & {} + \beta _3 {gender}_i + \beta _4 {education}_i \end{aligned}$$
(1)

where \(N_i\) is the total number of sentences generated by participant i (which was used in the offset variable \(\log N_i\)), \(\beta _0\) is the intercept, \(score_i\) is the variable indicating the LM1 (or LM2) score of participant i, and \(\beta _1\) is the regression coefficient associated with the score variable. We also included the participants’ age at baseline, gender, and years of education as control variables. Age is a continuous variable, gender is a dummy variable, and years of education is a dummy variable, which is dichotomized by whether it is less than 13 years. We denote the associated regression coefficients by \(\beta _{k\in \{2,3,4\}}\). For \(\beta _{k\in \{2,3,4\}}\), an increase in \(\beta _k\) by one unit increases the number of sentences with a tag of concern \(\exp (\beta _k)\) times.

For model selection, first, the overdispersion test was performed to examine the validity of the assumption of Poisson regression, in which the conditional variance is equal to the conditional mean for the current data. Model selection between the negative binomial regression model and zero-inflated negative binomial and Poisson models were conducted based on the AIC. For multiple testing corrections, we used the Benjamini–Hochberg method with the false discovery rate set to 0.1.

Finally, we examined which sessions in the intervention groups had higher frequencies of tags that were estimated to be associated with memory functioning. Investigating topics for these sessions will provide guidance on what should be included in future intervention programs.

The software and packages used for the analysis are presented in Section SM3.

Table 1 Descriptive statistics for the variables used in the regression analyses
Table 2 Associations between frequencies of tags and memory functioning

Results

Sentence classification by tags

For readability, details are reported in Section SM1. This section briefly describes the classification result for each type of tag during the entire intervention period.

Intention tags

The proportion of utterances with substantive content (i.e., sentences with Topic provision, Questions, and Reply tags) was higher in the intervention group than in the control group.

Time tags

For both intervention and control groups, the number of Present tags was the largest, followed in order by Past, Past-to-Present, Recent, and Future. Other tags were excluded from subsequent analyses because their frequencies were rare.

The proportions of Recent, Past-to-Present, and Present tags in the intervention group were greater than those in the control group, from the largest to the smallest effect size.

The hierarchical clustering applied to vectors of proportions of sentences with the leading five Time tags generated four clusters. Most of the participants in the intervention group belonged to two clusters with more Recent tags and fewer Past tags whereas almost all participants in the control group belonged to the remaining two clusters with more Past tags and fewer Recent tags.

Experience and Knowledge tags

Proportions of sentences with Exp-Know and Experience tags in the intervention group were greater than those in the control group.

Person tags

For both intervention and control groups, the number of Person3 tags was the largest, followed by those of Persons1 and 2 tags.

The proportion of sentences with the Person1 tag in the intervention group was greater than that in the control group.

Table 3 Proportions of relevant tags per session

Associations with memory functioning

Associations between memory functioning measured by LM1 and LM2 scores at baseline (independent variables) and the number of sentences with tags related to the hypothesis (dependent variables) were investigated: The combinations of Past or Recent tags and Experience/Knowledge tags. As mentioned in “Introduction”, to more clearly extract the characteristics of experience and knowledge, we used Experience (Knowledge) tags in combination with Person1 (Person3) tags. For sentences containing both Experience and Knowledge tags, those with Person2 tag were excluded. The label “notPerson2” was used to indicate this explicitly. Table 1 displays the descriptive statistics of the variables used in the analysis.

Since the dependent variables were count data, we first applied Poisson regression models to all dependent and independent variables combinations. Overdispersion tests applied to these models detected overdispersion in all the models. Since no zeros were observed in the dependent variables related to the Past tag, we selected the negative binomial model for analyses of Past-related tags. Regarding the dependent variables related to the Recent tag, zeros were observed (Table 1). Thus, we conducted model selection between the negative binomial, zero-inflated negative binomial, and zero-inflated Poisson models. For all cases except for the regression of the number of Recent:Exp-Know:notPerson2 on LM1, AIC was smaller for the negative binomial model than for zero-inflated models, and that for the zero-inflated Poisson model was the largest. Even for the exceptional case, the difference between the AICs of the negative binomial model and its zero-inflated counterpart is minor (172.3 vs. 171.1). Therefore, we selected negative binomial regression models for the analysis of Recent tags.

Table 2 shows the results of the negative binomial regressions. The LM1 and LM2 scores did not have significant associations with frequencies of Experience:Person1 tags when combined with either Past or Recent tags.

Regarding Knowledge:Person3 tag, when combined with the Recent tag, LM1, and LM2 scores had significant positive associations with it, but not when combined with the Past tag. The regression coefficients indicated that a one-point increase in the scores of LM1 (LM2) was associated with an increase in the number of sentences with Recent, Knowledge, and Person3 tags \(1.11 (=\exp (0.105)) (1.13 (=\exp (0.118)))\) times.

Regarding Exp-Know:notPeron2 tag, when combined with Past tag, it is significantly positively associated with LM2 score. The regression coefficients indicated that a one-point increase in the LM2 score was associated with an increase in the number of sentences with Past and Exp-Know:notPeron2 tags \((1.07 (=\exp (0.069)))\). When Exp-Know:notPeron2 tag is combined with Recent tag, it has no significant association with either LM 1 or LM2 scores.

Although the p-value for the association between the Past:Exp-Know:notPeron2 tag and LM1 score was below 0.05, the association is not significant in the Benjamini–Hochberg criterion. All other results significant at the 5% level in Table 2 met the Benjamini–Hochberg criterion.

Proportions of sentences with relevant tags per session in the intervention group

The previous section revealed that the most notable tags related to memory functioning were the Past:Exp-Know:notPerson2 and Recent:Knowledge:Person3 tags. To consider more effective intervention programs, this section calculated the proportions of those two types of sentences for each session in the intervention group and specified topics in the sessions where those sentences were frequently observed. We refer to each of the 12 sessions as S1, S2,..., and S12, respectively.

Table 3 lists the session topics and shows the average number per person of all tagged sentences, the average percentage per person of Past:Exp-Know:notPerson2-tagged sentences, and the counterpart of Recent:Know:Person3-tagged sentences for each session. The associated standard deviations are also presented.

The table shows that sentences with Past:Exp-Know:notPerson2 tags were observed relatively more in S2, S10, S1, and S5 than in other sessions, in that order. It also shows that those with Recent:Knowledge:Person3 tags were observed relatively more in S10, S3, S2, and S12 than in other sessions, in that order.

The frequencies of our focal sentences were low (the average of the percentages is at most 3.9\(\%\)). Their standard deviations were larger than the associated means for every session, except the frequency of Past:Exp-Know:notPerson2 tagged sentences in S2.

Discussion

Summary

This study hypothesized a bidirectional influence between the memory ability of older adults and their time orientation in natural conversations. On the one hand, people who frequently talk about immediate recent events in their daily lives have an enhanced ability to integrate new information due to repetitive recollection through conversation. Contrarily, the greater the ability to memorize recently accessed information, the more frequently they would talk about recent topics in their daily lives.

We aimed to test the latter hypothesis using data from a previous study that investigated the intervention effect of a conversational program on cognitive functions (Otake-Matsuura et al. , 2021). Based on our results and the former hypothesis, we tried to obtain findings that will refine future intervention programs to improve memory functioning.

To explore older adults’ time orientation by their utterances in group conversations, we assigned Time tags to their utterances (“Time tags”) and considered the percentages of those tags in their total utterances as their time orientation. We customized Time tags along with our hypothesis. Specifically, we added the novel time perspective dimensions Recent, which refers to an event within one month, to the existing categorization consisting of Past, Present, and Future. We also categorized the content of the participants’ utterances in terms of experience and knowledge based on two declarative memories: episodic and semantic memories.

Since this study is based on a different mode (i.e., group conversation) and different tags (e.g., Recent, and Experience/Knowledge tags) from previous studies, we first checked the prevalence of our tags (“Sentence classification by tags”).

Ensuingly, we investigated participants’ time orientation in unstructured group conversations (i.e., the control condition in Otake-Matsuura et al. (2021)), which is a proxy of a conversation in everyday life. Then we regressed it on their memory functioning measured in advance (“Associations with memory functioning”).

We further observed features of utterances in structured group conversations obeying the coimagination method (i.e., the intervention condition in Otake-Matsuura et al. (2021)), besides identifying what kinds of group conversation topics encouraged participants to generate sentences with features that were associated with higher memory functioning (“Proportions of sentences with relevant tags per session in the intervention group”).

Prevalence of Time Tags in group conversations

The prevalence of sentences was larger in the order of Present-, Past-, and Future-tagged. This is consistent with Demiray et al. (2018), despite the difference between group conversations in laboratory settings and daily conversations outside the laboratory and the difference in annotation schemes. Considering that study participants in Demiray et al. (2018) were English and Swiss German speakers, the consistency of the results between this study with Japanese speakers and theirs indicates, to some extent, its robustness to language differences. Future studies on the time orientation of daily conversation in other languages are awaited.

The prevalence of our newly defined Recent-tagged sentences fell between those of the Past- and Future-tagged sentences. The result at the aggregate level over the entire intervention period shows that the intervention group was more than twice as likely to generate Recent-tagged sentences as the control group (Table SM1).

Moreover, the hierarchical clustering method revealed the different features of group conversations based on the coimagination method in the intervention group and normal group conversations in the control group. In terms of Time tags, the participants were categorized into four clusters (Table SM2). Clusters containing more participants in the intervention group had higher average frequencies of Recent-tagged sentences, and clusters containing more participants in the control group had higher average frequencies of Past-tagged sentences. Although the former (latter) two clusters were split by the high or low average frequencies of Recent- (Past-) tagged sentences, the lower frequency of Recent-tagged sentences of the first two was higher than those of either of the latter two. This result suggests that conversational patterns based on the coimagination method are difficult to obtain in natural conversations. If it is established that the habit of frequently talking about recent events in daily life contributes to the maintenance or improvement of memory functioning in the future (i.e., the former hypothesis in “Summary”), this result suggests that the coimagination method can be an effective intervention program.

Association between older adults’ conversational characteristics and their memory functioning

As shown in Table 2, the LM scores had significant positive associations with the frequency of sentences with Recent tags if they were also Knowledge- and Person3-tagged (i.e., Recent:Know:Person3). However, no significant association was found between the LM scores and the frequency of Past-tagged sentences with Knowledge and Person3 tags. In addition, the LM scores showed no significant association with the frequency of Experience- and Person1-tagged sentences, regardless of the Time tags (Recent or Past). These findings suggest that older adults’ ability to memorize recent events is related to their utterances in natural conversations, i.e., the greater the ability to memorize recently accessed information, the more frequently they would talk about recent knowledge-based information obtained from their daily lives.

The main finding of a positive relationship between LM scores and the frequency of utterances regarding recently acquired knowledge-based information in natural conversations could be explained by the similarity between the behavioral metrics of LM and our conversation-derived metrics. Memory accuracy in the recall tasks in LM (see “Measures”) can be interpreted as the level of ability to memorize recently accessed knowledge-based information from external sources rather than experience-based information from one’s own experiences. This behavioral measure is similar to our conversation-derived metrics, that is, the frequency of utterances with recently acquired knowledge-based information, in terms of their targeted time and type of information. This explanation could be further supported by the finding that there was no significant association between the LM scores and the frequency of utterances with more remotely acquired (i.e., Past-tagged) knowledge-based information. Remote knowledge-based information can be regarded as semantic memory, which refers to the memory of general facts without a specific context in time and space (Tulving , 1972). This type of memory is established based on accumulated episodes and consolidated as abstracted knowledge, which is stable against advancing age (Park et al. , 2002). Thus, it should be distinguished from the recent one, which would be relatively vulnerable owing to a lack of sufficient time for establishment. It is also worth noting that no significant association was found between LM scores and the frequency of sentences with experience-based information, regardless of whether they described recent or remote ones. This result can be explained by the dissimilarity between the two metrics, given that experience-based information can be considered autobiographical memory, which would not be directly measured by the logical memory subtests in the WMS-R.

In interpreting the finding that the LM score was associated with the frequency of topics regarding recently acquired knowledge-based information but not experience-based information, further consideration should be given to the possibility that socio-demographic variables associated with the likelihood of acquiring recent knowledge-based information (e.g., educational background, information literacy, frequency of reading, and richness of social relationships that facilitate information exchange) were confounding factors. Particularly, such variables contribute to higher memory performance and the frequency of utterances of recently acquired knowledge-based information. To address this issue, our analysis included years of education as the control variable. Nevertheless, given that Lachman et al. (2010) demonstrated that frequent cognitive activity can compensate for the adverse effect of lower years of education on the performance of a word recall task, the socio-demographic background of the participants needs to be further investigated and included in the analysis for the detailed mechanism and more accurate estimation of effect sizes.

In the present study, we also found that the LM scores had significant positive associations with the frequency of sentences with both Experience and Knowledge tags if they were Past-tagged, i.e., Past:Exp-Know:notPerson2 (although for LM1, the corrected p-value suggests no significance). This implies that participants’ memory ability could be associated with the richness of the content in their utterances. In other words, the greater their memory ability, the more frequently they would speak about balanced information that contains both their own experience and general knowledge. This finding is consistent with a previous finding that the linguistic ability of older adults is associated with a subsequent decline in their cognitive function, including episodic memory (Farias et al. , 2012). As a measure of linguistic ability, Farias et al. (2012) used an idea density score that reflects the amount of information relative to the number of words used in a narrative. Thus, the previous finding can be interpreted as evidence that the quantity of novel information packed into utterances is related to memory function. Our findings extend the previous finding by demonstrating that older adults’ memory functions can be an indicator of the quality of the information included in their utterances, such as richness or complexity (i.e., the degree to which experience-based and knowledge-based information are packed into sentences in a balanced manner). Furthermore, if the richness or complexity of Exp-Know is interpreted as a consequence of recollection (i.e., not relying on familiarity), it is reasonably associated with high memory function, as our results suggest.

Implications for intervention programs

No significant intervention effects of the coimagination method on LM scores were found in the RCT (Otake-Matsuura et al. , 2021). This implies that the magnitude of the intervention may have been insufficient to improve participants’ memory functioning. Although the mechanism behind the link between the maintenance or improvement of memory function and conversation training must continue to be tested in the future, to develop more effective intervention programs along this line, the results of which session topics increased the proportion of sentences with tags relevant to memory functions would be useful.

The results show that the proportions of sentences with both Past:Exp-Know:notPerson2 and Recent:Know:Person3 tags in the sessions whose topics were “Neighborhood landmarks” (S2) and “Found on a 10-minute walk” (S10) were relatively higher than in the other sessions (Table 3). Conducting more sessions in which participants share what they notice in their neighborhood walks may be a promising approach to making the intervention program effective.

Previous studies have shown that walking is positively associated with not only older adults’ subjective well-being (Ku et al. , 2016), but also lower cognitive decline (Weuve et al. , 2004) and a reduction in the risk of dementia (Abbott et al. , 2004). Some studies have suggested that programs that combine cognitive and physical activities hold promise for facilitating neuroplasticity (Fissler et al., 2013; Bamidis et al., 2014; Wenger & Shing, 2016). Moreover, Sekiguchi et al. (2021) showed that walking is a leisure activity that is relatively easy for various types of inactive older adults to start. The current study suggests that the beneficial effects on memory function may be further enhanced by outputting the information gained during walking in conversations rather than just walking.

If we evaluate S2 and S10 from the viewpoint of reducing the disparity among individuals in the proportions of sentences with focal tags, S2 might be preferable because the coefficient of variation (SD divided by Mean in Table 3) in S2 was slightly smaller than in S10. It will be a future task for the coimagination method to incorporate this point, although it has already been concerned about the equality of time for each participant to speak (“Collection procedure”).

Limitations and future works

The present study adopted a group conversation approach. This has some advantages: Rather than having participants speak in their daily lives, they inevitably listened to other participants’ speech, and sometimes received questions from others, which may serve as a cue and make it easier for those who do not usually have few opportunities to talk with others about past or recent events. This can reduce the likelihood of underestimating their potential retrieval ability.

However, it also contains an element that obscures the results: Having adopted the approach of using the utterance content as a trace of having elicited memories, we should consider the factors that may cause the discrepancy between the actual (not necessarily uttered) memory elicited and the frequencies of Time tags assigned to utterances. For example, there can be individual differences in how many conversational turns can be taken. Even for individuals with similar characteristics, the number of turns taken may vary depending on their compatibility with the group members. Even if participants can recall and talk about their own experiences, those who are not good at taking conversational turns may not show this ability in group conversations. We cannot rule out the possibility of this affecting our results. To address this issue, it may be necessary to investigate the conversational characteristics of each participant in situations other than group conversation.

Furthermore, participants had conversations with the same group members throughout the entire period. The characteristics of utterances may vary depending on the conversation partner (Luo et al. , 2019). The robustness of the results, which may or may not depend on the type of person distributed within the group, needs to be addressed.

The present study was conducted on healthy older adults who were willing to work enough to enroll in the Silver Human Resource Center. However, studies have suggested a link between mental state and time orientation of thoughts. For example, depression is associated with past-orientation and anxiety with future-orientation (Ito et al. , 2019). To assess how robust our results are to variations in mental states, whether this applies to time orientation found in spontaneous natural conversation, as in the present study, needs to be further examined.

The present study focused solely on tags assigned to sentences. However, to further examine the validity of the interpretation that the ability to construct complex sentences, such as those with Exp-Know tags, was associated with memory functioning, we should consider structural aspects of sentences, such as complexity and dependency, in future analyses.