FormalPara Key Points

We report here on an international Delphi validation study of the Turkish Inappropriate Medication use in the Elderly (TIME) criteria, carried out with 11 international panelists with particular expertise in the rationalization of drug use in older adults.

The internationally validated version of the TIME criteria includes 134 criteria (101 TIME-to-STOP and 33 TIME-to-START criteria).

This first version of the internationally validated TIME criteria will be used in clinical trials to validate their effectiveness in improving prescribing in older adults.

1 Introduction

Older adults are the largest population group of medication users due to the increased incidence of chronic diseases with aging and the concurrent development of geriatric syndromes. These factors increase the risk of polypharmacy and potentially inappropriate prescribing, both of which are well-known risk factors for adverse drug reactions (ADRs) [1, 2]. ADRs in older adults are a significant cause of morbidity, disability, and mortality, and constitute a serious and ever-increasing public health problem [1]. Accordingly, the optimization of pharmacotherapy is essential in the management of older patients as a means of counteracting the adverse medical, economic, and social consequences of inappropriate medication use. Explicit (criteria-based) screening tools and implicit (judgment-based) evaluation methods have been developed to assist healthcare professionals in the management of pharmacotherapy in older adults. The explicit tools are drug and/or disease oriented, and can be used with little or no clinical judgment. They usually express determinants of inappropriate medication use for several drugs and/or diseases or drugs to avoid. Implicit evaluation tools, on the other hand, are judgment based and patient specific and include the patient’s complete medication regimen. They combine research data with clinical evaluations, and consider the preferences of the patients/caregivers when assessing the quality of prescriptions [3]. More than 70 tools from many different countries are described in literature for the assessment of inappropriate prescribing [4,5,6,7,8,9], among which, the most commonly used are the Beers criteria and the Screening Tool of Older Persons’ potentially inappropriate Prescriptions/Screening Tool to Alert to Right Treatment (STOPP/START) criteria [10, 11].

Prescribing habits and locally available medications vary considerably between countries, and the evidence on appropriate prescribing in older persons continues to evolve. Within this context, we produced a guide tailored to the specific needs of the Eastern Europe region and developed an explicit screening tool—the Turkish Inappropriate Medication use in the Elderly (TIME) criteria set (TIME‑to‑STOP/TIME‑to‑START)—under the leadership of the Rational Drug Use Working Group of the Turkish Academic Geriatrics Society [12].

A variety of explicit tools exist in this complex field, and it is not possible to identify a single ideal tool. Analogous to other tools developed explicitly for use in different European countries that take into account the medications available in the respective national markets, the TIME criteria have been developed primarily for use in Turkey and the Eastern European region, with nationally recognized academicians involved in its development. However, as the TIME criteria readily incorporate the aspects of well-accepted central European tools, and are updated based on a thorough literature review, we suggest that the TIME criteria be regarded as an expanded set for broader international use rather than being restricted to Eastern Europe. Hence, the final (fourth) phase in the development of the TIME explicit screening tool was planned as a Delphi validation study, with the aim being to validate the tool internationally. In this article, we aim to present the process and results of this Delphi validation phase.

2 Methods

2.1 Design

The flow of the TIME study was as follows: We applied the STOPP/START tool methodology and classified the inappropriate prescribing criteria in a similar way to the STOP and START criteria [12]. The study was conducted by the TIME study group—comprising a national expert group of 49 academics and a national working group of 23 academics. The academics were from a wide range of specialties involved in the frequent care of older adults, with 17 members from geriatric medicine; four members from psychiatry; three members each from neurology, cardiology, gastroenterology, general internal medicine, and pharmacology; two members each from endocrinology, nephrology, physical therapy and rehabilitation, urology; and one member each from pulmonology, infectious diseases, gynecology, ophthalmology, and clinical pharmacology. Details of the study have been described elsewhere [12], but briefly, the study was performed in three phases. In the first phase, an initial draft of 133 criteria was created combining the STOPP/START v2 [13] and the CRIME criteria [14]. The expert group reviewed the first draft and provided feedback comments, including suggested revisions or removals of the criteria, and/or the addition of new criteria. Thereby, a second draft was formed that took into account the feedback of the expert group. In the second phase, the working group reviewed the second draft and made a thorough literature search on each criterion. Two geriatrician members and the criterion-related specialty-specific working group members worked face-to-face evaluating the second draft in view of the references. This approach was applied to ensure and preserve the geriatric medicine perspective on each criterion. The group made revisions, removals, and formed additional criteria if sound evidence was found that was in line with the expert group’s comments. Consequently, the third draft was developed at the end of the second phase. In the third phase, all working group members evaluated and approved the full criteria set. At the end of this stage, 55 criteria were added, 17 criteria were removed, and 60 criteria were modified from the first draft. Accordingly, the final set of TIME criteria was composed of a total of 153 criteria (112 TIME-to-STOP and 41 TIME-to-START criteria) [12].

Following the third phase of the TIME criteria study, which was completed in March 2019 [12], the international Delphi phase was launched in June 2019. Delphi is a method of eliciting and refining group judgments [15], and is designed to facilitate structured group communications with a view to reaching consensus in expert opinions in the face of complex problems, expensive endeavors, and uncertain outcomes. For these reasons, it is applicable to the development of guidance on prescribing in the elderly, many of whom have medical complexity and multiple medications. Furthermore, the Delphi method has been already found useful in several of the previous studies designed to develop inappropriate medication use criteria sets such as STOPP/START, Beers criteria, GheOP3S tool and Euro-FORTA [10, 13, 16, 17]. It is conducted in consecutive rounds until a high level of consensus is reached among experts. The structure of the TIME-to-STOP/TIME-to-START tool was designed in the same way as the STOPP/START criteria [12, 13], and so we opted for the Delphi method applied in the STOPP/START study [13].

The panelists of this Delphi study were 11 internationally recognized experts who are specialized in geriatric pharmacotherapy and who have experience in the development of explicit tools—nine were academic geriatricians, one was a clinical pharmacologist and one was a community pharmacist academician. Of the academician geriatricians, one also had expertise in palliative medicine and clinical pharmacology. To ensure broad representation, experts were selected from different regions, with ten of the panelists being from seven countries in Europe (Belgium, Czech Republic, Germany, Italy, Spain, United Kingdom, The Netherlands) and one from Israel. All panelists were members of the International Group for Reducing Inappropriate Medication Use and Polypharmacy (IGRIMUP).

2.2 Delphi Rounds

We made use of SurveyMonkey® software, as an established computerized tool aiding the reaching of consensus on a particular issue, for the online Delphi rounds. The nationally approved TIME-to-STOP and TIME-to-START tool provides explanations to some criteria to aid clinicians in clinical practice. These explanations were omitted in the present Delphi study, due in part to the technical difficulties of integrating them into the online set and in part to simplify the process for the online panelists.

First, the full textual references for each criterion were sent in a Dropbox™ file to the panelists for their consideration. The first Delphi round was then commenced by sending the nationally validated version of the TIME criteria set (153 criteria). In the Delphi rounds, each panelist was asked to indicate to what extent they agreed or disagreed with each TIME-to-STOP and TIME-to-START criterion, considering both the available evidence and their own experience. We used a Likert scale to assess the level of agreement with each criterion, scored as follows: 1 = strongly agree; 2 = agree; 3 = neither agree nor disagree; 4 = disagree; 5 = strongly disagree. The panelists were given the opportunity to comment on each criterion and to add suggestions to the statements. The median and interquartile range values for each criterion were calculated in each iteration of the Delphi round, and criteria with median values of 1 or 2 and a 75th centile value of 1 or 2 were accepted, and were included in the final internationally validated criteria set. Criteria with a median value of > 2 were rejected and removed from the criteria set; while criteria with a median value of 1 or 2 but with a 75th centile value of > 2 were retained, to be assessed in the following round. Hence, neither rejected nor accepted criteria composed the content of the next Delphi round for re-evaluation by the panelists, with the scores, the 75th centile values, and the comments of the panelists related to those criteria being included in this round. Subsequently, we proceeded to the next Delphi round, once again inviting the panelists to score for agreement and comment on each criterion. The panelists re-scored the criteria on the Likert scale in the next Delphi rounds, considering the former scores and the accompanying comments related to the retained criteria. The above detailed approach was applied to reject, accept, or retain the criteria for the next Delphi rounds. In line with this outlined concept, we planned to continue the Delphi rounds until consensus was reached on the rejection or acceptance of each criterion. We completed the international Delphi study in March 2020.

3 Results

The development process and the timeline of the TIME International Delphi validation study are presented in Fig. 1. All 11 panelists provided responses in all Delphi rounds. In total, three Delphi rounds were performed. The first round took place between June 26, 2019 and August 12, 2019. As a result of the first Delphi round, 93 criteria were accepted out of 153 criteria in the nationally validated TIME criteria set; three criteria were rejected. One criterion was modified to improve comprehensibility (Box 1). The remaining 57 criteria (39 TIME-to-STOP criteria and 18 TIME-to-START) that were neither rejected nor accepted were carried forward to the second Delphi round (Fig. 1), which took place between November 14, 2019 and January 6, 2020. In the second round, 34 criteria were accepted and four criteria were rejected. The remaining 19 criteria (nine TIME-to-STOP criteria and ten TIME-to-START) that were neither rejected nor accepted formed the basis for the third Delphi round. The third round took place between February 19, 2020 and March 30, 2020. In the third round, seven criteria were accepted, while none of the criteria were rejected. Hence, there were 12 criteria remaining, which were neither rejected nor accepted.

Fig. 1
figure 1

The development process of the internationally validated TIME criteria set. TIME Turkish Inappropriate Medication use in the Elderly

Box 1 Modified criteria (n = 1)

As a result, at the end of Delphi Round 3, 134 criteria were accepted and seven criteria were rejected (Box 2), and consensus could not be reached for 12 criteria (Box 3). The median value of the 12 criteria that were neither rejected nor accepted was 2, indicating an “agree” response, and their 75th centile value was 3, indicating a “neither agree nor disagree” response. The panelists that responded “neither agree nor disagree” commented that they had no clinical experience in these criteria, and so we concluded that the responses to these criteria would not change in subsequent rounds. As every round took about 2 months, we decided to stop the rounds at this stage with the consensus of the international panelist group. Thereby we aimed to keep the criteria set as up to date as possible.

Box 2 Rejected criteria (n = 7)
Box 3 Criteria that did not achieve consensus and therefore rejected (n = 12)

The final list derived from the Delphi study presented here is given in Online Resource 1 and Online Resource 2, and the final list with a full list of references is given in Online Resource 3 and Online Resource 4 (see the electronic supplementary material).

4 Discussion

In this study, the fourth phase in the development of the TIME explicit medication screening tool was completed. An internationally validated TIME criteria set was obtained through a Delphi validation study involving 11 recognized experts who took part in the study from start to finish. The validated TIME list comprised 134 criteria (101 TIME-to-STOP and 33 TIME-to-START criteria).

In recent years, problems related to inappropriate prescribing in older adults have increased in prevalence as a public health issue. While helpful guides to appropriate prescribing have recently been developed for several specific diseases and medications, higher levels of evidence, such as controlled trial data, is still missing for the use of many medications in older adults. Recognizing this shortfall, researchers have applied consensus techniques to develop a strong evidence base in areas where high levels of evidence are lacking. Combining expert opinions with evidence from literature can be considered a good approach to create a valid, useful tool [4]. Every country operates according to its own standards and approved medications, which makes the country-specific adaption of explicit criteria necessary [4]. This has led to different tools for the assessment of inappropriate prescribing being developed and published. Some of these tools suggest that they can also address the needs of other regions due to their integration of the most recent evidence and study results, which have been published internationally from a variety of countries. That said, the tools that have been developed by local experts need to be validated internationally if they claim to address the needs of other regions. In the national development process of the TIME‑to‑STOP/TIME‑to‑START criteria set, the aim was for it to be used to improve prescribing in older adults also in other regions [12]. Hence, an international validation study was conducted to determine whether these criteria could be applied to address inappropriate drug use at an international level. The Delphi technique applied in the present study is a consensus technique that uses evidence-based literature as a basis, and that involves multiple questionnaire rounds with feedback provided to the panelists between rounds. The Delphi validation technique was applied also in the STOPP/START version 2 study [13], being the model study for the development of the TIME criteria set. Typical aspects of the Delphi methodology are anonymity, iteration, and feedback, with conventional emphasis on the consensus of experts and the associated comprehensive statistical evaluations implemented in the study. The validated TIME list was formed, based on the responses of 11 panelists from seven countries in Europe and one from Israel, after three Delphi rounds. By comparison, STOPP/START v2 was formed based on the contributions of 19 experts from 13 countries in Europe in the Delphi validation phase, with two Delphi rounds performed [13]. The 2019 Updated Beers criteria used a similar Delphi methodology involving 13 panelists from the American Geriatrics Society [18], while the GheOP3S tool from Belgium was validated by 11 panelists from various European countries [16]. A different approach was applied in the EURO-FORTA study [17]. Based on the FORTA2012 list, country/region-specific FORTA lists were developed in seven regions (the United Kingdom/Ireland, France, Poland, Italy, Spain, the Nordic countries, and the Netherlands) using the Delphi method in two rounds. Aside from the country/region-specific FORTA lists, an overarching EURO-FORTA list was also created. The aim was to include a minimum of four participating panelists from each region, and therefore 47 experts were included in total [17]. To the best of our knowledge, no other explicit country-specific tools have been validated internationally [4, 5, 19,20,21,22,23,24,25]. Thus, the TIME criteria set stands among the aforementioned internationally validated criteria.

The original TIME‑to‑STOP/TIME‑to‑START criteria set included 153 criteria [12]. Across the three Delphi rounds, seven criteria were rejected and 12 criteria were neither rejected nor accepted, and were therefore removed. As can be understood from the panelists’ comments in the survey, some criteria were not accepted because the panelists felt they were not sufficiently familiar with the situation or medication in the criterion in their respective clinical practices and personal experiences. Furthermore, the 12 criteria that were neither rejected nor accepted, and therefore failed to achieve consensus at the end of the Delphi process, all had a median Likert score of 2. We suggest, therefore, that the rejection of some of the criteria, or the lack of consensus, does not necessarily indicate that they were considered unacceptable. Furthermore, at the outset of the study, we planned to continue with the Delphi rounds until an accept or reject decision was made for all criteria. But after observing that panelists who indicated point “3” on the Likert scale (neither agree nor disagree) had commented that they lacked clinical experience related to those criteria, we concluded that their answers would not change in subsequent rounds, leading us to terminate the study at the end of the third round. Another factor in the decision to terminate the study after the third round was that we wanted the TIME criteria to be an up-to-date criteria set. As data extraction was performed before the Delphi rounds, and each round took a significant amount of time to be completed, we ended the Delphi rounds with the consensus of the international group.

Currently, a free software application based on the TIME‑to‑STOP/TIME‑to‑START criteria is in preparation, and is expected to be available in the near future. This resource will hopefully aid clinicians in their application of the TIME criteria and in their efforts to improve medication use in older adults in their clinical practices. To the best of our knowledge, among the tools, only FORTA [17] and Beers [18] include mobile applications. The SENATOR software noted in the literature is more detailed software incorporating further patient-related features in addition to those related to drug use [26]. Keeping a large number of criteria in mind can be difficult, and the potential of risk being overlooked is high in clinical practice, making a mobile application particularly useful. That said, there has been no study to date investigating specifically whether mobile applications aid clinicians and researchers in this regard. Nevertheless, such an application would be expected to be helpful to users in daily practice. Consequently, we plan in the future to carry out an online survey of physicians to identify whether the application facilitates usage, and whether there are any perceived clinical benefits.

This study has several limitations related to how the Delphi process was conducted. First, we were compelled to terminate the study at the end of the third round, although we had originally planned to continue it till decisions on the acceptance or rejection of all criteria had been reached. Second, no face-to-face meetings were made during the Delphi rounds due to the restrictions associated with the COVID-19 pandemic, and such a meeting could have facilitated the panelists in reaching decisions on the final 12 unresolved criteria. That said, the anonymity of the process allowed the panel members to make independent decisions, which is generally the preferred approach in a Delphi process. Another possible criticism relates to the number of panelists and countries involved. Our results reflect a broad European view, but do not represent global consensus, as only 11 panelists from eight countries participated. However, the limited representation of the panel is a potential limitation common to almost all Delphi projects. Another potential criticism is that we did not include the explanations for the criteria, which had been provided in the nationally validated version and were constituted to help clinicians in practice. This was a pragmatic decision, based on the potential difficulty of integrating these explanations into an online form, and the possible limited ability of the panelists to absorb and digest such data. Furthermore, similar explanations were not present in most of the explicit criteria sets. These explanations, however, are readily available to the reader, being present in the original study that was very recently published [12]. The research basis for each criterion was firmly ascertained before it was presented to the TIME international working group within the Delphi process. The roots of the TIME criteria were based on the STOPP/START and CRIME criteria sets, which are internationally recognized and widely used tools. Each proposed TIME criterion was backed up by systematic reviews, randomized controlled trials, and guidelines, and was approved by internationally recognized academicians from Turkey.

5 Conclusion

We have presented here an international application of the Delphi method to reach consensus on the recently proposed TIME criteria set, involving the collaboration of experienced international experts. The TIME criteria aim to optimize prescribing in older persons, with specific focus on the East European region. This validation study supports the claim that the TIME set can be regarded as a widened explicit tool for applications with older adults not only from the East European region, but also from the other regions included in this consensus study. Further studies are needed to assess whether use of the TIME criteria improves prescribing patterns and decreases adverse health-related outcomes resulting from the inappropriate use of medication in older adults. Such studies are currently underway in Turkey in different healthcare settings. This international project is another example of the importance of global cooperation between experts on the topic of polypharmacy and inappropriate medication use. Consequently, this article joins previous publications that have resulted from the fruitful collaborations of IGRIMUP members, including our position statement [3] and a special IGRIMUP collection of other deprescribing articles [2, 27,28,29,30,31].