An empirically sound telemedicine taxonomy – applying the CAFE methodology

Because the field of information systems (IS) research is vast and diverse, structuring it is a necessary precondition for any further analysis of artefacts. To structure research fields, taxonomies are a useful tool. Approaches aiming to develop sound taxonomies exist, but they do not focus on empirical development. We aimed to close this gap by providing the CAFE methodology, which is based on quantitative content analysis. Existing taxonomies are used to build a coding scheme, which is then validated on an IS project database. After describing the methodology, it is applied to develop a telemedicine taxonomy. The CAFE methodology consists of four steps, including applicable methods. It helps in producing quantitative data for statistical analysis to empirically ground any newly developed taxonomy. By applying the methodology, a taxonomy for telemedicine is presented, including, e.g. application types, settings or the technology involved in telemedicine initiatives. Taxonomies can serve in identifying both components and outcomes to analyse. As such, our empirically sound methodology for deriving those is a contribution not only to evaluation research but also to the development of future successful telemedicine or other digital applications.


Introduction
As the process of digitisation increasingly affects the health care sector, innovative care concepts, often subsumed under the term 'digital health' (Otto et al. 2020), emerge. One major branch of these solutionstelemedicine, defined as the delivery of health care or health education over a geographical distance using information and communication technologies (Sood et al. 2007)applies information and communication technologies (ICTs) to address current problems and future trends in the health care sector worldwide. For example, access to and efficiency of healthcare services can be increased by implementing telemedicine initiatives (World Health Organization (WHO) 2016; CSIRO Futures 2018). People living in rural areas or people with mobility restrictions have easier access to the specialists they need, even if the specialists are not located closely (Kruse et al. 2017). Also, health care solutions can be tailored to individual patients (Holmen et al. 2017) and the points of care delivery can be extended beyond traditional locations to the home of people. This is possible due to the increasing availability of smart home technologies (Maeder and Williams 2017).
However, despite telemedicine receiving growing attention, currently available solutions are often still in an infant stage and have not matured to reach regular care (Huang et al. 2017;Harst et al. 2019b), as barriers for the implementation of telemedicine applications are diverse, originating from a lack of technology acceptance by both providers and patients (Harst et al. 2019a), from insufficient technological or financial infrastructural conditions (Otto and Harst 2019), as well as from low-quality evidence . Telemedicine applications may be highly diverse, potentially encompassing prevention (Alcantara-Aragon et al. 2018), care delivery (Birns et al. 2013) and rehabilitation for different diseases (Anderson et al. 2017). With such various fields of applications comes an equally wide range of patient groups to be targeted, health care professionals to be involved and technologies to base the application on. All these factors complicate the effectiveness evaluation of telemedicine applications (Bashshur et al. 2005;Ekeland et al. 2010). Therefore, the framework for digital health evaluation by the National Health Service (NHS) suggests to differentiate digital health applications according to their aim (National Institute for Health and Care Excellence 2019). Only those applications aiming at delivering care directly to a patient (evidence tier 3 of the framework) need to prove clinical effectiveness, need to be accepted by patients and health care professionals, and need to fit with current processes of care delivery (National Institute for Health and Care Excellence 2019). However, to evaluate these factors, the wide array of telemedicine application types first needs to be structured according to stakeholders and patients involved as well as technologies used (Ekeland et al. 2012).
To foster a detailed understanding of different telemedicine application types, taxonomies are a suitable tool that is often applied in information systems (IS) research and that allows for systematising highly diverse fields of research (Bashshur et al. 2005;Gregor 2006;Usman et al. 2017). As such, they not only assist in evaluating existing telemedicine applications but also in planning future ones.
Several taxonomies for digital health and related concepts exist. However, there is currently no taxonomy available that offers insights into all the key aspects of especially telemedicine applications holistically or provides guidance during their development. Furthermore, current research on the development of taxonomies shows that, although a taxonomy development method by Nickerson et al. (2013) exists, most taxonomies are developed ad hoc, i.e. without following a stepwise and therefore reproducible method (Lösser et al. 2019). Both Nickerson as well as Lösser and colleagues additionally show that, if a method is applied at all, deductive (not followed by empirical validation) and intuitive (i.e. non-standardised) methods outweigh the inductive methods which are based on statistical analysis. At the same time, Nickerson and colleagues state that inductive methods are especially useful when it comes to structuring a research field without previous knowledge but vast amounts of data about it at hand (Nickerson et al. 2013). Databases such as the German database for telemedicine projects (vesta Informationsportal 2019) constitute the latter.
Given the overall goal of facilitating the evaluation of telemedicine applications and as such foster their implementation, the two key contributions of this paper are as follows: A. Method-Application: Development of an empirically grounded taxonomy of telemedicine projects. This should enhance the distinction and understanding of telemedicine applications and provide guidance for both evaluating existing as well as developing new applications. B. Method-Validation: Extension of an established method for taxonomy development by incorporating empirical steps into the development method of Nickerson et al. (2013).

Background
Taxonomies help in 'comparing and contrasting classes' (Gregor 2006) by providing 'a set of dimensions […,] each consisting of […] collectively exhaustive characteristics' (Nickerson et al. 2013, p. 340). The method for taxonomy development by Nickerson and colleagues aims to facilitate the development of useful taxonomies. It uses meta-characteristics as a conceptual kernel and allows for an iterative design by offering inductive (bottomup) and deductive (top-down) design steps, which both can be used depending on the research aim. The inductive design step starts from a subset of objects and determines their common characteristics (empirical-to-conceptual), whereas the deductive design step starts with a conceptualisation of potential taxonomy dimensions and aims to apply this conceptualisation to specific objects later on. As an ending condition for their design process, Nickerson et al. (2013) introduce usefulness of their end product (the newly developed taxonomy). However, their methodology remains abstract and therefore vague concerning the methods applicable in each design step and as such does not provide a distinct operationalisation of the empirical approach.
When developing an empirically sound taxonomy of telemedicine application types, we follow a design science research approach insofar as we first check for existing methodologies to taxonomy development, which are lacking empirical approaches (cf. above). Based on the identified gap, we provide the CAFE (Collect, Align/Fuse and Evaluate) methodology as a new empirical approach. By using our methodology to structure the field of telemedicine, we show the applicability of our methodology (Hevner et al. 2004).

Methods
The proposed CAFE methodology ( Fig. 1) consists of four individual steps, which will be explained in detail in the following, along with sub-tasks and applicable methods. The aim is to produce quantitative data for statistical analysis to empirically ground any newly developed taxonomy.
Step 0: Focus specification Nickerson and colleagues call for defining 'meta-characteristics' (Nickerson et al. 2013, p. 343) as a necessary precondition for the methodological process. Meta-characteristics need to be chosen according to what overarching research aim the taxonomy is to be used for. As such, defining metacharacteristics is needed in order to specify the focus of a taxonomy. Consensus meetings are a useful method to define such meta-characteristics (Donaldson et al. 2015).
Step 1: Taxonomy collection Existing taxonomies from related research fields can be used as a basis for developing a taxonomy suitable for the specified focus. To find applicable taxonomies, a snowball approach can be applied, starting with a widely cited taxonomy in the field and further including taxonomies or related research items cited within the corresponding paper. This process should be terminated when theoretical saturation is achieved, i.e. when one paper keeps reappearing in the reference lists and, therefore, no new dimensions turn up within the taxonomies found (Wildemuth 2016).
Step 2: Taxonomy alignment and fusion To fuse existing taxonomies into a standardised coding scheme for quantitative content analysis in step 3, they first need to be disassembled. Therewith, the characteristics covered by existing taxonomies become apparent. For grouping these characteristics into overarching categories, from here on called dimensions, qualitative meta-analysis is applied, which has already been used to reinterpret case study results (Berente et al. 2019). In a similar way, existing taxonomies are Fig. 1 The CAFE methodology reinterpreted. Two researchers should familiarise themselves independently with the characteristics found in existing taxonomies. Afterwards, the characteristics are clustered thematically into overarching dimensions with the researchers working together on a mind-map. Any differences between the researchers are resolved in discussion until consensus is reached.
As suggested by Krippendorff, verbal designations defining categories (i.e. dimensions) are used for defining categories, e.g. definitions taken from dictionaries orwhere the dimensions are more difficult to describeexamples, which are called extensional lists (Krippendorff 2013). When some dimensions of the coding scheme encompass a broad set of plausible characteristics that cannot completely be determined a priori, free text coding is used. The resulting free text inputs are subjected to inductive qualitative content analysis after data collection.
The coding scheme's objectivity and reliability are evaluated by an iterative pre-test with at least ten of the objects described in the database chosen to be classified (Krippendorff 2013). Two researchers should at least be involved as coders in this pre-test. Consensus among the coders can be calculated using Krippendorff's alpha, which is applicable to any number of coders. The fixed threshold for accepting coding data according to Krippendorff is α = 0.67 (Krippendorff 2013).
Step 3: Evaluation After aligning and consolidating the coding scheme, the artefact derived needs to be evaluated. This step requires quantitative content analysis to gain data for statistical analysis.

Quantitative content analysis
After having pretested the coding scheme successfully on at least ten classified objects, the quantitative content analysis is extended to all objects listed within the database chosen. Coders are independently and randomly assigned to a feasible number of objects in pairs of two to lower any coding bias. Disagreement between coders is resolved through discussion, or, if not possible, through the vote of an independent third coder. All coding data is collected in an excel spread sheet we call coding scheme.

Descriptive statistics and cluster analysis
The coding scheme yields data that can be analysed statistically. In a first step, a descriptive analysis is performed for all variables in the data set.
To assure that the method produces clearly distinct groups, cluster analysis is used as a form of validation. Fiedler and colleagues have demonstrated the method to be feasible when aiming to build a taxonomy of IT structures in organisations, based, in their case, on survey data (Fiedler et al. 1996). Cluster analysis is a statistical procedure by which casesor, in more general terms, objectsare subsumed into overarching groups based on their likeliness. Likeliness in cluster analysis is defined as proximity, i.e. whether cases in the scatterplot are positioned close to or far away from each other (Stuetzle and Nugent 2010). If a graphical representation of the data set allows for differentiating groups based on proximity and distance, a number of procedures can be applied to differentiate clusters. None of them, however, is completely objective (Everitt et al. 2011). When most of the variables in the coding scheme produced by quantitative content analysis are dichotomous, the singlelinkage procedure is applied to form clusters. It picks the nearest neighbour of a case and makes those two the first cluster, before adding the next nearest neighbour to those two and so on (Stuetzle and Nugent 2010).

Results: A telemedicine taxonomy
After refining Nickerson et al.'s method for taxonomy development, we applied this method to develop an empirically grounded taxonomy of telemedicine projects. This helps to distinguish and understand telemedicine applications, and to guide evaluation and develop new applications.
Step 0: Focus Specification The addressed research problem can be described as follows: Although telemedicine applications have been known and studied extensively for over 50 years (Singh et al. 2002), they rarely make the threshold into regular care, a phenomenon commonly referred to as 'pilotitis' (Huang et al. 2017). One reason is the lack of data from methodologically sound studies on effectiveness of telemedicine (see, for example, Angeles et al. 2011). This is, in turn, due to an insufficient knowledge on how predictors and outcomes (such as effectiveness) are linked ). Therefore, our overarching aim is to determine effective components of telemedicine solutions. Consequently, our meta-characteristics are components of telemedicine solutions.
All objects within the analysis must fall under Sood and colleagues' definition of telemedicine, which is the delivery of health care or health education over a geographical distance using information and communication technologies (Sood et al. 2007).
Step 1: Taxonomy collection Starting our snowball approach with a widely cited taxonomy in the field of digital health (Bashshur et al. 2011), we included 14 scientific papers for further analysis, each proposing a digital health taxonomy. Figure 2 shows the taxonomies we included, their authors, the respective methodology used to derive the taxonomies, as well as the dimensions included in each taxonomy.
Step 2: Taxonomy alignment and fusion To align and fuse the existing taxonomies, qualitative metaanalysis was used to develop a standardised coding scheme. As components of telemedicine applications were our meta-characteristics, we followed a primarily deductive path, yet also developed some dimensions of our coding scheme from scratch, i.e. in an inductive manner. Berente and colleagues call this a semiopen process (Berente et al. 2019).
All in all, application types of telemedicine were considered in eight of the taxonomies included (Dierks 1999;Fitch 2004;Hung et al. 2004;Tulu et al. 2005;Fong et al. 2011;Poenaru and Poenaru 2013;Pearce et al. 2016). However, these taxonomies seldom provided associations to other characteristics, with the exception of Fitch's taxonomy from 2004 (Fitch 2004). Stakeholders, in terms of personnel involved, were included in three taxonomies (Elasy et al. 2001;Fitch 2004;Krumholz et al. 2006), while technological specifications such as mode of data provision, technology used or the setting of the application were covered by all but three taxonomies (Valentijn et al. 2015;Pearce et al. 2016;Baumel et al. 2017). Tulu and colleagues provided the only taxonomy where the clinical field, yet not the diagnosis, was taken into account (Tulu et al. 2005). Six taxonomies encompassed an intended outcome of telemedicine applications (Fitch 2004;Krumholz et al. 2006;Bashshur et al. 2011;Valentijn et al. 2015;Edmunds et al. 2017;Baumel et al. 2017), but none was disease-specific.
Even though each examined taxonomy individually lacked several important components, the taxonomies as a whole complemented each other well. Accordingly, the characteristics they included could be aligned to develop the coding scheme.
As current research stresses the importance of tailoring telemedicine applications to the individual end user (Kayser et al. 2017), we added further characteristics, namely demographics, to the dimension target population, as this dimension was underrepresented in existing taxonomies (see Fig. 2). While some of the taxonomies listed above found tele-health education to be a characteristic of the dimension application type, current research on the treatment of chronic diseases finds education to be an incremental part of disease selfmanagement (Whelton et al. 2018). As such, we merged these categories into one application type (digital selfmanagement). Table 1 shows the final coding scheme to be used for quantitative content analysis to validate its applicability as taxonomy. The scheme includes dimensions, which cannot be measured directly, i.e. latent variables, and characteristics of each dimension, which can be measured, i.e. manifest variables. For each manifest variable, a description is given to facilitate coding.
Step 3: Evaluation The evaluation was undertaken following two steps: First, the quantitative content analysis was conducted using the GEMATIK database. This also included a pre-test of the Fig. 2 Characteristics of existing telemedicine taxonomies coding scheme in order to test for inter-coder reliability. Second, a statistical analysis was conducted based on the data gleaned from the content analysis.
To ensure our coding scheme is apt to derive distinct groups, i.e. telemedicine application types, we applied it to all telemedicine projects listed within the German database provided online by the GEMATIK (vesta Informationsportal 2019). Each project within this database describes a certain digital health technology used and pilot-tested in a specific project. All in all, the GEMATIK database lists 156 projects from 2005 to 2012.

Quantitative content analysis
For the quantitative content analysis, projects within the GEMATIK database were selected for further investigation if they fulfilled all three criteria for telemedicine named by Sood and colleagues in 2007: (1) Health care or health education for the patient, (2) delivered over a distance, and (3) using information and communication technology (Sood et al. 2007). The analysis took place between February and June 2018 and followed the selection procedure proposed by colleagues in 2018 (Harst et al. 2018).
The coding scheme's objectivity and reliability were evaluated and enhanced by an iterative pre-test with ten of the projects included. Seven coders with varying coding experience coded all ten projects. In the first iteration, Krippendorff's alpha was 0.64, which is below the threshold for accepting coding data (α = 0.67) (Krippendorff 2013). After the first iteration, the coding scheme was supplemented with further definitions to clarify controversial terms such as mode of data provision and technology used. In the second and final iteration, an acceptable value for Krippendorff's alpha of 0.83 was achieved.
After the pre-test, the quantitative content analysis was extended to all telemedicine projects listed within the GEMATIK database to test the validity of the coding scheme as a taxonomy. The complete analysis was carried out by seven researchers, which were independently and randomly assigned to all projects in pairs of two.

Descriptive statistics and cluster analysis
In the first step of the statistical analysis, a calculation of descriptive values was performed for the following variables: (a) intended outcome, (b) target disease, (c) technology used, (d) application setting, (e) data provision and (f) medical specialists involved. A scatterplot of all coded projects is depicted in Fig. 3. It shows five clearly distinct groups of cases, pointing towards a five-cluster solution.
After dummy-coding of all multivariate variables, the cluster analysis based on the single-linkage procedure was run. Variables based on which to search for nearest neighbours were the application components and characteristics listed above (a to f). The procedure yielded five clusters, observable in the dendrogram depicted in Fig. 4.
Of the 110 projects studies, 82 could be successfully allocated to one cluster, the remainder fit into more than one and was therefore excluded from the following analysis. Cross-tabulations were conducted to describe the clusters in detail, with the clusters as the dependent variable and the components and characteristics (b to g) of the projects analysed as predictors. Based on the application components and characteristics coded in the content analysis, we named the clusters as follows: Cluster 1: Telemonitoring (n = 13) As is the case for all clusters, the basic medical personnel involved in running the applications comprises clinicians and physicians. Cluster 1, however, includes nurses more often than others (n = 4) and also includes physical therapists (n = 1). It is rather smartphone-(n = 7) and/or web-based (n = 12) and mainly based on data provision in the form of texts and/or numbers (n = 12), which are primarily stored and then forwarded (n = 13). The application types subsumed under this cluster are mainly used in the context of cardio vascular  = 7), knee/hip endoprothesis (n = 1) and stroke (n = 3). Consequently, when taking a closer look at the doctors involved, these are cardiologists (n = 4), internal (n = 2) and geriatric specialists (n = 1). The intended outcomes of the applications are mostly optimising care processes (n = 3) and clinical values of the patients treated (n = 3). Taken together, the reported characteristics allow for labelling this cluster 'telemonitoring'.
Cluster 2: Teleconsultation (n = 17) No medical personnel is overrepresented in this cluster. The applications clustered here, however, are clearly hospital-based (n = 8) and use image-(n = 7) or real time video-based (n = 7) data provision from patients to a wide array of doctors, such as diabetologists (n = 2), neurologists (n = 3), psychologists (n = 1) and emergency doctors (n = 1). A fittingly wide array of diagnoses is covered by the cluster, among them overweight (n = 1) and eating disorders (n = 1), as well as neurodegenerative diseases (n = 1) and rheumatism (n = 1). The intended outcomes of the applications are mostly optimising care processes (n = 3) and diagnosis (n = 6). Taken together, the reported characteristics allow for labelling this cluster 'teleconsultation'.

Cluster 3: Telediagnosis (n = 18)
Among the medical personnel subsumed in cluster 3, clinicians stand out (n = 14). As for the location of the application components, neither hospitals, rehabilitation facilities nor patient homes stand out. The basal technology is mostly the landline (n = 8), and data provision is mostly text-based (n = 16), with a storing and forwarding of data (n = 14). A wide array of diagnoses is covered by the cluster, with cardio vascular diseases (n = 5), diabetes (n = 2) and stroke (n = 2) standing out. The bandwidth of doctors involved is equally wide, and this cluster also comprises speech therapists (n = 2) and oculists (n = 2). Enabling access to health care (n = 2) and early diagnosis (n = 2) are the primary intended outcomes of the application types subsumed. Especially the latter allows for labelling this cluster 'telediagnosis'.
Cluster 5: Digital selfmanagement (n = 11) In addition to the usual set of health personnel, nurses (n = 4) and dietitians (n = 1) form this cluster. The applications subsumed here are mostly home-based or portable (both n = 8), with no patterns concerning the mode of data delivery. Diabetes (n = 2), neurodegenerative diseases (n = 2) and stroke (n = 1) stand out, and consequently, so do diabetologists and neurologists. The primary outcome of the applications subsumed is self-management (n = 3), followed by improving patient-reported outcomes such as quality of life (n = 2). Therefore, the cluster is labelled 'tele-selfmanagement'.

The empirically consolidated taxonomy
Using characteristics and components of the applications studied as variables based on which a single-linkage cluster analysis could be run proved to be feasible. The clusters differ from each other sufficiently in order to delimitate telemedicine application types. The final taxonomy is depicted as a taxonomic tree in Fig. 5. It now comprises seven major components: application type, personnel involved, target population, setting, technology, data provision and intended outcome. Each dimension is broken down into several characteristics, which are not mutually exclusive.

Discussion
We provided the CAFE methodology for empirically deriving a telemedicine taxonomy (aim A). Given the fact that it produced distinct categories, i.e. phenotypes, of telemedicine applications, it is likely that the methodology can be used to structure the diverse fields of IS research. The centrepiece of the CAFE methodology is the quantitative content analysis used to derive taxonomies right from the set of objects they are supposed to structure. This is in line with the empirical-toconceptual approach suggested by Nickerson et al. (2013), which we extended by adding a specific empirical method and thereby validated it (aim B). However, as taxonomies are always a tool used to achieve a higher research aim (Szopinski et al. 2019), we strongly suggest defining metacharacteristics, including context factors (Esser and Goossens 2009;Greenhalgh et al. 2017), which go along with this aim beforehand. All in all, our approach is a mixture of theoretical and empirical approaches as proposed by Rich (1992).

Applicability of the proposed methodology
To the best of our knowledge, quantitative content analysis has not yet been used to derive taxonomies. This is somewhat surprising as one of the main purposes of quantitative content analysis is to identify and replicate patterns (Krippendorff 2013). The fact that the majority of the projects in the database could be unambiguously clustered underlines that point. Therefore, by applying quantitative content analysis, we validated whether the coding scheme can become a taxonomy. From a purely scientific standpoint, the process from a quantitative collection of existing taxonomies to a qualitative development of a coding scheme, whose applicability is then again validated with a quantitative design, constitutes a concurrent triangulation design, common in mixed methods research (Creamer 2017). As telemedicine application types could be unambiguously derived, the method demonstrated a certain rigor according to the Technical Risk & Efficacy evaluation in Design Science Research (Venable et al. 2016).
As Gregor states, '[r]esearch begins with a problem that is to be solved or some question of interest' (Gregor 2006, p. 619), and as such, a methodology has to be appropriate to solve the problem. As the overall aim of our research is to evaluate the effectiveness of telemedicine, to be a useful tool for this means, our taxonomy has to be able to identify components of telemedicine applications to be tested for effectiveness. The newly developed taxonomy allows for defining target groups (Tchero et al. 2018) based on their diagnoses (Rush et al. 2018) and medical specialists (Lee et al. 2017) involved in a telemedicine project. As for defining components of the basal technology, it incorporates basic devices as well as modes of data provision (Shen et al. 2018). It also encompasses the personnel involved, whose consideration has been deemed important by recent studies on the treatment of chronic diseases (Schwarz et al. 2018). Last but not least, the taxonomy names a set of outcomes to be achieved by telemedicine use, fulfilling an important prerequisite for evaluation (Wechsler et al. 2017). Therefore, our method also contributes to the development of telemedicinespecific core outcome sets (Smith et al. 2018). All in all, the newly developed telemedicine taxonomy can serve as a starting point for the evidence-based development and evaluation of telemedicine. This is especially important as generating evidence for the effectiveness of telemedicine is one major barrier to the scale-up (i.e. successful implementation) of telemedicine (Otto and Harst 2019).
As five telemedicine application types could be described in detail, the developed taxonomy can serve as a blueprint for developing such applications in the future. This predictive value of cluster analysis, i.e. our chosen statistical procedure, has also been proven useful for information systems use cases by Lee et al. (2004).

Limitations
Twenty-eight projects from the GEMATIK database (vesta Informationsportal 2019) could not be allocated successfully to a cluster due to the fact that they fit into more than one. This emphasises, once more, the importance of step seven in Nickerson et al.'s (2013) framework for taxonomy development, asking whether the ending condition is met. The process cannot be terminated until all cases can be unambiguously assigned to a phenotype (Nickerson et al. 2013). Therefore, our method will have to be applied in at least one other database to make sure the basic coding scheme is indeed applicable. As such, our proposed methodology is in line with the design science research process as proposed by Hevner and colleagues, who clearly state that artefacts from information systems research have to meet the requirements of the environment in which they are supposed to be used (relevance cycle). If they do not, re-assessment and refinement of the artefact is incremental. By conducting a pre-test of the coding scheme, we have already undertaken the design cycle of the design science research process once (Hevner et al. 2004).
Furthermore, the coding scheme was developed using qualitative research methods, which are subjective by nature. Bias was reduced by discussing the final scheme among the involved researchers and by a quantitative pre-test, involving Krippendorff's alpha measures. The analysis used the German GEMATIK telemedicine database (vesta Informationsportal 2019), which provides basic information on finished projects. Thus, the quality of the database relies on the reports of researchers uploading project-specific information. Over-and especially underreporting of relevant information may have had an influence on the analysis. To mitigate this effect, an additional hand search was conducted to identify project reports and other sources of information on the relevant projects. Finally, cluster analysis does not provide fully objective measures either for the adequate number of clusters to derive, or for their description. Future research should investigate whether novel approaches such as latent profile analysis, which are superior in this regard, can indeed also help to derive empirically sound taxonomies.

Conclusion
Evidence on the effectiveness and adequate tailoring of telemedicine devices, as well as on their successful implementation, remains sparse. One major step towards successful evaluation is defining components which characterise an application, along with outcomes to be achieved by its use. Especially standardisation of outcomes to be measured is fundamental in evaluation research beyond the use case of telemedicine. Taxonomies can serve in identifying both components and outcomes to analyse. As such, our empirically sound methodology for deriving those is a contribution not only to evaluation research but also to the development of future successful telemedicine or other digital applications. Furthermore, in adding a specific empirical method to Nickerson et al.'s (2013) approach, we validate its utility in developing taxonomies.
Authors' contributions LH conceived of the basic methodology of the presented research. All authors contributed to the data collection, LH and PT conducted the data analysis and structured the presentation on the results. LO, BW and HS developed the manuscript structure and provided text blocks, especially for the introduction and the discussion section. LH primarily wrote the manuscript. All authors provided critical feedback to the manuscript and approved it for publication.
Funding Open Access funding enabled and orgaized by Projekt DEAL. The work for this paper was partly funded by the European Social Fund and the Free State of Saxony (Grant no. 100310385).
Availability of data and materials Additional data on data collection and analysis can be requested from the corresponding author.

Declarations
Competing interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Research involving human participants and/or animals Research was based on text material and involved no human participants and/or animals, therefore no informed consent was necessary.
Ethical approval No ethical approval was needed for this work, as no data on participants were gathered.

Consent to participate
The study is solely based on secondary data so that no primary data acquisition and therefore no consent to participate was needed.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.