Skip to main content

Systematizing the lexicon of platforms in information systems: a data-driven study

Abstract

While the Information Systems (IS) discipline has researched digital platforms extensively, the body of knowledge appertaining to platforms still appears fragmented and lacking conceptual consistency. Based on automated text mining and unsupervised machine learning, we collect, analyze, and interpret the IS discipline’s comprehensive research on platforms—comprising 11,049 papers spanning 44 years of research activity. From a cluster analysis concerning platform concepts’ semantically most similar words, we identify six research streams on platforms, each with their own platform terms. Based on interpreting the identified concepts vis-à-vis the extant research and considering a temporal perspective on the concepts’ application, we present a lexicon of platform concepts, to guide further research on platforms in the IS discipline. Researchers and managers can build on our results to position their work appropriately, applying a specific theoretical perspective on platforms in isolation or combining multiple perspectives to study platform phenomena at a more abstract level.

Introduction

Over nearly 50 years, the Internet has enabled the emergence of an ever-increasing variety of platforms, while the Internet itself evolved from a digital information platform to a digital communications platform. Digital platforms are defined as a set of digital resources enabling interactions between actors while creating value (Boudreau & Hagiu, 2009; Constantinides et al., 2018; Parker et al., 2017). They manifest as boundary objects (Star, 2010; Star & Griesemer, 1989) that enable organizations and individuals to network with each other, and through which business models and markets can be fundamentally disrupted (Iansiti & Lakhani, 2014).

Information Systems (IS) research takes two major perspectives on phenomena related to platforms—especially digital platforms—a technical/engineering perspective and an economic perspective (Gawer, 2014). From a technical perspective, platforms are built on an “extensible codebase of a software-based system that provides core functionality shared by the modules that interoperate with it and the interfaces through which they interoperate” (Tiwana et al., 2010, p. 676). In this perspective, IS scholars investigate the design (e.g., Spagnoletti et al., 2015), (third-party) development (e.g., Ghazawneh & Henfridsson, 2013), and architecture (e.g., Baldwin & Woodard, 2009) of platforms. Adopting an economic perspective, researchers investigate the economic effects and mechanisms of value creation on two- or multi-sided markets (McIntyre & Srinivasan, 2017), define platform launch strategies (Kazan & Damsgaard, 2016; Stummer et al., 2018), and investigate the evolution of platform ecosystems (Asadullah et al., 2018; Ozer & Anderson, 2015; Tiwana et al., 2010) or of online communities (Butler et al., 2014).

The platform terms referred to in different research streams are often neither identical nor even compatible. On the one hand, IS scholars use different terms to refer to the same concept (synonyms: e.g., internet platform and online platform), while, on the other hand, they use the same terms to refer to different concepts (homonyms: e.g., service platform in the domain of service science refers to an environment enabling actors to interact and thereby co-create value, while in computer science it refers to a platform providing IT services). Due to this lexical fragmentation, researchers and practitioners struggle to consistently identify and define the characteristics of platforms and specific platform types (Sørensen et al., 2015). We propose using the metaphor of a rhizome to describe the current configuration of platform terms. A rhizome—a philosophical concept established by Deleuze and Guattari (1979)—resembles a botanic rhizome and describes a system in which different nodes are connected by various edges to form a network without hierarchies—opposing tree structures that use hierarchies.

In a rhizome, every node can and must be connected to every other node, establishing a multiplicity of each node, and allowing investigators to browse through a rhizome using a multitude of entry and exit points (Deleuze & Guattari, 1979). A rhizome can break at any point and still keep growing (Deleuze & Guattari, 1979). In IS, a rhizome has been used sporadically as a metaphor for diverse concepts, including networks (Atkinson & Brooks, 2005; Gachet & Brézillon, 2005), decision processes (Humphreys, 2021; Nolas, 2008), assemblage (Leclercq-Vandelannoitte et al., 2014; Sesay et al., 2016), discourse (Iivari et al., 2017), and transformation (Márton, 2021). We use the rhizome metaphor to describe the interplay of different research approaches on digital platforms as a complex phenomenon that is subject to lexical fragmentation and lacking distinguishable hierarchies. Still, different subtypes of digital platforms might have similar characteristics, constituting a stream of literature that yields a multitude of entry and exit points.

Even though earlier attempts have been made to structure research on the rhizomatic nature of platforms, they had three crucial and recurring shortcomings that have prevented establishing a holistic perspective. First, most related research adopted a reductionist perspective—emphasizing, for example, a technical over an economic perspective—while only a few papers took an integrative approach (e.g., Asadullah et al., 2018; Gawer, 2009; Spagnoletti et al., 2015). However, a rhizome cannot be reduced to its parts, as it represents the multiplicity of the whole and the interconnectedness of its parts (Deleuze & Guattari, 1979). Second, previous research applied specific theoretical lenses to study platforms, e.g., the sharing economy (Sutherland & Jarrahi, 2018) or online communities (Spagnoletti et al., 2015) but without integrating their respective knowledge. Concerning the rhizomatic nature of digital platforms, current research focuses on parts of the rhizome, which again conflicts with the multiplicity and interconnectedness of all nodes. Third, previous research focused on reviewing specific types of platforms (e.g., IoT platforms, see Hein et al., 2018, or multi-sided platforms, see Boudreau and Hagiu, 2009), while not systematizing a consistent lexicon of different types of platforms. Current attempts that systematize IS research on platforms favor a particular research stream’s body of knowledge while neglecting other streams, which means that the work performed by others remains unacknowledged (vom Brocke et al., 2015).

We take up the call from de Reuver et al. (2018) for “developing a typology expressing the variety of digital platforms” (de Reuver et al., 2018, p. 133). We extend this call for research to all platform terms investigated in IS. It has been argued that developing a consistent lexicon (Habermas, 2014)—constituting the structure and definitions of platform terms in IS—constitutes an important step towards consolidating concepts and theory in any field of scientific inquiry (Berente et al., 2019), including platform research in IS. Thus, we attempt to disentangle the platform rhizome, based on reducing its complexity through decomposing the IS platform literature, to identify a structure that is modular, yet connected. Modularity refers to breaking down a system into discrete pieces (i.e., modules) that communicate with each other only through standardized interfaces (Langlois, 2002), thereby decomposing a system into fine-grained, interacting subsystems that can themselves be subject to decomposition. Simon (1962) introduced the notion of (near) decomposability and viewed hierarchy as a prominent organizing principle of nature. Building on this point of view, we posit that decomposition can be a helpful strategy to analyze rhizomatic structures, too. To assemble our home discipline’s knowledge (Tarafdar & Davison, 2018) and provide a forward-looking lexicon of platform research, we formulate our research question as follows: What concepts appertaining to platforms are investigated in the IS discipline, and how do these concepts relate to each other to constitute a decomposed, forward-looking lexicon of platforms for IS research?

We apply an inductive and data-driven research approach, inspired by other papers that restructured a different research field (e.g., Antons & Breidbach, 2018; Sakata et al., 2013), to collect, analyze, and interpret the current lexicon on platforms in IS research. In a mission to develop “computationally intensive theory” (Berente et al., 2019, p. 51), we adapt the research process outlined by Müller et al. (2016). Hence, we first collect 11,049 peer- reviewed papers—representing 95%Footnote 1—from leading IS journals to conference proceedings. Second, we apply unsupervised machine learning methods to analyze these vast amounts of unstructured data and identify the 26 most influential platform terms along with each term’s semantically most similar words (SSW). We then cluster the identified platform terms hierarchically, visualizing the clusters with a dendrogram. Third, we discuss the results to interpret and systematize the implications of our data-driven findings. Finally, we consolidate our findings, presenting a decomposed, forward-looking lexicon of platform concepts that can be used individually or can be combined to analyze digital platforms in IS.

From a theoretical perspective, this paper contributes unique data-driven findings, elucidating the lexicon of terms that constitutes platform research in IS. Our study is the first to collect and analyze our discipline’s entire body of knowledge on platforms, covering more than 44 years of academic inquiry, dating from 1975 to 2019. Based on an inductive, data-driven approach, we identify, quantify, and systematize the most influential platform terms. Thereby, we complement non-empirical approaches that have structured the field conceptually (e.g., Fu et al., 2018; Schreieck et al., 2016; Tiwana et al., 2010) with data-driven insights resulting in a decomposed model and a lexicon of platform terms. Our results enable other researchers to comprehend the rhizomatic nature of digital platforms by using the extended lexicon of research on platforms and to position their insights vis-á-vis previous theory. Our findings provide a consistent conceptual frame of reference for research on digital platforms that systematizes and consolidates related research focusing on particular concepts and theories in this area. From a managerial perspective, our results outline how platform technologies interplay with economic effects and online communities, enabling practitioners to better understand and design strategies for value creation with platforms.

The remainder of the paper unfolds as follows. In Section 2, we justify and describe our research method. In Section 3, we report our data and their analysis in detail to make our approach transparent and replicable. In Section 4, we code, interpret, and discuss our findings against the backdrop of related research. In Section 5, we propose a decomposition of platform terms and a lexicon to systematize our discipline’s rhizomatic research on platforms. As customary in many inductive studies, the presentation and discussion of related work follow the reporting of our data (Müller et al., 2016). Section 6 concludes the paper and outlines prospects for future research.

Research method

Performing a data-driven study enables us to analyze the entire body of knowledge on platforms in IS. In doing so, we build our research endeavor upon “the idea that research can start with data or data-driven discoveries, rather than with theory” (Müller et al., 2016, p. 291). Several researchers already applied this strategy in service science (Antons & Breidbach, 2018), medicine (Churilov et al., 2005), financial analysis (Sung et al., 1999), and even fishery (Syed & Weber, 2018), to name just a few. We instantiate our data-driven approach to acquire interpretable results, instead of reaching the highest accuracy possible (Müller et al., 2016). After analyzing the data, we interpret our findings in a theory-driven discussion. Our research, therefore, covers all three phases of data-driven studies: Data collection, data analysis, and result interpretation (Müller et al., 2016).

Several methods can be used to identify platform concepts and, thereby, to systematize the lexicon of platforms, from analyzing literature, including citation and co-citation analysis (Osareh, 1996), to manual literature reviews (Webster & Watson, 2002), or text mining approaches. We opted for the latter in the light of the following advantages it entails for our endeavor. Text mining provides several methods and techniques to generate interpretable semantic representations from textual data (Miner et al., 2012). First, as a sub-field of text mining, Natural Language Processing (NLP) provides methods for automatically processing vast amounts of unstructured textual data (Miner et al., 2012), enabling the extraction of new knowledge from the data, while being scalable and reliable (W. Fan et al., 2006; Frawley et al., 1992). Its ability to analyze the entire body of knowledge published on platforms enables us to systematize the lexicon of platform research, covering more literature than researchers can read and analyze manually in a reasonable period of time (Debortoli et al., 2016). Second, NLP does not require making a-priori decisions on including or excluding papers from the analysis, eliminating the selection bias that is inherent in manual literature reviews (Indulska et al., 2012).

Figure 1 displays the six steps of data collection (step 1) and data analysis (steps 2-6). When compiling our data set (step 1), we had to make three decisions. First, as the lexicon of platforms differs for each scientific domain (e.g., platforms in engineering; Simpson, 2004), we decided to emphasize conceptual clarity over completeness, by solely analyzing platform research in IS. Second, consistent with our first decision, we again emphasized conceptual clarity over generating a bigger data set, focusing on peer-reviewed and high-quality papers. Therefore, the search process was limited to journals and conferences that had been ranked B or higher in the VHB JOURQUAL3 ranking, sub-discipline business information systems (Hennig-Thurau et al., 2004), the most prominent ranking of scientific outlets for IS research in the German-speaking community. Across all journals and conference proceedings included in the analysis, the number of papers that contain the term ‘platform’ at least once reveals that the concept has been used broadly in our discipline (Table 1). Third, we did not restrict the year range and included publications from the entire research history on platforms in IS in order to present a full picture of historic and recent understandings of platforms. As seen in Figure 2, platforms are a phenomenon that is of increasing interest to the IS discipline. The whole period, from 1975 to 2019, saw no historic disruption of the foundational concepts of platforms. In total, we identified 11,646 papers, of which we were able to access and analyze almost 95% (11,049 papers).

Fig. 1
figure 1

Computational steps for identifying and quantifying platform terms in IS

Table 1 Distribution of the term ‘platform’ across selected IS journals and conference proceedings
Fig. 2
figure 2

Number of papers considering platforms, categorized per year

Figure 2 shows that platforms are receiving an increasing amount of attention in the IS discipline. The number of papers on platforms has roughly doubled every seven years, especially in the latter two decades.

We implement a bottom-up coding approach (Urquhart, 2013) to identify platform terms without introducing the bias usually caused by building on extant theory (step 2). Instead, we put aside theory in favor of identifying data-driven results with NLP. First, we tokenized the text (Manning & Schütze, 1999) to identify individual words from the papers contained in the data set. Second, we searched for any occurrences of the word ‘platform’ and extracted them along with the three tokens occurring before the term ‘platform’. In order to leave out terms that do not convey semantic value, we defined function words (stopwords) and excluded them from the results (Gupta & Lehal, 2009; Munková et al., 2013; Pennebaker et al., 2003).

Trimming the results of function words and punctuation resulted in identifying the pure platform terms referred to in a paper. However, we took great care not to remove function words before the full platform terms were extracted, to avoid distorting their meaning. Third, we unified the lexical forms of the platform terms we extracted (Vijayarani et al., 2015) by lemmatization (i.e., restoring the word to its dictionary form; Manning & Schütze, 1999). We did not lemmatize the whole phrase, however, because there is no research available upon which we can build to define all lexical forms of the platform terms. For instance, it is debatable whether a ‘collaboration platform’ is the same as a ‘collaborative platform,’ both of which would have led to the same word stem with lemmatization. The second step resulted in identifying the number of papers that mention each platform term we extracted, before highlighting those terms that are used most frequently in the IS discipline.

In order to prepare our dataset for analysis, we performed additional pre-processing steps (step 3). First, we analyzed the whole text of the papers regarding the context in which each platform term occurred. We then concatenated the parts of each platform term using underscores, which allowed us to handle them as one word. Additionally, we lemmatized and tokenized the words in an additional pre-processing step to identify the best possible results as an input for the following steps. We then used the resulting data as an input for performing machine-learning approaches to analyze latent structures and dependencies in our data set. To gain a deeper understanding of the terms, we trained and applied a Word Embedding Model (WEM) to each of the platform terms. WEMs are neural network vector representations of words that describe their semantic similarity (Mikolov et al., 2013). This approach assumes that a word’s meaning is similar to the meaning of the words occurring in its immediate proximity. For our analysis, we applied the word2vec model (Goldberg & Levy, 2014) which, as an implementation of the concept proposed by Mikolov et al. (2013), has been proven to provide high-quality results (Lilleberg et al., 2015; Thomas & Azhuvath, 2018).

Based on the words occurring in their proximity, we calculated the similarity vector for each platform term (step 4), again applying the word2vec model. From this analysis, we retrieved the semantically similar words (SSW) for each platform term. The SSW enabled us to elucidate the semantic meaning of each platform term in more detail. Since the model only provided us with the similarity vectors of two words, we had to take the edited platform terms that were merged in step 3. We then ranked the words in descending order of their vectors, to identify the most similar words. Next, we excluded all function words, leaving us with up to 100 SSW per platform term. At this point, we also selected the platform terms to be analyzed further, since the next step would take all remaining SSW that were identified across all platform terms as an input. We limited our analysis to terms that occurred in at least 150 of the identified 11,049 papers.Footnote 2 We defined this threshold based on running the analysis several times, using different thresholds and inspecting the resulting SSW for validity and interpretability.

Since the SSW of the remaining terms might still have overlapped semantically, we performed k-means clustering for all relevant SSW (step 5)—one of the most popular, simple, and efficient algorithms used for clustering. As k-means clustering requires a numerical input, we again used a WEM to compile a semantic representation of the identified terms. This time, though, we used a pre-trained model, striving for a general semantic orientation of the words, and not for an orientation that is specific to the context of IS publications. We employed a publicly available model that has been trained on 100 billion words from Google News (Google Inc, 2013). The pre-trained model provided a numerical vector value, which we used as an input for k-means clustering, aimed at reducing the number of words to a cluster size of 10% of the words, so as to not overfit the results. The result of this step was a list of clusters for each platform term.

Completing the lexical framing, we clustered the identified platform terms hierarchically and visualized the clusters as a dendrogram (step 6). Hierarchical clustering is a two-stage process consisting of vector distance calculation and cluster calculation. In our case, each vector represented one platform term and contained the identified clusters of their SSW as dimensions. Thus, the value of each dimension equaled the number of occurrences of one cluster for a particular platform term. We then adopted average linkage (Almeida et al., 2007) to compute hierarchical clusters that can be visualized in dendrograms. A dendrogram is a graph consisting of edges and vertices where, in our case, each vertex represents a platform term or one cluster of platform terms. Our dendrogram reflects the rhizomatic nature of platform concepts but reduces the complexity inherent to a rhizome by identifying different lenses that can serve as entry points to the rhizome.

Having completed all data analysis activities, we proceeded with interpreting the results (Müller et al., 2016) in our quest to develop a more consistent lexicon of platform terms. First, we revisited authoritative definitions of these platform terms by identifying the most cited papers defining the terms. Second, we updated the definitions to sharpen their profile as analytical lenses that can be applied to research different aspects of digital platforms. Further research can build on this lexicon by either using one of these lenses to guide their research or by combining different lenses purposefully to investigate the complex interplay of different aspects in a specific platform. For instance, our lexicon provides clear terminology on which others can analyze AirBnB as a digital platform that also establishes a two-sided market, identifying both aspects and their interplay clearly.

Data analysis

Analyzing the resulting data set of 11,049 papers, we identified roughly 35,000 platform terms, with papers ranging from one to over 600 total platform occurrences (Gal-Or et al., 2018). In total, we identified more than 297 unique platform terms that occurred in at least ten different papers. The most frequently used platform terms are listed in Table 2, ranked in descending order of the number of papers in which they occur. A historical analysis showing the number of papers published per year for each platform term every year since 1975 is available as an online appendix.

Table 2 Platform terms used most frequently in the IS literature (bold line shows a threshold of 150 occurrences)

During the third and fourth step, we trained a WEM for each of the platform terms listed Table 2. The SSW of the platform terms—as the output of these models—consist of (1) the word in its lemmatized form, and (2) a similarity value. The similarity value is based on the cosine similarity between both terms, where ‘1’ displays full semantic similarity, and a value close to zero indicates low to no semantic similarity. An excerpt of the SSW for the term ‘software platform’ is presented in Table 3, sorted by their similarity in descending order (the SSW for each platform term are available as an online appendix).

Table 3 Semantically most similar words, exemplified for ‘software platform’

As described previously, we set the threshold for the minimum number of papers in which a specific platform term occurs to 150 (up to ‘cloud platform’, Table 2). The remaining 26 platform terms are further investigated in the fifth and sixth step of our research process. Finally, this step of the data analysis resulted in a dendrogram (Figure 3) which depicts the platform terms and their hierarchical clustering based on their SSW, with the scale on top indicating the distance value between each pair of vertices.

Fig. 3
figure 3

Dendrogram, visualizing distances among the most common platform terms in IS research

Interpretation and discussion

Our aim is to develop a decomposition and systematize a consistent lexicon of platform terms to guide future IS research. Building on our data, we now interpret and discuss the clustering of the identified terms (Berente et al., 2019). We started this process by interpreting terms that exhibit a distance of 0.6 to 0.7 to each other (Fig. 3), leading us to identify six clusters in total. Identifying six clusters was a normative decision. Fewer clusters would have required combining clusters that appeared to be different (e.g., ‘social platform’ and ‘computing platform’), while more clusters would have obliterated the similarities between them (e.g., ‘social platform’ and ‘social networking platform’). Our six clusters are (1) abstract technology views on platforms, (2) specific views on hardware and software platforms, (3) social communities and online platforms, (4) economic platforms as digital markets, (5) general properties of platforms as IS artifacts for value co-creation, and (6) sharing platforms. In the following, we interpret and discuss the terms from each of the six clusters, focusing on the terms’ position in the dendrogram (Fig. 3), their SSWs (presented in the online appendix), seminal definitions in the IS literature (presented in the online appendix), and their use over time. Figure 4 visualizes the number of papers published per year for every relevant platform term identified, structured by the six clusters (top left to bottom right).

Fig. 4
figure 4

Historical development of the platform terms’ utilization, reported per cluster. Each diagram shows the number of papers containing a platform term published per year, for the years 1985 to 2019 (as nearly no papers were published in earlier years, cf. Figure 2)

Abstract technology views on platforms

The first cluster comprises the terms ‘technology platform,’ ‘IT platform,’ ‘technological platform,’ ‘technical platform,’ ‘ecommerce platform,’ and ‘common platform,’ all of which refer to technical aspects of platforms. ‘Technology platform’ is a superordinate/collective concept that describes platforms from a technical perspective on an abstract level and its utilization has been increasing in popularity among IS researchers since its inception. A ‘technology platform’ refers to various contexts related to IT and IS (e.g., Njenga & Brown, 2012; Purao et al., 2018) and is defined as “a set of technologies that have been developed for various applications but share a common underlying basic concept” (L.-S. Fan et al., 2015, p. 2). Thus, the term refers to “a set of design elements and interfaces that make up a technology” (Kraemer & Dedrick, 2002, p. 9). The SSWs underline this view since they primarily consist of terms associated with the technical implementation of platforms (e.g., ‘functionality,’ ‘application,’ ‘device,’ ‘API,’ ‘middleware,’ ‘interoperable,’ ‘backend,’ ‘modular,’ ‘interface’). Another term frequently used in the IS literature is ‘IT platform.’ Both ‘technology platform’ and ‘IT platform’ exhibit the lowest distance between all terms, since they share a substantial set of their SSWs. Unsurprisingly, ‘IT platforms’ are described as “the extensible codebase of a software-based system that provides core functionality shared by the applications that interoperate with it and the interfaces through which they interoperate” (Tiwana et al., 2010, p. 676). The term refers to the technical structure of platforms that features a layered modular architecture. Due to the overlap of definitions, the similarity of their meaning, and the inherent focus of the IS discipline on information technology—which is more specific than the term ‘technology’—we propose to abandon the term ‘technology platform’ in favor of an ‘information technology platform’, to indicate that IS research concerning platforms invariably focuses on information technology.

‘Technological platforms’ refer to programming languages, frameworks, etc. used in Computer Science (e.g., Blechar et al., 2006; Prechelt, 2011). In IS research, the term is employed rather nonspecifically, as evidenced by its SSWs (the ten highest-ranked SSWs are ‘sophisticated,’ ‘option,’ ‘easily,’ ‘manner,’ ‘complement,’ ‘entire,’ ‘fundamentally,’ ‘configuration,’ ‘effectively,’ and ‘technologically’). Gawer (2014) posits:

Technological platforms can be usefully conceptualized as evolving organizations or meta- organizations that: (1) federate and coordinate constitutive agents who can innovate and compete; (2) create value by generating and harnessing economies of scope in supply or/and in demand; and (3) entail a modular technological architecture composed of a core and a periphery. (p.1240)

While the term is used frequently in the recent IS literature, our triangulation of the definition by Gawer (2014) and the SSWs reveals that the term ‘technological platform’ is an umbrella term that does not refer to any specific view of platforms. Hence, we recommend future researchers to discontinue its use in favor of more specific platform terms.

At the outset of platform research in IS, the term ‘technical platform’ was amongst the most frequently used term. However, it seems to have lost its appeal since, making it one of the least frequently used terms in this cluster (cf. Figure 4). It now seems to be confined to a technical context, in which hardware- and software-based infrastructure is described (Gillespie, 2010), primarily in the Computer Science literature (e.g., Elbanna & Linderoth, 2015; Mustonen-Ollila & Lyytinen, 2003). In this context, Benlian et al. (2015, p. 214) define the term ‘technical platform’:

All facets of a platform related to the technical development of third-party applications including, for example, the provision of APIs and SDKs as well as all kinds of regulatory processes (e.g., quality and content checks), documentations (e.g., help files) and communications (blogs or forums in the developer community) that go along with application development. (p.214)

This definition includes all artifacts that are related to a technical view on platforms. However, because the utilization of the term is declining and its SSWs align with the SSWs of ‘technology platform’ and ‘IT platform’, we recommend not using this term. Instead, researchers could refer to the more frequently used term of ‘information technology platform’ or any other more specific platform terms.

The term ‘ecommerce platform’ (or e-commerce platform) is defined as the following:

E-commerce platform provides users with a variety of business service component [sic], through which the business service component allows users to complete the online transaction process. […] E- commerce platforms not only provide users with functions of online transaction, but also provide users with a series of support services (Huang et al., 2011, pp. 2171–2172).

The term features the highest distance to all other terms in this cluster. Inspecting the timely distribution of the term in our data reveals that its usage seems to have peaked some years ago but is now outdated having been substituted with other terms since. Many of this term's SSWs refer to aspects that are now part of other, more specific research streams on platforms (e.g., ‘carsharing,’ ‘collaborate,’ ‘Airbnb,’ ‘crowdsource’). Examples for ‘ecommerce platforms’ provided in the literature (e.g., Amazon, ebay, Airbnb, Tripadvisor) substantiate this observation. Thus, we argue for discontinuing the use of the term ‘ecommerce platform’ in favor of using more specific or differentiated platform terms.

The term ‘common platform’ features a high distance to most of the other platform terms. Also, the word ‘common’ neither refers to the inner workings of a platform nor does it bear semantic value (cf. ‘most common platform’ in Park et al., 2007). Thus, we decided to exclude it from our analysis.

While this cluster contains a broad access to platforms, we view ‘IT platform’ as the most prominent term to be used in IS research when referring to a digital platform in a general sense.

Specific technology views on hardware and software platforms

The second cluster comprises the terms ‘open platform,’ ‘mobile platform,’ ‘computing platform,’ ‘hardware platform,’ ‘internet platform,’ ‘software platform,’ and ‘development platform.’ The term ‘multiple platform,’ which may have been shortened from its plural form during data pre-processing, is another term that we decided to drop due to its lack of clear semantics, as we did with ‘common platform’. Even if ‘multiple platform’ has a low distance score to ‘computing platform,’ a close inspection of the papers containing this term revealed that the term carries no specific meaning on its own. In contrast, we identified all other platform terms in this cluster as subtypes of the more general term ‘information technology platform’, as identified in cluster one.

The term ‘open platform’ does not refer to a specific platform type, but represents a research stream in IS that studies how platform openness impacts the development, evolution, and commercialization of a platform (Boudreau, 2010). Platform openness has been gaining increasing intention over the last couple of years, as researchers investigate it “as a governance-related concept reflecting the trade-off between retaining and relinquishing control over a platform” (Benlian et al., 2015, p. 210). The term's SSWs reveal topics of particular interest relating to, e.g., licensing, commercialization, proprietary (software), interoperability, and monetization.

The term ‘mobile platform’ is often used in the contexts of smartphones and other mobile devices (SSWs include ‘WhatsApp,’ ‘Symbian,’ ‘tablet,’ ‘smartphone’) that constitute boundary objects in mobile ecosystems that involve mobile device manufacturers, mobile network operators, mobile application developers, and other stakeholders (Basole & Karla, 2011). For a decade, the term was amongst the two most frequently used terms in this cluster. ‘Mobile platforms’ are viewed by Sørensen et al. (2015, p. 196) as:

Multi-sided markets [that] critically rely on architectural leverage (Thomas et al., 2014) through a critical mass of complementors and customers. Boundary resources (Ghazawneh & Henfridsson, 2013; Eaton et al., 2015) can support a highly distributed process subjected to combinations of centralised control and decentralised generativity (Tilson et al., 2010b).

Many papers investigate how different degrees of platform openness can lead to competitive advantage and to attracting more actors to join a platform ecosystem. Hence, “the issue of platform openness is therefore a critical issue for mobile platforms” (ibid).

The term ‘computing platform’ has been used frequently since 1995 and ranked second as the most frequently used terms in this cluster in more recent years (cf. Figure 4). The term's SSWs reveal that research on ‘computing platforms’ concerns, among others, the Internet of Things (IoT), which is the SSW with the highest similarity score. Consequently, Athanas and Abbott (1995, p. 16) have argued, back in the mid-1990s, that ‘computing platforms’ were “emerging as a class of computers that can provide near application-specific computational performance.” Other common topics identified in the SSWs are virtualization, deployment, and scalability. Hence, we conclude that the term ‘computing platform’ is used by IS scholars to report on scenarios in which computational power is outsourced to platforms that can be employed as-a-service to solve computational problems.

The term ‘hardware platform’ is used if referring to hardware issues related to platforms (SSWs include ‘installation,’ ‘mainframe,’ ‘workstation,’ ‘PC,’ ‘microcomputer’). This view is in line with definitions in the IS literature which introduce a ‘hardware platform’ as “a family of architectures that allow substantial re-use of software” (Keutzer et al., 2000, p. 1528) that “executes software application programs” (de Michell & Gupta, 1997, p. 349). In line with this definition, the term ‘software platform’ is used to refer to the development and deployment of software applications (SSWs: ‘linux,’ ‘application,’ ‘deployment,’ ‘SDK,’ ‘apache,’ ‘iOS). The definition by Taudes et al. (2000) explains how software and hardware platforms are integrated: “A software platform is a software package that enables the realization of application systems. […] Together with the hardware and the organizational knowledge about planning, designing, and operating application systems, the software platforms in use constitute a firm’s information technology infrastructure” (Taudes et al., 2000, pp. 227–228). However, both terms refer to separate views on platforms that need to go hand in hand to successfully design and develop platforms from a technical point of view.

Surprisingly, the IS literature does not define the next term in this cluster, ‘development platform.’ The literature only provides an example of a ‘development platform’ with the open-source development platform EclipseFootnote 3 (Mehra et al., 2011). This observation is in line with SSWs such as (ruby on) ‘rails,’ ‘ocean,’ ‘j2ee,’ and ‘petrel’, all representing other programming frameworks. Thus, research using the term ‘development platform’ concerns tools and frameworks that support programmers with the software development process.

The last term in the second cluster is ‘internet platform.’ It has received relatively stable attention throughout the last decade, while all other terms, apart from ‘hardware platform,’ are used more often in the IS literature. The term refers to the prospects of the Internet connecting distinct groups of users remotely by enabling “transactions [...] by using the Internet platform (e.g. TCP/IP, HTTP, XML) in conjunction with the existing IT infrastructure” (Zhu et al., 2006, p. 601). The corresponding SSWs identified (e.g., ‘interactive,’ ‘channel,’ ‘push,’ ‘ubiquitous’) substantiate this view, making the term generic and outdated, since any contemporary (digital) platform is based on the premises of internet technologies. Thus, we argue that the term has become obsolete.

In sum, our data provides evidence that research on platforms as digital tools considers diverse layers of technologies, comprising hardware such as mobile devices, computing infrastructures, protocols and networking technologies, software development frameworks, and software execution environments. While it is possible to address each layer specifically, the established set of terms is overlapping. We interpret these overlaps as part of a broader trend in which particular layers of technology are abstracted in favour of considering complete technology stacks.

Online communities and social platforms

The third cluster includes the terms ‘social platform,’ ‘social media platform,’ ‘online platform,’ ‘social networking platform,’ and ‘communication platform,’ exhibiting a maximum distance of 0.6, showing the considerable degree of semantic overlap.

‘Social platform’ as a concept lacks a popular and well-cited definition in IS research, although it is mainly seen as a tool for connecting users to enable interaction and communication (Cheung et al., 2011; Mitchell-Wong et al., 2008). Examples of ‘social platforms’ include social networking websites, online discussion forums, and blogs (Cheung et al., 2014; Cui et al., 2016), thus covering a wide range of such platforms. We choose ‘social platforms’ to represent the cluster itself as an umbrella term covering different platforms that focus on social interactions and connections.

The concept of ‘social media platform’ has evolved to become the most frequently used platform term in the IS knowledge base (cf. Table 2). ‘Social media platforms’ “facilitate information exchange between users” (Kaplan & Haenlein, 2010, p. 60), covering different forms of user-generated content (Wade et al., 2020). Kallinikos and Constantiou (2015, p. 73) propose viewing ‘social media platforms’ as “huge interaction machines rather than algorithms.” The SSWs underline these definitions (e.g., ‘microblogg,’ ‘twitter,’ ‘channel,’ ‘LinkedIn,’ ‘strategically,’ ‘facebook,’ ‘widespread,’ ‘instagram,’ ‘targeted,’ and ‘commercial.’) A ‘social media platform’ refers to user-generated content and media, and their sharing in online communities. Thus, the term is a specialized version of ‘social platform.’

‘Online platform’ is another prevalent platform term in this cluster. At a first glance, ‘online platform’ seems remotely connected to the other terms in this cluster, since it does not convey clear semantics apart from its reference to the Internet. Definitions for ‘online platforms’ are scarce, comprising a technological basis, delivering and aggregating services (Batura et al., 2015), and facilitating involvement and interactions for users and crowds (Nevo & Kotlarsky, 2020; OECD, 2019). The context in which platforms are characterized as ‘online platforms’ is, however, diverse and covers a broad outset of IS research areas focusing on social interactions and online communities, as evidenced by the cluster analysis and the SSWs. Thus, we propose the discontinuation of the terms due to its unclear and inconsistent meaning, and its nonspecific reference to digital platforms. Instead, we advise IS researchers to use ‘digital platform’ (cf. Section 4.5) or to refer to a more specific platform term.

A ‘social networking platform’ links “networks of users with the providers of various services and applications” (Bakos & Katsamakas, 2008, p. 172). ‘Social networking platforms’ often integrate third-party developers, who can integrate third-party content on the platform (Felt & Evans, 2008). In the IS literature and the SSWs we can identify ‘social networking platforms’ being LinkedIn, Yammer, Facebook, and Twitter. As such, ‘social networking platforms’ focus on promoting interactions between users, helping them with networking and connecting activities. Thus, we view ‘social networking platforms’ as another specialization of ‘social platforms.’

The last term in this cluster is ‘communication platform.’ Considering the evolution of platform terms in the history of IS (cf. Figure 4), this term was deemed to represent the archetype of a ‘social platform’ up to 2008, whereas they have now turned into the least popular terms in this cluster. This observation aligns with definitions of a ‘communication platform’ reaching back to 1998 (Bertino, 1998), but it should be noted that the interpretation of ‘communication platform’ has evolved from being a client-server architecture, predominantly on the Internet (Bertino, 1998), to having become a system which enables users to send, read, and reply to direct messages from other Internet users (Chang & Wu, 2014). As such, ‘communication platforms’ can improve knowledge management across organizations by decreasing required human communication (Dullaert et al., 2009; Jin & Kotlarsky, 2012). The SSWs underline this interpretation (e.g., ‘sms,’ ‘intranet,’ ‘messaging,’ ‘channel,’ ‘dialogue,’ ‘connect.’) We, therefore, view ‘communication platforms’ as a third specialization of ‘social platforms.’

Economic platforms as digital markets

An economic view on platforms as digital markets covers the terms ‘crowdfunding platform’ and ‘crowdsourcing platform’. Both types of platforms focus on economic effects and crowd involvement. Due to these terms’ definitions and their similarity to (two−/multi-) ‘sided platforms’, we incorporated ‘sided platform’ in this cluster, to comprise three terms.

‘Crowdfunding platforms’ enable the “financing of a project or a venture by a group of individuals instead of professional parties” (Schwienbacher & Larralde, 2010, p. 370) via internet services (Burtch et al., 2013; Liu et al., 2015). As such, ‘crowdfunding platforms’ allow individuals to freely present ideas to an online community to raise financial support for the realization of their products or services, matching ideas with investors (Gerber et al., 2012). Examples of ‘crowdfunding platforms’ are RocketHub, Kickstarter, and IndieGoGo (Gerber et al., 2012), which also feature at the top of the list of the term's SSWs.

Crowdsourcing is defined as a “type of participative online activity in which an individual, an institution, a non- profit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task” (Estellés-Arolas & González-Ladrón-de-Guevara, 2012, p. 197). ‘Crowdsourcing platforms’ enable these activities for online communities, propelling rivalry by incentives (Bauer et al., 2016), which is supported by the SSWs (e.g., ‘brokerage,’ ‘intermediary,’ ‘advice,’ and ‘challenge’). Thus, while both terms refer to digital markets, ‘crowdsourcing platforms’ and ‘crowdfunding platforms’ differ by the type of activities they enable and the purpose of the community involvement.

A closer inspection of the papers containing ‘sided platform’ reveals that this term encapsulates more specific terms, such as ‘multi-sided platform,’ and ‘two-sided platform,’ which were automatically shortened during data pre-processing, since numbers are function words. Multi-sided platforms “coordinate the demand of distinct groups of customers who need each other in some way” (Evans, 2003, p. 325), e.g., in dating clubs and yellow pages. Two-sided platforms focus on (in-)direct network effects between two sides of a market, mostly investigating pricing structures (Hagiu, 2007). Research on both two-sided and multi-sided platforms focusses on economic effects, e.g., pricing, economic models, and market competition (Evans, 2003; Hagiu, 2007; Hagiu & Wright, 2015) also supported by the SSWs (e.g., ‘champion,’ ‘cryptocurrency,’ ‘instruments,’ and ‘enabler.’)

All terms comprising this cluster focus on investigating the economic effects on platforms. This observation is consistent with the term's evolution over time (cf. Figure 3). While all terms emerged after 2010, they have become increasingly popular. Interestingly, they refer to different types of platforms, such that they can co -exist without the substantial overlay that we identified in other clusters (e.g., clusters 1 and 2).

General properties of platforms as IS artifacts for value co-creation

We identified the fifth cluster to cover terms that refer to platforms from an abstract perspective, highlighting the general properties that constitute the inner core of platforms and hold across diverse application scenarios.

‘Digital platforms’ refer to platforms as IS artifacts designed to attract and incorporate content supplied by third parties. A prominent review of research on ‘digital platforms’ is provided by de Reuver et al. (2018), differentiating a technical perspective on ‘digital platforms’ as “an extensible codebase to which complementary third-party modules can be added” from a socio-technical view of ‘digital platforms’ as “technical elements (of software and hardware) and associated organizational processes and standards” (de Reuver et al., 2018, p. 127). Both definitions refer closely to earlier conceptualizations of platforms (Sedera et al., 2016; Tiwana et al., 2010) and treat an extensible codebase as an indispensable feature constituting ‘digital platforms.’ In contrast with this technical viewpoint, the SSWs associated with ‘digital platform’ in our dataset show the evolution of the term which now refers to platforms in a much broader sense, comprising strategic (e.g., tactic, ambidextrous, strategizing), economic (e.g., intermediary, payment, marketplace), organizational (e.g., meta-organization, ecosystem, start-up), and technological (e.g., blockchain, architecture, IoT) aspects. Against this backdrop, ‘digital platforms’—the platform term displaying the most significant growth rate in research papers covering recent years—seem to have become the basic concept in IS when referring to platforms in an abstract sense, highlighting the platforms' most central properties, while abstracting from specific aspects of their design and use.

A similarly abstract view on ‘digital platforms’ is reflected in the term ‘service platform.’ A ‘service platform’ is a “modular structure that consists of tangible and intangible components (resources) and facilitates the interaction of actors and resources (or resource bundles)” (Lusch & Nambisan, 2015, p. 162). Thus, the concept focuses on the role of a platform to facilitate value co-creation among the actors interacting on the platform, thereby constituting a service ecosystem (Lusch & Nambisan, 2015). Service—as viewed from the service-dominant logic standpoint (Vargo & Lusch, 2004) on which this definition is based—is an abstract concept, referring to “the application of specialized competences […] through deeds, processes, and performances for the benefit of another entity or the entity itself” (Vargo & Lusch, 2004, p. 2). While particular types of service can be co-created with ‘digital platforms,’ ‘service platforms’ as a concept deals with the mechanisms that lead to co- creating value irrespective of these more specialized platform types. For instance, a ‘service platform’ can explain the mechanisms constituting value co-creation on a multi-sided market or a social media platform by applying an abstract lens on value co-creation, even if the way in which value co-creation works on each platform type might differ.

While a ‘cloud platform’ lacks a consistent definition in the extant literature, its recurring properties concern technical and business model aspects in terms of how the platform “deploy[s] software via the Internet” (Katzan, 2009, p. 256). A boundary-spanning role—integrating technical aspects and business aspects—is reflected in the term's position in our dendrogram, and by its SSWs (business aspects include ‘marketplace,’ ‘consulting,’ ‘commercialize,’ ‘brokerage,’ ‘competency,’ technical aspects include ‘virtualization,’ ‘proprietary,’ ‘infrastructure,’ ‘tenancy,’ and multiple classes of applications such as enterprise resource planning or customer relationship management). Consistent with this boundary-spanning role, Katzan (2009, p. 260) defines a ‘cloud platform’ as “an operating system that runs in the cloud and supports the software-as-a-service concept.” From a technical point of view, a ‘cloud platform’ provides infrastructure (infrastructure-as-a-service, e.g., virtualized hardware), development platforms for software (platform-as-a-service, e.g., middleware), or software (software- as-a-service, e.g., business applications) as a shared pool of virtualized resources that is scalable and available. Third-party users can deploy their software and have it hosted in the cloud as a managed service, or they can deploy and run their own applications in the cloud (Katzan, 2009). A prominent example is Google Cloud Platform,Footnote 4 which supplies business customers with, amongst others, solutions to design cloud-based data storage, monitoring and analytics solutions, mobile apps, and media solutions. As one aspect of cloud computing, ‘cloud platforms’ refer to as-a-service business models that build on metered or subscription pricing models, depending on the resources consumed by users over time (Katzan, 2009). As-a-service business models help users, amongst others, to adjust their computing resources to flexible demand using pooled resources, to lower their capital lockup, and to have flexible and scalable access to computing resources that are placed “somewhere in the Internet”, i.e., in the cloud (Katzan, 2009, p. 257).

Sharing platforms

In our cluster analysis, the term ‘sharing platform’ shows the highest distances (> 0.8) from all other clusters, marking it out as a cluster in its own right. At a higher level of detail our data indicate that the vocabulary associated with ‘sharing platform’ relates to both platform economics (cf. Section 4.4) and to online communities (cf. Section 4.3), displaying a strong connection with economic transactions (SSW: ‘rental,’ ‘barter,’ ‘lending,’ ‘rideshare,’ ‘commercial,’ ‘money,’ ‘carsharing’) and peer-to-peer interactions, as frequently implemented on ‘sharing platforms’ (SSW: ‘P2P,’ ‘C2C,’ ‘ecosystems’).

Despite this apparent overlap of social and economic aspects in sharing, the IS literature uses ‘sharing platform’ differently from economic platforms or online communities. By definition, ‘sharing platforms’ “facilitate sharing among people who do not know each other, and who lack friends or connections in common, […] mak[ing] stranger sharing less risky and more appealing because they source information on users via the use of ratings and reputations” (Frenken & Schor, 2017, p. 4). As such, they provide a mediating technology to enable sharing between different parties (Sutherland & Jarrahi, 2018). Seminal work on sharing (Belk, 2010) outlined why sharing, gift giving, and marketplace exchange represent very different prototypes of interactions, emphasizing that “sharing tends to be a communal act that links us to other people” (Belk, 2010, p. 717), and which can be seen as “nonreciprocal pro-social behavior” (Benkler, 2004, p. 275). Caring for others and social bonding are seen as defining characteristics of sharing (Belk, 2010). The same characteristics can also be found in research on online communities; Karahanna et al. (2018) identify relatedness as a psychological need in the social media context, defined as the “need to interact, be connected to, and experience caring for others” (Karahanna et al., 2018, p. 740). In contrast to sharing, economic exchange is traditionally characterized by a transfer of ownership and an impersonal relationship between exchanging parties (Belk, 2010). While the Internet has brought up new forms of Internet-facilitated sharing (Belk, 2014)—e.g., enabling people to share data with strangers and material goods in their neighborhoods—sharing is still supposed to be a social, not-for-profit interaction.

However, on digital platforms, references to ‘sharing’ are incongruent with its original meaning, rather pointing to a collaborative consumption of under-utilized resources. Collaborative consumption occurs when “people coordinat[e] the acquisition and distribution of a resource for a fee or other compensation,” also termed pseudo- sharing “in that they often take on a vocabulary of sharing (e.g., ‘car sharing’), but are more accurately short - term rental activities” (both: Belk, 2014, p. 1597). For instance, the platforms featured in the SSWs of ‘sharing platform’ (Airbnb, BlaBlaCar, or Lyft) enable users to consume houses or transportation collaboratively, using digital platforms that enable impersonal interactions among service providers and customers on a digital multi- sided market.

We conclude that as a theoretical lens, proper sharing leans towards the use of platforms to establish social communities, while collaborative consumption leans towards market-based interactions as a domain of platform economics. We conclude that the concept of a ‘sharing platform’ is often used inconsistently with foundational concepts of sharing and, therefore, cannot provide the missing link to connect “social network research, such as research on collective intelligence, with the domain of online social commerce as it is established in C2C interactions” (Puschmann & Alt, 2016, p. 95). We propose that—instead of using ‘sharing platform’—digital platforms that link actors for the purpose of nonreciprocal social behavior should be referred to as ‘social platforms’, whereas the term ‘(two-/multi-) sided platforms’ should be used to refer to a digital platform that enables impersonal market-based interactions, including collaborative consumption in a peer-to-peer network. With this distinction, research can avoid confusion concerning the terms ‘sharing platform’ and sharing.

Decomposed lexicon of platforms in IS

Based on the interpretation and discussion of the terms we discovered in our data-driven study, we strive to propose a consolidated lexicon of platform terms in IS research that reduces complexity and enables future research to study the phenomenon of digital platforms and their subtypes in detail. We built the lexicon by hierarchically decomposing the knowledge base on digital platforms in IS. Decomposition is a well-known and first-hand solution to break down complex structures (Alexander, 1964), like the rhizome structure currently appertaining to the IS knowledge base on digital platforms. While there is always more than one solution to decompose a complex structure or problem, every single one is always more suitable for a particular purpose (Alexander, 1964). The main goal of decomposition is to identify subsets “whose internal interactions are very rich” (Alexander, 1964, p. 124) with “as little interaction between subsets as possible” (ibid). At the same time, it is important to acknowledge that underemphasizing the relationships between the subsets of a decomposed structure is a critical issue that needs to be prevented (Alexander, 1964). Building on decomposition, we aim to develop a lexicon of digital platforms in IS, which enables future research to address specific terms and concepts better than today’s rhizomatic structure of platform terms.

A decomposed model of a structure enables the study of not only the details of the system but also the interactions between all parts of the system as a whole (Simon, 1996). Thus, our decomposed lexicon of digital platforms in the IS literature provides researchers with a systematization that can be applied to study digital platforms and related phenomena from different (isolated) theoretical perspectives and by combining multiple perspectives to study a phenomenon at a more abstract level or as a whole (Fig. 5). They might either decide to use one particular concept to focus on, or they might combine different concepts to investigate more complex phenomena. In this regard, the root node of our model, ‘digital platform’ comprises all features that relate to digital platforms and are decomposed in the subsets. IS artifacts exhibit technical and social properties. A ‘digital platform’ as a whole is, thus, a generative IS artifact that provides a mutual core of technology and organi- zational arrangements, inviting compatible and complementary resources (e.g., hardware, software, or content) from third parties to enable the emergence of digital online communities or markets (de Reuver et al., 2018).

Fig. 5
figure 5

A decomposed model of terms that can be used or combined as lenses to study platforms in IS research

Building on digital platforms, we posit that two overarching views can be applied when studying platforms. First, ‘service platforms’ point to the role that platforms play for co-creating value among the stakeholders in service (eco-)systems, such as service providers (including a platform owner and platform provider) and service customers. In this sense, a ‘service platform’ is a view on digital platforms as structures that establish value co-creation in service ecosystems, enabling actors to provide, access, and integrate complementary resources in service-for-service exchanges, building on a mutual core of technology and organizational arrangements (Lusch & Nambisan, 2015; Vargo & Lusch, 2004). Second, from a more technical perspective, ‘cloud platforms’ refer to digital platforms as IT artifacts that ought to be designed in specific ways as a prerequisite to enable actors to co-create value. ‘Cloud platforms’ provide an abstract view on an operating system that runs in the cloud and provides a shared pool of virtualized, scalable and available resources in the form of infrastructure, development platforms for software, or software. Both views are valid since cloud platforms are strongly related to as-a-service business models, as evidenced by their SSWs. For this reason, they provide complementary, yet abstract, views on the design, form, and function of platforms as IT artifacts (cloud platforms), and on the co-creation of value established through users interacting with a digital platform (service platform). Due to their role as views, neither of these platforms concepts provides a direct super-structure for the more detailed platform concepts discussed subsequently, which aligns with the purpose of a taxonomy, provided the ‘subclass of’ primitive is not applied to the edge connecting these concepts (Gomez-Perez & Corcho, 2002).

‘Information technology platforms,’ ‘social platforms,’ and ‘(two-/multi-)sided platforms’ are three more detailed perspectives on digital platforms. Triangulating the insights identified from clusters one and two, the term ‘information technology platform’ is identified as an overarching concept referring to technical views on platforms as IT artifacts. All other terms in clusters one and two are outdated (cluster one: ‘technical platform,’ and ‘ecommerce platform;’ cluster two: ‘internet platform’), nonspecific (cluster one: ‘technology platform,’ and ‘technological platform’), refer to a platform characteristic instead of a platform type (cluster two: ‘open platform’ refers to platform openness), or represent subclasses of ‘information technology platforms’ (cluster two: ‘software platform,’ ‘hardware platform,’ ‘development platform,’ ‘computing platform,’ and ‘mobile platform’). To provide a consistent decomposition of platform terms, we conclude that the state-of-the-art of platform research refers to the platform as an IT artifact as an ‘information technology (IT) platform’ (Table 4).

Table 4 A lexicon of significant platform terms for IS research

Combining the insights from the five platform terms in the third cluster, we identified that a ‘social platform’ is a generalization of ‘social media platform,’ ‘social networking platform,’ and ‘communication platform.’ These three terms differ by focusing on direct message exchange and communication (communication platform), creating and sharing user-generated content (social media platform), or networking and direct interactions (social networking platform).

Still, all three terms refer to social interactions that might often take place on the same social platform. Facebook, for instance, implements multiple features of ‘communication platforms’ (e.g., sending messages to peers), ‘social networking platforms’ (e.g., connect with friends and join groups), and ‘social media platforms’ (e.g., enabling influencers to share user-generated content, advertise products, and sell services). We advise researchers to use a more specific platform term when outlining their research, or use the broader term ‘social platform’ when referring to digital platforms that enable social interactions.

Concerning platform terms viewed from an economic viewpoint, our analysis identified ‘crowdfunding platform,’ ‘crowdsourcing platform,’ and ‘sided platform.’ As both crowdfunding and crowdsourcing platforms can be two-sided or multi-sided platforms, we advocate ‘(two-/multi-) sided platform’ to be a more general term, specializing into either ‘crowdfunding platform’—for multi-sided platforms to raise funds—or ‘crowdsourcing platform’—for multi-sided platforms to source work from anonymous online communities.

Conclusion

With our collection, analysis, and interpretation of digital platform terms in IS research, we offer two theoretical contributions. Having identified and analyzed platform terms from a compilation of 11,049 research papers published in leading peer-reviewed IS journals and conference proceedings in the past 44 years, and having applied an inductive text-mining approach using machine learning, our analysis is the first to cover the entire history of platform research in the IS knowledge base. With the identified terms and their SSWs, we contribute much-needed empirical evidence on platform research. Our decomposed model of digital platform terms and our lexicon of platform terms in IS research provide a common baseline to guide future platform research in studying isolated phenomena from a specific theoretical perspective, and to enable investigating broader phenomena on a more abstract level by combining different theoretical perspectives on platforms. At the same time, our analysis might enable researchers to identify and interpret older platform papers that used different terms in the formation phase of platform research. We consider both temporal directions—shaping the future and accessing the past of platform research—as equally important applications of our decomposed model and lexicon. We encourage others to apply the decomposed model of platform terms by, for example, using the term cloud platform when conducting future research on technical facets of digital platforms, while we advocate for using the term service platform when studying actors and services. Additionally, the ability to switch between different views on platforms also enables researchers to study the relationships between the identified subsets of platforms terms. This rationale can be applied to all other elements of the model. Consistently using the terms will help build and refine a consistent lexicon of digital platforms in the IS discipline in the future.

Although we took great precautions with generalizing the results of our data-driven analysis, our study is subject to common limitations inherent to text mining, the application of machine learning algorithms, and interpretive research. While at first glance a total number of 11,049 papers seems sufficient to justify the application of these methods, the size of our data set and choice of sources might still limit the generalizability of our contribution. We restricted our analysis to top-ranked journals and conference proceedings in the IS discipline (Tarafdar & Davison, 2018). As platforms are a topic of high interest to both researchers and practitioners, an even more inclusive consideration of practice-oriented journals, conference proceedings, or white papers could have led to the identification of additional or different terms. Since the results of the application of unsupervised machine learning are highly dependent on large amounts of data, extending the data set could lead to even better results to inform the manual coding and interpretation processes. On the other hand, extending the data set to include papers of lower scientific quality could also lower the data quality and, therefore, lead to less reliable results. Since our approach was to systematize and constitute the lexicon of platform terms in IS, we decided in favor of scientific excellence, at the expense of considering a slightly reduced but still sizeable data set. Faced with this trade-off, other researchers might take a different decision.

Likewise, data quality leaves its mark on the results of data-driven papers, since some papers might only have been included in the first research step because their bibliographies contained papers that referred to the word ‘platform’ in their titles. Likewise, the second step of our research process could potentially produce sub-optimal results, as authors might not use identical platform terms throughout their papers. To deal with these limitations appropriately, we performed different configurations and verified the quality of the results in each step of our research process for optimal parameter selection. We documented all decisions made in the analysis to enable others to replicate our research process. One important decision made in our analysis was to exclude platform terms that appeared in fewer than 150 papers (Table 2). Since we subsequently developed the decomposed model based on the reduced database, it is conceivable that including more terms would have produced a broader—yet less specific and common—collection of platform terms. A closer inspection of the excluded platform terms, however, suggests that these terms could still fit into our lexicon.

A general limitation of text mining and data aggregation techniques is that they do not always allow to retrospectively draw conclusions from the data. In our case, this means that we cannot refer to single papers or their context to explain the overall results identified from the dendrogram. To reduce the impact of this limitation on our contribution, we discussed the results of our data analysis with reference to definitions in the existing literature. Further, by conducting the majority of the research process automatically, we aimed to limit the (potentially detrimental) influence of the authors (bias and subjectivity), thus increasing the generalizability of our results, but acknowledge that the lack of human input could limit the conceptual insights gained from the results.

As regards the interpretation of our results, we identified a lack of common definitions for certain platform terms, which somewhat hampers the objective discussion of our results. Additionally, while we are confident that we provide a decomposition of platform terms with maximum inner meaningfulness and minimum interactions between the subsets, others might arrive at different suitable decompositions. However, we do not see this natural bias of decomposition as a weakness of our paper but want to invite others to critically review, revise, and enhance the decomposed model and corresponding lexicon of platforms terms.

There are ample opportunities to validate, discuss, and extend our results through quantitative, qualitative, and design-oriented studies. First, the research process presented here could be replicated in neighboring research disciplines of the IS field (e.g., economics, computer science, service science, marketing, engineering) to compare platform research across these disciplines. This effort would enhance our knowledge about the design, implementation, and evolution of platform concepts. An inter-disciplinary analysis might indicate how platforms can serve as boundary objects that bridge platform-related research carried out in different disciplines. Second, since our research process focused on analyzing textual data with data mining and machine learning, conceptual studies could challenge and extend our lexicon of platforms in IS.

Notes

  1. As explained below, the 95% were those papers we were able to access of all papers on platforms ever published in considered IS outlets.

  2. The same analysis with different thresholds of 50, 100, 150, and 200 words can be found in the appendix.

  3. For more information visit https://www.eclipse.org/eclipseide/.

  4. For more information visit https://cloud.google.com/solutions/manufacturing.

References

Download references

Acknowledgements

This paper is part of two research projects. The project DigiBus was funded under the promotion sign 005‐1807-0107 by the Ministry of Economics, Innovation, Digitalization, and Energy of the State of North Rhine-Westphalia (MWIDE.NRW). The project FLEMING was funded under the promotion sign 03E16012F by the Federal Ministry for Economic Affairs and Climate Action (BMWK). Furthermore, we acknowledge the work by Philipp Hansmeier, who supported us in conducting a comprehensive literature review documented in the online appendix.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Bartelheimer.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Responsible Editor: Yun Wan

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bartelheimer, C., zur Heiden, P., Lüttenberg, H. et al. Systematizing the lexicon of platforms in information systems: a data-driven study. Electron Markets 32, 375–396 (2022). https://doi.org/10.1007/s12525-022-00530-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12525-022-00530-6

Keywords

  • Platform
  • Text mining
  • Machine learning
  • Data communications
  • Interpretive research
  • Systems design and implementation

JEL classification

  • L86