1 Introduction

Based on exponentially increasing computing power (Schaller 1997; Shalf 2020) and big data availability, artificial intelligence (AI) has gained substantial traction in recent years (Haenlein and Kaplan 2019). In 2016, AlphaGo, a deep reinforcement learning algorithm developed by Google, was able to beat Lee Sedol, arguably one of the most renowned players of the board game “Go”, which is considered to be multiple times more complex than chess (Haenlein and Kaplan 2019; Padmanabhan et al. 2022; Silver et al. 2017). With the introduction of ChatGPT, AI has made its way to a broad common perception, raising promising expectations about impactful AI use cases in business research and practice.

The potential of AI has aroused the attention of management research (Zhang et al. 2022). This has resulted in a compound annual growth rate of more than 35% in AI referencesFootnote 1 between 2015 and 2019 (Fosso Wamba et al. 2021). Especially for the 2020s, we observe a steep increase in AI-related publications in top-tier management journalsFootnote 2 and expect even higher yields in the post-generative AI era.

Although AI-related literature is rich (Padmanabhan et al. 2022), we currently lack a comprehensive understanding of the use of AI algorithms. The absence of a concise AI definition and taxonomy in business (Collins et al. 2021; Samoili et al. 2020; Wirtz et al. 2019) results in a dispersed knowledge base, which has two critical implications. First, it currently seems challenging to assess the extent to which management research has exploited the potential of AI algorithms. Second, an overview of current trends as well as future potentials of applied AI remains opaque.

To further coordinate, extend, and strengthen the use of AI, researchers need to understand in what context, for what purpose, and which types of AI have been used in previous work. With only a few reviews investigating the overall topic from a sociotechnical lens (e.g., Abdel-Karim et al. 2021; Collins et al. 2021), we see a knowledge gap regarding the extent of applied AI algorithms, leaving the following questions unanswered: Do specific use cases or application areas foster the use of AI algorithms in management research? For which purposes do management researchers leverage AI algorithms? Which data type is commonly processed? And which algorithms are most popular?

We argue that answering these questions is critical to compiling a holistic overview of AI applications in management research. Likewise, for identifying gaps between the use of AI algorithms and their potential, and discovering avenues for research. We refer to several scholars calling for research in this area. For example, Raisch and Krakowski (2021) point out the need for more interdisciplinary efforts in AI research. They encourage melding perspectives of management research and computer science scholars to “embrace the topic of AI in management” (Raisch and Krakowski 2021, p. 33).

AI affects a broad scope of management research communities, and the field of information systems (IS) accounts for a large share of AI-related publications.Footnote 3 IS, at the intersection of computer science and management, encompasses the social and technical aspects of AI, and consequently, it plays a pioneering role in fostering the settlement of AI in management research (Berente et al. 2021). Therefore, we narrow the scope of this study to IS. In doing so, we align with the research call of Raisch and Krakowski (2021) to combine perspectives from computer science and management research. This study poses the following research question (RQ):

Rq

What is the current state of applied AI algorithms in IS, and how can extant research be synthesized?

By answering this RQ, we build a basis for profound research on the use of AI in IS and provide valuable insights transferrable to the general management research domain. We conduct a systematic literature review (SLR) of AI-related articles published between January 2016 and March 2024 in leading IS journals to identify, structure, and synthesize the use of AI in IS research. According to Rowe (2014, p. 242), literature reviews can build a foundation for research landscaping to “evolve more rapidly towards a more comprehensive and effective research genre’s spectrum”.

To improve the understanding of the use of AI, we propose a conceptual framework covering eight dimensions within the application areas, methods, and algorithms of applied AI. In detail, we categorize the application area into (1) the industry sector and (2) the functional area and dissect the technical aspects of AI applications into (3) AI categories, (4) input data, (5) learning behavior, (6) whether deep learning (DL) is applied, (7) whether explainable AI (xAI) is addressed, and (8) the algorithm used. We draw on a customized generative pre-trained transformer (GPT) based on OpenAI’s GPT-4, along with manual coding to verify its suggestions. Using this hybrid approach, we evaluate the articles of our literature population in terms of these eight dimensions. Overall, we aim to provide a well-developed, organized knowledge base to serve as a foundation for researchers to further explore the field of applied AI algorithms in management research.

Our main findings reveal that the number of AI publications per year and the relative share of articles per year addressing DL and xAI sharply increase within our observation period, and the use of AI falls into distinct topic clusters regarding the functional area and industry sector. Particularly, a large share of articles discusses topics related to the functional area of marketing and sales as well as the industry sectors of information, health care and social assistance, retail trade, and finance and insurance. Furthermore, supervised algorithms that process numerical or text data for predictive purposes constitute most of the applied AI we find in our literature population, and generative AI, reinforcement learning, and semi-supervised learning techniques occur seldom, although they offer vast potential for management research.

Our contribution to the literature is threefold. First, we conduct an in-depth analysis of the use of AI in IS research regarding its application areas, methods, and algorithms. We fully enclose our coding methodology and its results for each article within our literature population. We thereby create transparency in the complex landscape of AI in IS on both a general and specific level. This helps researchers understand the overall dynamics in the field of applied AI as well as specific characteristics on the level of a journal, topic, or article. As such, the overview of the applied AI landscape elucidates the convergence of the areas of computer science and management research within IS.

Second, we propose a conceptual framework of eight dimensions to structure our review, allowing for improved tangibility and increased transparency of AI applications. We use the framework within the field of IS, although its structure is applicable to other management research domains. Such transferability can guide future research conducting structured analyses of applied AI. Our framework aims to disaggregate the evolving research domain and provides a starting point for theory building for management scholars (Rowe 2014; Templier and Paré 2015).

Third, we review IS literature to inform about recent developments in the AI landscape and provide insights that may be transferrable to the broader management research domain. We identify topics that have gained increasing attention as well as underrepresented AI algorithms that show potential for future research. Our study invites scholars to identify gaps in the field of applied AI and further extend and strengthen the use of AI in both IS and the broader management research domain.

The remainder of the article is organized as follows. First, we introduce the research background of our conceptual framework. Next, we describe our research method, then summarize the findings of our SLR. Finally, we suggest avenues for research and discuss the limitations of our approach.

2 Conceptual Framework

Management research lacks clarity about and in-depth analyses of the use of AI algorithms. With neither a universally accepted AI definition (e.g., Collins et al. 2021; Wirtz et al. 2019) nor a mutually exclusive way to categorize AI algorithms (Samoili et al. 2020) in business, we must invent an appropriate structure to systematically evaluate the use of AI.

To derive a conceptual framework for our SLR, we use the European AI expert group’s schematic depiction of AI systems as a starting point, which describes that AI systems perceive information from their environment, process that information to allow decision-making, and react with some kind of action (The European Commission’s High Level Expert Group on Artificial Intelligence, 2018). Accordingly, an AI system comprises its environment, various forms of input, a processing unit, and various forms of output. We use these categories as a foundation to derive the dimensions of our conceptual framework. We divide an AI system’s environment into the dimensions of the industry sector and functional area. Regarding the input and output of an AI system, we introduce the dimensions of AI category and input data. We categorize the processing unit of an AI system according to the dimensions of learning behaviors, DL, xAI, and AI algorithms. The following sections elaborate on each dimension in detail. Our conceptual framework is illustrated in Fig. 1.

Fig. 1
figure 1

Conceptual framework

AI category: While artificial general intelligence—the ultimate vision of AI research—is multi-versatile, today’s state-of-the-art AI applications are mainly specialized to fulfill one specific purpose (Haenlein and Kaplan 2019). We include the following differentiations in our conceptual framework’s dimension of “AI categories”: descriptive, predictive, prescriptive, and generative.

Descriptive AI refers to the analysis of data from the past to extract insights and detect hidden patterns (Roy et al. 2022). For example, clustering identifies groups of related data points without prior knowledge of relationships (Han et al. 2012; Sarker 2021). In addition to pure clustering tasks, topic modeling detects similarities of data points in the latent space and is therefore commonly used to identify the relevant content of text data (Blei 2012; Vayansky and Kumar 2020). Another text-based AI task is sentiment analysis, the computational evaluation of authors’ opinions and emotions toward other individuals, events, or topics (Feldman 2013; Medhat et al. 2014).

Predictive AI learns from past data to reveal unknown information or forecast future events (Roy et al. 2022). For example, classification assigns a discrete class label to a given input with an unknown class label (Han et al. 2012). Reminiscent of this definition, regression predicts a continuous variable based on a given input (Sarker 2021).

Prescriptive AI builds on prediction to recommend the best action to achieve a desired objective or outcome (Roy et al. 2022). Among others, prescriptive systems are used for recommender systems (e.g., Luo et al. 2019; Wei et al. 2023), or healthcare applications (e.g., Wang et al. 2023; Yu et al. 2023).

While the previously discussed tasks extract knowledge from data, generative AI refers to the ability to create content from a learned probability distribution (Goodfellow et al. 2020). For example, large language models, such as GPT4, can create pictures from users’ descriptions.

Input data: In addition to numerical data, AI can process various other inputs, including text (e.g., Adamopoulos et al. 2018), images (e.g., Wang et al. 2022a, b; Zhang and Ram 2020), and sensor data (e.g., Tofangchi et al. 2021; Zhu et al. 2020). We add the dimension “input data” to our framework and distinguish between numerical, text, audio and signal, and image data.

The type of input data is closely related to different AI fields. Wang et al. (2021, p. 2) state that “AI branches out into closely related fields of computer vision, natural language processing, speech recognition, and machine learning”. Computer vision generally refers to a computer’s ability to perceive objects (Russell and Norvig 2010). Natural language processing (NLP) encompasses “systems and algorithms able to interact through human language” (Lauriola et al. 2022, p. 1). Speech recognition revolves around the conversion of “speech signal to a sequence of words” (Gaikwad et al. 2010, p. 16). Machine learning (ML) is a technique that enables “computer systems to improve with experience and data” (Goodfellow et al. 2016, p. 8). Text data and NLP have a close connection; likewise, image data and computer vision, and audio data and speech recognition.

We stress that none of these categorizations are mutually exclusive. First, speech recognition is broadly related to NLP. Second, ML, especially DL, plays a central role in various AI fields. For example, researchers frequently use ML algorithms in NLP (Lauriola et al. 2022; Li 2017), computer vision (Voulodimos et al. 2018), and speech recognition (Nassif et al. 2019).

Learning behavior: Currently, AI is mainly driven by ML algorithms (Jordan and Mitchell 2015). Goodfellow et al. (2016, p. 8) contend that “machine learning is the only viable approach to building AI systems that can operate in complicated, real-world environments”. Hence, we expect to see many ML applications in our SLR, and thus consider it necessary to make more granular distinctions between ML methods. Researchers generally categorize ML algorithms’ learning behavior into supervised, unsupervised, semi-supervised, and reinforcement learning (Jordan and Mitchell 2015; Mohammed et al. 2016).

We integrate the dimension of “learning behavior” into our framework and explain the categories as follows. Supervised learning algorithms perform tasks like classification and regression (Han et al. 2012; Sarker 2021) by processing labeled training data to infer mappings from an input to an output (Mahesh 2020; Mohammed et al. 2016). Unsupervised learning algorithms process unlabeled data to identify underlying patterns (Mohammed et al. 2016), as in clustering (Han et al. 2012; Sarker 2021). Labeled data can be a bottleneck in AI projects or take high manual effort and significant cost (Goodfellow et al. 2020; Roh et al. 2021). Therefore, semi-supervised learning, operating on partially labeled data sets (Han et al. 2012), aims to provide better results than unsupervised learning while requiring less labeled data than supervised methods (Sarker 2021). In contrast to the aforementioned closely related learning behaviors, reinforcement learning refers to agents that improve decision-making by adapting to feedback from the environment in which they operate (Ma and Sun 2020).

Deep learning: Major recent AI achievements and breakthroughs have been associated with DL (Jordan and Mitchell 2015). Deep learning is a particular type of ML (Goodfellow et al. 2016) typically based on artificial neural network architectures with several hidden layers (Ma and Sun 2020). We add the dimension of “deep learning” to our framework and distinguish whether DL is applied.

Explainable AI: With the evolvement of AI towards ML and especially DL, which often comprises black-box systems in such a way that users cannot understand the resulting model predictions, there is a need for more transparency in AI systems (Confalonieri et al. 2021). Explainable AI addresses this challenge by integrating explanation mechanisms into AI models (Doran et al. 2017). Researchers have access to several well-known methods to enhance AI algorithm interpretability. For example, the Shapley additive explanation outlines the contributions of explanatory variables to a given output (Gramegna and Giudici 2021). Locally interpretable model agnostic explanations use explainable, local models as proxies for black-box models (Gramegna and Giudici 2021). We include the dimension of “explainable AI” in our conceptual framework and distinguish whether xAI is addressed.

AI algorithms: AI encompasses various algorithms. For our framework’s dimension “AI algorithms”, we initially consider a selection of algorithms and inductively include others if necessary. Explaining each algorithm is out of the scope of this article, so we refer to the literature for more detail. We include the supervised ML algorithms Bayesian learning, logistic regression (LR), k-nearest neighbors (kNN), support vector machines (SVM), decision trees (DT) and random forests (RF), boosting (for an overview, see Sarker 2021), and autoencoder (Goodfellow et al. 2016). For unsupervised ML algorithms, we include k-means (Sarker 2021) and latent Dirichlet allocation (LDA) (Blei et al. 2003). For DL, we consider the multilayer perceptron (MLP), also known as artificial neural network (Goodfellow et al. 2016), convolutional neural networks (CNN), recurrent neural networks (RNN), including long-short memory cells (for an overview, see Sarker 2021), and generative adversarial networks (GAN) (Goodfellow et al. 2020). Lastly, we include reinforcement learning (Ma and Sun 2020) and (large) language models (Vaswani et al. 2017).

2.1 Industry sector

Padmanabhan et al. (2022) expect AI to evolve from a technique that companies at the forefront of innovation have developed and applied into a common tool for all organizations. However, innovation adoption speed may vary between industries (Lee and Xia 2006; Porter and Millar 1985). Some industries may lead in integrating AI algorithms or offer more potential for applied AI use cases. As research points to the path for practice, we ask whether the literature focuses on specific industry sectors that use AI algorithms. We use the dimension “industry sector” in our framework to evaluate the distribution of AI use cases in IS research. We use the North American Industry Classification System (NAICS), which is known for its cohesive industry distinction (Krishnan and Press 2003).

2.2 Functional area

The distribution of applied AI may vary not only between industries but also within a company. We integrate the dimension “functional area” into our framework to allocate AI use cases to a specific function in a company. We follow the functional area classification of Cannella et al. (2008), referencing Michel and Hambrick (1992) and Carpenter and Fredrickson (2001), consisting of Production-operations, R&D and Engineering, Accounting and Finance, Management and Administration, Marketing and Sales, Law, and Personnel and Labor Relations. Since we expect many studies to fall under an information technology-related background, we add a new category, IT.

3 Methodology

Thorough literature reviews represent a “trustworthy account of past research that other researchers might seek out for inspiration and use to position their own studies” (Templier and Paré 2015, p. 113). Therefore, a review must systematically identify relevant literature and be executed with methodological rigor, in line with its research objectives (Kitchenham et al. 2009; Rowe 2014; Templier and Paré 2015). To assure methodological rigor, we follow the framework proposed by Templier and Paré (2015) for guiding IS literature reviews. This framework consists of six steps (Templier and Paré 2015): (1) problem formulation, (2) literature search, (3) screening for inclusion, (4) quality assessment, (5) data extraction, and (6) analysis and synthesis. Table 1 illustrates our approach. The following sections detail each step.

Table 1 Review approach according to Templier and Paré (2015)

Step 1—Problem formulation: According to Rowe (2014), literature reviews should not only summarize previous work in the field of interest but also identify knowledge gaps to guide future research directions. Understanding a review’s objectives governs later decisions and matters for the outcome (Kitchenham and Charters 2007). We define the objectives of this study in three ways: (1) Its justification as a standalone literature review, (2) the SLR’s purpose, and (3) the outline of the intended analyses’ core constructs (Templier and Paré 2015). We refer to the introduction regarding the justification and purpose of our standalone review. The core construct of our analysis is to evaluate IS research’s use of AI algorithms in terms of the eight dimensions of our framework.

Step 2—Literature search: The literature search involves selecting sources, i.e., databases to search, and choosing search terms (Rowe 2014). To assemble our initial literature population, we searched the online databases EBSCOhost, Scopus, and Web of Science. The broad scope of online databases results in duplicates, but yields a more reliable selection of relevant work.

We restrict our search to leading IS journals, targeting the intersection of information technology and business research. We draw on the Association for Information Systems’ Senior Scholars’ List of Premier Journals with three modifications. We exclude the journal Decision Support Systems due to its technical scope and include Business and Information Systems Engineering (BISE) as well as Electronic Markets (EM). Both are well-cited journals outside of the list of premier journals and complement our literature corpus. Our final selection comprises the following twelve journals: BISE, EM, European Journal of Information Systems (EJIS), Information and Management (I&M), Information and Organization (I and O), Information Systems Journal (ISJ), Information Systems Research (ISR), Journal of the Association of Information Systems (JAIS), Journal of Information Technology (JIT), Journal of Management Information Systems (JMIS), Journal of Strategic Information Systems (JSIS), and Management Information Systems Quarterly (MISQ). In general, we agree with the critique by Larsen et al. (2019) that restricting a literature search to top journals might exclude relevant work. However, for our purpose, we argue that the selected journals cover the appropriate spectrum of applied AI algorithms we want to investigate, given their position at the intersection of computer science and management within IS.

We closely align the selection of our search terms to our research question (Kitchenham and Charters 2007). We want to extract articles that proactively use AI algorithms. To that end, our search terms include AI abbreviations, synonyms, closely related terms (e.g., ML), and, most importantly, a list of concrete AI algorithms. By dispensing with generic search terms like “robot” or “automation,” which occur in related work (e.g., Collins et al. 2021; Mariani et al. 2022), we propose an approach more focused toward applied AI. We use the following search terms: artificial intelligence, AI, machine learning, ML, deep learning, DL, *supervised learning, reinforcement learning, pattern recognition, neural network, decision tree, random forest, support vector, natural language processing, computer vision, machine vision, expert systems, speech recognition, predictive analytics, generative adversarial networks, generative artificial intelligence, large language models, transformer model, diffusion model, ChatGPT, GPT-4, and zero-shot learning.

With the defined set of search terms, we conducted a title-abstract-keyword search on the databases until March 2024 and received 1299 hits: 347 on EBSCOhost, 377 on Scopus, and 575 on Web of Science.

Step 3—Screening for inclusion: According to Templier and Paré (2018), we define criteria to guide the decision of which studies to include in subsequent steps. We distinguish formal from informal criteria. Formal criteria include document type, language, and publication date. We restrict our results to peer-reviewed articles in English. In light of a surge of interest in AI following the AlphaGo milestone (Haenlein and Kaplan 2019; Zhang et al. 2022), we choose articles published from 2016 on. We have already applied the formal inclusion criteria in Step 2, leading to the initial literature population of 1,299 articles. After removing duplicates, 695 articles remain. Our informal inclusion criterion is that the content of a study has to do with applied AI. Eligible studies use at least one AI algorithm, either to solve the research question or for illustrative purposes. We use a hybrid approach to screen the literature population. First, we use OpenAI’s GPT-4 to screen the studies’ abstracts and recommend whether to include or exclude an article. Then, we manually confirm or revise the AI’s suggestion based on the abstract and full paper reading. After excluding all studies that lack a dedicated use of AI algorithms, the remaining population consists of 143 studies. Table 2 presents this literature population categorized according to the publishing journals.

Table 2 Literature population of 143 studies

Step 4—Quality assessment: Assessing quality refers to screening the research design and methods used in the literature population for methodological rigor (Templier and Paré 2015). However, this step is not required in every form of literature review. According to Rowe (2014), reviews in IS can be placed into four categories: describing, understanding, explaining, and theory-testing. While explaining and theory-testing reviews include a quality assessment, describing and understanding reviews do not (Templier and Paré 2018). Since we aim to describe and understand the use of AI algorithms in IS research, not to explain and theory-test, we exclude quality assessment from our approach.

Two other arguments support our decision. First, excluding articles that use AI (referring to Step 3) but may have methodological areas of improvement would bias our analysis. Second, due to the restriction to leading IS journals, we expect a high standard of methodological rigor. These journals are highly competitive, and publications are often iteratively refined in review. We confirm these high standards after the reading stage of the previous step (Screening for Inclusion).

Step 5—Data extraction: We apply a hybrid coding approach to extract relevant data for further investigation. First, using a customized GPT, we leverage generative AI to screen the articles of our literature population for the eight dimensions of our framework. Next, we manually review the recommendations to decide how to code each article. For the first step, we use OpenAI’s GPT-4 to build a custom GPT named “Lit Review Analyzer”. After initialization, we proceed with manual coding examples for studies in our literature population. As a result, the Lit Review Analyzer processes PDF articles to output a table of recommendations for each framework dimension, for example, suggesting the best-fitting industry sector or functional area. We provide details on the set-up of the custom GPT in Table 3.

Table 3 Set-up of the custom GPT

EBSCOhost provides a NAICS-based industry sector classification for some articles. If available, we adopt this classification. Otherwise, we apply our hybrid coding procedure. For the dimension of AI algorithms, we differentiate between those that address the core research question and those used as a benchmark, baseline, or other form of comparison.

Importantly, the entries of each dimension of our framework are not mutually exclusive, so a study can be assigned to more than one in each dimension. For example, a study may be assigned to both text data and image data in the dimension of input data. To resolve ambiguity, we conducted discussions within our research team and iteratively refined our coding approach. The results of our coding scheme are summarized in the “6.” of this study.

Step 6—Analysis and synthesis: In this step, we investigate the absolute frequency and the rank of items within each of the eight dimensions.

4 Findings And Discussion

4.1 Literature Population

The literature population consists of 143 studies. Table 4 presents the distribution of the studies over the journals and years of publication. We highlight three observations. First, ISR, MISQ, and JMIS account for 65% of the articles in our literature population. Specifically, 93 of 143 articles are published by these journals, with 36, 32, and 25 publications for ISR, MISQ, and JMIS, respectively. Second, EJIS, I and O, ISJ, JIT, and JSIS barely publish studies that use AI algorithms, with three or fewer studies each. Third, we notice an increase in articles that use AI over time, with more than 100 articles published between 2020 and 2023.

Table 4 Literature population distribution over time and journals

The significant increase in publications in our literature population shows that AI algorithms have recently gained substantial traction in the field of IS. However, we think that the establishment of applied AI in IS has just started to rise. We expect to see growing attention paid to AI algorithms in the future, further fueled by the introduction of generative AI.

4.2 Dimensions Of The Conceptual Framework

4.2.1 Industry Sector

Within the reviewed literature, the preceding industry sector is 51: Information. We count 60 articles assigned to the category, which equals a relative frequencyFootnote 4 of more than 35%. With 23 studies assigned to its category, 62: Health Care and Social Assistance is the second most frequent industry sector, followed by 44–45: Retail Trade (16), 52: Finance and Insurance (11), 92: Public Administration (8), 31–33: Manufacturing (6), and 48–49: Transportation and Warehousing (5). Other NAICS categories have two or fewer studies assigned. We also find 23 that cannot be assigned to any industry sector. Table 5 presents the distribution of our literature population over the industry sectors.

Table 5 Absolute frequencies and rank of industry sectors

We attribute the prevalence of the industry sector 51: Information to two main reasons. First, since IS research, by definition, investigates information management, we expect information to be particularly relevant. Second, many studies in our population address research questions regarding social media (e.g., Ghiassi et al. 2016; Guo et al. 2018), product reviews (e.g., Liu et al. 2021; Mejia et al. 2019), cybersecurity (e.g., Ebrahimi et al. 2020; Li et al. 2016), or mobile apps (e.g., He et al. 2019; Lee et al. 2020), which commonly relate to the industry sector of information.

We attribute the prevalence of healthcare-related publications in our literature population to research questions on the intersection of healthcare and IS. For example, several studies process sensor data of home observation systems that improve older adult care. Zhu et al. (2020) propose a deep transfer learning approach to meet the challenges of sensor-based home activities of daily living monitoring systems. In a later study, they use deep learning for a multiphase activities of daily living recognition model (Zhu et al. 2021). Other studies source data from social media to advance knowledge on health-related issues, such as research by Xie et al. (2021) who use deep learning to understand opioid use disorder. Mousavi et al. (2020) use ML to evaluate content quality on health-related question-and-answer forums.

The high prevalence of the industry sector 44–45: Retail Trade mainly relates to studies addressing e-commerce research questions. For example, Luo et al. (2019) investigate e-commerce cart targeting and “simulate the treatment effects for every combination of individual demographics and purchase history and, thus, provide e-retailers with an optimal targeting scheme” (Luo et al. 2019, p. 17). Another study uses artificial neural networks to “predict consumers’ future path-to-purchase journeys based on their historical omnichannel behaviors” (Sun et al. 2022, p. 429).

4.2.2 Functional Area

Regarding the functional areas, Marketing and Sales comes in first, with 39 studies assigned to it, followed by IT (15), Accounting and Finance (14), Management and Administration (14), Law (14), R&D and Engineering (10), Production-operations (8), and Personnel and Labor Relations (5). We also report 45 studies that cannot be assigned to any functional area. Table 6 presents the absolute frequencies and the rank of assigned functional areas.

Table 6 Absolute frequencies and rank of functional areas

We find two main reasons to explain the frequent assignment of studies in our literature population to the functional area of marketing and sales. First, we see a close link between marketing and sales and the studies assigned to the retail trade industry sector. Fifteen of 16 studies related to this industry sector are also associated with marketing and sales. Second, we identify several studies addressing research questions regarding consumer reviews, which strongly link to marketing and sales. For example, Kumar et al. (2018, p. 351) “propose a novel hierarchical supervised-learning approach to increase the likelihood of detecting anomalies” in consumer reviews. Gunarathne et al. (2022) use CNNs to identify business-to-customer bias in social media-based customer service. Heightened interest in generative AI and its strong base of potential use cases in marketing, e.g., the AI-assisted development of advertising slogans, indicates that AI will be used even more frequently in marketing in the future.

4.2.3 Ai Category

Of the 157 articles assigned to the AI category, the subcategory of predictive AI leads with 106 articles, followed by descriptive (27), prescriptive (19), and generative AI (5). The prevalence of predictive AI fulfills our expectations, since classification and regression tasks, which use predictive algorithms, address a variety of problem domains.

The use of descriptive AI in our literature population is closely related to text analysis (21 of 27 articles process text data) and topic modeling, such as LDA. Another common use case for descriptive AI is clustering, an effective for finding patterns in unlabeled data, but not necessarily requiring AI. Hence, researchers may rely on legacy algorithms that do not qualify as AI to address clustering tasks. We see this as the reason for the few clustering AI algorithms in top IS research. We do not expect significant changes to this observation in the future.

In our literature population, prescriptive AI relates to ML tasks that process numerical data (14 of 19 articles) and DL algorithms (11 of 19 articles) that address recommendations. For example, He et al. (2019) propose a method for personalized mobile app recommendation and Liebman et al. (2019) develop a music playlist recommendation algorithm that integrates the emotional context of the listener.

Despite the rise of large language models in 2023, generative AI is underrepresented in our literature population, with only five studies assigned to this category. Importantly, generative AI has great potential for intensified use by future research. First, generative algorithms, comparable to semi-supervised and reinforcement learning, allow learning without requiring extensively labeled training data (Goodfellow et al. 2020). Second, generating content may reveal new use cases for business research and practice. To date, many applications of generative algorithms, especially GANs, focus on images (Creswell et al. 2018), but the application area may broaden to other use cases. For example, Krahe et al. (2020) use GANs to process 3D objects represented as point clouds for automated product design. With the increasing possibilities of transformers and large language models, we expect a sharp increase in the generation of use cases soon.

4.2.4 Input Data

Of the 179 total articles assigned to the input data dimension, 73 are assigned to the subcategory of numerical data, 75 to text data, 11 to audio and signal, and 11 to images. We also find nine studies we cannot assign to any of the subcategories of input data, five of which process graph-based data.

The prevalence of text data and the prevalence of NLP in IS research are linked, which reflects a significant interest in the research community. We attribute this to two main reasons. First, NLP can process unstructured data like text. This is a valuable information source for IS research, especially in the age of social media. Second, recent advancements in DL have increased the capabilities of NLP techniques and reinforced interest in ML research (Lauriola et al. 2022). In the literature population, 49 out of the 75 studies assigned to text data show the application of DL techniques. We see this trend confirmed by the recent developments of generative AI tools like ChatGPT that further boost NLP’s capabilities. For example, Drori and Te'eni (2023) discuss the use of large language models for peer-reviewing purposes in academia.

Conversely, the use of images and audio and signal data related to computer vision and speech recognition falls short in our SLR. However, computer vision, especially in the image recognition field, is used more frequently in other research communities, e.g., medicine (Esteva et al. 2021). For example, CNNs are applied for image-based cancer detection (e.g., Haenssle et al. 2018). Within our literature population, two studies in which image data is processed with AI address health-related issues (Pfeuffer et al. 2023; Zhang and Ram 2020). Another field that provides interesting use cases for computer vision is social media. Shin et al. (2020) use, among others, social media images to predict post popularity.

Therefore, we expect image processing to grow in future IS research based on two arguments. First, computer vision, especially image analysis, yields high maturity. For example, CNNs achieve superhuman performances in image classification tasks (Haenssle et al. 2018). Second, our SLR shows that many IS publications that use AI algorithms tackle health care and social media-related research questions. Both of these fields embrace use cases related to images.

In the broader management research scope, audio and signal data processing draws less attention. Based on this observation, we make two main arguments. First, the input data format, such as voice recordings, is unusual for management research issues and impractical to handle. Second, even if audio and signal data are required to process voice recordings, researchers can use established software tools to create transcripts and then process them via NLP. Hence, IS researchers may see fewer opportunities to contribute to the field of speech recognition.

4.2.5 Learning Behavior

Goodfellow et al. (2020) state that most of today’s AI approaches are based on supervised ML. We confirm this with our SLR by counting, of 175 assignments in total, 112 to supervised learning, 10 to semi-supervised learning, 47 to unsupervised learning, and six to reinforcement learning.

Semi-supervised and reinforcement learning algorithms, rarer than supervised and unsupervised ones, allow learning with fewer data amounts or even without labeled data. For example, Ebrahimi et al. (2020) present a semi-supervised learning approach to identify cyber threats on dark net marketplaces. They address the fact that data obtained from web platforms is usually unlabeled, and manual labeling “requires expensive human labor and expertise” (Ebrahimi et al. 2020, p. 696). Liebman et al. (2019) entirely bypass the need for labeled training data. Based on real-time interaction with the listener, they use reinforcement learning to adapt music playlist generation, since personal music preferences can vary for the same person under different circumstances (Liebman et al. 2019). Given the maturity and corresponding achievements in research and practice of semi-supervised and reinforcement learning, we see them as areas with great potential for future IS and management research. Semi-supervised learning allows the use of ML, though labeled data may be sparse. Reinforcement learning allows the integration of real-time feedback within an agent’s environment. Furthermore, it has significantly contributed to AI milestones, i.e., AlphaGo in 2016 (Haenlein and Kaplan 2019).

4.2.6 Deep Learning

We find that 92 of our 143 studies use advanced DL algorithms. Deep learning is associated with major breakthroughs in AI (Jordan and Mitchell 2015) in all branches, i.e., NLP (Lauriola et al. 2022; Li 2017), computer vision (Voulodimos et al. 2018), speech recognition (Nassif et al. 2019), and ML (Sarker 2021). The strong presence of DL in our literature population confirms that IS effectively uses the advantages of DL algorithms. However, we see a large increase in DL studies in the 2020s. In studies of our SLR published between 2016 and 2019, approximately every fourth one uses DL. Deep learning studies published in the 2020s come in at greater than 75%. We illustrate the relative share of DL within our literature population over time in Fig. 2. Based on this finding and since DL advances rapidly, embracing new use cases and opportunities, we expect the share of DL in AI-related IS research to stay high.

Fig. 2
figure 2

Relative share of DL articles by time

4.2.7 Explainable Ai

Our results show that 16 articles explicitly address xAI. Comparable to DL, xAI use increased within our literature population over the years: between 2016 and 2019, no study addresses xAI. With one article in 2020, two articles in 2021, six articles in 2022, and seven articles in 2023, the topic of xAI becomes increasingly relevant in later publication dates. Given the importance of transparency in AI recommendations, we expect the share of xAI-related articles to continue to grow.

4.2.8 Ai Algorithms

Table 7 shows the absolute frequency of AI algorithms within the literature population. In the following, we highlight the most frequently used algorithms. Artificial neural networks are particularly present. Sixty-one studies use MLPs, 51 use CNNs, 59 use RNNs, and eight use GANs. Regarding legacy ML algorithms, 69 studies use SVMs, and 56 use either DTs or RFs or both. Latent Dirichlet allocation (LDA) is applied in 28 studies. While reviewing our literature population, we found 84 algorithms that are not included in our framework. Particularly present are graph-based learning algorithms and hidden Markov models, with nine and four articles, respectively. Due to the specialization of the remaining algorithms, we refrain from discussing them in more detail. We reference our coding results in the “6.” of this study, which provides details on the algorithms of our SLR.

Table 7 Absolute frequencies and rank of algorithms

Comparable to the section input data, the distribution of AI algorithms applied in the reviewed literature supports previous findings. For example, the prevalence of artificial neural networks, i.e., MLPs, CNNs, RNNs, and GANs, reinforces the prevalence of DL in IS research, since these algorithms all belong to the field of DL.

Besides advanced DL algorithms, we find legacy ML algorithms widely applied in the reviewed literature, especially for benchmarking. For example, SVM, the most frequently applied algorithm in our study, was introduced in the early 1990s (Zoppis et al. 2019), DTs in the 1980s (Russell and Norvig 2010), and LDA in the early 2000s (Blei et al. 2003).

We have already outlined the potential of generative and reinforcement methods in previous sections. The sparse applications of GANs and reinforcement learning algorithms we found support our impression that generative and reinforcement algorithms are underrepresented in IS research. This is in spite of the current cultural and media attention paid to generative AI and its underlying language models. However, we expect a steep increase in research on generative AI over the next few years.

5 Outlook, Contribution, And Limitations

5.1 Avenues For Research

Computer vision, semi-supervised learning, reinforcement learning, and generative algorithms fall short in the reviewed studies, although they show great potential for IS. Computer vision is a branch of AI that deals with a computer’s ability to perceive objects. Especially in the field of image recognition, the literature and use cases offer established algorithms and examples of high-performance implementation. Because of the unstructured data represented in images, researchers using traditional approaches often disregard images as information sources. We encourage IS researchers to benefit from the capabilities of computer vision algorithms. For example, social media is a thriving research topic in IS, offering access to large amounts of content-rich image data. Researchers may use computer vision to exploit research questions primarily based on image information. They could also enrich other approaches with additional information gained through image analysis. Convolutional neural networks represent the most common DL image recognition algorithm and offer a well-described starting point (e.g., Voulodimos et al. 2018) for future research with several sample IS use cases (e.g., Shin et al. 2020; Zhang and Ram 2020).

Semi-supervised learning lets partially labeled data sets be processed. Thus, it provides a workaround if large amounts of labeled training data are unavailable or too costly. Hence, we see potential for semi-supervised learning algorithms to advance IS research. For example, many researchers in our SLR investigate data from social media or the internet. Both offer nearly inexhaustible amounts of unlabeled data. Semi-supervised learning may let research gaps in these fields be exploited by processing data with ML while keeping the underlying labeling effort manageable. Researchers may consider examples of semi-supervised learning in our literature population as a starting point for future research (e.g., Ebrahimi et al. 2020; Zhang and Ram 2020).

Instead of learning from historical data, reinforcement learning adapts to real-time feedback from an agent’s environment. As previously noted, we attribute the potential of reinforcement learning to past achievements that prove the capabilities of this methodology. We find many potential use cases for reinforcement learning in management research areas. These offer qualified environments for feedback-based adaption. For example, many studies in our SLR address research questions in the fields of e-commerce and social media. These may offer opportunities to receive feedback from buyers or users. With the example of Liebman et al. (2019) in our SLR, we encourage researchers to pursue opportunities to leverage the advantages of reinforcement learning in future studies.

Lastly, generative algorithms may help researchers explore new use cases and research questions. Generative algorithms embrace creativity within AI and give high-quality outputs if trained accordingly. For management research use cases, generative algorithms may create marketing and social media content or simulate cyber-attacks to test and improve cybersecurity. One of the most renowned generative algorithms is GAN, explicated by Goodfellow et al. (2020), and represented in our literature population (Ebrahimi et al. 2022; Golovianko et al. 2022). Moreover, since language models have gained so much attention as an underlying technique for the generative AI breakthrough in 2023, we expect them to represent many AI use cases in research going forward.

5.2 Contribution

We contribute to the literature in several ways. First, we conduct an SLR to analyze the application areas, methods, and algorithms of applied AI in leading IS journals. Our work extends existing literature and improves the transparency of the opaque landscape of applied AI. For every article we review, we include our coding methodology and results on both an aggregate and a granular level to shed light on the use of AI in IS. In doing so, we aim to provide a knowledge base for the overall dynamics as well as the specific characteristics of applied AI on a more detailed level. For example, our results allow researchers to analyze how the use of AI algorithms relates to certain journals, publication dates, industry sectors, or functional areas. Furthermore, the overview of the applied AI landscape in IS elucidates the convergence of the areas of computer science and management research within IS.

Second, we introduce a conceptual framework of eight dimensions that disaggregates the evolving research domain of applied AI, improving tangibility and providing a practicable structure. We use the framework within the field of IS, but its structure is transferrable to other management research domains. Considering the absence of a concise AI definition and taxonomy in business (Collins et al. 2021; Samoili et al. 2020; Wirtz et al. 2019), our framework may guide future analyses of AI in management research and stimulate theory building (Rowe 2014; Templier and Paré 2015).

Third, we use IS literature to inform the research community about recent developments in the AI landscape and provide insights that may be transferrable to the broader management research domain. We identify topics that have gained increasing attention, e.g., xAI, as well as underrepresented AI algorithms that hold potential for future research, such as generative algorithms and reinforcement learning. Hence, our study enables scholars to identify gaps in the field of applied AI and to further extend and strengthen its use in IS and the broader management research domain.

5.3 Limitations

As with any study, there are some limitations to our review. First, we limit our literature search to a selection of leading IS journals. While we believe that the literature selection suits our purpose, this restriction may exclude relevant work.

Second, we structure the review according to the framework we have introduced here, which increases transparency and tangibility in using AI in IS research. However, we stress that our conceptual framework is not mutually exclusive or collectively exhaustive. Other possibilities for structuring an SLR to address our research question may exist.

Third, the SLR’s spectrum of application areas, methods, and algorithms appears broad. Future studies could focus on specific dimensions of our conceptual framework or specific AI algorithms to evaluate findings even more granularly.

Fourth, we consider AI a highly dynamic and continuously evolving field of research. We see this in the drastic shift of attention towards AI algorithms after the introduction of ChatGPT. We expect significant changes in the kind of algorithms used in research and practice in the future. Therefore, the findings of this SLR must be continuously monitored and updated to allow for a real-time perspective on this research field in flux.

6 Conclusion

This study conducts an SLR to answer the research question “What is the current state of applied AI algorithms in IS, and how can extant research be synthesized?”. To break down the research question for analysis, we propose a conceptual framework to increase transparency about the application areas, methods, and algorithms of applied AI in IS research. We analyze 143 articles from leading IS journals, finding that the number of AI publications per year, and the relative share of articles per year addressing DL and xAI, significantly increase within our SLR’s observation period. The use of AI falls into distinct topic clusters regarding the functional area and industry sector. Supervised algorithms predictively using numerical or text data dominate our literature population. Furthermore, our SLR shows that only a few studies use computer vision, semi-supervised learning, reinforcement learning, and generative algorithms. These algorithm types have potential for use cases in IS and management research in general. Therefore, we identify them as underrepresented and suggest that they offer promising fields for research avenues.