1 Introduction

Social coding platforms, such as GitHub, have changed the collaborative nature of open-source software development by integrating mechanisms such as issue reporting and pull requests into distributed version control tools (Dabbish et al. 2012; Gousios et al. 2014). This pull-based development workflow offers new opportunities for community engagement but increases the workload for repository maintainers, who need to communicate, review code, deal with contributor license agreement issues, explain project guidelines, run tests, and merge pull requests (Gousios et al. 2016).

To reduce this intensive workload, developers often rely on automation tools to perform repetitive tasks, such as to check whether the code builds, the tests pass, and the contribution conforms to a defined style guide (Kavaler et al. 2019). GitHub projects adopt, for example, tools to support Continuous Integration and Continuous Delivery or Deployment (CI/CD) (Zhao et al. 2017; Cassee et al. 2020) and for code review (Kavaler et al. 2019; Wessel et al. 2020). In recent years, development bots have been widely adopted to automate predefined tasks around pull requests (Wessel et al. 2018). By automating part of the workflow, developers expect to increase both productivity and quality (Vasilescu et al. 2015).

To further support automation, GitHub recently introduced GitHub ActionsFootnote 1(the feature was made available to the public in November 2019). GitHub Actions allow the automation of tasks based on various triggers (e.g., commits, pull requests, issues, comments, etc.) and can be easily shared between repositories, automating aspects of how developers build, test, and deploy software projects.

However, little is known about the impact on the project activities when adopting GitHub Actions and the challenges imposed on the project development process. In this paper, we aim to understand how software developers use GitHub Actions  to automate their workflows and how the dynamics of pull requests of GitHub projects change following the adoption of GitHub Actions.

To achieve our goal, we address the following research questions:

figure a

We aim to understand how commonly repositories use GitHub Actions and for what purposes. As a result of this analysis, we found a considerable number of active repositories (1,489 out of 5,000 repositories) adopted GitHub Actions. This is a dramatic change when compared to the early adoption of GitHub Actions (Kinsman et al. 2021) (only 0.7% of the studies repositories adopted it). Actions are spread across 20 categories, including utilities, continuous integration, code quality, and deployment.

figure b

To gain insight into how developers perceive GitHub Actions, we manually analyzed a set of discussion threads and developer conversations on Discord that mention GitHub Actions. We found distinct categories of discussions related to Actions, including help requests, the potential of using them, issues reproducing output with Actions, and plans to use GitHub Actions.

figure c

In this research question, we investigate whether project activity indicators, such as the number of pull requests, comments, commits, and time to close pull requests change after GitHub Actions adoption. We used a Regression Discontinuity Design (RDD) (Thistlethwaite and Campbell 1960) to model the effect of Action adoption across 662 projects that had adopted GitHub Actions for at least 12 months. Our findings also suggest that the activity indicators change in opposite directions for accepted and rejected pull requests (PRs). Fewer pull requests are being accepted after adopting GitHub Actions, and these pull requests usually have more comments and fewer commits. In contrast, there are more rejected pull requests, with fewer comments and more commits.

figure d

As Actions are diverse and might perform a diverse range of tasks on GitHub repositories, we also investigated whether the impact of GitHub Actions differs across Action categories. The literature recommends employing a segmented analysis to further explain the general findings from statistical models (Wessel et al. 2022). In this research question, as in RQ3, we used a Regression Discontinuity Design model to measure the impact of adoption in project indicators across the four most popular Action categories: Utilities, Continuous Integration, Code Quality, and Deployment. Results obtained in the segmented analysis were similar to the overall results (from RQ3), except for code quality Actions, which led to fewer rejected pull requests.

The main contributions of this paper are:

  1. 1.

    Characterization of the usage of GitHub Actions.

  2. 2.

    An understanding of how developers discuss GitHub Actions.

  3. 3.

    An understanding of how GitHub Actions’ adoption impacts project activities.

This paper extends our prior work (Kinsman et al. 2021), published at MSR 2021 (The Mining Software Repositories Conference), along two major dimensions: the data used and the analyses performed. The data used in this paper broadens our previous work in three major dimensions: time (24 vs. 12 months after GitHub Actions were introduced), the number of unique Actions (973 vs. 708), and the dataset of projects used (5,000 most popular GitHub projects vs. RepoReapers dataset). In this extension, we also added RQ4 and included new regression discontinuity design analyses split by Action categories.

2 Workflow Automation with GitHub Actions

GitHub Actions is an event-driven API the GitHub platform provides to automate development workflows. GitHub Actions can run a series of commands after a specified event has occurred. An event is a specific activity that triggers a workflow run, as shown in Fig. 1 (see the icon). For example, a workflow is triggered when a pull request is created for a repository or when a pull request is merged into the main branch. Workflows are defined in the .github/workflows/ directory and use YAML syntax, having either a .yml or .yaml file extension.

Fig. 1
figure 1

GitHub workflow automation with GitHub Actions (adapted from GitHub)

A workflow can contain one or more Actions. GitHub allows developers to build reusable components, called Actions. Developers create Docker and JavaScript Actions, and both require a metadata file to define the Action’s inputs, outputs, and entry point.

After the successful execution of a workflow, the outputs can be displayed in different ways, such as through a GitHub Action bot. Like many other bots on GitHub, this bot is implemented as a GitHub user that can submit code contributions, interact through comments, and merge or close pull requests (Wessel and Steinmacher 2020).

As an example of GitHub Actions adoption, consider the case of the project GammapyFootnote 2, an open-source Python package for gamma-ray astronomy. As of the 13\(^{th}\) of November 2019, the Gammapy community adopted a GitHub Action called First InteractionFootnote 3, which is responsible for identifying and welcoming newcomers when they create their first issue or open their first pull request on a project. As shown in Fig. 2a, Gammapy created a workflow called Greeting that both new pull requests and issues might trigger, as defined by the on keyword. The output of the First Interaction Action is displayed through an issue/pull request comment posted by GitHub Action Bot when a new contributor authors a new pull request or issue. An example of this Action interaction on a GitHub issue is shown in Fig. 2b.

Fig. 2
figure 2

Example of First Interaction Action on Gammapy project 

Development bots and workflows that rely on GitHub Actions are already used in hundreds of thousands of repositories, justifying the need for further studies on these automation mechanisms’ evolution and impact on collaborative software development practices. Recently, developers published GitHub Actions variants for many well-known bots (e.g., Coveralls, Codecov, Snyk), and these Actions are rapidly increasing in popularity (Golzadeh et al. 2020).

3 Related Work

Previous work has investigated a variety of automation tools, including development bots, continuous integration/delivery, and GitHub Actions.

3.1 Development Bots

Development bots have been proposed to automate technical and social aspects of software development activities (Lin et al. 2016), such as communication and decision-making (Storey and Zagalsky 2016). For example, on GitHub, bots are often integrated into the pull request workflow (Erlenhov et al. 2019) to perform a variety of tasks, including repairing bugs (Monperrus 2019), refactoring source code (Wyrich and Bogner 2019), recommending tools to help developers (Brown and Parnin 2019), and updating outdated dependencies (Mirhosseini and Parnin 2017). Wessel et al. (2018) identified 13 categories of development bots. van Tonder and Le Goues (2019) believe development bots are a promising addition to a developer’s toolkit as they bridge the gap between human software development and automated processes.

However, understanding the impact of development bots on human developers’ interactions is a major challenge. Storey and Zagalsky (2016) highlight that the way that development bots interact on pull requests can be disruptive and perceived as unwelcoming. Wessel et al. (2021) identified several challenges caused by bots in pull requests and theorized how human developers perceive annoying bot behaviors as noise on social coding platforms. Wessel et al. (2020, 2022) also found that adopting code review bots changes team dynamics, for example, by increasing the number of monthly merged pull requests and decreasing communication among developers.

3.2 Continuous Integration and Continuous Delivery

Continuous Integration and Continuous Delivery (CI/CD) tools aim to bridge development and operation activities by automating the building, testing, and deployment of applications (Duvall et al. 2007). These tools constantly compile incremental code changes made by developers, build software deliverables, run automated tests and verifications, and deploy applications to servers, improving software quality and productivity (Duvall et al. 2007). Vasilescu et al. (2015) show that using CI leads to more pull requests being processed, and thus more pull requests being accepted or rejected. In the context of Computer Science education,  Hu and Gehringer (2019) set up a continuous integration service on GitHub to provide feedback to students about code style and functionality. Prior work has also investigated the impact of CI and code review tools on GitHub projects (Zhao et al. 2017; Kavaler et al. 2019; Cassee et al. 2020) across time. While Zhao et al. (2017) and Cassee et al. (2020) focused on the impact of the Travis CI tool’s introduction in development practices, Kavaler et al. (2019) examined the impact of linters, dependency managers, and coverage reporter tools. A survey by Chen et al. (2001) reports that of the hundreds of billions of dollars spent on developer wages, up to 25% accounts for fixing bugs. Continuous integration and other automation tools thus hold huge potential to further reduce human effort and costs by automatically fixing bugs.

3.3 GitHub Actions

GitHub Actions offer built-in support to automate parts of the software development workflows that exceed what CI/CD tools can achieve. Golzadeh et al. (2022) showed that, in 18 months of existence, GitHub Actions had become the dominant CI service, covering more than half of all repositories with a CI. Software projects are still adjusting GitHub Actions to their dynamics. Valenzuela-Toledo and Bergel (2022) found 11 reasons for changing the GitHub Actions’ workflow. Saroar and Nayebi (2023) conducted a survey to understand the motivations and best practices in using, developing, and debugging GitHub Actions. Calefato et al. (2022) identified a set of practices for using GitHub Actions in projects related to machine learning-enabled systems. In a broader view, Decan et al. (2022) found that the reuse of actions is a common practice. Researchers are also starting to provide academic tools via GitHub Actions to facilitate the integration with real projects. For example, Cordeiro et al. (2021) offer a GitHub Action for detecting flakiness in time-constrained tests. Finally, in a prior work Kinsman et al. (2021), we investigate how developers use GitHub Actions and how several activity indicators change after their adoption. We explain how this paper extends our prior work in Section 1. Chen et al. (2021) also extended our prior work, finding that 22% of popular projects adopt GitHub Actions and that adoption correlates with project popularity and number of contributors and varies per programming language. They also found that after adopting GitHub Actions, the number of commits, number of pull requests, issue latency, and pull request latency tend to decrease, while the number of issues closed tends to increase.

4 Research Design

This study aims to understand GitHub Actions usage and the effects on GitHub projects. To achieve our goal, we employed a mixed-methods approach combining a time series analysis on a sample of open-source repositories and qualitative analysis of developers impressions about GitHub Actions. We present our study design, data collection, and analysis procedures in the following.

4.1 Selecting Projects

We assembled a dataset of GitHub open-source projects that adopted GitHub Actions at some point in their history. We started by selecting the 5,000 most-starred GitHub repositories. We used stars as a proxy for popularity. We then filtered this dataset to keep open-source software projects that had adopted at least one Action during their lifetime. To identify these projects, we retrieved data from the GitHub API using a Ruby toolkit called Octokit.rb.Footnote 4 To determine if the project used any Actions, we verified whether the repositories contained files in yaml format in the ./github/workflows directory. This filtered dataset comprised 1,489 projects.

4.2 Analyzing the Use of GitHub Actions

First, we collected and quantitatively analyzed the number of projects using GitHub Actions and the number of GitHub Actions per project (RQ1). We also automatically analyzed the workflow files of the studied projects, searching for the category, description, and whether GitHub verified the Action. We determined the Actions used within a workflow by extracting the ‘uses: ACTION@VERSION.’ For example, in ‘uses: actions/first-interaction@v1’ the First interactionFootnote 5 was identified and extracted. In the case of multiple Actions in a single workflow, all of them were identified.

4.3 Categorizing GitHub Actions Discussions

To answer RQ2, we manually investigated how GitHub Actions were discussed in project-specific channels, including GitHub Discussions (Hata et al. 2022) and Discord chats (Subash et al. 2022).

Filtering GitHub Discussions and Discord chats. We started by investigating the GitHub Discussions on our selected projects. Out of the 5,000 repositories in our dataset, 897 (18%) had the Discussions feature enabled at the time of data collection, and 830 (17%) contained at least one Discussion thread. These 830 repositories account for 88,443 Discussion threads (minimum: 1, median: 22, maximum: 10,129), containing 326,033 posts. To complement our analysis, we have also considered developers’ conversations on Discord, as they may use other communication channels to discuss GitHub Actions. For this analysis, we used the DISCO dataset (Subash et al. 2022). This dataset consists of one-year public conversations on Discord of five software development communities (Python, Go, Clojure, and Racket).

Aiming for high precision rather than recall, we applied a strict filter to these GitHub Discussion posts and chat excerpts and selected only those with the exact string “GitHub Action” (case insensitive). We avoided searching for strings like “.github/workflows/” and “workflow”, which tend to generate many false positives. An exploratory analysis of the DISCO dataset showed that strings like “.github/workflows/” are rarely mentioned, and “workflow” mostly appears in unrelated contexts.

This filtering step resulted in (i) 573 posts originating from 458 threads in 148 different repositories and (ii) 40 excerpts from two distinct communities (34 and 6 excerpts from Python and Go, respectively).

Qualitative analysis. We applied qualitative coding to the 458 threads to understand how developers discuss GitHub Actions. One author developed a preliminary coding schema based on a random sample of 20 threads, which was refined through discussions with all authors. Two authors then independently coded another set of 20 threads and measured inter-rater agreement. Based on achieving an ‘almost perfect’ agreement (Cohen’s \(\kappa = 0.939\) (McHugh 2012)) and resolving disagreements through discussion, the same two authors divided the remaining threads equally among them and completed the annotation of all 458 threads. We also applied qualitative coding to the 40 chat excerpts from the DISCO dataset. Two authors then independently coded all chat excerpts based on the defined code schema and measured inter-rater agreement (Cohen’s \(\kappa = 1\)). Section 5.2 reports the coding schema and the detailed results for both Discussion threads and Discord conversations.

4.4 Time Series Analysis

We conducted a time series analysis to answer RQ3 and RQ4. We collected longitudinal data for different outcome variables and treated the adoption of GitHub Actions by each project in our dataset as an “intervention”. This way, we could align all the time series of project-level outcome variables on the intervention date and compare their trends before and after adopting GitHub Actions. The following subsections detail the steps involved, from aggregating the project variables to running the statistical models.

4.4.1 Aggregating Project Variables

We gathered Action data within an observation period of 12 months before and 12 months after the Action adoption within each project. Similar to previous work (Zhao et al. 2017; Wessel et al. 2020; Cassee et al. 2020; Kinsman et al. 2021), we exclude 30 days around the Action adoption date to avoid the influence of the instability caused during this period. Afterward, we aggregated individual pull request data into monthly periods, considering 12 months before and after the Action introduction. Next, we checked the activity level of the candidate projects, since many projects on GitHub are inactive (Gousios et al. 2014). Our data set comprises 662 active projects that had been using at least one GitHub Action for 12 months.

We focused on the same pull request-related variables as in previous work (Wessel et al. 2020; Kinsman et al. 2021):

Merged/non-merged pull requests: the number of monthly contributions (pull requests) that have been merged (accepted) or closed but not merged (rejected) into the project, computed over all closed pull requests in each time frame.

Comments on merged/non-merged pull requests: the median number of monthly comments computed over all merged and non-merged pull requests in each time frame.

Commits of merged/non-merged pull requests: the median of monthly commits computed over all merged and non-merged pull requests in each time frame.

Time to merge/time to close pull requests: the median of monthly pull request latency (in hours), computed as the difference between the time when the pull request was closed and the time when it was opened. The median is computed using all merged and non-merged pull requests in each time frame.

Based on previous work (Cassee et al. 2020; Zhao et al. 2017; Wessel et al. 2020; Kinsman et al. 2021), we also collected six known covariates for each project:

Project name: the name of the project to which the pull request belongs. This name is used to uniquely identify the project on GitHub.

Programming language: the primary project programming language, as automatically provided by GitHub.

Time since the first pull request: in months, computed since the earliest recorded pull request in the project’s history. We use this variable to capture the project’s maturity regarding its use of pull requests.

Total number of pull request authors: we count how many contributors submitted pull requests to the project as a proxy for the community size of a project.

Total number of commits: we compute the total number of commits as a proxy for the activity level of a project.

Number of pull requests opened: the number of monthly contributions (pull requests) received in each time frame. We expect that projects with a high number of contributions also observe a high number of comments, latency, commits, and merged and non-merged contributions.

4.4.2 Statistical Approach

We modeled the effect of GitHub Action adoption over time across GitHub repositories using a Regression Discontinuity Design (RDD) (Thistlethwaite and Campbell 1960; Imbens and Lemieux 2008), following the work of Wessel et al. (2020). RDD is a technique used to model the extent of a discontinuity at the moment of intervention and long after the intervention. The technique assumes that if the intervention does not affect the outcome, there would be no discontinuity, and the outcome would be continuous over time (Cook and Campbell 1979). The statistical model behind RDD is

$$\begin{aligned} \begin{aligned} y_{i} =&\, \alpha + \beta \cdot \text {time}_{i} + \gamma \cdot \text {intervention}_{i} \, + \\ {}&\delta \cdot \text {time\_after\_intervention}_{i} \, + \eta \cdot {controls}_{i} + \varepsilon _{i} \end{aligned} \end{aligned}$$

where i indicates the observations for a given project.

To model the passage of time as well as the GitHub Action introduction, we rely on three variables: time, time after intervention, and intervention. The time variable is measured as months at the time j from the start to the end of our observation period for each project.

The intervention variable is a binary value used to indicate whether the time j occurs before (\({\textit{intervention}}=0\)) or after the (\({\textit{intervention}}=1\)) adoption event. The time_after_intervention variable counts the number of months at time j since the Action adoption, and the variable is set to 0 before adoption. The \({\textit{controls}}_{i}\) variables enable the analysis of Action adoption effects rather than confounding the effects that influence the dependent variables. For observations before the intervention, holding controls constant, the resulting regression line has a slope of \(\beta \), and after the intervention \(\beta +\delta \). The size of the intervention effect is measured as the difference equal to \(\gamma \) between the two regression values of \(y_{i}\) at the moment of the intervention.

Considering that in RQ3 we are interested in the effects of GitHub Actions on the monthly trend of the number of pull requests, number of comments, number of commits, and time to close for both merged and non-merged pull requests, we fitted eight models (4 variables \(\times \) 2 cases). In RQ4, we measured the impact of adoption for the same variables across the four most popular Action categories in our filtered dataset: utilities, continuous integration, code quality, and deployment. We selected projects that have adopted one or more of the four categories. In cases where a project employs multiple Actions, the project is considered in the analysis of multiple Action categories. Therefore, we fitted thirty-two models (4 variables \(\times \) 2 cases \(\times \) 4 categories).

To balance false positives and false negatives, we report the corrected p-values after applying multiple corrections using the method of Benjamini and Hochberg (1995). We implemented the RDD models as a mixed-effects linear regression using the R package lmerTest (Kuznetsova et al. 2017). We modeled project name and programming language as random effects (Gałecki and Burzykowski 2013) to capture project-to-project and language-to-language variability (Zhao et al. 2017). We evaluate the model fit using marginal \((R^2_m)\) and conditional \((R^2_c)\) scores, as described by Nakagawa and Schielzeth (2013). The \(R^2_m\) can be interpreted as the variance explained by the fixed effects alone, and \(R^2_c\) as the variance explained by the fixed and random effects together.

In mixed-effects regression, the variables used to model the intervention and the other fixed effects are aggregated across all projects, resulting in coefficients useful for interpretation. The interpretation of these regression coefficients supports the discussion of the intervention and its effects, if any. Thus, we report the significant coefficients (\(p < 0.05\)) in the regression and their variance, obtained using ANOVA. In addition, we log transform the fixed effects and dependent variables that have high variance (Sheather 2009). We also account for multicollinearity, excluding any fixed effects for which the variance inflation factor (VIF) is higher than 5 (Sheather 2009).

5 Results

In the following, we report the results of our study per research question.

5.1 How do OSS Projects use GitHub Actions? (RQ1)

Analyzing the set of 5,000 repositories, we identified 1,489 (29.8%) open-source software projects that had adopted at least one GitHub Action at the time of our data collection. As the box plot in Fig. 3 shows, many of these repositories adopt more than one Action, with a median value of four and a maximum of 46.

Fig. 3
figure 3

Number of Actions per repository (log scale)

In these repositories, we found 973 distinct predefined GitHub Actions. We collected data from each Action’s repository and the GitHub MarketplaceFootnote 6 page to categorize these GitHub Actions. If published in the marketplace, an Action is classified into 1–2 categories by the publisher. Table 1 presents the categorization of GitHub Actions we found. Note that the percentages do not add up to 100, since about half of the GitHub Actions are assigned to two categories, a primary and a secondary.

Table 1 Categorization of GitHub Actions found in our sample

The five most frequent categories of GitHub Actions are:

Utilities: GitHub Actions created to automate diverse steps of the development workflow on the GitHub platform, often in support of other GitHub Actions. For example, the Read Properties Action inspects Java .properties files looking for predefined properties. Another example of a utility Action is Replace string, which replaces strings that match predefined regular expressions.

Continuous integration: GitHub Actions responsible for running the CI pipeline and notifying contributors of test failures in CI tools (e.g., Retry Step, Chef Delivery).

Deployment: GitHub Actions designed to build and deploy the application upon request. One example is the Action called Jekyll Deploy, responsible for building and deploying the Jekyll site to GitHub Pages.

Publishing: GitHub Actions responsible for automatically publishing packages to the registry. For example, Action For Semantic Release is an Action that leverages semantic-release to fully automate the package release workflow, determining the next version number, generating the release notes, and publishing the package.

Code quality: GitHub Actions that analyze source code (e.g., code style, code coverage, code quality, and smells) submitted through pull requests and give feedback to developers via GitHub checks or comments.

In addition, we found that 42 (5.93%) out of 973 GitHub Actions are verified by GitHub. Creators are verified if they have an existing relationship with GitHub, and GitHub works closely with the creator to generate these GitHub Actions.

Table 2 Most-used GitHub Actions across repositories

Table 2 shows the ten most popular GitHub Actions. The most popular one, actions/checkout is used by the vast majority (97%) of repositories that have adopted at least one GitHub Actions. The five most popular GitHub Actions are the following:

actions/checkout: A verified utility Action that checks out a repository under $GITHUB_-WORKSPACE. Therefore, a workflow can access the repository for further workflow tasks.

actions/cache: A verified utility and dependency management Action that allows caching dependencies and building outputs to improve workflow execution time.

actions/setup-node: A verified utility Action that sets up a Node.js environment for use in a workflow, allowing users to specify a Node.js version.

actions/upload-artifact: A verified utility Action that uploads artifacts from a workflow, allowing developers to share data between jobs and store data once a workflow is complete.

actions/setup-python: A verified utility Action that sets up a Python environment for use in a workflow, allowing the use of Python features and commands.

Out of 5,000 GitHub repositories, 1,489 (29.8%) adopted the GitHub Actions feature, with a median of four GitHub Actions used per repository. We found 973 unique predefined GitHub Actions being used within the workflows. These GitHub Actions are spread across 27 categories. The most recurrent ones are utilities, continuous integration, and deployment. Comparison to our previous work: In our previous work, we found that only 0.7% of repositories considered in our analysis had adopted GitHub Actions. This number has changed dramatically, with GitHub Actions now having found much more widespread adoption. ]

figure f

5.2 How is the use of GitHub Actions Discussed by Developers? (RQ2)

We categorized 458 GitHub Discussion threads and 40 developers’ conversation excerpts containing the phrase “GitHub Action”. Table 3 shows an overview of this categorization, indicating how many threads and excerpts we found in each category. We present the categories in the following.

Table 3 Categorization of Discussion threads and developers conversations on Discord

Help wanted in the context of GitHub Actions (no error message): The largest group of Discussion threads that mention GitHub Actions concerns requests for help in the context of the feature. We distinguish requests for help that mention a specific error message and are primarily aimed at soliciting help in debugging from those that are less specific. Conversations that do not provide a specific error message might ask for help in configuring a particular Action or mention that automation is not working as intended.

Marginal mention of GitHub Actions: While all threads and chat excerpts in our dataset contain the phrase ‘GitHub Action,’ the feature is not the main topic of all such conversations. In some cases, GitHub Actions is mentioned as part of a long discussion thread announcing a release where GitHub Actions only affected a small number of features. In other cases, GitHub Actions are only mentioned several months after the threads were started, and they are only marginally related to the thread topic.

Error/debug message in the context of GitHub Actions: Complementing the first category discussed above (Help wanted in the context of GitHub Actions), Error/debug message in the context of GitHub Actions contains discussions that start with a specific error or warning message and ask for help. In most cases, the error or warning has been provided verbatim by the developer starting the discussion. Errors can come from the GitHub Actions feature itself or from the various applications, such as linters or code review bots, that are invoked via a GitHub Actions.

Potential of usingGitHub Actions: Since GitHub Actions is still a relatively new feature, not all developers are aware of it. This category captures discussions in which developers suggest the use of GitHub Actions to address a specific task, e.g., “alternatively, the JIRA issue transitions at both PR creation and merge can be accomplished using GitHub Actions listening to those events”Footnote 7 or “you could use the Vercel CLI directly as part of a GitHub Action (or similar) to deploy when releasing”.Footnote 8

Issue reproducing output with GitHub Actions: In many cases, the goal of using a GitHub Action is to automate a process otherwise conducted manually (or using a different tool). Discrepancies can occur when developers struggle to reproduce results they achieved with the help of a GitHub Action, e.g., “This only happens with builds in GitHub Actions and I am unable to reproduce this locally”.Footnote 9

Plan to use GitHub Actions: Compared to the large number of GitHub issues dedicated to discussing projects’ migration plans to GitHub Actions, which we identified in our previous work, we found a smaller number of such discussion threads in this work, likely because the GitHub Actions feature is more established now. An example of such a discussion thread is “Migrating from Azure Pipelines to GitHub Actions”,Footnote 10 a thread that discusses the pros and cons of migration as well as how to implement it for a specific project.

Non-English thread: A small number of discussion threads in our dataset were not in English.

Other: Three of the discussion threads in our dataset did not fit any of the above categories and were assigned to the ‘Other’ category. An example is a discussion thread on GitHub’s docs projectFootnote 11 about how to structure documentation about GitHub Actions.

Discussion threads and chat excerpts that mention GitHub Actions predominantly focus on requests for help in the context of the feature, with or without concrete error messages. A smaller group of discussions concerns plans for using the feature or debating its potential. Comparison to our previous work: A couple of years after the data collection for our previous work, in which we analyzed GitHub issues about GitHub Actions (since GitHub Discussions did not yet exist), we now find fewer discussions about the potential of GitHub Actions and more discussions about specific issues, such as errors and discrepancies. ]

figure g

5.3 What is the Impact of GitHub Actions? (RQ3)

To answer this question, we investigated the effects of GitHub Action adoption on project activities along four dimensions: (i) merged and non-merged pull requests, (ii) human conversation, (iii) efficiency to close pull requests, and (iv) modification effort. We start by investigating how Action adoption impacts the number of merged and non-merged pull requests. We fit two mixed-effect RDD models, as described in Section 4.4.2. For these models, the number of merged/non-merged pull requests per month is the dependent variable. Table 4 summarizes the results of these models. In addition to the model coefficients, the table also shows the sum of squares, with variance explained for each variable. We also highlighted the time series predictors time, time after intervention, and intervention in bold.

Table 4 The Effects of GitHub Actions on PRs. The response is log(number of merged/non-merged PRs) per month

Analyzing the model for merged pull requests, we found that the fixed-effects part fits the data well (\(R^2_m=0.87\)). However, considering \(R^2_c=0.93\), variability also appears from project-to-project and language-to-language. Among the fixed effects, we note that the number of monthly pull requests explains most of the variability in the model, indicating that projects receiving more contributions tend to have more merged pull requests, with other variables held constant. Regarding the Action effects, there is a discontinuity at adoption time, followed by a statistically significant decrease after the introduction.

Similar to the previous model, the fixed-effect part of the non-merged pull requests model fits the data well (\(R^2_m=0.71\)), even though a considerable amount of variability is explained by random effects (\(R^2_c=0.82\)). We note similar results on fixed effects: projects receiving more contributions tend to have more non-merged pull requests. In addition, pull requests receiving more comments tend to be rejected. The effect of Action adoption on the non-merged pull requests differs from the previous model. Regarding the time series predictors, the negative trend in the number of non-merged pull requests before the Action adoption is reversed, toward an increase after adoption.

Table 5 The Effects of GitHub Actions on Pull Request Comments. The response is log(median of comments) per month

To investigate the effects of Action adoption on pull request communication, we fit one model to merged pull requests and another to non-merged ones. The median of pull request comments per month is the dependent variable. Table 5 shows the results of the fitted models. Considering the model of comments on merged pull requests, we found that the combined fixed-and-random effects (\(R^2_c=0.58\)) fit the data better than the fixed effects (\(R^2_m=0.30\)), showing that most of the explained variability in the data is associated with project-to-project and language-to-language variability, rather than the fixed effects. We also observe that the time to close pull requests explains the largest amount of variability in the model, indicating that the communication during the pull request review is strongly associated with the time to merge it. Regarding the Action effects, we note no statistically significant trend before adoption; a discontinuity at the adoption time; and an apparent increase in the time trend after adoption.

Turning to the model of comments on non-merged pull requests, the model fits the data well (\(R^2_m=0.56\)), and variability is explained by the random variables (\(R^2_c=0.69\)). This model also suggests that communication during the pull request review is strongly associated with the time to reject the pull request. Table 5 shows a discontinuity at adoption time, followed by a statistically significant decrease after Action adoption.

Table 6 The effects of GitHub Actions on time to close PRs. The response is log(median of time to close PRs) per month

We fitted two RDD models where median of time to close pull requests per month is the dependent variable. The results are shown in Table 6. Analyzing the results of the effect of GitHub Actions on the latency to merge pull requests, we found that combined fixed-and-random effects fit the data better than the fixed effects. Although several variables affect the trends of pull request latency, communication during the pull requests is responsible for most of the variability in the data. This indicates the expected results: the more effort contributors expend discussing the contribution, the more time the contribution takes to merge. The number of commits also explains the amount of data variability, since a project with many changes needs more time to review and merge them. We observe a discontinuity at adoption time, followed by a statistically significant decrease after GitHub Action’s introduction.

Turning to the model of non-merged pull requests, we note that it fits the data well (\(R^2_m=0.50\)), and variability is explained by the random variables (\(R^2_c=0.61\)). As above, communication during the pull requests is responsible for most of the variability encountered in the results. Similar to the previous model, none of the Action-related predictors have statistically significant effects on the time to reject pull requests. We observe an increasing trend before adoption, followed by a statistically significant discontinuity at adoption. After adoption, however, there is no effect on the time to reject pull requests, since the time after intervention coefficient is not statistically significant.

Table 7 The Effects of GitHub Actions on Pull Request Commits. The response is log(median of commits) per month

Finally, we studied whether Action adoption affects the number of commits made before and during the pull request review. Again, we fitted two models for merged and non-merged pull requests, where the median of pull request commits per month is the dependent variable. The results are shown in Table 7. Analyzing the model of commits on merged pull requests, we found that the combined fixed-and-random effects (\(R^2_c=0.60\)) fit the data better than the fixed effects (\(R^2_m=0.37\)). The statistical significance of all Action-related coefficients indicates that the adoption of GitHub Actions affected the number of commits. We note a statistically significant discontinuity at adoption time, followed by a decreasing trend after adoption. Additionally, we can also observe that the number of pull request comments and the number of contributions per month explains most of the variability in the result. This result suggests that the more comments and pull requests there are, the more commits there will be.

Investigating the results of the non-merged pull request model, we also found that the combined fixed-and-random effects fit the data better than the fixed effects. Similar to the previous model, the number of pull request comments per month explains most of the results’ variability. Regarding the time series predictors, the model did not detect any discontinuity at adoption time. However, the negative trend in the median of commits before the bot adoption is reversed, toward an increase after adoption.

After adopting GitHub Actions, on average, there are fewer accepted pull requests, with more discussion comments and fewer commits, which take more time to merge. On the other hand, there are more rejected pull requests, which contain fewer comments and more commits. Comparison to our previous work: We confirm the results from our previous work. We have already shown that GitHub Actions increase the number of rejected pull requests and decrease the number of commits on merged pull requests. ]

figure h

5.4 How Does the Impact of GitHub Actions Differ Across Action Categories? (RQ4)

To investigate the effects of GitHub Action adoption on project activities across the four most used Action categories in our dataset, we fit thirty-two mixed-effect RDD models, as described in Section 4.4.2. We considered the same activity indicators studied in the previous research question: (i) merged and non-merged pull requests, (ii) human conversation, (iii) efficiency to close pull requests, and (iv) modification effort.

Table 8 The Effects of GitHub Actions on Merged Pull Requests. The response is log(number of merged PRs) per month
Table 9 The Effects of GitHub Actions on Non-merged Pull Requests. The response is log(number of non-merged PRs) per month

We fitted four RDD models for each of the Action categories where number of merged pull requests per month is the dependent variable. The results are shown in Table 8. The statistical significance of the time series predictors for utilities indicates that the adoption of GitHub Actions of this category affected the trend in the number of merged pull requests. In addition, we fitted four RDD models where number of non-merged pull requests per month is the dependent variable (see Table 9). In the model of code quality GitHub Actions, although the model did not detect any discontinuity at adoption time, the positive trend in the number of rejected pull requests before Action adoption is reversed toward a decrease after adoption. Considering the other categories, the Action-related predictors do not have statistically significant effects, meaning the trend in the number of merged and non-merged pull requests is stationary over time and remains unaffected by the Action adoption.

Table 10 The Effects of GitHub Actions on Comments of Merged Pull Requests. The response is log(number of comments on merged PRs) per month
Table 11 The Effects of GitHub Actions on Comments of Non-merged Pull Requests. The response is log(number of comments on non-merged PRs) per month

Analyzing the models of human discussions (see Table 10), where the median of comments per month in merged pull requests is the dependent variable (see Table 11), we found that the introduction of utility GitHub Actions increases the discussions by developers on merged pull requests. There is a discontinuity at adoption time, followed by a statistically significant decrease after the utilities’ introduction. Turning to the models where the median of comments per month in rejected pull requests is the dependent variable (see Table 11), we found that utilities, CI, and deployment GitHub Actions decreased the number of comments on rejected pull requests.

Table 12 The effects of GitHub Actions on the time to merge pull requests. The response is log(median of time to merge PRs) per month
Table 13 The effects of GitHub Actions on the time to close pull requests. The response is log(median of time to close PRs) per month
Table 14 The Effects of GitHub Actions on Commits of Merged Pull Requests. The response is log(number of commit on merged PRs) per month
Table 15 The Effects of GitHub Actions on Commits of Non-merged Pull Requests. The response is log(number of commit on non-merged PRs) per month
Table 16 Segmented analysis comparison (whole sample vs. different categories)

Segmenting the analysis for specific categories, we found that the number of comments in rejected pull requests statistically decreases in 3 out of 4 categories as well as in the whole sample, as can be observed in Tables 11, 12, 13, 14, 15, and 16. Besides this indicator, the Utilities category, which contains the largest number of GitHub Actions, resembles the whole sample and also showed statistical differences in accepted pull requests (decreased), comments in accepted pull requests (increased), commits in accepted pull requests (decreased), and commits in rejected pull requests (increased). In the Code Quality category, the only indicator for which we observed a statistically significant change is the number of rejected pull requests (decreased), which is in the opposite direction of the whole sample. We conjecture that Code Quality GitHub Actions help contributors improve the quality of pull requests that would otherwise be rejected and, thus, the number of rejected pull requests in such repositories tends to decrease after the introduction of the Action.

Analyzing the four most used types of GitHub Actions, we found that the number of comments in rejected pull requests consistently decreased across categories (3 out of 4). Several other indicators also changed after the adoption of GitHub Actions from the Utilities category: accepted pull requests (decreased), comments in accepted pull requests (increased), commits in accepted pull requests (decreased), and commits in rejected pull requests (increased). In the Code Quality category, the only indicator that changed is the number of rejected pull requests (decreased). ]

figure i

6 Discussion

This section discusses our results and the key implications for practitioners, researchers, and educators.

Automation in Software Engineering. The rise of GitHub Actions evidence the importance of automation in software engineering. OSS project maintainers, who are often busy with coding and community-building activities, can save a lot of time by using GitHub Actions to automate repetitive tasks such as replacing strings and running the integration pipeline. Automation can bring not only time savings but also avoid human errors and provide consistency in completed tasks (Storey and Zagalsky 2016). The multiple benefits of automation can help explain the widespread adoption of GitHub Actions. Indeed, we have seen an increase from 0.7% to circa 30% in the adoption of GitHub Actions since we conducted our prior work (Kinsman et al. 2021). This result is in line with the studies conducted by Decan et al. (2022) and Chen et al. (2021), who found GitHub Actions in 43.9% and 22% of their sample of projects, respectively. We also found a large number of projects discussing using GitHub Actions. Given this impetus to automation, other software engineering tools and platforms should consider offering automation capabilities or integration endpoints and APIs so that the variety of tools used in software development can be integrated into large and more complex workflows. Our results show that projects have a median of four GitHub Actions, and we expect this number to grow as more tools are integrated into the workflow pipelines. The power of automating tasks with GitHub Actions can also be explored in other contexts. For example, software engineering educators can use GitHub Actions to build automation tools to better support their assignments, including those related to contributing to OSS Pinto et al. (2017). GitHub Actions can also automate multiple aspects related to code quality checking still unexplored (Aniche et al. 2016; dos Santos and Gerosa 2018; Aniche et al. 2016).

Problems may arise from the integration of distinct automation tools. We identified almost 1,000 distinct GitHub Actions in the repositories, and projects often use more than one GitHub Actions in their repositories (median number of four and a maximum of 46 in a single repository). Wessel et al. (2021) showed that the use of multiple automation tools may cause noise and inconsistencies. As some GitHub Actions provide limited configuration options and are hard to change, researchers and practitioners should find ways to seamlessly integrate such tools in their repositories. A promising approach is the use of meta-bots to integrate and moderate the interactions of multiple bots (Wessel et al. 2022). Such meta-bots can be responsible for mediating the communication between the tools and the environment. Another approach is the adoption of process execution languages, such as BPEL and BPMN (Ouyang et al. 2006), to allow end users to describe their workflow and how information moves among the activities, which may include manual and automated tasks. Future work can investigate how to facilitate such end-user programming to build complex workflows and automation scenarios. Approaches such as orchestrations and choreographies (Leite et al. 2013) can also be investigated in this context. Future work can also investigate the interplay of GitHub Actions and other automation tools, such as development bots (Wessel et al. 2018). It is still not clear when each platform should be used and how the interoperability problems should be addressed.

CI/CD is one of the most automated parts of the workflow. Our results are in line with Golzadeh et al. (2022), who showed that GitHub Actions are replacing other continuous integration platforms. Almost one-fourth of the GitHub Actions we found are categorized as continuous integration and many other categories of actions are closely related to continuous integration or continuous delivery, including deployment, publishing, testing, etc. The popularity of these types of GitHub Actions can be explained by the popularity of CI/CD automation tools themselves. The literature has shown that these tools streamline the review of external contributions (Cassee et al. 2020). Hilton (2016) showed that projects can process more outside contributions after the adoption of CI without any change in code quality. With less time spent reviewing external pull requests, maintainers can focus on improving other aspects of the development workflow. Given the availability and widespread use of GitHub Actions for CI/CD, projects considering automating this part of the workflow should consider adopting GitHub Actions. Projects that use existing tools should become aware that they may need to migrate to GitHub Actions at some point.

GitHub Actions are still not optimal. When looking for references to GitHub Actions in the projects, the most common type of message we found was requests for help. Developers were soliciting help in configuring a particular GitHub Action or mentioning that automation was not working as intended. Projects should be aware that, as often occurs with novel technologies or features, GitHub Actions can introduce unforeseen problems. Projects should be prepared to assist developers in debugging and configuring GitHub Actions they adopt. Our results also reveal that the use of GitHub Actions sometimes makes debugging more difficult, since developers cannot reproduce locally issues related to GitHub Actions.

Project activity changes with the introduction of GitHub Actions . The adoption of new technology can bring unanticipated consequences to group behavior (Healy 2012). According to Mulder (2013), many effects are not directly caused by the new technology itself but by the changes in human behavior that it provokes. For example, with the automation of repetitive tasks, human developers can focus on other tasks, which may help explain some of the changes we observed after the adoption of GitHub Actions. Our results suggest that the introduction of GitHub Actions causes changes in several activity indicators. In particular, we noted fewer accepted pull requests, with fewer commits and more communication, and more rejected pull requests, with fewer comments and more commits. GitHub Actions can also introduce a secondary evaluation step to the pull request. Especially at the beginning of the adoption, the number of commits may increase due to the need to meet all requirements imposed by the GitHub Actions. Our results may also imply possible negative consequences. GitHub Actions may change the discussion patterns in the project. Utility actions, for example, may lead developers to discuss more. Thus, practitioners, who may already handle a high amount of messages in their repositories, must be aware that introducing some actions may increase the number of messages even more. Additional effort is also necessary to investigate the impact on newcomers, who already face a variety of barriers (Balali et al. 2018; Steinmacher et al. 2015) and may suffer from the disturbance in communication. For newcomers, interacting with GitHub Actions can be inconvenient, leading developers to lose motivation or even abandon their contributions. Similar effects have been observed when newcomers interact with other automation tools, such as development bots, which are often perceived as disruptive and noisy (Wessel et al. 2021). Therefore, designers should envision automation tools as socio-technical rather than purely technical applications, considering human interaction, developers’ collaboration, and ethical concerns (Storey and Zagalsky 2016). The literature still lacks design strategies that include end-user perspectives to enhance the interplay between automation tools and developers on social coding platforms. Future work can devise guidelines and best practices about how to build GitHub Actions and adopt them in projects to holistically consider the dynamics of the project. Considering different cognitive styles and preferences may also be the subject of future research (Santos et al. 2023).

Distinguishing human and GitHub Actions contributions in empirical studies. To enable large-scale empirical studies on the usage of automation workflows (i.e., bots, GitHub Actions) in social coding platforms, it is necessary to determine which projects rely on this automation and which user accounts work as proxies for automation tools. Several bot detection techniques have been proposed to automatically identify bot contributions in software repositories (Golzadeh et al. 2020; Abdellatif et al. 2022; Dey et al. 2020). These techniques usually rely on profile information, account activity, and comment patterns in issue and pull request comments. One of the biggest challenges with identifying automated contributions made by bots remains the occurrence of mixed accounts used by both humans and bots. Since GitHub Actions can also be implemented to act on behalf of a regular GitHub user account (i.e., a mixed account), the outcome of empirical analyses may be affected if these accounts are not properly identified.

7 Limitations and Threats to Validity

This section discusses the limitations and threats to validity and how we have mitigated them.

Generalizability: Since we selected top-starred software projects, our findings might not be generalized to other or all GitHub projects. In particular, our work focused on open-source repositories. Since the usage of Actions might slightly differ for closed-source projects, our findings might also not be generalized to closed-source, private, or industry repositories on GitHub. One way to overcome this threat is by studying less popular projects hosted on GitHub and also projects that are not open-source. Additionally, even though we considered a large number of projects and our results indicate general trends, we recommend running segmented analyses when applying our results to a given project.

Reliability of Results: To ensure consistency and improve the reliability of our qualitative findings, we have calculated the inter-rater agreement. After achieving an ‘almost perfect’ agreement (Cohen’s \(\kappa = 0.939\) (McHugh 2012)), the disagreements between the two researchers who coded the developer’s conversations (Discussion threads and conversations on Discord) have been extensively discussed throughout multiple meetings to reach an agreement.

Construct Validity: As stated by Kalliamvakou et al. (2014), many merged pull requests appear non-merged. Since we consider the number of merged pull requests, our results may be affected by this threat. Our study can be replicated when automated ways of detecting this issue are developed.

Internal Validity: We applied multiple data filtering steps to the statistical models to reduce internal threats. We varied the data filtering criteria to confirm the robustness of our models. For example, we filtered projects that did not receive pull requests in all months and observed similar phenomena. We also carried out a series of placebo tests (Imbens and Lemieux 2008) using the same model with the adoption artificially set to different dates to confirm the model’s robustness. The assumption of exogeneity of the treatment might be a threat. Another internal limitation of our analysis is that a single project on GitHub might have more than one Action in its workflow and, thus, would be considered twice in our models. Previous research has highlighted that many social and technical aspects affect the pull request acceptance (Tsay et al. 2014; Dey and Mockus 2020). Such aspects might act as potential confounding effects on our models. Following previous work that also considered interventions to pull requests (Cassee et al. 2020; Wessel et al. 2020, 2022; Kinsman et al. 2021), we added a set of six control variables, including the total number of pull request authors (as a proxy to the community size), the total number of commits (as a proxy to the activity level), time since the first pull request (capturing the pull request usage maturity) that might influence the independent variables to reduce confounding factors. However, in addition to the already identified variables, other factors might influence the results, and further research is necessary to establish causal relations.

8 Conclusion

In this paper, we investigate how software developers use GitHub Actions to automate their workflows, how they discuss these GitHub Actions, and the effects of the adoption of GitHub Actions on pull request dynamics. We collected and analyzed data from 5,000 active GitHub repositories. To understand the impact on practice, we statistically analyzed a sample of 662 open-source projects hosted on GitHub.

Firstly, the findings showed that circa 30% repositories used GitHub Actions. We also found that 973 unique predefined GitHub Actions were used within the workflows. Further, we collected and analyzed GitHub Actions related discussions and chat excerpts on Discord and found that most of them were related to developers asking for help. These findings indicate that GitHub Actions can introduce additional issues related to debugging and contributing. By modeling the data around the introduction of GitHub Actions, we noticed different results between merged and non-merged pull requests. For merged pull requests, the number of pull requests and commits decreased while comments increased, and for non-merged pull requests the number of pull requests and commits increased while the number of comments decreasesd.

Practitioners need to make informed decisions about whether to adopt GitHub Actions into their projects and how to use them effectively. GitHub Actions might allow them to automate repetitive tasks in their projects with their own custom Action. GitHub Actions provides hundreds of different GitHub Actions, potentially making it difficult for practitioners to decide which Action to use. Our work provides empirical data on which GitHub Actions are currently used and how they can impact development processes. Learning from those adopters can provide insights to assist the open-source community in deciding whether to use GitHub Actions and how to use them effectively. Future work includes the qualitative investigation of the effects of adopting GitHub Actions and the expansion of our analysis for considering the effects of different types of GitHub Actions and activity indicators.