SecDocker: Hardening the Continuous Integration Workflow

Current Continuous Integration (CI) processes face significant intrinsic cybersecurity challenges. The idea is not only to solve and test formal or regulatory security requirements of source code but also to adhere to the same principles to the CI pipeline itself. This paper presents an overview of current security issues in CI workflow. It designs, develops, and deploys a new tool for the secure deployment of a container-based CI pipeline flow without slowing down release cycles. The tool, called SecDocker for its Docker-based approach, is publicly available in GitHub. It implements a transparent application firewall based on a configuration mechanism avoiding issues in the CI workflow associated with intended or unintended container configurations. Integrated with other DevOps Engineers tools, it provides feedback from only those scenarios that match specific patterns, addressing future container security issues.


Introduction
There are plenty of tools to analyze and secure the creation of container images.Besides that, several organizations have developed guidelines to assist developers in the creation of such images with a certain degree of security.For instance, focusing on Docker [17], it is possible to find the Docker Benchmark tool released by the Center for Internet Security (CIS) [9] or the Ultimate Benchmark for Container Image Scanning (UBCIS) [2]; both containing guides that analyze every single dangerous step involved during the imagebuilding process.
However, there are different local exploitation issues associated to the Continuous Integration (CI) workflow that needs to be secured.The problem here is that some containerization solutions-like Docker-has different exploits that allow an attacker to override some of the image specifications; which is done by providing new ones at the very moment the the container is created.Furthermore, opening ports is always a security risk if it is controlled by a low level user in the system (like a developer in a DevOps server).
In this way, this study focuses on developing a tool called SecDocker for enhancing the cybersecurity pipeline when integrating containerization in the CI workflow [23].SecDocker is a wrapper, specifically an application firewall for Docker, that allows the sysadmins to block the capabilities offered by Docker in the run command.By doing so, even if Docker allows the user to perform dangerous activities the action is blocked before it gets executed.
Below, the research question and main contribution are presented.The remainder of the paper presents the elements for answering the research question.Section 2 overviews the state of the art in containerization and Continuous Integration workflow.Section 3 presents the developer's and attacker's scheme to the containerized CI layer.Section 4 presents SecDocker tool: its design, architecture and usage.Section 5 validates with the results of SecDocker from two different experiments carried out in this research measuring performance and the operational flow.Section 6 discusses enumerating pro's and cons of SecDocker and finally Section 7 provides conclusions about the processes and solutions presented in this paper.

Research Question and Contribution
Continuous Integration is a cornerstone methodology that addresses automatically several processes previously faced by software developers.However, the CI workflow also needs to meet the security mechanisms that will guarantee flexibility, productivity, and efficiency during the software development life cycle.Thus, this paper aims a set of elements that are framing within the next Research Question (RQ): RQ: Which are the mechanisms for avoiding and minimizing cybersecurity and misconfiguration issues in a CI container-based deployment system?
This RQ scales to a new level when the containerization tool is Docker.Most parts of current automation servers and processes are supported on Docker containers.However, its engine shows critical points that favor malicious users or an unaware DevOps engineer to crash the CI pipeline.Working with an erroneous configuration would promote a bad system behavior, with the manpower and economical costs associated.There are three main phases when using Docker containers in the CI workflow: 1) issues associated with image retriever; 2) issues associated with image builder, and 3) issues generated when the image is deployed.
This study presents the overview of the latest step, as well as the design and development of a tool called SecDocker for minimizing issues associated with container deployment.Besides, the tool introduces expansion capabilities for solving the first and second steps using plugins.

Background
CI is one of many software development practices aimed at helping organizations to accelerate their development and delivery of software features without compromising quality [11].According to Fitzgerald and Stol [8], it can be defined as "a process which is typically automatically triggered and comprises inter-connected steps such as compiling code, running unit and acceptance tests, validating code coverage, checking compliance with coding standards and building deployment packages".For Shahin, Babar, and Zhu [20], alongside Continuous Delivery or CDE (ensure the package is always at a productionready state after tests) and Continuous Deployment or CD, (deploy the package to production or customer environments), CI is considered part of the continuous software engineering paradigm, which includes the popular term "DevOps" [8].
DevOps is a mix of the words Development and Operations and, although there is no common definition for it, some literature reviews exist to date that addresses this point [21,12,5].For instance, Jabbari et al. [12] define it as "a development methodology aimed at bridging the gap between Development (Dev) and Operations (Ops), emphasizing communication and collaboration, continuous integration, quality assurance and delivery with automated deployment utilizing a set of development practices".To enable such concepts or practices, and thus aid developers in materializing them, DevOps relies on using a range of tools [5,15]; from source code management to monitoring and logging, as well as configuration management.Together, these tools allow the creation of a pipeline that automate the processes of compiling, building and deploying the source code into a production platform [11].
But as a relative young methodology, integrating and maintaining these tools or managing the infrastructure in which they run automatically may pose a challenge [20]; especially for CI and CD.As Leite et al. discuss in their literature review [15], concepts like "infrastructure as code", virtualization, containerization or cloud services are solutions currently known to be used for these types of issues.Among all of them, containerization is perhaps the most popular solution in DevOps environments at the moment.With a platform as a service focus, it is used for delivering software in a portable and streamlining the way by providing a platform that allows developing, running and managing applications without worrying about the infrastructure needed [19].
Technically speaking, containerization is a type of lightweight OS-level virtualization technology that allows running multiple isolated systems (in terms of processes, resources, network, etc) while sharing the same host OS.Such systems or containers, hold packaged, self-contained applications and, if necessary, binaries and libraries required to run them [3].Moreover, they have been around for some time in various forms: from chroot, FreeBSD jails or Solaris zones to Linux-based solutions relying on kernel support like LXC or OpenVZ [3,22,19,7].But over time, containerization has become a major trend thanks to tools like Docker [18,15].
Docker is an open-source platform that facilitates the management of containers by using a client-server architecture through a CLI tool, a daemon and a REST API [18,22].It relies on the concept of images to build containers, that is, a specification of the collection of layered file systems, their corresponding execution environment and some metadata; making them portable, shareable and also updatable [19].Regarding their usage, Docker containers can be used either as a microservice (to host a single service), as a way of shipping complete virtual environments (to reproduce and automate the deployment of applications) or even as a platform as a service (to cope with security and infrastructure integration issues) [7,17].
From a security perspective, Docker provides different levels of isolation, host hardening capabilities and some countermeasures related to network operations [6,7,17].Nevertheless, it is not exempt from security threats nor vulnerabilities such as ARP spoofing, DoS attacks, privilege escalation, etc.This is due to the nature of containerization itself because an attack on the host OS may expose all containers and their network traffic.To address these cybersecurity risks it is necessary to take similar actions to DevOps; especially where pipeline automation is a requirement (as in CI or CD).Such actions can be understood as best practices or recommendations that aim to establish a Secure Software Development Life Cycle.Examples of this can be found in reports like DevSecOps: How to Seamlessly Integrate Security Into DevOps [16] or DoD Enterprise DevSecOps Reference Design [14], where container hardening is contemplated.

Container Layer in CI
Containers are used in CI processes to isolate and automate the creation of an application into one single self-contained virtual environment.This solution simplifies DevOps manpower, as it allows to split a large application development project into several smaller work units.Having said that, this section describes the role of CI from the point of view of two actors (DevOps engineers and attackers) and also presents the scenarios likely to be vulnerable.

DevOps Engineers Scheme
From a developers perspective, CI is used to guarantee the quality, consistency and viability across different environments [10].But as CI systems are vulnerable to security attacks and misconfigurations [20], DevOps engineers frequently rely on containers to create such environments as they provide isolation without much effort to them.Generally, this has been achieved by technologies like Docker, which allow them to treat infrastructures as code [13].
Regarding CI, Docker has ease DevOps engineers in the replication of environments for building automation pipelines.Particularly, as Boettinger et al. point in their work [4], it has solved common issues encountered by endusers like managing dependencies (through images), imprecise documentation (through scripts to build up such images) or code-rot (with image versioning); along with the adoption and re-use of existing workflows (thanks to features like portability, easy integration into local environments or public repositories for sharing and reusing those images).
But despite the benefits that Docker or other containerization technologies may offer to DevOps engineers in CI environments, the latter still face challenges related to its adoption; particularly associated with introducing any new technologies or phenomena in a given organization [10,20].According to Shahin et al. [20], literature shows that, among the common practices for implementing CI workflows, DevOps engineers need to decompose development into smaller units and also plan and document the activities that comprise the automation pipeline.Having said that, it must be noted that there are many ways of approaching the design of such pipelines.But taking into account the use of containers and based on Bass et al. approach [1], any CI workflow must include the following 6 components in such design plan: 1. Automation server.Implements the CI/CD pipeline and creates a local workspace in which its steps take place.2. Orchestrator.Sequentially triggers each step of the pipeline by communicating with the remaining components.It should be noted that, when using containers, steps may require images to perform their actions.Thus, the same image can be used through the whole pipeline or in specific steps.3. Code retriever.Pulls source code from repository to local workspace.4. Unit tester.Runs automated unit tests on source code. 5. Artifact builder.Builds deployable artifacts from source code.6. Image generator.Builds, verifies, stores, and deploys an image to be used within the pipeline.
With this in mind and despite using containers, any standard CI workflow that establishes and defines this components will lower security and increase its functionality risks.To avoid this, different automated continuous tests could be applied to the whole process.However, and particularly for item 6, some of the tools go toward a specific commercial solution.As a result, there is a need to develop a tool like SecDocker .

Attacker Scheme
As mentioned in Section 2, containers are the target of different security threats or vulnerabilities.Therefore, a containerized environment-like those created with Docker-may have different potential attack vectors [17]: host OS, network or physical systems, source code repositories, image repositories or the very own containers.Securing these vectors is not a trivial task, but the contributions presented in this paper are framed towards the integrity of container images used by CI (or CD) pipelines.
In such cases, images are frequently used to ship a complete virtual environment where concrete actions from the CI workflow take place (e.g.build, test, run or deploy an application).Such workflow is scripted and usually automated by triggering a webhook from some version control system.But this approach makes pipelines unreliable so, to contribute to its hardening, the image generator component from the CI process (see the previous subsection) is an element that needs to be hardened somehow.Regarding this process and based on Bass et al. approach [1], it is possible to distinguish four components involved in it (Figure 1): 1. Builder.Builds a container image according to some specifications.This image comprises the virtual environment or workspace where some or all workflow actions will take place.2. Verifier.Computes a checksum in order to verify the authentication of the image was just built.3. Archiver.Stores the image in a registry or repository so it can be retrieved later.4. Deployer.Deploys the image into a testing or production environment in order to execute the CI workflow or some of its scripted actions.
This study considers the last component to be one of the most important.The reason is that a correct configuration will minimize the impact of an issue in the previous three components.A container with no root or bounded CPU will guarantee minimal resource exploitation to the host machine.Thus, a runtime check for the detection of common security and configuration weaknesses against a compliance configuration pattern defined by DevOps engineers seems to meet the requirements for production environments.

Proposed Solution
SecDocker is an application firewall for Docker and thus shares the same purpose as any web application firewall: prevent users from performing dangerous or unexpected actions on systems.
Broadly speaking, the solution filters TCP traffic and works by monitoring the Docker run command.Its main goal is to evaluate all the requests meant for the Docker daemon by standing between it and the user (see Figure 2).
Whenever a new HTTP request aiming for the run API endpoint reaches the firewall, the IP package is opened and the request parameters are loaded.After that, these variables are checked against a security profile (previously configured by the user) to prevent unauthorized actions.On the one hand, when a match is found that package is dropped and a new one is created and sent to the end-user.It should be noted that this "new" package contains the original data plus the information that one or more requested options were not allowed.On the other hand,-when no matches are found-, SecDocker appends or modifies the requested parameters to suit some running restrictions (previously specified by the user).Then, the package is recreated and sent to the Docker daemon to finally perform the requested action.SecDocker is written in Go and is publicly available in GitHub1 .It features a modular and extensible design comprised of 5 components at its core: 1. Security.Performs validation against the user-supplied options.2. Config.Loads user's information into the firewall in real-time.
3. Docker.Performs tasks related to how Docker processes information.4. HTTPServer.Manages and performs actions against HTTP data (e.g.loading the body of the requests, crafting new requests/responses, etc). 5. TCPIntercept.Handles packages at the TCP level, so the communication looks transparent for the end-user.It also maintains the communications and gathers data for the HTTPServer module.Additionally, it must be noted that this module is based on Trudy2 , a transparent proxy that can modify and drop TCP traffic.
Its functionality can also be expanded by third-party applications thanks to a dedicated component named Plugins.For its basic workflow, SecDocker delegates some extra functionality to two plugins: -Anchore3 .Inspects, analyzes and applies user-defined acceptance policies.
-Notary4 Ensures the integrity of a trusted collection of Docker images.
Likewise, an accountability component based on logs is also included with SecDocker .This logging component relies on Logrus5 , an external logger package for Go that provides structured logs.

Usage
As mentioned at the beginning of this section, SecDocker workflow involves routing TCP packages in a similar way to a firewall.Consequently, it should be placed in front of the server responsible for handling requests to Docker.This is done either to maintain the original destination port of the Docker daemon or to perform some alteration to redirect the traffic to the right port. Figure 3 illustrates the process and places SecDocker in the CI workflow.u s e r : " 1 0 0 0 " 24 e n v i r o n m e n t : 25 -" M Y _ E N V = t r u e " Listing 1 Example of a SecDocker configuration file users to set up some security features related to the Docker image and its execution (see Table 1).Regarding its output, SecDocker starts to listen on port 8999 and logs all packages and their related data to the standard output by default.A separated log file is also created containing all the requests; whether they were allowed or not and why.Furthermore, any external plugins can have their logs to output their own results.

Validation
Two experiments were carried out to both measure SecDocker 's performance and check its functionality.The experiments were conducted on two PCs connected to the same LAN.Both systems were configured using Elementary OS 5.1.7 and had different specifications: one with an Intel(R) i5-3570 CPU @ 3.40 GHz and 16.0 GB memory, and the other with an AMD Ryzen 5 3500U CPU @ 3.60 GHz and 8.0 GB memory.The first PC was used as a server for running Docker and SecDocker and the second as a client to connect to the latter and execute different Docker commands.

Performance Testing
The first experiment was carried out to evaluate SecDocker 's performance as timing behavior, that is, to run transparently from a running Docker server.The test consisted of running 100 times the following command from the client's PC: # docker run alpine:latest No extra arguments were provided to ensure that no external factors (like an increase of CPU usage or network speed) affects the measurements.Besides, the standard Unix time tool was used to measure the performance of each command.Table 2 summarizes the experiment results after its execution with and without SecDocker .The mean time for running commands when SecDocker was enabled in the server PC was 0.509 ± 0.085 seconds, with an interquartile range of 0.06 seconds.Meanwhile, the meantime for commands when Sec-Docker was disabled in the the same server was 0.519 ± 0.041 seconds, with an interquartile range of 0.05 seconds.Since the differences are only 0.01 seconds between the mean times and also between the interquartile ranges, it can be considered valid to assert that, apparently, SecDocker runs transparently from Docker.

Functional Testing
The second experiment was carried out to test SecDocker functionality.This time, the goal was to send the following command from the client's PC to To prevent this potential threat, the server PC used the same configuration file shown in Listing 1; which includes the privileged option set to true in order to drop commands like the one previously mentioned.
Figure 4 shows that running the proposed command for this test fails as expected.From SecDocker 's point of view, the command is processed as represented in the sequence diagram shown in Figure 5.When the HTTP request derived from the command arrives at SecDocker , it extracts all parameters and checks them against the security configuration loaded.In the test environment, the privileged option is met, so a response is sent to the user stating that it has a forbidden option.SecDocker is a tool meant to be used for a wide variety of members from CI and DevOps communities.

Research Perspective
This work makes the assumption that the RQ proposed in Section 1.1 (see below) is appropriate, meaningful, and purposeful when facing cybersecurity issues during the CI workflow.Previous sections mentioned some of the common strategies used to solve CI issues associated to this RQ; mainly focused on the Image Generator step.Hence, it is possible to present a subset of scenarios for identifying SecDocker validity.Some of the answers extracted from this work are: -Even though is commonly accepted that the CI workflow relies on DevOps engineers experience, it is necessary to avoid unaware behaviors using a transparent and automated mechanism such as SecDocker .-SecDocker , which works as an application firewall, has no impact compared with regular Docker use.-Using a deployment engine based on YAML configuration files minimizes unaware deployments, simplifies repetitively tasks and makes more comprehensible automated monitoring process.-SecDocker allows to track and audit all commands sent to Docker.Its logging capabilities could be used as a tracking system having in mind the timestamp.
These points summarize the goal of the work presented and, at the same time, provide a concise and clear way to answer the RQ posed.

Software Perspective
SecDocker offers many potential benefits regarding the CI process.Some of these are: -Publicly available.It is an open source tool released under the MIT license.
The tool is presented in a way that makes deployment easier for the DevOps community.It is written in Go, a popular programming language, and offers a middleware solution for Docker, a mainstream containerization solution.-Flexibility, scalability and security.It should be noticeable from DevOps and CI engineers that the current release of SecDocker brings simplicity to CI and CD processes.Likewise, relying on different configuration files, makes easier to define all the requirements needed for an infrastructure and thus prevent misconfiguration issues related to last minute fixes, to reduce performance issues associated with lack of hardware resources or even software incompatibilities between versions.-Installation costs.The process of downloading, compiling and deploying is performed with exactly three commands; as indicated in the documentation available in the GitHub repository.-Assurance.SecDocker users do not need to consider the trade-off between speed and certainty.Results presented in Section 5.1 show similar performance and negligible differences when using Docker with or without SecDocker .
However, SecDocker also has certain shortcomings, including: -The solution is only applicable to the deployment part of a CI/CD workflow; it does not cover previous steps.However, SecDocker architecture favors the use of plugins (like Anchore and Notary) in order to support such features.-SecDocker works as an application proxy.Each time a client makes a Docker request, SecDocker only intercepts it and checks its IP and port (which need to be the ones associated to Docker).Currently, SecDocker does not route these packages; which, on the other hand, would add a new level of security allowing to hide connection elements to the user.-The image provided in SecDocker 's configuration file is not validated.More precisely, SecDocker does not check if the image provided by the Docker server is legit.-Unusual launch parameters (like those related to DNS or Input/Output) are also not checked by SecDocker .-Once a container is running, SecDocker does not perform additional actions to test whether such container is executing under the defined specifications or is being used for the intended purpose.
Finally, software metrics are presented in order to provide some sort of assessment to the tool described in this paper.These metrics can be used to define its maintainability and code quality but also can give details about how easy is to debug, maintain or integrate new functionalities to it.Moreover, they were measured against version v0.1-beta of the application and using SonarQube6 and Golint7 as code quality tools.
Considering the above, SecDocker has 1834 lines of code (LOC) and an accumulative cyclomatic complexity of 206, distributed among the different functions of four files: tcpintercept (tcpproxy.go),docker (security.go),commandline (command.go)and httpserver test (server test.go).In addition, it has a total number of 35 test cases-aggregated in 13 test functions that are grouped by table-driven tests-and a 87% of test coverage.Lastly, and regarding code quality, Golint detects 31 issues (28 related to naming and comments and 3 to coding structures) while SonarQube detects only 12 code smells and no bugs, vulnerabilities nor security hotspots.

Conclusions
In conclusion, it is important to harden CI workflow.We knew from previous experiences that corporations refuse to deploy new tools given the cost associated (training, deployment, etc).Thus, the idea of providing a firewall app that allows maintaining the current workflow was a key for designing SecDocker .
It is critical for every DevOps engineer to secure as much as possible their containers platforms.By developing SecDocker we have learned the possible threats of a CI system running containers, in particular the mainstream tool Docker.Performing a close analysis of the user input and harden the systems to minimize the possible attack surface and the capabilities the users can access to.

Fig. 4
Fig. 4 SecDocker response when running the proposed Docker command

Fig. 5
Fig. 5 Sequence diagram describing how SecDocker blocks a docker run command SecDocker general overviewFurthermore, to filter all this traffic a YAML file is used (see Listing 1 for an example).This file contains a set of configurable features that allow end

Table 1
Configurable features supported by SecDocker

Table 2
Statistics related to time taken to execute 100 Docker commands when SecDocker is both enabled and disabled