1 Introduction

The agile software development methods were originally designed for small co-located teams. However, with the success of agile methods in small teams, organizations started adopting them also in large and distributed environments [1]. To support this, practitioners have proposed different scaling frameworks such as the Scaled Agile Framework (SAFe) [2], Large Scale Scrum (LeSS) [3] and Disciplined Agile Delivery (DAD) [4]. According to the 12th State of Agile Survey [5], SAFe seems to be currently the most popular scaling framework, with 29% of respondent organizations reporting the adoption of SAFe.

Researchers have identified the lack of research, and emphasized the need for scientific studies on the adoption of scaling frameworks [6, 7]. A recent multivocal literature review (MLR) on SAFe identified only six scientific studies published on the framework [8]. Most of the published information related to SAFe consists of the experience reports written by practitioners. These reports are available on the SAFe homepage [2]. The MLR also identified a need for research-based evidence related to the transformation process to SAFe [8].

In this paper, we describe a part of the transformation process in a large traditional organization in the financial sector. We focus on the formation of agile release trains (ARTs), a central construct in the SAFe framework, and the related challenges experienced in case the organization.

The paper is structured as follows: In the next section, we describe how value streams are identified and ARTs formed according to the SAFe framework. Then, we present the previous literature on the formation of ARTs and the related challenges. In Sect. 3, we describe our research methodology and present the case organization. Section 4 provides out results. In Sect. 5, we discuss the results and finally, in Sect. 6, we conclude the paper and suggest directions for future research.

2 Related Work

2.1 The Scaled Agile Framework: Identifying Value Streams and Agile Release Trains

The Scaled Agile FrameworkFootnote 1, introduced in 2011 [2], integrates practices from lean and agile to support scaling to the enterprise level. The framework has four different levels: portfolio, large solution, program and team [9]. Each layer has a set of activities, roles, and processes to support and build solutions.

One of the critical moves during adoption of SAFe is the identification of value streams, which are defined as “the sequence of steps used to build solutions that generates continuous customer value. They may deliver either direct customer value or may support internal processes” [10].

When the value streams have been identified, teams are grouped into ARTs, which are long-lived organizational structures, composed of agile teams, key stakeholders, and other resources [2]. An ART typically includes 50–125 people, and delivers solutions incrementally using time-boxed Program Increments (PIs) [11]. The Program Increments are typically eight to twelve weeks long, and are preceded by a PI planning. The PI planning meetings, in which all teams in an ART meet face-to-face, typically last two days, and serve as the heartbeat of the ART, helping to align the teams to a common vision [12]. During a program increment, agile teams work on their backlogs using either Scrum or Kanban.

The SAFe implementation road-map [13] gives a detailed description on how to identify the value streams and form the release trains [10]. The SAFe framework defines two types of value streams: operational and development. An operational value stream is a set of steps taken in order to provide services to the customer [10]. A development value stream supports operational value streams by developing new products or services. Initially, the organization starts by identifying the operational value streams. SAFe provides a template to help organizations to define them.

After identifying the operational value streams, the next step is to identify the systems that support the value streams and the people who develop these systems. Then, the development value streams are identified. The organization might have one or several development value streams. The development value streams need to be mostly or wholly independent, in order to deliver the value without having many inter-value stream dependencies [10].

When the value streams have been identified, ARTs are formed to realize the identified development value streams. SAFe defines the following set of attributes of an effective ART: (1) consisting of 50–125 people, (2) focus on a complete system or a related set of products or services, (3) long lived and stable teams that deliver value constantly, and (4) deliver independently by having a minimum number of dependencies with other ARTs [13].

Depending on the number of people in the ARTs, different designs are possible: “a single ART delivering a single value stream”, “a single ART delivering multiple value streams”, “multiple ARTs delivering a single large value stream” [10]. When having multiple ARTs delivering a single large value stream, there are typically a lot of dependencies between them. In this case, the ARTs can be designed into feature ARTs or subsystem ARTs. Typically, a large system might require both types of ARTs. When developing a segment inside large value streams, ARTs may not be end-to-end. However, in reality, the beginning and the ending of a value stream are quite relative to each other [10]. The inputs, values and systems may vary for each different segment that in fact creates a logical diving line for the ARTs [10]. In practice, other factors like geography, spoken language, and cost centers might influence the ART design, but these are considered less desirable [10].

2.2 Overview on Release Train Formation and Its Challenges

The existing literature contains very little information on how organizations define value streams and form ARTs in pratice. Below, we summarize the reported information on value streams and release train formation.

During their transformation to SAFe, organizations start to identify the value streams [14,15,16]. Some mapped the existing value streams [17] during different workshops like management [15] and leadership workshops [14]. One organization reported arranging a value stream mapping event by bringing different Scrum teams together [18].

Organizations formed release trains by combining the existing product clusters, Scrum teams and component teams [19], or system teams, development teams and cross-functional roles [20]. Some cases structured release trains around the current products and web portals [21], product streams [22], utilities [21], platforms [21, 23], markets [14] and business programs [24]. In one case [22], the software development domain was divided into eight ARTs, while in another case [25] several domains, “commercial, cargo, flight and ground operations, engineering and maintenance, finance, human resources”, were combined to form the release trains. This setup bought in value for all the domains.

Challenges related to defining and structuring the organization around value streams have been reported by several cases [14, 21, 26]. In [21], it was difficult to figure out the domain of the ART. Struggles with handling cross-team dependencies between ARTs and integrating teams with less dependencies into ARTs was reported in [27]. Resistance to be a part of the ARTs was also reflected in [19].

Even though the above mentioned literature touches the topic, in-depth information on the formation of release trains in practice is lacking both in the grey and in the scientific literature, grey literature providing more information on the topic compared to the scientific literature.

3 Research Methodology

3.1 Research Goals and Questions

The objective of our research was to investigate the SAFe transformation in a large, traditional financial corporation. In this paper, we focus on the formation of ARTs and related challenges in the case organization, as this rose as a central theme during the interviews. The case organization was purposefully selected, as it provided an opportunity to perform an information rich case study [28]. Additionally, it is one of the largest corporations in Denmark that has implemented the SAFe framework.

We approached this case in an exploratory manner, and formulated the following research questions:

  • RQ1: How did the ART formation proceed at the case organization?

  • RQ2: What were the challenges of forming ARTs at the case organization?

3.2 Case Description

The case organization is a financial corporation developing large and complex pension and insurance products. At the time of the study, the organization consisted of 1300 persons, of which 300 people (32 teams) were involved in software development. The development was distributed to two sites: Denmark and Poland, the main part of the development taking place in Denmark, while consultants were hired from Poland making up ca 10–15% of the headcount.

Before agile, the organization used the sequential PRINCE 2 process model, and was siloed and hierarchical with a command and control leadership style. In 2015, the organization got a new CEO, who brought a modern way of leading to the organization. A strategy to change the traditional mindset at the organization was developed, but people were not ready to embrace the strategy, due to the lack of resources and the right infrastructure. They were struggling with long queues and capacity allocation. The organization started a Kanban initiative in the beginning of 2016, introducing lean projects to optimize the way of running projects. At this time, a group of 20 persons working on front-end development started using agile practices.

In the end of 2016, a new CIO was appointed. He gathered many directors and C level leaders to start an agile pilot. The organization established an agile pilot with front end teams. During this time, the organization studied different scaling frameworks and models, including SAFe, the Spotify model, DAD and LeSS, finally settling on SAFe. A significant force behind this decision was the new CIO, who had an ambition to implement SAFe, as he had positive experiences from a SAFe transformation from his previous company. Furthermore, SAFe provided a top-down approach that helped to get management buy-in. A further supporting fact was that SAFe had been taken into use by many other financial organizations, making it easy to recruit coaches with framework experience.

At the time of the interviews, the case organization had four ARTs, see Fig. 1. Along with the trains they also had formed six Centers of Excellence (CoEs): Project and Program CoE, DevOps CoE, Lean and Agile CoE (LACE), SAP CoE, Integration and BPM CoE, and Test CoE.

The organization had approximately 30 projects running at the time of interviews. These projects were running in parallel with the release trains.

In the beginning, the organization had quarterly releases. Besides the quarterly releases, a few small releases happened every week. Finally, the organization moved to monthly releases.

3.3 Data Collection

We collected data by conducting 24 semi-structured interviews, during a 3-month period from February to April 2018. We collected data on different topics, for example: transformation reasons; transformation process; success factors, benefits and challenges of adopting SAFe; lessons learned; recommendations for future adopters; what could have been done differently in the transformation; and future steps at the case organization.

In this paper, we only focus on the formation of ARTs and the challenges faced after forming them. The interview data was complemented by observations of two PI planning meetings, in February and April 2018.

We interviewed a total of 27 people from different roles, including developers (4)Footnote 2, Product Managers (2), Project Managers (2), Product Owners (2), people from different Centers of Excellence (5), Project and Program (1), DevOps (1), Integration (1), Test (1), Scrum Masters (2), Release Train Engineers (2), requirement analyst (1) and person from Service Oriented Architecture (SOA) (1).

We collected the data from the two longest running trains (DCE and DBI trains), as they were the pioneers in the SAFe journey. The other two trains (IP and DM) were only recently formed. All interviews were conducted face-to-face with two interviewers present, one being the primary interviewer, while the other was taking detailed notes and asking complementary questions. In one interview three persons were interviewed at the same time and in another interview two persons. In the rest of the interviews only one interviewee was present at the time.

The interviews were semi-structured and conversational to help in adapting to different roles and to understand individual opinions and perceptions. Each interview lasted 1–2 h, with an average of 90 min.

Fig. 1.
figure 1

Organizational structure after transition to SAFe

3.4 Data Analysis and Validation

All interviews were recorded and transcribed, and analyzed using the qualitative coding tool Nvivo 12 [29]. We followed the guidelines from [30] for coding. The first author started with open coding and compared the similarities and differences among the open codes and clustered them together into axial codes. During the process of axial coding, the authors discussed the clustering and naming. Based on the discussions, a few codes where modified or renamed. We identified the following high-level codes from the analysis: opinions on the SAFe framework, transformation reasons, transformation process, success factors of the adoption, challenges of the adoption, future steps for the case organization, recommendations for future adopters, lessons learned, and things that could have been done differently during the transformation.

After the analysis, the results were presented in a feedback session at the case organization in June 2018. All interviewees were invited. Twelve persons attended the session, most of which were interviewees. At the end of the session we discussed with the participants about the existing challenges and the changes they made in the organization after the interview period, i.e after April. Nobody disagreed with our results.

4 Results

4.1 The Formation of ARTs (RQ1)

In this section we describe how the case organization formed the ARTs, and the different negotiations that took place and compromises that were done.

Piloting and the First Train.

As this SAFe transformation started from the IT management, with the CIO leading the change, and not from the top management, the IT managers had to “sell” the SAFe adoption to the rest of the organization: to developers, business people and higher managers. They decided to do that by starting a pilot through which they could show concrete benefits of SAFe. The pilot, a front-end development area (portal), was chosen. There were several carefully considered reasons behind this choice: (1) the teams working in this area had already started using agile and lean practices, (2) the people in this area already knew each other, lowering the threshold to join the pilot, and finally (3) in the front-end area it was deemed to be easy to show results and business value with help of short iterations and frequent deliveries.

figure a

The pilot train was called DCE, for “Digital Customer Experience”. The team formation was led by the front-end department leader and a project manager from the digital area. In the pilot phase, the train consisted of four teams. Later on, after some reorganization, a fifth team was added.

The pilot was commenced by a kick-off event. The event program included communicating the reasons behind the agile transformation and the selection of SAFe as a framework, explaining how the transformation will be started, as well as presenting the management ambitions for the release trains. The kick-off was followed by a PI planning session at the end of March 2017.

The pilot organization faced problems when trying to collaborate with the rest of the organization, as the surrounding parts were not ready to support the pilot as quickly as required by the agile way of working. People outside the pilot commented that they would not like to change their ways of working just for the sake of this pilot. As, after all, it was just a pilot that would be over after some time. Thus, our interviewees explained that being called as a “pilot” was not purely positive.

figure b

Despite these problems, the pilot ended up being successful, and after less than six months the next train was launched, with the pilot train stabilizing its position as the first train.

An Organic Way of Transformation.

After the success of the pilot, management planned to launch new release trains. However, they faced three major challenges: (1) political issues, (2) difficulties in identifying and separating value streams, and (3) avoiding a big restructuring of the organization.

Political Issues: Before moving to SAFe, the organization was siloed, with each director owning a pool of resources. Thus, it was crucial to get buy-in from the directors to allocate their resources in the release trains. Initially, none of the directors wanted to lose power by allocating resources into the release trains, as the power of each director was measured by the number of full-time employees overseen. Therefore, management wanted to create a comfortable set up for directors to willingly allocate their resources into the trains. Thus, they had to make compromises while designing the trains: the trains were almost vertically sliced, instead of horizontal slicing, to retain their old silos. This structure helped getting the business buy-in needed for the formation of the release trains.

Difficulties in Identifying and Separating Value Streams: The organization struggled to identify the relevant value streams, due to the presence of tightly coupled systems with a significant amount of cross-system dependencies. The same specialized persons participated in the development of several systems. For example, there were discussions on splitting up the pension and insurance products into two different value streams. However, splitting them and making them full stack was considered extremely difficult due to the lack of resources and the competence profile of people, i.e, very specialized competencies working on both products. Therefore, these two product groups were finally put into one joint train.

Avoiding a Big Restructuring of the Organization: The management was not ready to radically restructure the entire organization and invest in new resources for getting enough of the currently scarce resources (that were now working with several products) to each value-stream based train. As the SAFe transformation was not initiated by top management, the managers driving the transformation felt that they had to start from somewhere, first making easier changes that provide benefits and show the potential of SAFe. Then, after gaining experience of working with this framework, they would gradually start moving people from one train to another, slowly making the trains end-to-end and based on real value streams. A few interviewees called this plan an organic way of transformation, as the following quote explains:

figure c

Transformation Teams.

The high level design of the trains was lead by the main agile coaches. This design included figuring out what will be part of each train and what is left outside, by discussing details like, what is the focus of each train, how big are the trains, and which groups of people are part of each train. After that, the coaches formed transformation teams for each train. Each team was composed of people from business and different departments, and included line mangers and specialists. Each transformation team had approximately, 10 to 20 people, as the coaches wanted to involve all key stakeholders to make them to commit to the train design.

The designing of the trains was carried out iteratively: line managers from the transformation teams presented the designs to the employees by going to each department and talking to them. They collected feedback on the design and afterwards made a few changes to the structure of the trains to achieve the best possible solutions.

figure d

Forming the Second Train. Every business area in the organization had their own data department. The organization had a need to centralize the people working on data by aligning different data related initiatives, such as data warehouse solutions or data for artificial intelligence. Thus, people working on data were allocated to a second train, the “Data Train”, officially “Data and Business Intelligence”.

The coaches facilitated the designing of the train. They conducted workshops by bringing together all people, who were identified as key stakeholders. Initially, a design workshop was conducted to figure out the purpose of the train and who should be a part of the train. This train had many departments involved and people did not share similar qualifications. Again, full-stack teams were not seen as possible in this train. Thus, they ended up with a component team type of structure.

figure e

In another workshop the coaches and the train management described a vision for each team and chose the Product Owners for the teams.

In the next couple of workshops the team members could put their names in teams based on their skill set and interests. Later on, the coaches and the train management made only a few adjustments on the teams. Finally, five Scrum teams were formed and the Data Train started in August 2017.

Forming the Third and Fourth Trains. The last two trains were formed in March 2018. The trains were called “Pension and Insurance Products”, and “Digitalization and Management”. The Pension and Insurance Product train included the company’s core products and their further development. The Dizitalization and Management train concentrated on future areas, like digitalization of the different work processes in the company’s business, like digitalization of administrative processes, as well as new directions, like robotics and artificial intelligence. These trains had eleven and nine Scrum teams when started.

Again, a series of strategic discussions were held to decide the boundaries between ARTs, in terms of systems, business processes and resources, which ended up to a rough draft on the philosophy of what kinds of ARTs will be designed. This draft was further worked in a workshop between business, IT and team leaders where it was described what kind of teams these ARTs need. The designing of teams inside the trains was realized by using “Lego-blocks”. Different coloured legos were used for different roles, e.g., core developer with blue colour Legos. Some of the Lego blocks had names on them to represent the limited resources. Every manager had a certain number of Lego blocks representing a role. They aimed to make the teams as cross-functional as possible.

figure f

4.2 Challenges of Forming the ARTs (RQ2)

Besides the challenges mentioned in the previous section: political issues, difficulties in identifying and separating the value streams and not wanting to start a big restructuring of the organization, we identified several other challenges the case organization faced while adopting SAFe. To answer the second research question, we chose to present a couple of the most significant challenges faced by the organization while forming the ARTs: (1) project related challenges and (2) challenges due to dependencies.

Project Related Challenges. Complex Projects: Before the transformation, the development work in the organization was purely based on projects that were tightly controlled by the project managers. When the transformation started, the projects still kept running and the project managers kept their role in controlling the projects. Even though the idea was to finally get rid of the projects, this could not be done suddenly. Thus, the projects were running in parallel with the trains, with each project having work items in several trains.

The organization did not implement the portfolio layer with epics, but instead had projects. The projects were mentally transitioned into epics, i.e, product epics. These projects, running in parallel with the release trains, required detailed resource allocation and long term planning. Many interviewees mentioned that projects were not suitable for the release trains, as they had strict deadlines and large tasks, which cannot be delivered in small bits. Projects were so complex that they required detailed analysis phase before putting into the release trains. These project tasks were put into release trains in the form of features and user stories. In many cases, one task in a project required more than one train to realize it. This brought communication and coordination challenges between the four release trains. Moreover, the project managers felt helpless when they were responsible for the projects, but at the same time not able to control the work done in the trains.

Aligning Project and ART Releases: The projects had a different planning horizon than the release trains. The projects employed release management, which required details of the releases two months in advance. The release cycles of the projects were not synchronized with the PI release cycle used by the release trains. Thus, only the release trains were working in an agile way, and the rest of the organization was still waterfall driven. It was difficult to figure out how to align the project release cycle with the PI cycle of release trains.

Prioritization Challenges: The project tasks were distributed between the four trains due to lack of full stack trains. The priority of the project tasks differed between trains due to the lack of alignment between the trains. The trains had different PI cycles, i.e., they lacked PI cadence. They did not have joint PI planning, nor joint prioritization. One of the project managers mentioned that, they need to have some kind of planning where they can continue the prioritization or have some common prioritization session. While some other interviewees hoped for a portfolio layer to have continuity in the prioritization and to make sure the related tasks have the same priority between the trains.

For example, if project tasks were allocated between two trains, a task in train one is prioritized as one and in train two a related task as ten. If there is a delay, then the train two may move the task to the next PI, which causes a delay in the delivery of the project, which has a strict deadline. This also created additional coordination overhead between the trains to ensure the other tasks related to a certain priority, e.g., “one”, also have the same priority “one” between all four trains.

Challenges Due to Dependencies. External Dependencies: The tasks and features done in trains had several dependencies to the organizational units external to the trains. Many interviewees reported that every task or feature that was supposed to be delivered by a release train had lots of dependencies outside the trains. Additionally, the organization had separated the operations (Ops) and testing from the release trains by forming separate centers of excellence (CoE) for DevOps and Testing. This created further delays and dependencies between the trains and CoEs.

The organization found it impossible to form full stack trains by having all the competencies and people working full-time for the trains also from the external units. These dependencies between the trains and external units caused a lot of delays to the deliveries.

Inter-train Dependencies: The train design still had the old silo structure, i.e., the trains were responsible only for their own silo. The front-end and the back-end work was distributed between two different trains. Most tasks required data from the back-end to change something in the front end. This caused a lot of dependencies and coordination needs between the trains and finishing a task during the same period was challenging.

Many people argued these dependencies between the tasks were already present while running the projects, but there were project managers to coordinate the dependencies. After SAFe, the coordination of dependencies was pushed down to the team level. Teams were good at identifying the dependencies, but they were not good at acting on them. The project managers, who were not part of trains, were trying to coordinate between the trains, even though the teams should coordinate themselves according to agile and SAFe.

figure g

Some of the interviewees, especially at the team level, expressed the need to bring in the portfolio and large solution layers to deal with the coordination between the ARTs. At the end of our study period, the coaches were planning the portfolio layer.

5 Discussion

5.1 RQ1: How Did the Release Train Formation Proceed at the Case Organization?

The SAFe transformation was initiated by launching a pilot with the teams that already had experience with agile practices. The SAFe implementation road-map [13, 31] does not explicitly mention piloting as a starting point for the SAFe transition. However, it recommends the organizations to “pick up one value stream and one ART” and then, suggests to make a preliminary implementation plan for launching the next successive ARTs [31]. The same scenario was observed in our case, as they started with a pilot and then launched three more new ARTs. Several such instances of starting a pilot were identified in the literature [32,33,34,35,36,37].

After the success of the pilot, the organization was not ready for a radical and organization-wide restructuring for launching the new release trains that would be based on value streams. Instead, the old silo structure was retained to gain political acceptance for the transformation. Thus, an organic way of forming the release trains was initiated without having “rigid value streams” in the beginning, but planning to change the trains gradually towards real value streams. Likewise, several organizations in the existing literature struggled to identify the right value streams [14, 21, 26].

The road-map says [13] breezing or attempting a shortcut for identifying value streams is considered as “putting your foot on the brake at the same time you are trying to accelerate” [10]. This statement seems to be true within our case, as several challenges arose due to compromising for the train structure by designing them around silos instead of value streams. However, this compromise helped, according to our interviewees, the organization to gain acceptance for the transformation and to get more business resources into trains, which might not have happened by aiming for rigid value streams. This case adopted several innovative approaches for designing the teams for next three trains, such as design workshops and Lego workshops. We could not find detailed information on experiences of forming ARTs and teams in the existing literature.

5.2 RQ2: What Were the Challenges of Forming Release Trains at the Case Organization?

The case organization retained its old silo structure even after forming the release trains, due to political struggles and the desire to avoid big restructuring. Managing dependencies between the four silo based release trains was a significant challenge, which created coordination overhead. The same was reflected in [27], regarding managing the cross-team dependencies across release trains. Additionally, several other cases of adopting agile at scale reflected challenges with cross-team dependencies [1].

Our case struggled with complex projects that were hamstrung with deadlines and large tasks, as existing projects continued and their tasks were just distributed to different trains. Project managers continued their work, but did not have a say in the prioritization of the tasks distributed to different trains. We did not find similar cases from the literature.

5.3 Limitations

We identified the following threats to validity [28].

Construct Validity: This treat is concerned with how well the case study reflects reality. We carefully selected a rather large number of respondents representing various roles jointly with the organization to facilitate respondent triangulation. Initially, we made a list of potential interviewees, during a PI planning session at the case organization. This list was checked by one of the core member of the transformation team, who also suggested other people for getting the desired information for the study. There is a treat to misunderstand and misinterpret the questions, this was mitigated by conducting the interviews in a conversational manner, that helped interviewees to clarify the questions, in case of ambiguity. All interviews were conducted by two researchers, who also actively discussed the analysis.

External Validity: The external validity is concerned with the ability to generalize the results to other contexts. While it is difficult to explicate the exact context variables that facilitate generalisation, we compared our results with other SAFe case studies [2], with the SAFe implementation road-map [13], as well as with general studies of large-scale agile adoption [1].

Reliability: This threat is concerned with replication of the study. There is a threat of researcher bias in interpretation of the data. To mitigate this threat, we collected data from multiple sources, to ensure correctness of data. The results of coding process were validated by conducting a feedback session at the organization, and by discussing the analysis among the researchers.

6 Conclusions

The number of organizations adopting SAFe is increasing, but despite this, scientific studies on adopting this framework are scarce. Moreover, the published studies contain no in-depth information on the transformation process. This paper makes a contribution by describing the formation of ARTs and the challenges faced while forming them, as part of a SAFe transformation.

SAFe is not a silver bullet to all the scaling problems encountered by large-scale organizations. It can only be a starting point for scaling, and cannot solve all the challenges involved. Several organizations have reflected the struggles to form the release trains and to identify suitable value streams, especially those that develop multiple and tightly coupled systems. In this specific case we could see that turning a silo based traditional organization with projects into a SAFe organization that would have value stream based agile release trains was not possible overnight. The steps towards the goal required compromises, which caused a lot of challenges.

The current literature lacks in-depth information on how to form release trains and value streams in real complex organizations. Since several organizations have reflected such challenges, it is crucial to conduct more in-depth research on how to form release trains in practice and how to mitigate the challenges encountered to provide guidance to the practitioners. We welcome case studies, especially from matured organizations, that have taken SAFe into use for more than three years ago and that could give detailed information on the mitigation strategies adopted for the challenges faced during their SAFe adoption.