1 Introduction

The COVID-19 pandemic has abruptly and unprecedentedly disrupted software developers’ working routines. On short notice, many software developers were asked to switch from their typical office-based working habits to a new working from home (WFH) setting. This change in work setting has had a considerable negative impact on developers’ well-being and productivity (Ralph et al. 2020b), as the pandemic and subsequent restrictions (e.g., lockdowns) restricted their basic needs, such as the need for autonomy, competence, or relatedness (Cantarero et al. 2021). Nevertheless, longitudinal research has also shown that software engineers can successfully adapt over time, suggesting that their well-being and productivity bounce back to pre-pandemic levels (Ford et al. 2021; Forsgren 2020; Bao et al. 2020; Russo et al. 2021b; Smite et al. 2021). This is encouraging, as 89% of professionals would like to work from home at least one day per month after the pandemic (Walton et al. 2020). For this reason, major IT companies (e.g., Twitter, Microsoft, AirBnB, Uber, Facebook) informed their employees that they could work from home indefinitely (e.g., Twitter) or extended the remote work policies providing specific support (e.g., AirBnB) (Hadden et al. 2020). Thus, research conducted during the pandemic will likely also be of value once current restrictions have been lifted.

Software professionals working remotely for an organization is not a new topic in software engineering. In 1983, Olson defined remote work as an “organizational work that is performed outside of the usual organizational confines of space and time” (Olson 1983). This definition implies professionals’ high degree of freedom with regards to scheduling their working hours, activities, and the location from which they work. With the rise of the internet in the late 90s, scholars started researching the challenges and opportunities of remote work from home (Pounder 1998). In these cases, professionals usually have a high degree of autonomy in terms of time but not in terms of space since they have chosen their homes as primary working spaces. Generally speaking, researchers investigated specific software development practices, such as processes (Guo 2001; Deshpande et al. 2016) or communication (Higa et al. 2000) to better tailor remote work practices to business needs. Similarly, collaboration and characteristics of remote and asynchronous projects have been extensively studied by the Global Software Engineering community (Herbsleb 2007; Smite et al. 2010). Such studies typically focus on the interaction among software development teams co-located in different geographical areas. However, the focus has been on software development teams working together on distributed projects. There is a growing agreement within the practitioners’ community that working from home is different from working remotely on distributed projects (Aten 2020). While working from home is understood as working from the primary address of residence, such as an apartment or house, working remotely is carried out typically in co-working spaces or in different settings where one lives. The pandemic made many of us realize that some of the fears often associated with remote work (such as decreasing productivity) are often unfounded. Hence, anecdotal evidence driving top managerial decisions due to the lack of specific research (Mesaglio 2020) should be supplemented with scholarly evidence.

So far, the authors of this paper have worked on a comprehensive research agenda to understand the effects of the COVID lockdown on software engineers. We started looking at the self-perceived well-being and productivity in the earlier months of the pandemic (Russo et al. 2021b). Afterward, we tracked how a typical work day looks like, as also the distribution of work activities compared to pre-COVID times (Russo et al. 2021). Eventually, we performed a two years long longitudinal study with six waves to assess the effects of the entire pandemic on software developers (Russo et al. 2021a). Additionally, the first author also investigated software process-related changes while working from home (Cucolaş and Russo 2023).

This paper studies whether professionals’ needs influence their time on various activities. In their seminal paper, Ryan and Deci (2000) introduce self-determination theory, which describes the three innate psychological needs that motivate us and guide our behavior: need for autonomy, competence, and relatedness. The need for autonomy measures whether people feel independent; the need for competence whether people can complete various (challenging) activities; and need for relatedness assesses whether people feel appreciated by others close to them. Self-determination theory has frequently been used in the work context to predict job satisfaction and performance (Gagné and Deci 2005). For example, research established that all self-determination theory-related needs (need for autonomy, relatedness, and competence) positively correlate with job satisfaction and productivity (Van den Broeck et al. 2010). By building on self-determination theory, we study how software engineers’ activities changed during the pandemic using the activity taxonomy of Meyer et al. (2019).

In line with other researchers who started to look at productivity of software engineers in a more holistic way (Sadowski and Zimmermann 2019), we are particularly interested in understanding whether specific activities contribute to their well-being and productivity in general and which factors contribute to their satisfaction and productivity while working on a particular task. For example, meetings can be resource-draining and be felt as burdensome by employees (Allen et al. 2012). Furthermore, we also take social relations as an indicator of need for relatedness into account: People who feel that communication with their colleagues and line managers is of importance might be more inclined to spend time in meetings, helping others, and other social activities and report higher well-being because their need for relatedness is then more likely to be satisfied. Prior research which investigated predictors of well-being and stress in occupational settings (Bhui et al. 2016; Edwards and Cable 2009; McCalister et al. 2006) has not measured the specific activities that might have contributed to higher stress and lower levels of well-being. However, the type of activity someone is doing might contribute to higher stress levels beyond other factors identified by previous research, such as support by coworkers and supervisors (Chyi et al. 2018). If we were to determine which specific activities are associated with higher or lower levels of stress or well-being, this would provide valuable information for future research investigating predictors of stress. We divided this study into an exploratory and a confirmatory part to investigate all these aspects. Both studies build on self-determination theory (Ryan and Deci 2000).

In the exploratory investigation, we first measured developers’ activities and self-reported well-being and productivity to assess changes throughout the lockdown over a two-week period. We compared wave 1 with wave 2 to assess our test-retest reliability and stability of the data captured. In particular, we found that the time software engineers spent doing specific activities from home was comparable when working in the pre-pandemic office. Nevertheless, we also reported significant mean differences, such as less time dedicated to meetings and breaks and more time spend on specification and documentation. Interestingly, the time people spent on each activity was unrelated to their general well-being, perceived productivity, and other variables. In hindsight, this is not surprising because many factors affect our well-being and productivity. For example, well-being is impacted by a range of factors such as the quality of our relationships, personality, or situational factors (e.g., weather) (Connolly 2013; Diener 2009; Russo et al. 2021b), which makes it unlikely that spending an hour more or less on a specific activity will significantly impact well-being. However, what we believe is more likely to impact well-being and productivity, are activity-specific features, which is one of the primary motivations of the confirmatory study (i.e., what factors predict activity-specific well-being and productivity?).

In the confirmatory study, we measured activity-specific well-being and productivity, as well as the activity-specific need for autonomy, competence, and relatedness (e.g., how productive professionals felt during the activity they spend the most time on a day). Additionally, we explored whether task-unrelated variables such as resilience or work-life balance act as moderators between activity-specific needs and activity-specific well-being and productivity (see below for a more detailed rationale). Our findings confirm the long-standing intuition that software engineers feel more autonomous while coding than while in meetings or writing emails. Also, software engineers experience less satisfaction with bugfixing but helping others is a satisfaction booster. We further characterized which activities resulted in higher feelings of satisfaction, productivity, autonomy, competence, and connectedness. Moreover, through combining both the exploratory and confirmatory study, methodological lessons can be learned: Only asking whether overall well-being and productivity, for example, are associated with time spent on specific activities, misses the impact different activities can have on people’s well-being and productivity. Measuring activity-specific well-being and productivity levels overcomes this limitation.

In the remainder of this paper, we describe the related work in Section 2, followed by a description of our research design in Section 3. The analysis and related results of our analysis are described in Section 4. Implications and recommendations for software engineers and organizations are outlined in Section 5. Finally, we conclude this paper by presenting future research directions in Section 6.

2 Related Work

Research on behavioral and emotional aspects within the software engineering community is a relatively new but rising research topic (Sánchez-Gordón and Colomo-Palacios 2019). Developers’ behaviors and emotional states do play a substantial role in how they are going to perform their working activities (Graziotin et al. 2015). For this reason, the community started to focus specifically on software engineers’ behaviors (Lenberg et al. 2015), emotions (Graziotin et al. 2014), or personality traits (Cruz et al. 2015; Russo and Stol 2022).

Concerning the pandemic, there is widespread agreement that lockdowns have a negative influence on well-being (Brooks et al. 2020; Lunn et al. 2020). Living in a lockdown during a pandemic has been linked to increased levels of anger, depression, emotional exhaustion, fear of infecting others or becoming infected, insomnia, irritability, loneliness, low mood, and post-traumatic stress disorders (Sprang and Silman 2013; Hawryluck et al. 2004; Lee et al. 2005; Marjanovic et al. 2007; Reynolds et al. 2008; Bai et al. 2004; Tag et al. 2022). Furthermore, anxieties of infection (Kim et al. 2015; Prati et al. 2011), a lack of supplies or not being treated (Wilken et al. 2017), and false or conflicting information (Caleo et al. 2018) can all cause substantial stress and give rise to new approaches to regulate our emotions (Tag et al. 2022). Furthermore, the psychological impacts of being quarantined may take years to manifest (Brooks et al. 2020).

Pre-COVID research, on the other hand, indicates that remote working is associated with improved work-life balance, creativity, productivity, reduced stress, and low carbon emissions due to the absence of commuting (Owl Labs 2019; Anderson et al. 2015; Bloom et al. 2015; Vega et al. 2015; Baruch 2000; Cascio 2000). However, there are several apparent downsides to remote work, like decreased teamwork and communication, loneliness, the sensation of always being ‘online,’ decreased motivation, and distractions at home (Buffer 2020; Yang et al. 2021). Aside from such factors, estimates indicate that remote work will grow significantly in the next years (Owl Labs 2019; Gallup 2020).

In the software engineering domain, several large software companies, such as Stack Overflow or Red Hat, have embraced working from home by designing ad hoc schemes already before the start of the 2020-Corona pandemic (Mazzina 2017; RedHat 2015). Organizations do so to increase their employees’ job satisfaction and productivity while simultaneously reducing their operating expenses, such as office rent (Felstead and Henseke 2017; Pérez et al. 2002). Several aspects of remote and distributed working have been (indirectly) investigated by the Global Software Engineering community well before the pandemic (e.g., Smite et al. 2010; Herbsleb and Moitra 2001; Richardson et al. 2012). To better frame this study theoretically, we looked into peer-reviewed publications in Scopus which explicitly focused on working from home (i.e., and not remote and distributed work). We made this choice to narrow down the subject matter and consider only articles whose primary focus is about working from home. We identified thirteen relevant papers in total. Considering the vast but recent impact of COVID-19, we also selected non-peer-reviewed pre-prints on arXiv. Table 1 summarizes prior studies of remote working issues related to software engineers.

Table 1 Overview of prior studies about software engineers working from home

Most papers which focused on WFH were published in or after 2019 and are related to the COVID-19 pandemic. From a methodological perspective, most studies have been field studies involving a single company (i.e., Fujitsu (Higa et al. 2000), Baidu (Bao et al. 2020), and Microsoft (Ford et al. 2021; Miller et al. 2021; Butler and Jaffe 2021; Yang et al. 2021)). Such real-world investigations aimed to understand the research phenomena by generating research hypotheses. Three studies were conducted in a neutral setting on the opposite spectrum by asking participants a quantifiable judgment and analyzing such data through statistical techniques. These six sample studies generalize their result on the entire software engineering population (Ralph et al. 2020b; Russo et al.2021a2021b; Machado et al. 2021; Cucolaş and Russo 2023; Smite et al.2021).

Content wise, half of the papers are concerned with specific topics related to working from home, such as security (Pounder 1998; James and Griffiths 2014), process (Guo 2001), work productivity (Higa et al. 2000; Lamarche 2020), and inclusion (Ford et al. 2019). The other half mostly investigated well-being and productivity while working from home during the pandemic (Ford et al. 2021; Ralph et al. 2020b; Russo et al. 2021a, 2021b; Butler and Jaffe 2021; Machado et al.2021; Smite et al. 2021) or productivity-related to projects’ characteristics (Bao et al. 2020; Cucolaş and Russo 2023).

Overall, the investigated topic is not new to the community. However, from this short review, we noticed how scholars focused in particular on WFH topics due to the COVID-19 pandemic and the subsequent lockdown. Indeed, future work is needed to support developers working in a lockdown environment or in a reality where pandemic waves are part of our everyday lives. Alternatively, more optimistically, software organizations will enforce hybrid work in a widely spread manner. Therefore, we believe that this subject matter is of utter importance for software professionals’ well-being and productivity in the years to come. This is also important because past research has shown that there are some mean differences between software engineers and the general population (Russo et al. 2022). In other words, we cannot assume that findings from other population types (e.g., employees at Microsoft, general population) generalize to software engineers.

3 Research Design

Our design was guided by the relevant ACM SIGSOFT Empirical Standards for longitudinal and sample studies (Ralph et al. 2020a). First, we applied an exploratory longitudinal design already described in Russo et al. (2021). Subsequently, to overcome the methodological limitations of the exploratory study while gaining further insights into the associations of activities with activity-specific satisfaction, productivity, and basic needs, we employed a cross-sectional design.

We formulate the following five main research questions which were guided by previous research and by self-determination theory (Ryan and Deci 2000):

Research Question 1: Has the distribution of daily working activities of software engineers changed while WFH during the pandemic as compared to pre-pandemic daily working activities?

Research Question 2: Is the distribution of daily working activities related to well-being, productivity, and other variables?

Research Question 3: To what extent does Self-Determination Theory (i.e., the needs for autonomy, competence, and relatedness) predict software engineers’ activity-specific satisfaction and productivity during the COVID-19 pandemic?

Research Question 4: To what extent are the associations between activity satisfaction and productivity moderated by resilience and company support during the COVID-19 pandemic?

Research Question 5: Do software engineers’ work activities while WFH during the pandemic affect their activity-specific well-being, productivity, and psychological needs?

We designed the exploratory study to answer RQ1 and RQ2, whereas the confirmatory research was designed to answer RQ3 to RQ5.

Our first concern was to recruit software professionals for our exploratory study carefully. To do so, we used a multistage selection process, detailed in Section 3.2. We asked them to complete the same survey on two occasions. Unique randomized IDs were assigned to participants to preserve their anonymity and match their responses from both waves. To address concerns about replicability and increase the reliability of our findings, we asked the same participants to complete all measures twice, two weeks apart. This allowed us to test whether the distribution of daily working activities has changed. At the same time, we asked participants to report how much time they spend on 15 activities and compared the responses with a pre-pandemic sample (Meyer et al. 2019), which allowed us to test whether the distribution has changed since the onset of the first lockdown in 2020. To test RQ2 – is the time spent on different activities correlated with well-being, productivity, and other variables – we correlated the time spent on each activity with professionals’ general well-being, productivity, and other variables.

In a subsequent confirmatory study, we asked participants about their well-being, productivity, autonomy, competence, and relatedness to their co-workers while completing specific activities (e.g., “how stressed were you while coding?”). Specifically, to test RQ3 – whether the needs for autonomy, competence, and relatedness predict software engineers’ activity-specific satisfaction and productivity – we asked how satisfied, productive, autonomous, competent, and related with their co-workers’ participants felt during working on a specific activity (e.g., coding). Our design allowed us to test RQ3 across all activities but also separately for each activity.

Additionally, to investigate RQ4 – whether the associations between autonomy, competence, and relatedness with activity satisfaction and productivity are moderated by resilience and company support – we also included a range of conceptually related variables that measure facets of company support: caring leadership, work-life balance, empowerment, job enablement, soft company support, hard company support, and recognition. We expect that software engineers who are more resilient and receive higher company support are less likely to be affected by, for example, reduced autonomy for a specific task. For instance, resilience or recognition might buffer against reduced autonomy because resilient people are more likely to bounce back after stressful events such as being less able to make autonomous decisions (Smith et al. 2008; Weinstein and Ryan 2011). Further, software engineers who experience low autonomy, competence, or relatedness during their work will experience only lower satisfaction and be less productive if their company does not provide adequate support that helps to buffer against the negative impact. In other words, we expect the effect of the three needs on activity satisfaction and productivity to be reduced if resilience and company support is high.

Finally, to test RQ5 – does the activity impact activity-specific satisfaction, productivity, and psychological needs – we tested during which activity professionals felt relatively more or less satisfied, productive, and so on.

3.1 Theoretical Framework

We are performing this investigation using the Self-Determination Theory (SDT) framework. In particular, this theoretical framework has been used to design organizational policies to improve both well-being and high-quality performance (Gagné and Deci 2005). SDT is a macro theory of human motivation that focuses, among others, on the motivations in the workplace (Ryan and Deci 2000).

The general idea of SDT is that if the three basic needs for competence, autonomy, and relatedness are satisfied, they lead to an increase in professionals’ intrinsic motivation, productivity, and well-being. Indeed, employees’ well-being is not only an ethical concern for every business but also a pivotal aspect to enhancing organizational sustainability, which is directly related to customers’ satisfaction and financial success (Mackey and Sisodia 2014). As a macro theory, it includes several factors that lead to employees’ well-being, such as the three basic needs.

The motivation related to specific job activities influences employees’ productivity and well-being. Specifically, according to Deci et al., it mediates workplace-specific context such as developers’ activity with performance and wellness (Deci et al. 2017), as depicted in Fig. 1. In other words, the three basic needs of SDT applied to developers’ activity should be positively associated with well-being and productivity.

Fig. 1
figure 1

Theoretical Framework of Self-Determination Theory (SDT) in the workplace adapted from Deci et al. (2017), where software engineering activities are the workplace-related independent variables, and SDT the mediating variable

3.2 Participants

For the exploratory study, a power analysis using G*Power (Faul et al. 2009) version 3.1 revealed that to detect a small-to-medium effect size of r = .20, using a power of 1 − β = .80 (for a two-sided test), a sample size of at least 190 participants is required.Footnote 1 We assumed an effect size of r = .20 because this is close to the medium effect size in individual difference research (Gignac and Szodorai 2016) from which many of our variables stem (e.g., SDT). We used a power of .80 because it is conventional to keep the false-negative rate (i.e., the β-error) to 1 - .80 = .20 or lower (Cohen 1992). If we had assumed a larger effect size, fewer participants would have been needed to detect such a larger effect with a power of .80.

Participants were selected from a broader set of 500 software engineers who were carefully selected through a multistage process in a previous study by Russo and Stol (2022). To select this initial pool of participants we applied a three-level screening process. First, we pre-screened the participants on the Prolific platform. The initial pre-screening criteria was knowledge of software development techniques, do computer programming for a living, use technology at work, and have an approval rate of 100% in previous studies. This left us with 2,897 members candidates. Then, we performed a competence screening. With the help of a questionnaire, we assessed in a time-boxed fashion the candidates’ knowledge with one question about software design and two about programming. After this phase, 514 candidates were included in our sample. Finally, we focused on the candidates’ attention with a quality screening, where we excluded informants who had a suspicious response pattern or have not passed attention checks of a 10-minutes long questionnaire about personality traits. The final set contained 483 fully screened software engineers.

For this study, we only selected professionals (from the Russo & Stol pool) who were working from home during the pandemic and live in countries with comparable lockdown measures. We used the following criteria: the country had to be in an official lockdown and those measures had to be rather homogeneous across the country. For example, countries such as Sweden with rather liberal lockdowns were excluded. Similarly, in Germany individual regions decided whenever the lockdown had to be applied.Footnote 2 Finally, we obtained a sample of 192 software engineers who completed the first survey (Mage = 36.65 years, SD = 10.77, range = 19–63; 154 men, 38 women). Of those, 184 participated in the second wave two weeks later. We provide demographic information on participants’ gender, age, and location in Table 2. We collected our data between 26 and 30 April 2020 (wave 1) and between 10 and 13 May 2020 (wave 2).

Table 2 Demographic information of both samples

To identify participants for the confirmatory study, we also first run a power analysis, which revealed that a sample size of 77 is sufficient to detect a medium effect size with three predictors (i.e., need for autonomy, competence, and relatedness) with a power of .80. However, to keep the length of the survey to a manageable amount, participants only selected three activities they performed during the day. They completed a series of questions that expressly referred to each of the three activities. We therefore aimed to recruit around 300 participants, to obtain for multiple activities the required sample size of at least 77 participants. To ensure that the participants were software engineers, we run a pilot study to screen our informants with questions developed by Danilova et al. (2021). The survey design is comparable with the previous exploratory one. The pre-screened followed the same criteria. What was different is the competence screening, where we asked specific questions developed and validated by Danilova et al. Regarding the quality screening, of the 300 selected participants, 10 participants failed at least one test item and/or completed the survey in less than 4 minutes and were excluded. The vast majority of participants, 210, worked in ‘Software & IT,’, 20 in ‘Education & Research,’ and 11 in ‘Finance, banking & insurance.’

To ensure high data quality, we recruited participants from the academic data collection platform Prolific Academic and compensated participants above the US minimum wage (Palan and Schitter 2018; Russo 2022). The survey was run using Qualtrics.

3.3 Measurements for the Exploratory Longitudinal Study

For the exploratory study, we derived the variables from a related project. For a complete presentation of the used instruments, we directly refer to Russo et al. (2021b) and the Supplementary Materials. Most of the scales described below have been cited between hundreds and tens of thousands times and been used across a wide range of contexts (e.g., organizational, clinical). The longitudinal design also allowed us to compute test-retest reliabilities, rit (i.e., the stability of responses across two or more time-points), by correlating responses given by participants at time 1 with those at time 2 (we are using time and wave interchangeably), which provides additional information about a scale’s reliability to the commonly used Cronbach’s alpha (McDonald 2013). Test-retest reliabilities close to 0 are undesirable since they indicate a low association between the two-time points, suggesting, among others, poor data quality. Cronbach’s alpha is a measure of scale reliability. For exploratory research, using new measurement scales, values above .60 are desirable while for confirmatory research the threshold is above .70 (and below .95) (Hair et al. 2013).

Activities

We measured the same 15 activities that were measured by Meyer et al. (2019). We did this because we believe they covered most activities and to have a pre-pandemic comparison group. We asked participants, “During the past week, how much time did you spend on each task percentage-wise (%)?” This was followed by the 15 activities, rated on a 101-point slider-scale ranging from 0% to 100%. For the activities which might have been more ambiguous, a brief explanation was added in brackets such as ‘Helping (helping, managing or mentoring people),’ ‘Networking (maintaining relationships).’ The 15 activities are coding, bugfixing, meetings, testing, email, breaks, code review, specification, learning, helping, administration, interruptions, documentation, various (i.e., other activities not listed above), and networking.

Well-being

We used the Satisfaction with Life Scale (Diener et al. 1985), because it is one of the most validated scales and because it shows good convergent and discriminant validity (Pavot and Diener 1993). Example items validated include “The conditions of my life in the past week were excellent” and “I was satisfied with my life in the past week”. Responses were given on a 7-point scale ranging from 1 (strongly disagree) to 7 (strongly agree). Our Cronbach’s alpha values to measure internal consistency for both waves were the following αtime1 = .90, αtime2 = .90 (rit = .72,p < .001).

Productivity

Measuring productivity in software engineering is a highly debated issue. Some scholars, for example, suggest making the measurement more objective by using function points (Wagner and Ruhe 2018). Ko has criticized this viewpoint as being detrimental in the long run (Ko 2019). On the other hand, other researchers propose a self-reflection measure with developers’ self-reporting their daily productivity (Meyer et al. 2014). In this work, we adopted a similar approach. We did not use a standard measure (e.g., such as Ralph et al. (2020b) did). Instead, we operationalized productivity as a function of time spent working and efficiency per hour, compared to a typical week. Specifically, we asked respondents three items: “How many hours have you been working approximately in the past week?” (Item 1), “How many hours were you expecting to work over the past week assuming there would be no global pandemic and lockdown?” (Item 2), and “If you rate your productivity (i.e., outcome) per hour, has it been more or less over the past week compared to a normal week?” (Item 3). Item 3 measured perceived efficacy and was answered on a bipolar scale that ranged from “100% less productive” to “100% more productive”, with the scale mid-point being “0%: as productive as normal”. We computed productivity with the following formula: productivity = (Item1/Item2) × ((Item3 + 100)/100). Productivity scores from 0 to .99 would reflect lower than normal productivity, scores of 1 the same amount of productivity, and scores above 1 higher levels of productivity.

The reason for this choice is that we wanted to investigate the variance in productivity while working remotely as compared to being in the office. We acknowledge that some readers might have some concerns with this approach. For example, software engineers might understand productivity differently. While one software engineer might feel productive when having been asked to do a lot of tasks other than their main task for the week with high priority, whereas another software engineer might feel less productive. However, this is an issue of all our scales (e.g., we do not know whether participants interpret/instantiate autonomy, competence, or well-being in the same way), but nevertheless we find strong correlations among these variables. This interpretation is supported from psychological research: There is substantial heterogeneity in how people interpret human values (e.g., equality, freedom, security) (Hanel et al. 2018). Nevertheless, values are still strong predictors of personality and beliefs (Kajonius et al. 2015; Saroglou et al. 2004). As long as there is no systematic bias in how our participants understood productivity – and we do not assume there is – we do not believe this is an issue. Additionally, test-retest reliability correlation was large, rit = .50,p < .001, and productivity correlated negatively with the number of breaks taken (Table 5).

Stress

We used a 4-item version of the Perceived Stress Scale (Cohen et al. 1983), as it is an often used and well-validated scale (Lee 2012). Example items include “In the last week, how often have you felt that you were unable to control the important things in your life?” and “In the last week, how often have you felt confident about your ability to handle your personal problems?” The response scale ranged from 1 (Never) to 4 (Very often). α1 = .80, α2 = .77 (rit = .73,p < .001).

Boredom

We used the Boredom Proneness Scale (Farmer and Sundberg 1986; Struk et al. 2017), because it is a well-validated scale (Tam et al. 2021). Example items include “It is easy for me to concentrate on my activities” and “Many things I have to do are repetitive and monotonous”. Items were answered on a 4-point scale ranging from 1 (Strongly disagree) to 7 (Strongly agree). α1 = .87, α2 = .87, (rit = .69,p < .001).

Autonomy, Competence, and Relatedness

To measure the three needs of the self-determination theory (Ryan and Deci 2000), we used the psychological needs scale (Sheldon and Hilpert 2012), which is also a well-validated scale (Neubauer and Voss 2016). Example items include “I was free to do things my own way” (need for autonomy), “I did well even at the hard things” (need for competence), and “I felt unappreciated by one or more important people” (need for relatedness, recoded). Need for autonomy’s Cronbach’s alpha level were: α1 = .72, α2 = .76 (rit = .76,p < .001); for Competence: α1 = .77, α2 = .65 (rit = .76,p < .001); and for Relatedness: α1 = .79, α2 = .78 (rit = .71,p < .001).

Quality and Quantity of Communication with Colleagues and Line Managers

We used a self-developed three items instrument to capture how positive and supportive the communication has been with colleagues and line managers. The items are “I feel that my colleagues and line manager have been supporting me over the past week”, “I feel that my colleagues and line manager believed in me over the past week”, and “Overall, I am happy with the interactions with my colleagues and line managers over the past week.” (α1 = .88, α2 = .92; rit = .67,p < .001).

Daily Routines

We developed a five items scale to capture participants’ daily habits, as having automaticity in one’s life frees cognitive resources for other things such as work (Moors and De Houwer 2006). The items were designed to capture a broad range of daily activities that were possible during the regulations in most countries at the time of data collection (spring 2020). The items are “I am planning a daily schedule and follow it”, “I follow certain tasks regularly (such as meditating, going for walks, working in timeslots, etc.)”, “I am getting up and going to bed roughly at the same time every day during the past week”, “I am exercising roughly at the same time (e.g., going for a walk every day at noon)”, and “I am eating roughly at the same time every day” (α1 = .75, α2 = .78; rit = .73,p < .001).

Distractions at Home

We developed a two items scale to measure perceived distraction in general as measuring the exact cause for distractions would have been beyond the scope of our study. The items are “I am often distracted from my work (e.g., noisy neighbors, children who need my attention)” and “I am able to focus on my work for longer time periods” (recoded) (α1 = .64, α2 = .63; rit = .63,p < .001).

3.4 Measurements for the Confirmatory Cross-Sectional Study

3.4.1 Measurement of Activity-Specific Variables

After providing informed consent, participants were instructed “Which of the following tasks have you spent most time with yesterday? For example, when you spent most of your time in two meetings, pick the meeting that went longer. Select three tasks.” Participants selected three of the activities we used in Study 1, except breaks, interruptions, and various, which were excluded, leaving 12 activities: Coding (n = 192), bugfixing (111), testing (96), specification (22), reviewing (91), documenting (40), meetings (87), emails (51), helping (33), networking (11), learning (93), and administration (14). Participants then completed 17 items for each task, 8 measuring our two dependent variables, well-being and productivity, and 9 measuring our three independent variables, need for autonomy, competence, and relatedness.

Satisfaction

was measured with a six items we created for the purpose of the study. The items were created to capture positive and negative aspects of satisfaction (Karademas 2007). In other words, some items were reversed scored, which might result in lower reliability (e.g., if a participant gives the item only a cursory read) but comes with the advantage of higher validity (Clifton 2019). The wording of the six item is “How stressed were you during the task?” (reversed scored), “How many positive emotions have you felt during the task?”, “How bored were you during this task?” (reversed scored), “After completing the task, I felt tired” (reversed scored), “Performing this task frustrated me” (reversed scored), and “I felt exhausted after the task” (reversed scored). The reversed scored items were recorded so that higher scores indicated higher well-being. Answers were given on a scale ranging from 1 (Not at all) to 7 (Very). A principal component analysis revealed that the 6 items were loading on one component, with good internal consistency (α = .80).

Productivity

was measured with two items we created for the purpose of the study: “How productive have you been during this task?”, which was answered on a scale ranging from 1 (Not at all) to 7 (Very), and “What percentage of your goals have you reached during < task >,” which was answered on a 0-100 scale. We created both items as they measure related, yet slightly different aspects of productivity. For example, a software engineer can feel productive but not have reached all of their goals because unexpected issues occurred while working on an activity. If the issues were overcome, the software engineer might feel productive but have not fully reached their goals. Both items were standardized before being averaged (α = .50).

To measure the three independent variables, we adapted three items for each of the three needs of the self-determination theory (Ryan and Deci 2000) from the balanced measure of psychological needs scale (Sheldon and Hilpert 2012). The scale measures each of the three needs with six items. We selected those items which we judged as best suitable to be adapted for our purpose. We chose three items to get a good balance between brevity and informativeness: For example, if we had measured each need with only two items, we would have ended up with only one if a participant skipped an item as not applicable; conversely, selecting four items per need would have resulted in nine more items (i.e., 3 needs × 3 activities) for the full survey, thus increasing its length. All items were answered on a 7-point response scale varying from 1 (Not at all) to 7 (Fully) with an 8th option, ‘Not applicable.’

Need for Autonomy

was measured with “I was really doing what interests me,” “I was free to do things my own way,” and “I had a lot of pressures I could do without when working on the task” (recoded). However, as the last item was uncorrelated with the other two, r s = -.00 and -.14, we only combined the first two items (α = .46) into an Autonomy factor and included the last item as a single-item predictor.Footnote 3

Need for Relatedness

was measured with “I felt close and connected with people working on the same task as me,” “I felt appreciated by one or more people working on the same task as me,” and “I had disagreements or conflicts with people working on the same task as me” (recoded). However, as the last item was uncorrelated with the other two, r s = .09, .06, we only combined the first two items (α = .73) into a relatedness factor and included the last item as a single-item predictor.Footnote 4

Need for Competence

was measured with “I was successfully completing the task,” “I did well even at the hard things,” and “I struggled to complete the task” (recoded; α = .64). Thus, instead of the three predictors, we now have five, two of which are single item predictors. While single-item scales are sometimes considered as problematic because of possible low reliability, they are often used in research and – assuming there is evidence that participants paid attention to the items as evidenced through good internal consistencies of other scales – can produce meaningful findings (Gebauer et al. 2017; Wolf et al. 2021). Indeed, the results of the measures with the two single items are in line with expectations (see below).

3.4.2 Measurement of Task-Independent Variables

Additionally, we also included variables that were suggested to be related to our dependent variables from the exploratory investigation.

Resilience

was measured with the 6-item Brief Resilience Scale (Smith et al. 2008). Participants indicate how much they agreed with statements such as “I tend to bounce back quickly after hard times” and “It is hard for me to snap back when something bad happens” (recoded). Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .73).

Caring Leadership

was measured with the 7-item Caring Leadership Scale (Louis et al. 2016). Example items include “My manager develops an atmosphere of caring and trust” and “I feel free to discuss work problems with my manager without fear of having it used against me later.” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .85).

Work-Life Balance

was measured with a 5-item scale. The items from this and the following five scales were provided by Qualtrics and offered to their users (Qualtrics 2022). After reading the items, we judged them as appropriate measures of the constructs (e.g., work-life balance) they claim to measure. Example items include “My workload is manageable” and “I have the flexibility I need in my work schedule to meet both work and personal needs.” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .84).

Empowerment

was measured with a 7-item scale. Example items include “I am given the opportunity to be involved in decisions that affect me” and “Employees are encouraged to participate in decisions that affect their work.” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .83).

Job Enablement

was measured with a 7-item scale. Example items include “My job is challenging and interesting” and “My work-from-home workspace allows me to be productive.” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .77).

Soft Company Support

was measured with a 3-items, including “My company is providing me with the necessary software tools to work from home” and “My company is providing me with the necessary flexibility so that I can work from home properly.” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .64).

Hard Company Support

was measured with a 3-items, including “My company is supportive in providing me the necessary work from home setting (e.g., chair, screen, mouse).” and “From the start of the lockdown, my company is taking care also of things it didn’t do before (e.g., internet bill, electricity bill).” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .76).

Recognition

was measured with a 7-item scale. Example items include “I receive meaningful recognition when I do a good job” and “My manager values my contribution.” Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .89).

4 Analysis & Results

In this section we describe which analyses we used to address our five research questions and the results.

4.1 RQ1: Has the Distribution of Daily Working Activities of Software Engineers Changed while WFH during the Pandemic as Compared to Pre-Pandemic Daily Working Activities?

To test RQ1, we first compared the time participants reported to have spent on each of the 15 activities with those reported by Meyer et al. (2019). The results are displayed in Fig. 2, as well as Tables 3 and 4. To test whether participants in our sample reported spending more or less of their time on certain activities than the software developers surveyed by Meyer et al. (2019), we performed a series of one-sample t-tests. For example, we compared the percentages of participants in our sample at time 1 spend coding was significantly different from 15%, which is the percentage reported by Meyer et al. (see Table 3, second column). That is, we tested whether participants in our sample spend significantly more time (i.e., > 15%) or less time (< 15%) coding than participants in Meyer et al.’s pre-pandemic study. We performed 15 (activities) × 2 (time points) = 30 t-tests (two-tailed, since we did not have directed hypotheses).Footnote 5

Fig. 2
figure 2

Distribution software engineering work activities during the two waves in our study, and a typical workday of software engineers as reported by Meyer et al. (2019)

Table 3 Comparisons of both waves with time spend on activities as reported by Meyer et al. (2019)
Table 4 Comparisons of activities between time 1 and time 2

Software engineers in our sample reported on average to have spent less time bugfixing, in meetings, getting interrupted (only at time 2), helping (only at time 2), and taking breaks; but more time on testing, specification, writing documentation, networking (only at time 1), learning, and administrative activities compared to the participants surveyed by Meyer et al. (Table 3). However, the differences between what our participants and those of Meyer et al. reported differed by only a few percent (see Fig. 2). This visual inspection of the data is supported by correlation analysis. The percentages of time spent on the 15 activitiesFootnote 6 reported by Meyer et al. correlated with r(13) = .84,p < .0001 at time 1 and with r(13) = .83,p = .0001 at time 2. To obtain those correlations, we correlated the mean percentages reported in columns 2-4 of Table 3 with each other. That is, we tested whether the average percentages spent on each activity reported by the participants in the Meyer et al. sample would align with those reported by the participants in our sample at waves 1 and 2. This suggests that while there are some deviations, the overall order of activities remains stable. It further supports the quality of our data. If our participants had responded carelessly or even randomly, those two correlation coefficients would be around 0.

In the next step, we explored whether participants’ activities changed over time during the lockdown. To do this, we performed a series of paired t-tests (Table 4). The only statistically significant differences were observed for networking and taking breaks. At time 2, participants spent less time networking and taking breaks compared to time 1. Overall, the relative order of the activities remained very stable across time on the group level (i.e., when correlating the group averages for the activities of time 1 and 2), r(13) = .99,p < .0001.

4.2 RQ2: Is the Distribution of Daily Working Activities Related to Well-Being, Productivity, and other Variables?

To test RQ2, we correlated the time participants spent on each activity with the selected variables. This was possible because the activities were mostly uncorrelated in both time points on an individual level. We report Pearson’s correlation coefficients (r) in our tables since most of the data were normally distributed. However, for the sake of completeness, we also ran a non-parametric Spearman’s rank correlations test (reported in the Supplementary Material), which provided us with very similar results, suggesting the robustness of our results. In total, we computed at both time points 13 (well-being related variables and productivity) × 15 (activities) = 195 correlations. Given a large number of comparisons, we changed our significance threshold from α = .05 to .0005. Again, a Bonferroni correction would have resulted in an adjusted alpha level of .00017, which is overly conservative and does not consider that some variables are correlated (e.g., distractions and stress). Thus, the adjusted significance threshold of .0005 seemed appropriate to us, neither overly conservative nor liberal. This new threshold implies that only correlation coefficients of r ≥ .25 are significant. This is because the p-value of r = .25 is just below the .0005 threshold for our sample size of 192, p ≈ .00047.

The correlation coefficients are presented in Tables 5 and 6. This analysis did not show substantially significant results across both time points at α =.0005. At time 1, three significant correlations emerged which were at time 2 no longer significant. First, productivity was negatively correlated with time spent on breaks, r = −.30,p = .00002, which can be considered as a further validation of our productivity measure rather than a meaningful finding itself. However, the correlation between productivity and time spent on breaks was again negative but did not reach statistical significance, r = −.16,p = .03. Second, relatedness correlated negatively with interruptions at time 1, r = −.27,p = .0002, but not at time 2, r = −.04,p = .58. Third, autonomy correlated negatively with meetings at time 1, r = −.25,p = .00048, but not at time 2, r = −.17,p = .02. Overall, we conclude that work activities carried out at home are not related to well-being, productivity, and other variables.

Table 5 Correlations between activities and variables at Time 1
Table 6 Correlations between activities and variables at Time 2

4.3 RQ3: Do the needs for Autonomy, Competence, and Relatedness Predict Software Engineers’ Activity-Specific Satisfaction and Productivity?

To test the third research question, we run in a first step two linear-mixed models with random intercepts across all activities using the R-package lme4, version 1.1-25 (Bates et al. 2015). A linear-mixed model is superior to a standard multiple linear regression because the responses are not independent, which is an assumption of regression analysis (Brauer and Curtin 2018). Each participant responded to three activities, making them dependent. Ignoring dependencies can result in biases such as an inflated type-I error rate (i.e., false positives) (Judd et al. 2012). Figure 3 displays the results. Across all activities, activity satisfaction was negatively predicted by conflicts and pressure, and positively by autonomy, competence, and relatedness.Footnote 7 In contrast, productivity was only predicted by autonomy, relatedness, and especially competence.

Fig. 3
figure 3

Predictors of activity satisfaction and productivity across all activities. The horizontal lines represent 95%-CIs

In the next step, we tested whether the pattern of our findings would hold within each of the completed activities by at least 77 participants. This threshold was used because the power analysis reported above revealed that at least 77 participants were needed to detect a medium effect size. As can be seen in Figs. 4 to 5, the pattern of the result was mostly consistent across the activities, but some minor deviations occurred. For example, for meetings, competence did not matter for participant’s activity satisfaction and productivity, but autonomy mattered. In other words, during meetings, it matters more whether people have the feeling they are autonomous rather than competent.

Fig. 4
figure 4

Predictors of well-being and productivity across activities with n ≥ 77. The horizontal lines represent 95%-CIs. Plot one of three

Fig. 5
figure 5

Predictors of well-being and productivity across activities with n ≥ 77. The horizontal lines represent 95%-CIs. Plot one of three

4.4 RQ4: Are the Associations Between Activity Satisfaction and Productivity Moderated by Resilience and Company Support?

We tested the fourth research question by running a series of 2 (DV: activity satisfaction vs. productivity) × 5 (IVs: activity-specific variables autonomy, competence, relatedness, conflict, pressure) × 8 (moderators: resilience, leadership, balance, empowerment, enablement, soft-support, hard-support, recognition) = 80 moderated regression analyses. Specifically, we multiplied each of the task-dependent variables with each of the task-independent variables. Given a large number of tests, we set our α-level to .001 to reduce the likelihood of false-positive results. However, none of the interactions reached statistical significance, ps > .001. Together, this suggests that only activity-specific variables matter for activity satisfaction and productivity.

Additionally, we tested whether any of the seven task-independent variables would be associated with activity satisfaction and productivity; we again run two linear-mixed models with random intercepts across all activities. The predictors were resilience, leadership, balance, empowerment, enablement, soft support, hard support, and recognition. None of the predictors reached statistical significance, p > .16.

4.5 RQ5: Do Software Engineers’ Work Activities while WFH during the Pandemic affect their Activity-Specific Well-Being, Productivity, and Psychological Needs?

Since our design had left many empty cells,Footnote 8 a standard approach such as a within-subject ANOVA was not possible (e.g., no participant reported that they were networking and doing administrative activities). We therefore standardized all of our seven outcome variables and tested whether activities would lie above or below the mean for each scale using a series of one-sample t-tests. This approach allows testing whether doing a specific activity increases or decreases, for example, activity satisfaction compared to the average of all activities. Considering the high number involved in our analysis, we set the new alpha-level to .001, which means that we will only consider results to be significant if p <.001 or the 99.9%-CI does not include zero. Results are displayed in Figs. 7 and 9 and Tables 7 and 8. Activity satisfaction was on average lower when participants were bugfixing [M = -0.48, SD = 1.02, t(114) = -5.07, p < .0001], and higher when participants were helping others [M = 0.56, SD = 0.77, t(35) = 4.39, p = .0001]. Further, participants experienced higher levels of autonomy when coding and lower levels of autonomy when being in meetings and writing emails. Competence was lower when bugfixing and higher when helping people. Relatedness was only higher when people were helping. Pressure and conflict were not impacted by task.

Table 7 Differences between activities
Table 8 Differences between activities (continued)

4.6 Exploratory Analysis

We explored whether there are any gender mean differences for any of our activity-independent and activity-dependent variables, because other studies found that women’s mental health and productivity were more negatively impacted by the Covid-19 pandemic than men’s (Carli 2020). In total, we conducted 8 (activity-independent) + 201 (activity-dependent with > 1 women responding) independent samples t-tests. Because of the large number of comparisons, we adjusted our α −threshold to .0005. None of the t-tests reached statistical significance, all ps > .0006. We report descriptive and relevant inferential statistics for each of the 209 t-tests in the Online Supplemental Materials on Zenodo. Additionally, we explored whether day of the week is not only associated with productivity – previous research found that productivity is higher Tuesdays to Thursdays and lower on Mondays and Fridays (Senney and Dunn 2019) – but also associated with well-being or needs. However, this was not the case, according to a series of both Pearson’s and Spearman’s rank correlations rs < .13,ps > .07.

5 Discussion

5.1 Revised Theoretical Framework

Our results partly align with the theoretical framework proposed by Deci et al. (2017) (cf. Fig. 1). Whereas the exploratory study did not find that activities are significantly correlated with needs or the dependent variables, the confirmatory study found support for it. We found that some activities were linked with the activity-specific needs of the self-determination theory as well as activity-specific satisfaction. Additionally, activity-specific needs were associated with activity-specific satisfaction and productivity. However, while our findings are in line with Deci et al.’s (2017) broad framework, we are, to the best of our knowledge, the first in testing which activities show stronger links with activity-specific needs, satisfaction, and productivity.

However, a revised theoretical framework is also supported by our confirmatory study: The links between the three needs and activity satisfaction as well as productivity are moderated by the type of activity (moderation is represented in Fig. 8, a consequence of our findings of Figs. 4 to 6). In other words, the strength of the association between needs and activity-satisfaction as well as productivity depends on the type of activity. The model depicted in Fig. 8 does not directly contradict the model shown in Fig. 1, but it revises it. They can co-exist, as our data shows. The model from Fig. 1 is more relevant to understand underlying mechanism and basic processes, whereas the model from Fig. 8 has more applied value. Indeed, the latter model offers intriguing possibilities for future research, which we discuss in more detail below.

Fig. 6
figure 6

Predictors of well-being and productivity across activities with n ≥ 77. The horizontal lines represent 95%-CIs. Plot one of three

5.2 Implications for Research and Practice

Our investigation addresses the need for scholarly evidence concerning the effects of WFH during the COVID-19 pandemic on software developers’ work activities, including the impact on professionals’ well-being and productivity. Further, a deeper understanding of the effect of the pandemic on professional working life for the large number of software professionals working remotely provides relevant insights for both research and practice. To this end, this study makes several contributions, as summarized in Table 9.

Table 9 Summary of key findings & implications

First, we ran an exploratory longitudinal study during the COVID-19 lockdown with 192 carefully selected software professionals to address the first and second research questions. We assessed developers’ working activities and their perceived well-being, productivity, and other relevant psychological and social variables. Our data quality was assured by the high test-retest reliability of each variable, measuring at least .50, and Cronbach’s alpha values above .60.

Second, we compared the time spent on typical office-based activities with the same activities while working from home. Using the taxonomy and previously collected data of Meyer et al. (2019), we ran 30 one-sample t-tests to assess significant differences. Although we reported several differences, they are relatively small, which indicates that the time spent on different activities is almost identical in both the online and the physical working environment.

Third, we analyzed whether the time spent on each working activity changed during the pandemic. After performing 15 paired t-tests, we conclude that developers did not change how they spend their time over a period of two weeks.

Fourth, we investigated whether well-being-related variables and productivity are associated with the time spent on each activity and if the findings replicate across both time points. To do so, we ran twice 195 correlation analyses. Our results suggest that well-being-related variables and productivity are not associated with the time spent on each activity.

However, a shortcoming of our exploratory study is that we only measured general well-being, productivity, and needs and the amount of time spent on various activities during the past week. The lack of significant findings could suggest that either the type of activity does not impact professionals’ well-being and productivity or that many other factors impact well-being and productivity more strongly (e.g., quality of social contacts (Miller et al. 2021; Russo et al. 2021b)). We found evidence for the former in our confirmatory study.

In our confirmatory study, we tested whether activity-specific variables, such as the need for autonomy, competence, relatedness, and activity-independent variables, such as resilience or empowerment, are associated with activity satisfaction and productivity (to address the third research question). Additionally, we tested whether activity-specific and activity-independent variables interact in predicting activity satisfaction and productivity, addressing the fourth research question. Finally, we tested whether specific activities impact professionals’ activity-specific satisfaction and productivity, addressing the fifth and final research question. Here, we summarize and discuss the results of our research questions.

RQ1 : Has the distribution of daily working activities of software engineers changed while WFH during the pandemic as compared to pre-pandemic daily working activities?

On the whole, we did not register significant changes to developers’ work distribution. Further, we highlight that Meyer et al.’s sample covers only one software company (Microsoft) (Meyer et al. 2019), whereas we surveyed developers across many companies globally distributed. Therefore, some deviations were expected. Nevertheless, we still report an overall consistency between our WFH data and Meyers et al.’s analysis of a typical office day at Microsoft. Our results show that neither working from home nor sample type (Micosoft employees vs a diverse sample) affected how software engineers dedicate their time to specific activities. However, we observed some minor differences. Most notably, software engineers in our sample spend less time on bugfixing, meetings, and breaks. Also, they report less time on e-mail writing (only in wave 1) and fewer interruptions when working from home (only in wave 2). In contrast, they spend more time on specifications, testing, administration, documentation, and learning. It is unclear whether those minor differences emerged because of the pandemic or because our sample differed.

We observe that meetings are significantly reduced while working remotely. One explanation is that they are, on average, shorter and more time-efficient than in the office. For example, small talk might be perceived as more challenging during online meetings than in-person meetings. Alternatively, they might be better planned since setting up online meetings often requires a clear start and end time. Also, our participants invested in improving their skill set as they spent more time learning. Similarly, developers seem to be more focused on their activities: They reported fewer breaks and interruptions. At the same time, developers remain linked to their organization or their colleagues since their time on networking remains the same. We did not register any significant change in the work activities during our exploratory investigation, with only two exceptions: at the first wave, developers spent more time on breaks and networking than during the second wave. Nevertheless, we report a correlation close to 1 of the group averages, suggesting a very high consistency in the pandemic activity distribution. The reason software engineers spent less time on breaks and networking during the second measurement point might indicate that they became more accustomed to their new WFH condition. Accordingly, professionals learned to spend their working time more efficiently. Similar conclusions are also supported by the literature (Ford et al. 2021; Russo et al. 2021b).

RQ2 : Is the distribution of daily working activities related to well-being, productivity, and other variables?

We did not find any significant relations between daily working activities and the well-being, productivity, or other investigated variables, except for one, taking breaks was negatively associated with productivity in our exploratory study. This can be interpreted as a generally positive finding since it shows software engineers’ well-being and productivity do not depend on the type of activity they are doing, at least concerning the 15 measured activities. The only significant relation was productivity, which correlated negatively with breaks in wave 1. Despite being intuitive, we are very cautious about concluding that developers should take fewer breaks to be more productive since such a relation was not significant at wave 2 (although still negative). Further, prior work shows that breaks can increase well-being (Dababneh et al. 2001) and can improve the quality of professionals’ social networks (Waber et al. 2010). Similarly, correlation does not equate to causation: participants might have taken more breaks because they felt less productive for various reasons (e.g., more exhaustion, distractions at home).

Regarding the other activities, we do not find evidence that the time spent on activity affects productivity or well-being. We did not register any significant effect on how the amount of time dedicated to development activities impacts software engineers’ general well-being, stress, boredom, or distractions while working from home. Previous studies showed that during the pandemic, it is essential to have daily routines to improve personal well-being (Russo et al. 2021b). However, routines do not seem to play a significant role when it comes to individual activities. As our findings show, possible distractions that might happen while working from home (e.g., children, noisy neighbors) do not influence the time spent on specific work activities.

The innate psychological needs of the self-determination theory (Ryan and Deci 2000), and its three dimensions, need for autonomy, competence, and relatedness, are associated with work motivation in general (Gagné and Deci 2005). To the best of our knowledge, our study is the first in our community to assess whether specific activities are correlated with autonomy, competence, and relatedness. Overall, we found that general psychological needs were unrelated to the amount of time developers spent on specific activities. In hindsight, this might be because the scale we used to measure the three dimensions of the self-determination theory captures broad human needs in general (Ryan and Deci 2000) and not specifically while working on specific activities. We addressed this limitation of the exploratory study in the confirmatory analysis.

While working remotely, the quality of communication between team members can be challenging, as face-to-face communication has to pass through a medium (e.g., MS Teams, Zoom). Therefore, not being directly connected to the organizations can become a big issue for remote workers. For example, research suggests that lower support from coworkers and supervisors (McCalister et al. 2006), perceiving the values of one’s organization to be different from one’s values (Edwards and Cable 2009), and unfair treatment and lack of appreciation (Bhui et al. 2016) are putting the mental health of remote workers at risk. Interestingly, our results suggest that the quality of communication does not relate to individual working activities. This might seem surprising at first glance, as it is plausible to assume that those who find the quality of communication to be poorer might engage less in activities that require more communication (e.g., meetings) and more in activities that require less direct communication (e.g., coding, bugfixing). This might suggest that developers are professional enough not to let their behavior be influenced by their perception of the quality of communication. In other words, the time spent by software engineers on each activity is not detrimental to the relations with their organization. Prior research has mostly ignored whether activity type plays a role in professionals’ psychological and social factors. Typically, scholars only measured whether people are, for example, overall stressed, as opposed to stressed by specific activities (Bhui et al. 2016; Edwards and Cable 2009; McCalister et al. 2006). Our research suggests that the type of activity is not a confounding variable, which increases our trust in prior research, which has typically looked at subjective work experience in general rather than actual activities. So, our exploratory findings suggest that software engineers’ psychological and social factors do not matter on what work activity they are performing, but rather how it is done.

RQ3 : Do the needs for autonomy, competence, and relatedness predict software engineers’ activity-specific satisfaction and productivity?

In the confirmatory study, we found, across all activities, that the needs for autonomy, competence, and relatedness were positively associated with activity satisfaction and productivity. Simultaneously, conflict and pressure were only negatively associated with activity satisfaction but unrelated to productivity. These associations were mostly consistent across activities, albeit a few deviations occurred (Figs. 4 and 6). For example, relatedness predicted activity productivity for meetings and reviewing but not for coding, bug fixing, testing, and learning. One possibility is that meetings and reviewing are typically more social (i.e., done with other people), making relatedness more relevant. Overall, our results align with our first findings, even though previous research often used different measures of well-being and/or needs, a variety of statistical tests (e.g., zero-order correlations), and/or relied on different populations such as student samples. For example, a meta-analysis (Yu et al. 2018) found a correlation of r = .49,95%CI[.39,.57], between need for autonomy and satisfaction with life, which is very much in line with our findings: A linear mixed-effects model with only autonomy as a predictor for activity satisfaction revealed β = .48Footnote 9.

This result is of great relevance to understanding developers’ satisfaction and productivity. To improve activity satisfaction and productivity, self-determination theory is a precious lens. Indeed, more autonomous, competent, and related professionals show a high degree of satisfaction and productivity. These findings are also precious for employee recruitment and retention. Companies should keep this aspect in mind when organizing working activities. In particular, micro-management could be detrimental to software engineers’ satisfaction and productivity. In other words, it is advisable to discuss realistic working goals of software projects, leaving it to the teams to self organize, like a recent investigation about effective Scrum teams highlighted (Verwijs and Russo 2023).

RQ4 : Are the associations between activity satisfaction and productivity moderated by resilience and company support?

None of the seven task-unrelated variables (e.g., resilience, work-life balance) moderate the link between the three needs and activity satisfaction and productivity. Initially, we hypothesized that, for example, resilience might buffer against reduced autonomy because resilient people are more likely to bounce back after stressful events such as being less able to make autonomous decisions (Smith et al. 2008; Weinstein and Ryan 2011). Generally measured variables (e.g., general work-life balance) are rarely associated with specific variables (Davidson and Jaccard 1979). This might be because we measured resilience and work-life balance in a way that is too broad. Future research could measure resilience in a more specific way (e.g., resilience during the day or activity-specific resilience), which makes it more relevant for activity-specific satisfaction, productivity, and basic needs. Alternatively, other personality traits might be more relevant. For example, proactive personality was found to mitigate or moderate the effect between stress and productivity (Hung et al. 2015; Onyemah 2008). Thus, lower levels of activity satisfaction might strongly impact productivity for those who score low on proactive personality.

Additionally, caring leadership, work-life balance, empowerment, job enablement, soft company support, hard company support, and recognition were unrelated to activity-specific satisfaction and productivity. In hindsight, this is not surprising given that we measured all these variables generally. For example, if we had measured activity-life balance instead of general work-life balance, we would have likely found an effect on activity-specific satisfaction and productivity.

Overall, our results are inconclusive on this question. Although the moderation effects of resilience and company support are not supported, we acknowledge that with more specific measurements, this outcome might change.

RQ5 : Do software engineers’ work activities while WFH during the pandemic affect their activity-specific well-being, productivity, and psychological needs?

We found that activity satisfaction was relatively lower when participants were bugfixing and higher when helping others. This finding is in line with previous research suggesting that helping others increases well-being (Buchanan and Bardi 2010). In contrast, levels of activity productivity were more consistent across activities, while activity satisfaction varied. Our findings of bugfixing have three main practical implications.

First, bugfixing might be viewed as an annoying but necessary activity by many developers: Compared to all activities, 80 participants reported a below-average level of satisfaction when bugfixing, whereas only 35 reported above-average satisfaction (cf. Fig. 7). Pointing out the meaningfulness of bugfixing is essential. Literature supports that meaning is positively associated with satisfaction, autonomy, competence, and relatedness (Martela et al. 2018). Even though most developers are aware that bugfixing is essential, the odd reminder or nudge can have an impact (Venema and van Gestel 2021). For example, while most people are aware that switching off the light when needed is beneficial for the environment, reminders of it nevertheless increase the likelihood that the light gets switched off (Byerly et al. 2018). Occasional reminders or nudges are typically very inexpensive and are likely to be cost-effective. However, more research is needed whether in the context of bugfixing nudges result in substantially increased satisfaction. Additionally, organizations should support a higher degree of socialization during bugfixing activities. Software engineers appear to be (contrary to stereotypes) social and caring individuals. Consequently, code review practices should be primarily supported by management.

Fig. 7
figure 7

Differences between activities regarding activity satisfaction. Red lines represent 99.9%-CIs

Second, organizations should facilitate an inclusive working environment in which developers are actively helping each other to perform different activities they can freely choose from. One concrete example might be to establish innersourcing projects (Stol and Fitzgerald 2014). They are similar to open source projects, except they are closed projects in which only employees can participate. This practice would also support the need for autonomy of software professionals in contributing to projects they find important and committed to.

Third, establishing mentorship programs can stimulate senior developers’ desire to help by increasing newcomers’ sense of relatedness. This aspect is even more critical in a WFH setting, where informal networking occasions are typically limited. At the same time, this will increase the onboarding success of new employees. Research already showed that the support of newly hired employees through, for example, mentoring projects, is an essential factor for onboarding success and, eventually, employees’ retention (Sharma and Stol 2020). Furthermore, an effective bug triaging process is considered pivotal for a software organization efficacy to address quality concerns (Anvik et al. 2006). Picking the right developer to work on a specific bug is crucial to fix the bug timely and to reduce bug tossing length (Yadav et al. 2019). Establishing an effective and transparent process is, thus, a way to establish meaning about this activity. Future research along the lines of RQ5 could also investigate whether an activity was self-chosen. If an activity is self-chosen, intrinsic motivation is usually higher, which is linked to higher job satisfaction and performance (Hayati and Caniago 2012).

5.3 Measuring Satisfaction and Productivity

Findings from both studies have not only practical but also methodological implications. The time developers spend on a specific activity was unrelated to their well-being, productivity, needs, or working conditions, when the latter was measured in a general (i.e., activity unrelated) way. Researchers or employers who wish to identify how to increase satisfaction or productivity of a specific activity need to adapt their measures to become activity-specific. For example, increasing employees’ general resilience or work-life balance will have little impact on how satisfied and productive they are with a specific coding task. In contrast, enhancing autonomy for coding is likely more beneficial. However, it should be noted that we have created the activity-related measures for the confirmatory study. While the measures were mainly associated with each other in the expected directions, even when measured with a single item (e.g., conflicts and pressure were negatively associated with activity satisfaction), future research could further improve our measures to increase their reliability.

However, this does not imply that general measures of personality and other constructs cannot predict activity-specific variables. Previous research established that, for example, personality variables predict related behavior averaged over a sample of occasions and situations much better than single observations (Epstein 1979; Skimina et al. 2019). In contrast, averaging across multiple instances of autonomy-related behaviors across various situations (e.g., living in a self-chosen city, working in an area that matches personal interest, or listening to the music one likes most as opposed friends, partner, or family) will likely be more strongly associated with need for autonomy. For example, looking at an exhibition that is of personal interest at a museum might only weakly be predicted by need for autonomy. This is because general measures are broad and trans-situational by definition. For instance, resilience is important in many aspects of a software developer’s life, not only while coding on a specific day. This activity, in turn, can also be influenced by many situational variables (e.g., distractions at home, a particular project, working with competent colleagues) that diminish the impact of personality. If researchers are interested in testing whether, for example, resilience predicts activity satisfaction, they might want to measure activity satisfaction across multiple activities (e.g., coding, bugfixing) and/or multiple time-points (van Berkel et al. 2019).

Further, our findings cautiously suggest that WFH might be more beneficial for both developers and organizations than working in the office, or at least for some groups of professionals (Ford et al. 2019). However, while some studies support our conclusion that WFH increases or does not impact productivity (Bao et al. 2020; Barrero et al. 2021; Deole et al. 2021; Russo et al. 2021b), some studies also found that WFH has a negative impact on productivity (Gibbs et al. 2021; Kitagawa et al. 2021; Morikawa 2020). As there are too many potential differences between the studies (e.g., cultural factors, working conditions at home, type of work, measurement of productivity), cross-country and cross-profession studies are needed. Large sample sizes or meta-analyses synthesize the findings to better understand the conflicting findings in the direction WFH shifts productivity during the Covid-19 pandemic. Thus, there is a need for more research to identify factors that help us understand how WFH can be beneficial and whether these factors are transferable to working on-site. Our confirmatory study offers an intriguing possibility for the contradictory studies: The type of activity matters. Certain activities might be less feasible when WFH, which reduces productivity, whereas working on other activities might be more accessible when WFH and thus increase productivity, similar to what we predict in Fig. 8.

Fig. 8
figure 8

Revised Theoretical Framework. We found that the strength of the association between Self-Determination Theory needs and the two dependent variables depends on the type of activity performed by developers

5.4 Threats to Validity

To conclude this section, we briefly address the most relevant limitations.

Reliability

We investigated our subject matter using a longitudinal exploratory design combined with a confirmatory cross-sectional one. Participants were identified using a multi-stage selection process to ensure (i) they are professionally active software engineers, (ii) data quality, and (iii) that they were working from home during the lockdown. Validated scales have been used when available or adapted from previous investigations. In line with most related research, we have not aimed to control for response biases because doing so has usually little impact on the reliability: Some approaches to control for response bias improve the reliability slightly, but can also reduce reliability or leave it unchanged (He et al. 2017). Overall, we report a high test-retest reliability in the longitudinal study and adequate internal consistencies of all measures.

Construct Validity

To enhance cross-study comparability, we used the taxonomy by Meyer et al. (2019) to define the daily activities of software developers. Similarly, we used those benchmarks to confront it with working from home settings. However, we did not monitor developers’ effectiveness by executing every activity while working remotely. We opted for this to be consistent with Meyer et al. and because we collected data from a global sample of software professionals working in 190+ different organizations, making the development of objectively comparable measurements near impossible. Still, we report some differences with the data collected by Meyer et al., although the difference is of only some percentage points.

Conclusion Validity

Our conclusions rely on multiple statistical analyses, such as one-sample t-tests, paired t-tests, Pearson’s correlation, multiple regressions, and linear mixed-effects models. Furthermore, we also ran a non-parametric Spearman’s rank correlations test for our conclusion’s consistency since not all distributions were perfectly normally distributed. To support Open Science, we make a reproducible R-code alongside our raw data openly available on Zenodo.

Internal Validity

We used self-reported measures for well-being, productivity, and other psychological and social variables for this investigation, which might be considered a limitation. The data for the exploratory study was collected towards the end of the first lockdown in spring 2020 with a longitudinal design. We expanded our initial data collection one year later, in spring 2021, with a cross-sectional, confirmatory study. This enabled our participants to report a more mature and stable assessment of the new working setting. For the exploratory investigation, we only considered countries with comparable lockdown measures (e.g., we excluded, among others, Denmark, Germany, and Sweden as these countries did not face a total lockdown or had different measures in place in the country’s regions). Thus, we asked both waves about lockdown conditions in their home country and if they were still working from home. Moreover, the exploratory longitudinal study was performed in a relatively short time frame (around two weeks) due to the ever changing health public policies in the first months of the pandemic. We do not deem it as a significant limitation, since the main goal of this first study was to identify relevant tendencies to follow up in the confirmatory study. Since all selected informants faced comparable conditions, we did not exclude any of the 192 selected software professionals. For the confirmatory study, we surveyed 300 developers working from home. Since lockdown measures in spring 2021 were comparable across all countries, we did not exclude any country a priori.

External Validity

We designed this study to maximize internal validity. Therefore, we determined our sample size with an a priori power analysis. So, we did not work with a representative sample of the software engineering population in mind (such as Russo and Stol 2022did, where the research goal was to generalize results, surveying over 400 software engineers). However, we recognize having submitted our surveys in the middle of a very peculiar period. This makes it unclear whether we can generalize our findings to non-pandemic working from home settings. Notwithstanding, we also realize that we require fast and reliable evidence regarding the COVID-19 crisis we are facing right now, improving the quality of developers’ daily lives. This study will also enable a better-informed research design for future remote working studies once this pandemic is over. Finally, our sample is almost entirely composed of western country developers. Consequently, the investigated effects could be different in other regions of the world e.g., Africa or Asia.

6 Conclusion

This research focused on software engineers’ activity satisfaction and performance during the COVID-19 pandemic. For the sake of clarity, we did not provide any consideration regarding the Future of Work after the COVID-19 lockdowns, such as Smite et al. (2023). To do so, we first employed an exploratory longitudinal study design across two waves and a confirmatory cross-sectional study. We found that developers still spend proportionally the same amount of time on their different daily activities. For example, the software engineers in our sample still spent most of their working time on coding, bugfixing, meetings, testing, and e-mails, as previously reported by Meyer et al. (2019). Nevertheless, we found some significant mean differences. Our participants reported having spent less time in meetings and breaks, suggesting that both were less common, possibly due to developers’ adaption of working remotely. Similarly, no significant relations have been found between productivity, well-being, and relevant social and psychological variables with working activities. In our confirmatory cross-sectional study, we found that activity-specific needs for autonomy, competence, and relatedness are associated with activity-specific satisfaction and productivity. Furthermore, activity satisfaction was relatively lower when participants were bugfixing and higher when helping others. At the same time, autonomy was perceived as relatively lower while professionals were in meetings or writing e-mails.

Overall, our research suggests that WFH does not per se affect how much time developers spend working on various activities. Nevertheless, software engineers are social beings, and their satisfaction increase when they can help others. This paper also suggests a number of recommendations for organizations to support their employees’ well-being and productivity. In particular, active company policies to support developers’ need for autonomy, relatedness, and competence appear to be particularly effective in a WFH context. Also, bugfixing is the most detrimental activity for professionals’ satisfaction. Accordingly, specific processes should be designed for software engineers working from home (e.g., bug triaging and mentorship programs).

As a deductive investigation, using a quantitative stance we can only assess the relations between the independent variables with our two dependent ones. Thus, we do miss a number of nuances about the interactions of our variables that should be investigated further. Additionally, future research should aim to provide more tailored recommendations based on developers’ personalities. This would result in a more nuanced understanding of the subject matter. Also, a better understanding of software professionals’ activity satisfaction and productivity is needed to develop reliable measurement instruments and to develop or refine theories.