Satisfaction and performance of software developers during enforced work from home in the COVID-19 pandemic

Following the onset of the COVID-19 pandemic and subsequent lockdowns, the daily lives of software engineers were heavily disrupted as they were abruptly forced to work remotely from home. To better understand and contrast typical working days in this new reality with work in pre-pandemic times, we conducted one exploratory (N = 192) and one confirmatory study (N = 290) with software engineers recruited remotely. Specifically, we build on self-determination theory to evaluate whether and how specific activities are associated with software engineers’ satisfaction and productivity. To explore the subject domain, we first ran a two-wave longitudinal study. We found that the time software engineers spent on specific activities (e.g., coding, bugfixing, helping others) while working from home was similar to pre-pandemic times. Also, the amount of time developers spent on each activity was unrelated to their general well-being, perceived productivity, and other variables such as basic needs. Our confirmatory study found that activity-specific variables (e.g., how much autonomy software engineers had during coding) do predict activity satisfaction and productivity but not by activity-independent variables such as general resilience or a good work-life balance. Interestingly, we found that satisfaction and autonomy were significantly higher when software engineers were helping others and lower when they were bugfixing. Finally, we discuss implications for software engineers, management, and researchers. In particular, active company policies to support developers’ need for autonomy, relatedness, and competence appear particularly effective in a WFH context.


Introduction
The COVID-19 pandemic has abruptly and unprecedentedly disrupted software developers' working routines.On short notice, many software developers were asked to switch from their typical office-based working habits to a new working from home (WFH) setting.This change in work setting has had a considerable negative impact on developers' well-being and productivity [107], as the pandemic and subsequent restrictions (e.g., lockdowns) restricted their basic needs, such as the need for autonomy, competence, or relatedness [22].Nevertheless, longitudinal research has also shown that software engineers can successfully adapt over time, suggesting that their well-being and productivity bounce back to pre-pandemic levels [46,47,7,114,126].This is encouraging, as 89% of professionals would like to work from home at least one day per month after the pandemic [140].For this reason, major IT companies (e.g., Twitter, Microsoft, AirBnB, Uber, Facebook) informed their employees that they could work from home indefinitely (e.g., Twitter) or extended the remote work policies providing specific support (e.g., AirBnB) [56].Thus, research conducted during the pandemic will likely also be of value once current restrictions have been lifted.
Software professionals working remotely for an organization is not a new topic in software engineering.In 1983, Olson defined remote work as an "organizational work that is performed outside of the usual organizational confines of space and time" [97].This definition implies professionals' high degree of freedom with regards to scheduling their working hours, activities, and the location from which they work.With the rise of the internet in the late 90s, scholars started researching the challenges and opportunities of remote work from home [104].In these cases, professionals usually have a high degree of autonomy in terms of time but not in terms of space since they have chosen their homes as primary working spaces.Generally speaking, researchers investigated specific software development practices, such as processes [55,37] or communication [66] to better tailor remote work practices to business needs.Similarly, collaboration and characteristics of remote and asynchronous projects have been extensively studied by the Global Software Engineering community [64,127].Such studies typically focus on the interaction among software development teams co-located in different geographical areas.However, the focus has been on software development teams working together on distributed projects.There is a growing agreement within the practitioners' community that working from home is different from working remotely on distributed projects [5].While working from home is understood as working from the primary address of residence, such as an apartment or house, working remotely is carried out typically in co-working spaces or in different settings where one lives.The pandemic made many of us realize that some of the fears often associated with remote work (such as decreasing productivity) are often unfounded.Hence, anecdotal evidence driving top managerial decisions due to the lack of specific research [90] should be supplemented with scholarly evidence.
So far, the authors of this paper have worked on a comprehensive research agenda to understand the effects of the COVID lockdown on software engineers.We started looking at the self-perceived well-being and productivity in the earlier months of the pandemic [114].Afterward, we tracked how a typical work day looks like, as also the distribution of work activities compared to pre-COVID times [112].Eventually, we performed a two years long longitudinal study with six waves to assess the effects of the entire pandemic on software developers [113].Additionally, the first author also investigated software processrelated changes while working from home [31].
This paper studies whether professionals' needs influence their time on various activities.In their seminal paper, Ryan and Deci [117] introduce selfdetermination theory, which describes the three innate psychological needs that motivate us and guide our behavior: need for autonomy, competence, and relatedness.The need for autonomy measures whether people feel independent; the need for competence whether people can complete various (challenging) activities; and need for relatedness assesses whether people feel appreciated by others close to them.Self-determination theory has frequently been used in the work context to predict job satisfaction and performance [48].For example, research established that all self-determination theory-related needs (need for autonomy, relatedness, and competence) positively correlate with job satisfaction and productivity [15].By building on self-determination theory, we study how software engineers' activities changed during the pandemic using the activity taxonomy of Meyer et al. [91].
In line with other researchers who started to look at productivity of software engineers in a more holistic way [118], we are particularly interested in understanding whether specific activities contribute to their well-being and productivity in general and which factors contribute to their satisfaction and productivity while working on a particular task.For example, meetings can be resource-draining and be felt as burdensome by employees [2].Furthermore, we also take social relations as an indicator of need for relatedness into account: People who feel that communication with their colleagues and line managers is of importance might be more inclined to spend time in meetings, helping others, and other social activities and report higher well-being because their need for relatedness is then more likely to be satisfied.Prior research which investigated predictors of well-being and stress in occupational settings [12,40,88] has not measured the specific activities that might have contributed to higher stress and lower levels of well-being.However, the type of activity someone is doing might contribute to higher stress levels beyond other factors identified by previous research, such as support by coworkers and supervisors [25].If we were to determine which specific activities are associated with higher or lower levels of stress or well-being, this would provide valuable information for future research investigating predictors of stress.We divided this study into an exploratory and a confirmatory part to investigate all these aspects.Both studies build on self-determination theory [117].
In the exploratory investigation, we first measured developers' activities and self-reported well-being and productivity to assess changes throughout the lockdown over a two-week period.We compared wave 1 with wave 2 to assess our test-retest reliability and stability of the data captured.In particular, we found that the time software engineers spent doing specific activities from home was comparable when working in the pre-pandemic office.Nevertheless, we also reported significant mean differences, such as less time dedicated to meetings and breaks and more time spend on specification and documentation.Interestingly, the time people spent on each activity was unrelated to their general well-being, perceived productivity, and other variables.In hindsight, this is not surprising because many factors affect our well-being and productivity.For example, well-being is impacted by a range of factors such as the quality of our relationships, personality, or situational factors (e.g., weather) [29,38,114], which makes it unlikely that spending an hour more or less on a specific activity will significantly impact well-being.However, what we believe is more likely to impact well-being and productivity, are activity-specific features, which is one of the primary motivations of the confirmatory study (i.e., what factors predict activity-specific well-being and productivity?).
In the confirmatory study, we measured activity-specific well-being and productivity, as well as the activity-specific need for autonomy, competence, and relatedness (e.g., how productive professionals felt during the activity they spend the most time on a day).Additionally, we explored whether taskunrelated variables such as resilience or work-life balance act as moderators between activity-specific needs and activity-specific well-being and productivity (see below for a more detailed rationale).Our findings confirm the long-standing intuition that software engineers feel more autonomous while coding than while in meetings or writing emails.Also, software engineers experience less satisfaction with bugfixing but helping others is a satisfaction booster.We further characterized which activities resulted in higher feelings of satisfaction, productivity, autonomy, competence, and connectedness.Moreover, through combining both the exploratory and confirmatory study, methodological lessons can be learned: Only asking whether overall well-being and productivity, for example, are associated with time spent on specific activities, misses the impact different activities can have on people's well-being and productivity.Measuring activity-specific well-being and productivity levels overcomes this limitation.
In the remainder of this paper, we describe the related work in Section 2, followed by a description of our research design in Section 3. The analysis and related results of our analysis are described in Section 4. Implications and recommendations for software engineers and organizations are outlined in Section 5. Finally, we conclude this paper by presenting future research directions in Section 6.

Related Work
Research on behavioral and emotional aspects within the software engineering community is a relatively new but rising research topic [119].Developers' behaviors and emotional states do play a substantial role in how they are going to perform their working activities [54].For this reason, the community started to focus specifically on software engineers' behaviors [80], emotions [53], or personality traits [30,116].
Concerning the pandemic, there is widespread agreement that lockdowns have a negative influence on well-being [16,82].Living in a lockdown during a pandemic has been linked to increased levels of anger, depression, emotional exhaustion, fear of infecting others or becoming infected, insomnia, irritability, loneliness, low mood, and post-traumatic stress disorders [129,60,79,85,109,6,132].Furthermore, anxieties of infection [73,105], a lack of supplies or not being treated [143], and false or conflicting information [21] can all cause substantial stress and give rise to new approaches to regulate our emotions [132].Furthermore, the psychological impacts of being quarantined may take years to manifest [16].
Pre-COVID research, on the other hand, indicates that remote working is associated with improved work-life balance, creativity, productivity, reduced stress, and low carbon emissions due to the absence of commuting [99,3,13,135,9,24].However, there are several apparent downsides to remote work, like decreased teamwork and communication, loneliness, the sensation of always being 'online,' decreased motivation, and distractions at home [18,147].Aside from such factors, estimates indicate that remote work will grow significantly in the next years [99,49].
In the software engineering domain, several large software companies, such as Stack Overflow or Red Hat, have embraced working from home by designing ad hoc schemes already before the start of the 2020-Corona pandemic [87,108].Organizations do so to increase their employees' job satisfaction and productivity while simultaneously reducing their operating expenses, such as office rent [44,103].Several aspects of remote and distributed working have been (indirectly) investigated by the Global Software Engineering community well before the pandemic (e.g., [127,65,110]).To better frame this study theoretically, we looked into peer-reviewed publications in Scopus which explicitly focused on working from home (i.e., and not remote and distributed work).We made this choice to narrow down the subject matter and consider only articles whose primary focus is about working from home.We identified thirteen relevant papers in total.Considering the vast but recent impact of COVID-19, we also selected non-peer-reviewed pre-prints on arXiv.Table 1 summarizes prior studies of remote working issues related to software engineers.[113] Sample study.14-months 4-wave longitudinal study with 15 variables associated to developers' well-being and productivity.
Well-being increased over time during the lockdown and productivity remained stable.
No significant change in productivity but WFH impacts some more then others.Cucolas , & Russo (2021) [31] Multi-Methods study.Qualitative interviews and sample study of Scrum developers.After a theoretical model was induced from qualitative data, a sample study of 200 software engineers validated it with PLS-SEM.
Home-working environment is the most important variable for project success, and to improve WFH conditions, organizations should strengthen the need for developers' autonomy, competence, and relatedness.Miller et al. (2021) [93] Field study.Mixed-methods investigation of Microsoft developers.Two surveys collected information about working from home and team-related issues.Data were analyzed using different quantitative and qualitative techniques.
Communication and interaction with colleagues is a relevant predictor of developers' satisfaction and team productivity.
Butler & Jaffe (2021) [19] Field study.Diary study of 435 Microsoft developers over 10 weeks during the lockdown.Data were analyzed using different quantitative and qualitative techniques.
The largest identified challenges were meetings, overwork, and physical and mental health.On the other hand, participants appreciated having more family time and work flexibility.Machado et al. (2021) [83] Sample study.Mixed-methods investigation of 233 Brazilian software professionals.Data were analyzed using different quantitative and qualitative techniques.
The pandemic affected men and women differently.Organizations should accommodate women first when scheduling meetings.Organize uninterrupted work sessions and support childcare are also recommended.Ford et al. (2020) [46] Field study.Mixed-methods investigation of 3,634 Microsoft developers.Two surveys collected qualitative and quantitative insights about WFH conditions during the COVID-19 lockdown.
Quality of family life and time improved; although WFH might have led to a lack of focus, poor work-life boundaries, communications, and sync issues, developers adapt over time.Ralph et al. (2020) [107] Sample study.Large-scale crosssectional study of 2,225 software developers globally working from home during the COVID-19 lockdown, surveying five variables.Data were analyzed using covariance-based structural equation modeling.
Confirmation of a theoretical model.Professionals' well-being and productivity are suffering; well-being and productivity are strongly related to each other; women are disproportionately affected by this peculiar remote working setting.Russo et al. (2020) [114] Sample study.Longitudinal study involving 192 software engineers living in countries with comparable COVID-19 lockdown measures, surveying 51 variables.Data were analyzed using correlations, multiple linear regressions, and covariance-based structural equation modeling to assess predictive causal relations.
Well-being and productivity are related, professionals adapt to the condition over time, improving their wellbeing and productivity, introverts are disproportionately affected by the lockdown, no predictor variable was significantly able to causally explain the variance in well-being and productivity.[45] Field study.Qualitative study interviewing three transgender software engineers to explore the interplay of gender identity and remote work.

Ford et al. (2019)
Working from home enables the empowerment and identity disclosure of software professionals from marginalized communities.James & Griffiths (2014) [69] Experimental simulation.Within an existing project, relevant working from home problems has been identified and addressed by developing and validating a specific solution.
Development of a mobile execution environment to support a secure and portable working from home setting.Guo (2001) [55] Field study.Report of two qualitative surveys regarding software process improvement related to the distinctive characteristics of teleworking.
Development of the Software Process Improvement approach for Teleworking Environment (SPITE) model.Identification of 25 base practices to improve software processes when working from home.Higa et al. (2000) [66] Field study.Mixed-methods study at Fujitsu with 44 software engineers to investigate how the use of E-mail influences telework.To test the hypotheses, three hierarchical regression models were used.
An effective use of E-mails by remote workers leads to better work distribution and work productivity.Pounder (1998) [104] Formal theory.Essay about security problems linked to telework.This is the first paper that considers "homeworking" as a distinct working setting.It discusses the main security concerns and makes recommendations for organizations.
Most papers which focused on WFH were published in or after 2019 and are related to the COVID-19 pandemic.From a methodological perspective, most studies have been field studies involving a single company (i.e., Fujitsu [66], Baidu [7], and Microsoft [46,93,19,147]).Such real-world investigations aimed to understand the research phenomena by generating research hypotheses.Three studies were conducted in a neutral setting on the opposite spectrum by asking participants a quantifiable judgment and analyzing such data through statistical techniques.These six sample studies generalize their result on the entire software engineering population [107,114,83,31,113,126].
Content wise, half of the papers are concerned with specific topics related to working from home, such as security [104,69], process [55], work productivity [66,76], and inclusion [45].The other half mostly investigated well-being and productivity while working from home during the pandemic [46,107,114,19,83,113,126] or productivity-related to projects' characteristics [7,31].
Overall, the investigated topic is not new to the community.However, from this short review, we noticed how scholars focused in particular on WFH topics due to the COVID-19 pandemic and the subsequent lockdown.Indeed, future work is needed to support developers working in a lockdown environment or in a reality where pandemic waves are part of our everyday lives.Alternatively, more optimistically, software organizations will enforce hybrid work in a widely spread manner.Therefore, we believe that this subject matter is of utter importance for software professionals' well-being and productivity in the years to come.This is also important because past research has shown that there are some mean differences between software engineers and the general population [115].In other words, we cannot assume that findings from other population types (e.g., employees at Microsoft, general population) generalize to software engineers.

Research Design
Our design was guided by the relevant ACM SIGSOFT Empirical Standards for longitudinal and sample studies [106].First, we applied an exploratory longitudinal design already described in Russo et al. [112].Subsequently, to overcome the methodological limitations of the exploratory study while gaining further insights into the associations of activities with activity-specific satisfaction, productivity, and basic needs, we employed a cross-sectional design.
We formulate the following five main research questions which were guided by previous research and by self-determination theory [117]: Research Question 1: Has the distribution of daily working activities of software engineers changed while WFH during the pandemic as compared to pre-pandemic daily working activities?
Research Question 2: Is the distribution of daily working activities related to well-being, productivity, and other variables?
Research Question 3: To what extent does Self-Determination Theory (i.e., the needs for autonomy, competence, and relatedness) predict software engineers' activity-specific satisfaction and productivity during the COVID-19 pandemic?
Research Question 4: To what extent are the associations between activity satisfaction and productivity moderated by resilience and company support during the COVID-19 pandemic?
Research Question 5: Do software engineers' work activities while WFH during the pandemic affect their activity-specific well-being, productivity, and psychological needs?
We designed the exploratory study to answer RQ1 and RQ2, whereas the confirmatory research was designed to answer RQ3 to RQ5.
Our first concern was to recruit software professionals for our exploratory study carefully.To do so, we used a multistage selection process, detailed in Section 3.2.We asked them to complete the same survey on two occasions.Unique randomized IDs were assigned to participants to preserve their anonymity and match their responses from both waves.To address concerns about replicability and increase the reliability of our findings, we asked the same participants to complete all measures twice, two weeks apart.This allowed us to test whether the distribution of daily working activities has changed.At the same time, we asked participants to report how much time they spend on 15 activities and compared the responses with a pre-pandemic sample [91], which allowed us to test whether the distribution has changed since the onset of the first lockdown in 2020.To test RQ2 -is the time spent on different activities correlated with well-being, productivity, and other variables -we correlated the time spent on each activity with professionals' general well-being, productivity, and other variables.
In a subsequent confirmatory study, we asked participants about their wellbeing, productivity, autonomy, competence, and relatedness to their co-workers while completing specific activities (e.g., "how stressed were you while coding?").Specifically, to test RQ3 -whether the needs for autonomy, competence, and relatedness predict software engineers' activity-specific satisfaction and productivity -we asked how satisfied, productive, autonomous, competent, and related with their co-workers' participants felt during working on a specific activity (e.g., coding).Our design allowed us to test RQ3 across all activities but also separately for each activity.
Additionally, to investigate RQ4 -whether the associations between autonomy, competence, and relatedness with activity satisfaction and productivity are moderated by resilience and company support -we also included a range of conceptually related variables that measure facets of company support: caring leadership, work-life balance, empowerment, job enablement, soft company support, hard company support, and recognition.We expect that software engineers who are more resilient and receive higher company support are less likely to be affected by, for example, reduced autonomy for a specific task.For instance, resilience or recognition might buffer against reduced autonomy because resilient people are more likely to bounce back after stressful events such as being less able to make autonomous decisions [128,141].Further, software engineers who experience low autonomy, competence, or relatedness during their work will experience only lower satisfaction and be less productive if their company does not provide adequate support that helps to buffer against the negative impact.In other words, we expect the effect of the three needs on activity satisfaction and productivity to be reduced if resilience and company support is high.
Finally, to test RQ5 -does the activity impact activity-specific satisfaction, productivity, and psychological needs -we tested during which activity professionals felt relatively more or less satisfied, productive, and so on.

Theoretical Framework
We are performing this investigation using the Self-Determination Theory (SDT) framework.In particular, this theoretical framework has been used to design organizational policies to improve both well-being and high-quality performance [48].SDT is a macro theory of human motivation that focuses, among others, on the motivations in the workplace [117].
The general idea of SDT is that if the three basic needs for competence, autonomy, and relatedness are satisfied, they lead to an increase in professionals' intrinsic motivation, productivity, and well-being.Indeed, employees' well-being is not only an ethical concern for every business but also a pivotal aspect to enhancing organizational sustainability, which is directly related to customers' satisfaction and financial success [84].As a macro theory, it includes several factors that lead to employees' well-being, such as the three basic needs.
The motivation related to specific job activities influences employees' productivity and well-being.Specifically, according to Deci et al., it mediates workplace-specific context such as developers' activity with performance and wellness [35], as depicted in Figure 1.In other words, the three basic needs of SDT applied to developers' activity should be positively associated with well-being and productivity.

Participants
For the exploratory study, a power analysis using G*Power [43] version 3.1 revealed that to detect a small-to-medium effect size of r = .20,using a power of 1 − β = .80(for a two-sided test), a sample size of at least 190 participants is required 1 .We assumed an effect size of r = .20because this is close to the medium effect size in individual difference research [52] from which many of our variables stem (e.g., SDT).We used a power of .80 because it is conventional to 1 With r, we mean Pearson's r, which is a measurement of linear association between two variables; its values ranges between -1 (perfect negative linear association) and +1 (perfect positive linear association).Values around 0 suggest that there is no linear association between two variables.Statistical power is the probability of detecting an effect of at least a given effect size with a certain probability (here: .80).Fig. 1 Theoretical Framework of Self-Determination Theory (SDT) in the workplace adapted from Deci et al. [35], where software engineering activities are the workplace-related independent variables, and SDT the mediating variable.

Software
keep the false-negative rate (i.e., the β-error) to 1 -.80 = .20 or lower [27].If we had assumed a larger effect size, fewer participants would have been needed to detect such a larger effect with a power of .80.
Participants were selected from a broader set of 500 software engineers who were carefully selected through a multistage process in a previous study by Russo & Stol [116].To select this initial pool of participants we applied a three-level screening process.First, we pre-screened the participants on the Prolific platform.The initial pre-screening criteria was knowledge of software development techniques, do computer programming for a living, use technology at work, and have an approval rate of 100% in previous studies.This left us with 2,897 members candidates.Then, we performed a competence screening.With the help of a questionnaire, we assessed in a time-boxed fashion the candidates' knowledge with one question about software design and two about programming.After this phase, 514 candidates were included in our sample.Finally, we focused on the candidates' attention with a quality screening, where we excluded informants who had a suspicious response pattern or have not passed attention checks of a 10-minutes long questionnaire about personality traits.The final set contained 483 fully screened software engineers.
For this study, we only selected professionals (from the Russo & Stol pool) who were working from home during the pandemic and live in countries with comparable lockdown measures.We used the following criteria: the country had to be in an official lockdown and those measures had to be rather homogeneous across the country.For example, countries such as Sweden with rather liberal lockdowns were excluded.Similarly, in Germany individual regions decided whenever the lockdown had to be applied2 .Finally, we obtained a sample of 192 software engineers who completed the first survey (M age = 36.65years, SD = 10.77,range = 19-63; 154 men, 38 women).Of those, 184 participated in the second wave two weeks later.We provide demographic information on participants' gender, age, and location in Table 3.We collected our data between 26 and 30 April 2020 (wave 1) and between 10 and 13 May 2020 (wave 2).
To identify participants for the confirmatory study, we also first run a power analysis, which revealed that a sample size of 77 is sufficient to detect a medium effect size with three predictors (i.e., need for autonomy, competence, and relatedness) with a power of .80.However, to keep the length of the survey to a manageable amount, participants only selected three activities they performed during the day.They completed a series of questions that expressly referred to each of the three activities.We therefore aimed to recruit around 300 participants, to obtain for multiple activities the required sample size of at least 77 participants.To ensure that the participants were software engineers, we run a pilot study to screen our informants with questions developed by Danilova et al. [33].The survey design is comparable with the previous exploratory one.The pre-screened followed the same criteria.What was different is the competence screening, where we asked specific questions developed and validated by Danilova et  To ensure high data quality, we recruited participants from the academic data collection platform Prolific Academic and compensated participants above the US minimum wage [100,111].The survey was run using Qualtrics.

Measurements for the exploratory longitudinal study
For the exploratory study, we derived the variables from a related project.For a complete presentation of the used instruments, we directly refer to Russo et al. [114] and the Supplementary Materials.Most of the scales described below have been cited between hundreds and tens of thousands times and been used across a wide range of contexts (e.g., organizational, clinical).The longitudinal design also allowed us to compute test-retest reliabilities, r it (i.e., the stability of responses across two or more time-points), by correlating responses given by participants at time 1 with those at time 2 (we are using time and wave interchangeably), which provides additional information about a scale's reliability to the commonly used Cronbach's alpha [89].Test-retest reliabilities close to 0 are undesirable since they indicate a low association between the two-time points, suggesting, among others, poor data quality.Cronbach's alpha is a measure of scale reliability.For exploratory research, using new measurement scales, values above .60are desirable while for confirmatory research the threshold is above .70(and below .95)[57].
Activities.We measured the same 15 activities that were measured by Meyer et al. [91].We did this because we believe they covered most activities and to have a pre-pandemic comparison group.We asked participants, "During the past week, how much time did you spend on each task percentage-wise (%)?"This was followed by the 15 activities, rated on a 101-point slider-scale ranging from 0% to 100%.For the activities which might have been more ambiguous, a brief explanation was added in brackets such as 'Helping (helping, managing or mentoring people),' 'Networking (maintaining relationships).'The 15 activities are coding, bugfixing, meetings, testing, email, breaks, code review, specification, learning, helping, administration, interruptions, documentation, various (i.e., other activities not listed above), and networking.
Well-being.We used the Satisfaction with Life Scale [39], because it is one of the most validated scales and because it shows good convergent and discriminant validity [102].Example items validated include "The conditions of my life in the past week were excellent" and "I was satisfied with my life in the past week".Responses were given on a 7-point scale ranging from 1 (strongly disagree) to 7 (strongly agree).Our Cronbach's alpha values to measure internal consistency for both waves were the following α time1 = .90,α time2 = .90( r it = .72,p < .001).
Productivity.Measuring productivity in software engineering is a highly debated issue.Some scholars, for example, suggest making the measurement more objective by using function points [139].Ko has criticized this viewpoint as being detrimental in the long run [75].On the other hand, other researchers propose a self-reflection measure with developers' self-reporting their daily productivity [92].In this work, we adopted a similar approach.We did not use a standard measure (e.g., such as Ralph et al. [107] did).Instead, we operationalized productivity as a function of time spent working and efficiency per hour, compared to a typical week.Specifically, we asked respondents three items: "How many hours have you been working approximately in the past week?"(Item 1), "How many hours were you expecting to work over the past week assuming there would be no global pandemic and lockdown?"(Item 2), and "If you rate your productivity (i.e., outcome) per hour, has it been more or less over the past week compared to a normal week?" (Item 3).Item 3 measured perceived efficacy and was answered on a bipolar scale that ranged from "100% less productive" to "100% more productive", with the scale mid-point being "'0%: as productive as normal".We computed productivity with the following formula: productivity = (Item1/Item2) × ((Item3 + 100)/100).Productivity scores from 0 to .99 would reflect lower than normal productivity, scores of 1 the same amount of productivity, and scores above 1 higher levels of productivity.
The reason for this choice is that we wanted to investigate the variance in productivity while working remotely as compared to being in the office.We acknowledge that some readers might have some concerns with this approach.For example, software engineers might understand productivity differently.While one software engineer might feel productive when having been asked to do a lot of tasks other than their main task for the week with high priority, whereas another software engineer might feel less productive.However, this is an issue of all our scales (e.g., we do not know whether participants interpret/instantiate autonomy, competence, or well-being in the same way), but nevertheless we find strong correlations among these variables.This interpretation is supported from psychological research: There is substantial heterogeneity in how people interpret human values (e.g., equality, freedom, security) [59].Nevertheless, values are still strong predictors of personality and beliefs [71,120].As long as there is no systematic bias in how our participants understood productivityand we do not assume there is -we do not believe this is an issue.Additionally, test-retest reliability correlation was large, r it = .50,p < .001,and productivity correlated negatively with the number of breaks taken (Tab.6).
Stress.We used a 4-item version of the Perceived Stress Scale [28], as it is an often used and well-validated scale [78].Example items include "In the last week, how often have you felt that you were unable to control the important things in your life?" and "In the last week, how often have you felt confident about your ability to handle your personal problems?"The response scale ranged from 1 (Never) to 4 (Very often).α 1 = .80,α 2 = .77(r it = .73,p < .001).
Boredom.We used the Boredom Proneness Scale [42,131], because it is a well-validated scale [133].Example items include "It is easy for me to concentrate on my activities" and "Many things I have to do are repetitive and monotonous".Items were answered on a 4-point scale ranging from 1 (Strongly disagree) to 7 (Strongly agree).α 1 = .87,α 2 = .87,(r it = .69,p < .001).
Quality and quantity of communication with colleagues and line managers.We used a self-developed three items instrument to capture how positive and supportive the communication has been with colleagues and line managers.The items are "I feel that my colleagues and line manager have been supporting me over the past week", "I feel that my colleagues and line manager believed in me over the past week", and "Overall, I am happy with the interactions with my colleagues and line managers over the past week."(α 1 = .88,α 2 = .92;r it = .67,p < .001).
Daily Routines.We developed a five items scale to capture participants' daily habits, as having automaticity in one's life frees cognitive resources for other things such as work [94].The items were designed to capture a broad range of daily activities that were possible during the regulations in most countries at the time of data collection (spring 2020).The items are "I am planning a daily schedule and follow it", "I follow certain tasks regularly (such as meditating, going for walks, working in timeslots, etc.)", "I am getting up and going to bed roughly at the same time every day during the past week", "I am exercising roughly at the same time (e.g., going for a walk every day at noon)", and "I am eating roughly at the same time every day" (α 1 = .75,α 2 = .78;r it = .73,p < .001).
Distractions at home.We developed a two items scale to measure perceived distraction in general as measuring the exact cause for distractions would have been beyond the scope of our study.The items are "I am often distracted from my work (e.g., noisy neighbors, children who need my attention)" and "I am able to focus on my work for longer time periods" (recoded) (α 1 = .64,α 2 = .63;r it = .63,p < .001).

Measurement of activity-specific variables
After providing informed consent, participants were instructed "Which of the following tasks have you spent most time with yesterday?For example, when you spent most of your time in two meetings, pick the meeting that went longer.Select three tasks."Participants selected three of the activities we used in Study 1, except breaks, interruptions, and various, which were excluded, leaving 12 activities: Coding (n = 192), bugfixing (111), testing (96), specification (22), reviewing (91), documenting (40), meetings (87), emails (51), helping (33), networking (11), learning (93), and administration (14).Participants then completed 17 items for each task, 8 measuring our two dependent variables, well-being and productivity, and 9 measuring our three independent variables, need for autonomy, competence, and relatedness.
Satisfaction was measured with a six items we created for the purpose of the study.The items were created to capture positive and negative aspects of satisfaction [72].In other words, some items were reversed scored, which might result in lower reliability (e.g., if a participant gives the item only a cursory read) but comes with the advantage of higher validity [26].The wording of the six item is "How stressed were you during the task?" (reversed scored), "How many positive emotions have you felt during the task?", "How bored were you during this task?"(reversed scored), "After completing the task, I felt tired" (reversed scored), "Performing this task frustrated me" (reversed scored), and "I felt exhausted after the task" (reversed scored).The reversed scored items were recorded so that higher scores indicated higher well-being.Answers were given on a scale ranging from 1 (Not at all) to 7 (Very).A principal component analysis revealed that the 6 items were loading on one component, with good internal consistency (α = .80).
Productivity was measured with two items we created for the purpose of the study: "How productive have you been during this task?", which was answered on a scale ranging from 1 (Not at all) to 7 (Very), and "What percentage of your goals have you reached during < task >," which was answered on a 0-100 scale.We created both items as they measure related, yet slightly different aspects of productivity.For example, a software engineer can feel productive but not have reached all of their goals because unexpected issues occurred while working on an activity.If the issues were overcome, the software engineer might feel productive but have not fully reached their goals.Both items were standardized before being averaged (α = .50).
To measure the three independent variables, we adapted three items for each of the three needs of the self-determination theory [117] from the balanced measure of psychological needs scale [123].The scale measures each of the three needs with six items.We selected those items which we judged as best suitable to be adapted for our purpose.We chose three items to get a good balance between brevity and informativeness: For example, if we had measured each need with only two items, we would have ended up with only one if a participant skipped an item as not applicable; conversely, selecting four items per need would have resulted in nine more items (i.e., 3 needs × 3 activities) for the full survey, thus increasing its length.All items were answered on a 7-point response scale varying from 1 (Not at all) to 7 (Fully) with an 8th option, 'Not applicable.' Need for autonomy was measured with "I was really doing what interests me," "I was free to do things my own way," and "I had a lot of pressures I could do without when working on the task" (recoded).However, as the last item was uncorrelated with the other two, rs = -.00 and -.14, we only combined the first two items (α = .46)into an Autonomy factor and included the last item as a single-item predictor. 3eed for relatedness was measured with "I felt close and connected with people working on the same task as me," "I felt appreciated by one or more people working on the same task as me," and "I had disagreements or conflicts with people working on the same task as me" (recoded).However, as the last item was uncorrelated with the other two, rs = .09,.06,we only combined the first two items (α = .73)into a relatedness factor and included the last item as a single-item predictor. 4eed for competence was measured with "I was successfully completing the task," "I did well even at the hard things," and "I struggled to complete the task" (recoded; α = .64).Thus, instead of the three predictors, we now have five, two of which are single item predictors.While single-item scales are sometimes considered as problematic because of possible low reliability, they are often used in research and -assuming there is evidence that participants paid attention to the items as evidenced through good internal consistencies of other scales -can produce meaningful findings [50,144].Indeed, the results of the measures with the two single items are in line with expectations (see below).

Measurement of task-independent variables
Additionally, we also included variables that were suggested to be related to our dependent variables from the exploratory investigation.
Resilience was measured with the 6-item Brief Resilience Scale [128].Participants indicate how much they agreed with statements such as "I tend to bounce back quickly after hard times" and "It is hard for me to snap back when something bad happens" (recoded).Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .73).
Caring leadership was measured with the 7-item Caring Leadership Scale [81].Example items include "My manager develops an atmosphere of caring and trust" and "I feel free to discuss work problems with my manager without fear of having it used against me later."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .85).
Work-life balance was measured with a 5-item scale.The items from this and the following five scales were provided by Qualtrics and offered to their users [1].After reading the items, we judged them as appropriate measures of the constructs (e.g., work-life balance) they claim to measure.Example items include "My workload is manageable" and "I have the flexibility I need in my work schedule to meet both work and personal needs."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .84).
Empowerment was measured with a 7-item scale.Example items include "I am given the opportunity to be involved in decisions that affect me" and "Employees are encouraged to participate in decisions that affect their work."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .83).
Job Enablement was measured with a 7-item scale.Example items include "My job is challenging and interesting" and "My work-from-home workspace allows me to be productive."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .77).
Soft company support was measured with a 3-items, including "My company is providing me with the necessary software tools to work from home" and "My company is providing me with the necessary flexibility so that I can work from home properly."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .64).
Hard company support was measured with a 3-items, including "My company is supportive in providing me the necessary work from home setting (e.g., chair, screen, mouse)."and "From the start of the lockdown, my company is taking care also of things it didn't do before (e.g., internet bill, electricity bill)."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .76).
Recognition was measured with a 7-item scale.Example items include "I receive meaningful recognition when I do a good job" and "My manager values my contribution."Responses were given on a 5-point scale ranging from 1 (Strongly disagree) to 5 (Strongly agree; α = .89).

Analysis & Results
In this section we describe which analyses we used to address our five research questions and the results.
4.1 RQ1: Has the distribution of daily working activities of software engineers changed while WFH during the pandemic as compared to pre-pandemic daily working activities?
To test RQ1, we first compared the time participants reported to have spent on each of the 15 activities with those reported by Meyer et al. [91].The results are displayed in Figure 2, as well as Tables 4 and 5. To test whether participants in our sample reported spending more or less of their time on certain activities than the software developers surveyed by Meyer et al. [91], we performed a series of one-sample t-tests.For example, we compared the percentages of participants in our sample at time 1 spend coding was significantly different from 15%, which is the percentage reported by Meyer et al. (see Table 4, second column).That is, we tested whether participants in our sample spend significantly more time (i.e., > 15%) or less time (< 15%) coding than participants in Meyer et al.'s pre-pandemic study.We performed 15 (activities) × 2 (time points) = 30 t-tests (two-tailed, since we did not have directed hypotheses) 5 .
Software engineers in our sample reported on average to have spent less time bugfixing, in meetings, getting interrupted (only at time 2), helping (only at time 2), and taking breaks; but more time on testing, specification, writing documentation, networking (only at time 1), learning, and administrative activities compared to the participants surveyed by Meyer et al. (Table 4).However, the differences between what our participants and those of Meyer et al. reported differed by only a few percent (see Figure 2).This visual inspection of the data is supported by correlation analysis.The percentages of time spent on the 15 activities 6 reported by Meyer et al. correlated with r(13) = .84,p < .0001at time 1 and with r(13) = .83,p = .0001at time 2. To obtain those correlations, we correlated the mean percentages reported in columns 2-4 of Table 4 with each other.That is, we tested whether the average percentages spent on each activity reported by the participants in the Meyer et al. sample would align with those reported by the participants in our sample at waves 1 and 2. This suggests that while there are some deviations, the overall order of activities 5 Because of the large number of comparisons, we adjusted the α-threshold from .05 to .003 to reduce the risk of false-positive results.This means that we considered only p-values of < .003as statistically significant.This is a standard procedure for studies that involve many variables to ensure reliable results, e.g., [58].Note that changing the α-threshold impacts the test statistic (e.g., t−value), as the test statistic and p-value are perfectly associated with any given sample size [62].For example, for an α−threshold of .003and a sample size of 192 (time 1) or 184 (time 2), the critical t-values are 3.006 and 3.008.In other words, only if the t−value obtained from a t−test is larger than 3.006 (or 3.008), the p-value would be < .003,and we would consider the test result to be statistically significant.Note that a Bonferroni correction would have resulted in an adjusted alpha-level of .05/30≈ .0017,which is overly conservative and does not consider that some variables are correlated (e.g., between time 1 and 2).Thus, the adjusted significance threshold of .003seemed appropriate to us, neither overly conservative nor liberal.Also, we are not interpreting p-values that are just above our threshold.Doing this would be equal to stating that there is a trend towards significance, implying that with a larger sample the effect would have become statistically significant.However, this is not the case [145] 6 For the correlations, the Degrees of Freedom are N − 2 = 13 with N = 15 activities.remains stable.It further supports the quality of our data.If our participants had responded carelessly or even randomly, those two correlation coefficients would be around 0.
In the next step, we explored whether participants' activities changed over time during the lockdown.To do this, we performed a series of paired t-tests (Table 5).The only statistically significant differences were observed for networking and taking breaks.At time 2, participants spent less time networking and taking breaks compared to time 1.Overall, the relative order of the activities remained very stable across time on the group level (i.e., when correlating the group averages for the activities of time 1 and 2), r(13) = .99,p < .0001.

RQ2: Is the distribution of daily working activities related to well-being, productivity, and other variables?
To test RQ2, we correlated the time participants spent on each activity with the selected variables.This was possible because the activities were mostly uncorrelated in both time points on an individual level.We report Pearson's correlation coefficients (r) in our tables since most of the data were normally distributed.However, for the sake of completeness, we also ran a non-parametric Spearman's rank correlations test (reported in the Supplementary Material), which provided us with very similar results, suggesting the robustness of our results.In total, we computed at both time points 13 (well-being related variables and productivity) × 15 (activities) = 195 correlations.Given a large number of comparisons, we changed our significance threshold from α = .05to .0005.Again, a Bonferroni correction would have resulted in an adjusted alpha level of .00017,which is overly conservative and does not consider that some variables are correlated (e.g., distractions and stress).Thus, the adjusted significance threshold of .0005seemed appropriate to us, neither overly conservative nor liberal.This new threshold implies that only correlation coefficients of r ≥ .25 are significant.This is because the p-value of r = .25 is just below the .0005threshold for our sample size of 192, p ≈ .00047.
The correlation coefficients are presented in Table 6 and Table 7.This analysis did not show substantially significant results across both time points at α =.0005.At time 1, three significant correlations emerged which were at time 2 no longer significant.First, productivity was negatively correlated with time spent on breaks, r = −.30,p = .00002,which can be considered as a further validation of our productivity measure rather than a meaningful finding itself.However, the correlation between productivity and time spent on breaks was again negative but did not reach statistical significance, r = −.16,p = .03.Second, relatedness correlated negatively with interruptions at time 1, r = −.27,p = .0002,but not at time 2, r = −.04,p = .58.Third, autonomy correlated negatively with meetings at time 1, r = −.25, p = .00048,but not at time 2, r = −.17,p = .02.Overall, we conclude that work activities carried out at home are not related to well-being, productivity, and other variables.

RQ3: Do the needs for autonomy, competence, and relatedness predict software engineers' activity-specific satisfaction and productivity?
To test the third research question, we run in a first step two linear-mixed models with random intercepts across all activities using the R-package lme4, version 1.1-25 [10].A linear-mixed model is superior to a standard multiple linear regression because the responses are not independent, which is an assumption of regression analysis [14].Each participant responded to three activities, making them dependent.Ignoring dependencies can result in biases such as an inflated type-I error rate (i.e., false positives) [70].Figure 3 displays the results.Across all activities, activity satisfaction was negatively predicted by conflicts and pressure, and positively by autonomy, competence, and relatedness 7 .In contrast, productivity was only predicted by autonomy, relatedness, and especially competence.
In the next step, we tested whether the pattern of our findings would hold within each of the completed activities by at least 77 participants.This threshold was used because the power analysis reported above revealed that at least 77 participants were needed to detect a medium effect size.As can be seen in Figures 4 to 6, the pattern of the result was mostly consistent across the activities, but some minor deviations occurred.For example, for meetings, competence did not matter for participant's activity satisfaction and productivity, but autonomy mattered.In other words, during meetings, it matters more whether people have the feeling they are autonomous rather than competent.

RQ4: Are the associations between activity satisfaction and productivity moderated by resilience and company support?
We tested the fourth research question by running a series of 2 (DV: activity satisfaction vs. productivity) × 5 (IVs: activity-specific variables autonomy, competence, relatedness, conflict, pressure) × 8 (moderators: resilience, leadership, balance, empowerment, enablement, soft-support, hard-support, recognition) = 80 moderated regression analyses.Specifically, we multiplied each of the task-dependent variables with each of the task-independent variables.Given a large number of tests, we set our α-level to .001 to reduce the likelihood of false-positive results.However, none of the interactions reached statistical significance, ps > .001.Together, this suggests that only activity-specific variables matter for activity satisfaction and productivity.
Additionally, we tested whether any of the seven task-independent variables would be associated with activity satisfaction and productivity; we again run two linear-mixed models with random intercepts across all activities.The predictors were resilience, leadership, balance, empowerment, enablement, Since our design had left many empty cells8 , a standard approach such as a within-subject ANOVA was not possible (e.g., no participant reported that they were networking and doing administrative activities).We therefore standardized all of our seven outcome variables and tested whether activities would lie above or below the mean for each scale using a series of one-sample t-tests.This approach allows testing whether doing a specific activity increases or decreases, for example, activity satisfaction compared to the average of all activities.
Considering the high number involved in our analysis, we set the new alpha- level to .001, which means that we will only consider results to be significant if p < .001or the 99.9%-CI does not include zero.Results are displayed in Figures 7 and 9 and Tables 8 and 9. Activity satisfaction was on average lower when participants were bugfixing [M = -0.48,SD = 1.02, t(114) = -5.07,p < .0001],and higher when participants were helping others [M = 0.56, SD = 0.77, t(35) = 4.39, p = .0001].Further, participants experienced higher levels of autonomy when coding and lower levels of autonomy when being in meetings and writing emails.Competence was lower when bugfixing and higher when helping people.Relatedness was only higher when people were helping.Pressure and conflict were not impacted by task.

Exploratory Analysis
We explored whether there are any gender mean differences for any of our activity-independent and activity-dependent variables, because other studies found that women's mental health and productivity were more negatively impacted by the Covid-19 pandemic than men's [23].In total, we conducted 8 (activity-independent) + 201 (activity-dependent with > 1 women responding) Note.Each variable was first standardized.We then performed a series of one-sample t-tests to test whether participants score on average above or below 0 (i.e., the average across all activities), separately for each activity and variable.Note.Each variable was first standardized.We then performed a series of one-sample t-tests to test whether participants score on average above or below 0 (i.e., the average across all activities), separately for each activity and variable.
independent samples t-tests.Because of the large number of comparisons, we adjusted our α−threshold to .0005.None of the t-tests reached statistical significance, all ps > .0006.We report descriptive and relevant inferential statistics for each of the 209 t-tests in the Online Supplemental Materials on Zenodo.Additionally, we explored whether day of the week is not only associated with productivity -previous research found that productivity is higher Tuesdays to Thursdays and lower on Mondays and Fridays [121] but also associated with well-being or needs.However, this was not the case, according to a series of both Pearson's and Spearman's rank correlations rs < .13,ps > .07.

Revised Theoretical Framework
Our results partly align with the theoretical framework proposed by Deci et al. [35] (cf. Figure 1).Whereas the exploratory study did not find that activities are significantly correlated with needs or the dependent variables, the confirmatory study found support for it.We found that some activities were linked with the activity-specific needs of the self-determination theory as well as activity-specific satisfaction.Additionally, activity-specific needs were associated with activity-specific satisfaction and productivity.However, while Fig. 8 Revised Theoretical Framework.We found that the strength of the association between Self-Determination Theory needs and the two dependent variables depends on the type of activity performed by developers.
our findings are in line with Deci et al.'s [35] broad framework, we are, to the best of our knowledge, the first in testing which activities show stronger links with activity-specific needs, satisfaction, and productivity.However, a revised theoretical framework is also supported by our confirmatory study: The links between the three needs and activity satisfaction as well as productivity are moderated by the type of activity (moderation is represented in Figure 8, a consequence of our findings of Figures 4 to 6).In other words, the strength of the association between needs and activity-satisfaction as well as productivity depends on the type of activity.The model depicted in Figure 8 does not directly contradict the model shown in Figure 1, but it revises it.They can co-exist, as our data shows.The model from Figure 1 is more relevant to understand underlying mechanism and basic processes, whereas the model from Figure 8 has more applied value.Indeed, the latter model offers intriguing possibilities for future research, which we discuss in more detail below.

Implications for Research and Practice
Our investigation addresses the need for scholarly evidence concerning the effects of WFH during the COVID-19 pandemic on software developers' work activities, including the impact on professionals' well-being and productivity.Further, a deeper understanding of the effect of the pandemic on professional working life for the large number of software professionals working remotely provides relevant insights for both research and practice.To this end, this study makes several contributions, as summarized in Table 10.WFH does not affect the time spent on working activities by software developers, and the distribution is comparable to a typical office day.One interpretation might be that the significant time reduction of meetings suggests that online meetings are more time-efficient than physical ones.Also, professionals seem to be more focused when working remotely and have fewer interruptions.This allows them, among others, to dedicate more time to developing their own skills.Developers had a very regular work activity distribution during the pandemic, comparable to their office day.Fewer breaks and networking might suggest that professionals adapted to the new situations towards the end of the first lockdown in May 2020 in many countries, and became more timeefficient.
RQ2: Is the distribution of daily working activities related to well-being, productivity, and other variables?A series of 2 × 195 correlation analyses did not show substantially significant results.Overall, we conclude that work activities carried out at home are not related to well-being, productivity, and other variables such as stress, boredom, or needs.This can be interpreted as a generally positive finding, as it shows that various activities are unrelated to important psychological and social variables while WFH if they are measured typically (e.g., well-being over the past week).

Findings Implications
RQ3: Do the needs for autonomy, competence, and relatedness predict software engineers' activity-specific satisfaction and productivity?
In the confirmatory study, we found, across all activities, that the needs for autonomy, competence, and relatedness were positively associated with activity satisfaction and productivity, using linear mixed-effects modeling and multiple linear regression analysis.Conflict and pressure were only negatively associated with activity-specific satisfaction but unrelated with activityspecific productivity.These associations were primarily consistent across activities, albeit a few deviations occurred (Fig. 4 and 6).
Self-determination theory provides a robust framework to understand and enhance developers' productivity and well-being.A higher degree of autonomy, competence, and relatedness for software professionals can increase their satisfaction and productivity.Rather than control or micro-management, organizations should support employees to tailor their own working activities and training.
RQ4: Are the associations between activity satisfaction and productivity moderated by resilience and company support?
A series of 80 moderated regression analyses revealed that neither caring leadership, work-life balance, empowerment, job enablement, soft company support, hard company support, nor recognition moderates the link between the three needs and activity satisfaction and productivity.Additionally, all seven task-unrelated variables were unrelated to activity-specific satisfaction and productivity.
Our results are inconclusive.Possibly, with more specific measures (e.g., activityspecific company support), this outcome might change.
As a community, we need better and more nuanced measurements of satisfaction and productivity to identify specific factors that contribute to professionals' satisfaction and productivity compared to overall assessments.Repeated self-reports (e.g. or Experience Sampling [77]) can identify the effect of contextual factors (e.g.current task).This allows for collecting reliable and contextually rich data as participants assess their current state rather than reflect on an extensive time in the past [134].We found that activity satisfaction was relatively lower when participants were bugfixing and higher when they were helping others, using a series of 84 onesample t-tests.Additionally, autonomy was perceived lower while professionals were in meetings or writing emails.Competence was higher when professionals were helping others and lower when bugfixing.Relatedness was higher when professionals were helping others.The findings hold even after controlling for multiple comparisons.
Bugfixing is associated with lower activity satisfaction while helping improves it.Code review, innersourcing, mentoring projects, and bug triaging processes support software engineers' desire to help, making them more satisfied and productive.At the same time, more junior figures can learn from more experienced ones, increasing employees' retention, and the helpers' satisfaction.
First, we ran an exploratory longitudinal study during the COVID-19 lockdown with 192 carefully selected software professionals to address the first and second research questions.We assessed developers' working activities and their perceived well-being, productivity, and other relevant psychological and social variables.Our data quality was assured by the high test-retest reliability of each variable, measuring at least .50,and Cronbach's alpha values above .60.
Second, we compared the time spent on typical office-based activities with the same activities while working from home.Using the taxonomy and previously collected data of Meyer et al. [91], we ran 30 one-sample t-tests to assess significant differences.Although we reported several differences, they are relatively small, which indicates that the time spent on different activities is almost identical in both the online and the physical working environment.
Third, we analyzed whether the time spent on each working activity changed during the pandemic.After performing 15 paired t-tests, we conclude that developers did not change how they spend their time over a period of two weeks.
Fourth, we investigated whether well-being-related variables and productivity are associated with the time spent on each activity and if the findings replicate across both time points.To do so, we ran twice 195 correlation analyses.Our results suggest that well-being-related variables and productivity are not associated with the time spent on each activity.
However, a shortcoming of our exploratory study is that we only measured general well-being, productivity, and needs and the amount of time spent on various activities during the past week.The lack of significant findings could suggest that either the type of activity does not impact professionals' well-being and productivity or that many other factors impact well-being and productivity more strongly (e.g., quality of social contacts [93,114]).We found evidence for the former in our confirmatory study.
In our confirmatory study, we tested whether activity-specific variables, such as the need for autonomy, competence, relatedness, and activity-independent variables, such as resilience or empowerment, are associated with activity satisfaction and productivity (to address the third research question).Additionally, we tested whether activity-specific and activity-independent variables interact in predicting activity satisfaction and productivity, addressing the fourth research question.Finally, we tested whether specific activities impact professionals' activity-specific satisfaction and productivity, addressing the fifth and final research question.Here, we summarize and discuss the results of our research questions.
RQ1: Has the distribution of daily working activities of software engineers changed while WFH during the pandemic as compared to pre-pandemic daily working activities?
On the whole, we did not register significant changes to developers' work distribution.Further, we highlight that Meyer et al.'s sample covers only one software company (Microsoft) [91], whereas we surveyed developers across many companies globally distributed.Therefore, some deviations were expected.Nevertheless, we still report an overall consistency between our WFH data and Meyers et al.'s analysis of a typical office day at Microsoft.Our results show that neither working from home nor sample type (Micosoft employees vs a diverse sample) affected how software engineers dedicate their time to specific activities.However, we observed some minor differences.Most notably, software engineers in our sample spend less time on bugfixing, meetings, and breaks.Also, they report less time on e-mail writing (only in wave 1) and fewer interruptions when working from home (only in wave 2).In contrast, they spend more time on specifications, testing, administration, documentation, and learning.It is unclear whether those minor differences emerged because of the pandemic or because our sample differed.
We observe that meetings are significantly reduced while working remotely.One explanation is that they are, on average, shorter and more time-efficient than in the office.For example, small talk might be perceived as more challenging during online meetings than in-person meetings.Alternatively, they might be better planned since setting up online meetings often requires a clear start and end time.Also, our participants invested in improving their skill set as they spent more time learning.Similarly, developers seem to be more focused on their activities: They reported fewer breaks and interruptions.At the same time, developers remain linked to their organization or their colleagues since their time on networking remains the same.We did not register any significant change in the work activities during our exploratory investigation, with only two exceptions: at the first wave, developers spent more time on breaks and networking than during the second wave.Nevertheless, we report a correlation close to 1 of the group averages, suggesting a very high consistency in the pandemic activity distribution.The reason software engineers spent less time on breaks and networking during the second measurement point might indicate that they became more accustomed to their new WFH condition.Accordingly, professionals learned to spend their working time more efficiently.Similar conclusions are also supported by the literature [46,114].
RQ2: Is the distribution of daily working activities related to well-being, productivity, and other variables?
We did not find any significant relations between daily working activities and the well-being, productivity, or other investigated variables, except for one, taking breaks was negatively associated with productivity in our exploratory study.This can be interpreted as a generally positive finding since it shows software engineers' well-being and productivity do not depend on the type of activity they are doing, at least concerning the 15 measured activities.The only significant relation was productivity, which correlated negatively with breaks in wave 1.Despite being intuitive, we are very cautious about concluding that developers should take fewer breaks to be more productive since such a relation was not significant at wave 2 (although still negative).Further, prior work shows that breaks can increase well-being [32] and can improve the quality of professionals' social networks [138].Similarly, correlation does not equate to causation: participants might have taken more breaks because they felt less productive for various reasons (e.g., more exhaustion, distractions at home).
Regarding the other activities, we do not find evidence that the time spent on activity affects productivity or well-being.We did not register any significant effect on how the amount of time dedicated to development activities impacts software engineers' general well-being, stress, boredom, or distractions while working from home.Previous studies showed that during the pandemic, it is essential to have daily routines to improve personal well-being [114].However, routines do not seem to play a significant role when it comes to individual activities.As our findings show, possible distractions that might happen while working from home (e.g., children, noisy neighbors) do not influence the time spent on specific work activities.
The innate psychological needs of the self-determination theory [117], and its three dimensions, need for autonomy, competence, and relatedness, are associated with work motivation in general [48].To the best of our knowledge, our study is the first in our community to assess whether specific activities are correlated with autonomy, competence, and relatedness.Overall, we found that general psychological needs were unrelated to the amount of time developers spent on specific activities.In hindsight, this might be because the scale we used to measure the three dimensions of the self-determination theory captures broad human needs in general [117] and not specifically while working on specific activities.We addressed this limitation of the exploratory study in the confirmatory analysis.
While working remotely, the quality of communication between team members can be challenging, as face-to-face communication has to pass through a medium (e.g., MS Teams, Zoom).Therefore, not being directly connected to the organizations can become a big issue for remote workers.For example, research suggests that lower support from coworkers and supervisors [88], perceiving the values of one's organization to be different from one's values [40], and unfair treatment and lack of appreciation [12] are putting the mental health of remote workers at risk.Interestingly, our results suggest that the quality of communication does not relate to individual working activities.This might seem surprising at first glance, as it is plausible to assume that those who find the quality of communication to be poorer might engage less in activities that require more communication (e.g., meetings) and more in activities that require less direct communication (e.g., coding, bugfixing).This might suggest that developers are professional enough not to let their behavior be influenced by their perception of the quality of communication.In other words, the time spent by software engineers on each activity is not detrimental to the relations with their organization.Prior research has mostly ignored whether activity type plays a role in professionals' psychological and social factors.Typically, scholars only measured whether people are, for example, overall stressed, as opposed to stressed by specific activities [12,40,88].Our research suggests that the type of activity is not a confounding variable, which increases our trust in prior research, which has typically looked at subjective work experience in general rather than actual activities.So, our exploratory findings suggest that software engineers' psychological and social factors do not matter on what work activity they are performing, but rather how it is done.
RQ3: Do the needs for autonomy, competence, and relatedness predict software engineers' activity-specific satisfaction and productivity?
In the confirmatory study, we found, across all activities, that the needs for autonomy, competence, and relatedness were positively associated with activity satisfaction and productivity.Simultaneously, conflict and pressure were only negatively associated with activity satisfaction but unrelated to productivity.These associations were mostly consistent across activities, albeit a few deviations occurred (Fig. 4 and 6).For example, relatedness predicted activity productivity for meetings and reviewing but not for coding, bug fixing, testing, and learning.One possibility is that meetings and reviewing are typically more social (i.e., done with other people), making relatedness more relevant.Overall, our results align with our first findings, even though previous research often used different measures of well-being and/or needs, a variety of statistical tests (e.g., zero-order correlations), and/or relied on different populations such as student samples.For example, a meta-analysis [148] found a correlation of r = .49,95% − CI[.39, .57], between need for autonomy and satisfaction with life, which is very much in line with our findings: A linear mixed-effects model with only autonomy as a predictor for activity satisfaction revealed β = .48 9.
This result is of great relevance to understanding developers' satisfaction and productivity.To improve activity satisfaction and productivity, selfdetermination theory is a precious lens.Indeed, more autonomous, competent, and related professionals show a high degree of satisfaction and productivity.These findings are also precious for employee recruitment and retention.Com-panies should keep this aspect in mind when organizing working activities.In particular, micro-management could be detrimental to software engineers' satisfaction and productivity.In other words, it is advisable to discuss realistic working goals of software projects, leaving it to the teams to self organize, like a recent investigation about effective Scrum teams highlighted [137].
RQ4: Are the associations between activity satisfaction and productivity moderated by resilience and company support?
None of the seven task-unrelated variables (e.g., resilience, work-life balance) moderate the link between the three needs and activity satisfaction and productivity.Initially, we hypothesized that, for example, resilience might buffer against reduced autonomy because resilient people are more likely to bounce back after stressful events such as being less able to make autonomous decisions [128,141].Generally measured variables (e.g., general work-life balance) are rarely associated with specific variables [34].This might be because we measured resilience and work-life balance in a way that is too broad.Future research could measure resilience in a more specific way (e.g., resilience during the day or activity-specific resilience), which makes it more relevant for activity-specific satisfaction, productivity, and basic needs.Alternatively, other personality traits might be more relevant.For example, proactive personality was found to mitigate or moderate the effect between stress and productivity [68,98].Thus, lower levels of activity satisfaction might strongly impact productivity for those who score low on proactive personality.
Additionally, caring leadership, work-life balance, empowerment, job enablement, soft company support, hard company support, and recognition were unrelated to activity-specific satisfaction and productivity.In hindsight, this is not surprising given that we measured all these variables generally.For example, if we had measured activity-life balance instead of general work-life balance, we would have likely found an effect on activity-specific satisfaction and productivity.
Overall, our results are inconclusive on this question.Although the moderation effects of resilience and company support are not supported, we acknowledge that with more specific measurements, this outcome might change.
RQ5: Do software engineers' work activities while WFH during the pandemic affect their activity-specific well-being, productivity, and psychological needs?
We found that activity satisfaction was relatively lower when participants were bugfixing and higher when helping others.This finding is in line with previous research suggesting that helping others increases well-being [17].In contrast, levels of activity productivity were more consistent across activities, while activity satisfaction varied.Our findings of bugfixing have three main practical implications.
First, bugfixing might be viewed as an annoying but necessary activity by many developers: Compared to all activities, 80 participants reported a below-average level of satisfaction when bugfixing, whereas only 35 reported above-average satisfaction (cf.Fig. 7).Pointing out the meaningfulness of bugfixing is essential.Literature supports that meaning is positively associated with satisfaction, autonomy, competence, and relatedness [86].Even though most developers are aware that bugfixing is essential, the odd reminder or nudge can have an impact [136].For example, while most people are aware that switching off the light when needed is beneficial for the environment, reminders of it nevertheless increase the likelihood that the light gets switched off [20].Occasional reminders or nudges are typically very inexpensive and are likely to be cost-effective.However, more research is needed whether in the context of bugfixing nudges result in substantially increased satisfaction.Additionally, organizations should support a higher degree of socialization during bugfixing activities.Software engineers appear to be (contrary to stereotypes) social and caring individuals.Consequently, code review practices should be primarily supported by management.
Second, organizations should facilitate an inclusive working environment in which developers are actively helping each other to perform different activities they can freely choose from.One concrete example might be to establish innersourcing projects [130].They are similar to open source projects, except they are closed projects in which only employees can participate.This practice would also support the need for autonomy of software professionals in contributing to projects they find important and committed to.
Third, establishing mentorship programs can stimulate senior developers' desire to help by increasing newcomers' sense of relatedness.This aspect is even more critical in a WFH setting, where informal networking occasions are typically limited.At the same time, this will increase the onboarding success of new employees.Research already showed that the support of newly hired employees through, for example, mentoring projects, is an essential factor for onboarding success and, eventually, employees' retention [122].Furthermore, an effective bug triaging process is considered pivotal for a software organization efficacy to address quality concerns [4].Picking the right developer to work on a specific bug is crucial to fix the bug timely and to reduce bug tossing length [146].Establishing an effective and transparent process is, thus, a way to establish meaning about this activity.Future research along the lines of RQ5 could also investigate whether an activity was self-chosen.If an activity is self-chosen, intrinsic motivation is usually higher, which is linked to higher job satisfaction and performance [61].

Measuring satisfaction and productivity
Findings from both studies have not only practical but also methodological implications.The time developers spend on a specific activity was unrelated to their well-being, productivity, needs, or working conditions, when the latter was measured in a general (i.e., activity unrelated) way.Researchers or employers who wish to identify how to increase satisfaction or productivity of a specific activity need to adapt their measures to become activity-specific.For example, increasing employees' general resilience or work-life balance will have little impact on how satisfied and productive they are with a specific coding task.
In contrast, enhancing autonomy for coding is likely more beneficial.However, it should be noted that we have created the activity-related measures for the confirmatory study.While the measures were mainly associated with each other in the expected directions, even when measured with a single item (e.g., conflicts and pressure were negatively associated with activity satisfaction), future research could further improve our measures to increase their reliability.
However, this does not imply that general measures of personality and other constructs cannot predict activity-specific variables.Previous research established that, for example, personality variables predict related behavior averaged over a sample of occasions and situations much better than single observations [41,124].In contrast, averaging across multiple instances of autonomy-related behaviors across various situations (e.g., living in a selfchosen city, working in an area that matches personal interest, or listening to the music one likes most as opposed friends, partner, or family) will likely be more strongly associated with need for autonomy.For example, looking at an exhibition that is of personal interest at a museum might only weakly be predicted by need for autonomy.This is because general measures are broad and trans-situational by definition.For instance, resilience is important in many aspects of a software developer's life, not only while coding on a specific day.This activity, in turn, can also be influenced by many situational variables (e.g., distractions at home, a particular project, working with competent colleagues) that diminish the impact of personality.If researchers are interested in testing whether, for example, resilience predicts activity satisfaction, they might want to measure activity satisfaction across multiple activities (e.g., coding, bugfixing) and/or multiple time-points [11].
Further, our findings cautiously suggest that WFH might be more beneficial for both developers and organizations than working in the office, or at least for some groups of professionals [45].However, while some studies support our conclusion that WFH increases or does not impact productivity [7,8,36,114], some studies also found that WFH has a negative impact on productivity [51,74,95].As there are too many potential differences between the studies (e.g., cultural factors, working conditions at home, type of work, measurement of productivity), cross-country and cross-profession studies are needed.Large sample sizes or meta-analyses synthesize the findings to better understand the conflicting findings in the direction WFH shifts productivity during the Covid-19 pandemic.Thus, there is a need for more research to identify factors that help us understand how WFH can be beneficial and whether these factors are transferable to working on-site.Our confirmatory study offers an intriguing possibility for the contradictory studies: The type of activity matters.Certain activities might be less feasible when WFH, which reduces productivity, whereas working on other activities might be more accessible when WFH and thus increase productivity, similar to what we predict in Figure 8.

Threats to validity
To conclude this section, we briefly address the most relevant limitations.
Reliability.We investigated our subject matter using a longitudinal exploratory design combined with a confirmatory cross-sectional one.Participants were identified using a multi-stage selection process to ensure (i) they are professionally active software engineers, (ii) data quality, and (iii) that they were working from home during the lockdown.Validated scales have been used when available or adapted from previous investigations.In line with most related research, we have not aimed to control for response biases because doing so has usually little impact on the reliability: Some approaches to control for response bias improve the reliability slightly, but can also reduce reliability or leave it unchanged [63].Overall, we report a high test-retest reliability in the longitudinal study and adequate internal consistencies of all measures.
Construct validity.To enhance cross-study comparability, we used the taxonomy by Meyer et al. [91] to define the daily activities of software developers.Similarly, we used those benchmarks to confront it with working from home settings.However, we did not monitor developers' effectiveness by executing every activity while working remotely.We opted for this to be consistent with Meyer et al. and because we collected data from a global sample of software professionals working in 190+ different organizations, making the development of objectively comparable measurements near impossible.Still, we report some differences with the data collected by Meyer et al., although the difference is of only some percentage points.
Conclusion validity.Our conclusions rely on multiple statistical analyses, such as one-sample t-tests, paired t-tests, Pearson's correlation, multiple regressions, and linear mixed-effects models.Furthermore, we also ran a nonparametric Spearman's rank correlations test for our conclusion's consistency since not all distributions were perfectly normally distributed.To support Open Science, we make a reproducible R-code alongside our raw data openly available on Zenodo.
Internal validity.We used self-reported measures for well-being, productivity, and other psychological and social variables for this investigation, which might be considered a limitation.The data for the exploratory study was collected towards the end of the first lockdown in spring 2020 with a longitudinal design.We expanded our initial data collection one year later, in spring 2021, with a cross-sectional, confirmatory study.This enabled our participants to report a more mature and stable assessment of the new working setting.For the exploratory investigation, we only considered countries with comparable lockdown measures (e.g., we excluded, among others, Denmark, Germany, and Sweden as these countries did not face a total lockdown or had different measures in place in the country's regions).Thus, we asked both waves about lockdown conditions in their home country and if they were still working from home.Moreover, the exploratory longitudinal study was performed in a relatively short time frame (around two weeks) due to the ever changing health public policies in the first months of the pandemic.We do not deem it as a significant limitation, since the main goal of this first study was to identify relevant tendencies to follow up in the confirmatory study.Since all selected informants faced comparable conditions, we did not exclude any of the 192 selected software professionals.For the confirmatory study, we surveyed 300 developers working from home.Since lockdown measures in spring 2021 were comparable across all countries, we did not exclude any country a priori.
External validity.We designed this study to maximize internal validity.Therefore, we determined our sample size with an a priori power analysis.So, we did not work with a representative sample of the software engineering population in mind (such as Russo and Stol [116] did, where the research goal was to generalize results, surveying over 400 software engineers).However, we recognize having submitted our surveys in the middle of a very peculiar period.This makes it unclear whether we can generalize our findings to non-pandemic working from home settings.Notwithstanding, we also realize that we require fast and reliable evidence regarding the COVID-19 crisis we are facing right now, improving the quality of developers' daily lives.This study will also enable a better-informed research design for future remote working studies once this pandemic is over.Finally, our sample is almost entirely composed of western country developers.Consequently, the investigated effects could be different in other regions of the world e.g., Africa or Asia.

Conclusion
This research focused on software engineers' activity satisfaction and performance during the COVID-19 pandemic.For the sake of clarity, we did not provide any consideration regarding the Future of Work after the COVID-19 lockdowns, such as Smite et al. [125].To do so, we first employed an exploratory longitudinal study design across two waves and a confirmatory cross-sectional study.We found that developers still spend proportionally the same amount of time on their different daily activities.For example, the software engineers in our sample still spent most of their working time on coding, bugfixing, meetings, testing, and e-mails, as previously reported by Meyer et al. [91].Nevertheless, we found some significant mean differences.Our participants reported having spent less time in meetings and breaks, suggesting that both were less common, possibly due to developers' adaption of working remotely.Similarly, no significant relations have been found between productivity, wellbeing, and relevant social and psychological variables with working activities.In our confirmatory cross-sectional study, we found that activity-specific needs for autonomy, competence, and relatedness are associated with activity-specific satisfaction and productivity.Furthermore, activity satisfaction was relatively lower when participants were bugfixing and higher when helping others.At the same time, autonomy was perceived as relatively lower while professionals were in meetings or writing e-mails.
Overall, our research suggests that WFH does not per se affect how much time developers spend working on various activities.Nevertheless, software engineers are social beings, and their satisfaction increase when they can help others.This paper also suggests a number of recommendations for organizations to support their employees' well-being and productivity.In particular, active company policies to support developers' need for autonomy, relatedness, and competence appear to be particularly effective in a WFH context.Also, bugfixing is the most detrimental activity for professionals' satisfaction.Accordingly, specific processes should be designed for software engineers working from home (e.g., bug triaging and mentorship programs).
As a deductive investigation, using a quantitative stance we can only assess the relations between the independent variables with our two dependent ones.Thus, we do miss a number of nuances about the interactions of our variables that should be investigated further.Additionally, future research should aim to provide more tailored recommendations based on developers' personalities.This would result in a more nuanced understanding of the subject matter.Also, a better understanding of software professionals' activity satisfaction and productivity is needed to develop reliable measurement instruments and to develop or refine theories.
al. Regarding the quality screening, of the 300 selected participants, 10 participants failed at least one test item and/or completed the survey in less than 4 minutes and were excluded.The vast majority of participants, 210, worked in 'Software & IT,', 20 in 'Education & Research,' and 11 in 'Finance, banking & insurance.'

Fig. 6
Fig.6Predictors of well-being and productivity across activities with n ≥ 77.The horizontal lines represent 95%-CIs.Plot one of three.

Table 1
Overview of prior studies about software engineers working from home

Table 3
Demographic information of both samples.For brevity, we list the most common countries among our sample.

Table 4
[91]arisons of both waves with time spend on activities as reported by Meyer et al.[91]Note.Activity percentages as per 'typical workday' following Meyer et al.[91].M t1 : mean at time 1 (see also Table 5), t-value 1: t-value of one-sample t-test from time 1 vs value reported by Meyer et al., p1: p-value of one-sample t-test from time 1. ↑ and ↓ indicate a significant increase or decrease in time spent on activity as compared to [91].

Table 5
Comparisons of activities between time 1 and time 2 Participants who spend more time on an activity at time 2 compared to time 1; Decrease: Participants who spend less time on an activity; Equal: Number of participants whose score has not changed.↓ indicates a significant decrease between time 1 and 2.
Note. t: t-value of a dependent sample t-test; Cohen's d: standardized mean difference; Increase:

Table 6
Correlations between activities and variables at Time 1

Table 7
Correlations between activities and variables at Time 2

Table 8
Differences between activities

Table 9
Differences between activities (continued)

Table 10 :
Summary of key findings & implications (continued)