1 Introduction

Through the increased digitalization of our society, working habits are gradually changing. These changes especially affect the work conditions of so-called knowledge workers, such as software engineers, architects, research scientists, accountants, lawyers and project managers, whose job primarily involves handling or using information. For these professionals, it is now possible to work everywhere, all the time: at the office, during transit, at home, using mobile devices and cloud services. It is not clear yet whether or how these new work habits affect workers’ well being, since the reduction of time for resting and/or physical activity in combination with the increased demands that workers are facing can potentially lead to burn-out or physical problems.

Well-being is a key concept in the WHO definition of health: “Health is a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity.” (WHO 1948). In the Netherlands in 2015, one out of seven employees reported monthly burn-out symptoms at some point in their career (Hooftman et al. 2016). In 2017, the incidence increased to one out of six employees (16%). Improving the well-being of knowledge workers (and controlling the cost of sick leaves) is of high importance for employers and society. At the same time, workers themselves have a personal interest and motivation to prevent negative consequences of imbalanced working habits, but often need guidance and support to achieve this balance.

Personal tracking devices and wireless networking technology, including smartphones and wrist devices, have been regarded as a highly promising platform for supporting health behavior change (IJsselsteijn et al. 2006). Indeed, the sensing technology for recording personal behaviour/vitals data using e.g. acclerometry or photophletysmography has rapidly matured. However so far, effective use of this abundant data has been challenging. Critical elements are

  • processing the raw sensor data in higher level, interpretable labels, related to the personal context;

  • dealing with noise and artefacts and the large dynamic range of signals;

  • taking into account intra- and inter-personal differences and variance for interpreting the data;

  • designing a data storage and processing architecture that allows for flexible aggregation of information and data, while providing personalized privacy protection strategies;

  • providing personalized coaching strategies, taking into account context, personal preferences;

  • optimizing coaching by learning from historical data of the individual coachee and/or similar other coachees.

The SWELL project

The SWELL projectFootnote 1 [2011–2017] had the objective to develop m-Health applications by applying human centered design techniques, involving sensing and reasoning techniques, grounded in theory, that help to improve the well-being of knowledge workers. SWELL was a large collaborative research project between several universities, companies and research institutes in the Netherlands, each individual study contributing to the project goal at large. This paper aims to give an integrated overview of what has been achieved and provides recommendations for follow-up research.

A multidisciplinary perspective

The SWELL project brought together multiple disciplines: social sciences, computer science, human computer interaction, and data governance. The consortium’s goal was to develop applications that enable the use of data captured by a combination of sensors to a) reason on the gathered data in a smart and context aware way; b) to provide insight into their behaviour and determinants for well-being at work, and c) to develop personalized feedback and coaching strategies to change behavior. The project was structured by individual studies, linked to particular needs and expertises of the partners in this public-private collaboration:

  • a method to capture work-context in an unsupervised fashion, by analyzing raw keyboard and mouse interaction (Sappelli et al. 2016);

  • methods to apply the work-context to provide context-sensitive (personalized) search engine functionality (Verberne et al. 2014)

  • a design method for context aware systems, that takes into account non-functional requirements such as privacy concerns and user autonomy (Koldijk et al. 2014a, 2016);

  • analysis of the efficacy of posture and facial expression analysis for stress assessment (Koldijk et al. 2016);

  • a method for classifying activities using accelerometers in smartphones (Shoaib et al. 2014b);

  • a protocol-free method for measuring cardiovascular fitness with a wristband (Rospo et al. 2016);

  • a method for designing user friendly privacy controls (Bokhove et al. 2012)

  • strategies on how to support people in how to increase their self efficacy for changing behavior (Achterkamp et al. 2018);

  • a method to design the context sensitive reasoning component of mHealth coaching apps (Bosems and van Sinderen 2015);

  • a mixed method evaluation of an m-Health app developed in SWELL (de Korte et al. 2018b).

This paper puts the results of the individual studies in the larger context of developing personalized adaptive systems driven by sensor data, and provides an overview what has been achieved and which new questions have been identified. We think that a multidisciplinary view on the challenges outlined above is important, since most related work has a more restricted scope. The lessons learned from our project do have an impact for the development of mHealth and pervasive support technology, which has gained a large momentum in recent years.

Research questions

This paper is a reflection on the high-level research questions of the SWELL project, bringing together insights from the various studies that have been carried out in the project and specify directions for future research. The high-level research questions that motivated the SWELL project were:

  1. RQ 1

    To what extent is it possible to recognize (sense and classify) physical human activities and work behavior of knowledge workers using a combination of non-obtrusive, affordable and easy-to-obtain sensors?

  2. RQ 1

    How can we design personalized feedback and coaching provided by mHealth applications to knowledge workers using theories of behavioral change and make these tools more context aware?

Clearly, continuous monitoring of work behaviour and health status results in a rather sensitive dataset. It is therefore important to engage users in the design of m-Health applications and data storage in order to gain and maintain trust. In SWELL, we followed a value-sensitive design aproach to deal with privacy aspects.

Contributions

The main contributions of the SWELL project are: (1) to our knowledge, the SWELL project was the first project to address mHealth for well-being at work from a multi-disciplinary view, addressing both mental and physical well-being aspects; (2) we have shown the potential of affordable and easily available sensors for sensing and classifying user activity data; (3) our methodology is user-centered, with an important role for context-sensitivity and personalization; (4) in the design of our technology, privacy and user autonomy are integrated in the design process as important aspects of mHealth applications.

In the following sections, we first specify our motivation and key concepts (Sect. 2). Next, we describe prior and related work (Sect. 3). In Sect. 4, we present SWELL technnologies and prototypes as instantiations, according to the sense–reason–act framework. This is followed by a discussion of the lessons learned in Sect. 5 and our conclusions in Sect. 6.

2 Motivation and key concepts

In this section we first provide further background on the factors that determine well-being at work, linking our work to the literature in occupational health (Sect. 2.1). Subsequently, we introduce our approach to provide workers with tools to empower them in self-regulating their well-being based on the sense–reason–act framework (Sect. 2.2). Finally, we summarize our research on privacy impact assessment and privacy aware design of m-Health tools (Sect. 2.3).

2.1 Well-being at work, an occupational health perspective

This paper is centered around two causes of work-related health risk factors in knowledge workers: (A) stress and (B) low physical activity. We summarize relevant studies from an occupational health perspective that provided a point of departure for our project:

  1. (A)

    Knowledge workers often experience stress caused by high work demands. They experience a high workload that can be difficult to manage and may impact mental well-being (Michie 2002; Kalimo 1999). Not all stress is harmful: Seyle (1975) distinguishes positive and negative stress (eustress and distress). An environmental stressor causes a particular perception of the stressor in the individual. This can lead to acute physiological stress responses and, on the long run to long term physical, cognitive, emotional and behavioral stress consequences. The relation between job demands and resources determines how stressors influence the experience of stress. This is formalized in the Job Resources–Demands model, Demerouti et al. (2001). How the experience of stress can lead to long term stress consequences is determined by the amount of recovery. This is formalized in the Effort–Recovery model, Meijman et al. (1998).

    In SWELL, we study (a) the measurement of stress, (b) innovative technology to decrease information overload [an important stress inducing factor (Edmunds and Morris 2000)] as well as (c) personalized and group interventions that help to reduce or control stress levels, e.g. by providing feedback on the correlation between tasks and stress levels.

  2. (B)

    The second cause of work-related health risks that we address is low physical activity. The majority of office workers do not meet the WHO physical activity guidelines for physical activity and fitness.Footnote 2 Furthermore, desk workers spend 65–75% of their working hours sitting (Buckley et al. 2015). Recent literature shows that such sedentary behavior may lead to metabolic syndrome, a cluster of conditions associated with an elevated risk of cardiovascular disease and diabetes type II (Lee et al. 2010; Greer et al. 2015).

    To prevent these harmful effects, office workers should engage in sufficient physical activity at moderate and high intensity, and avoid long sedentary periods. The WHO recommendation for physical activity is to perform at least 30 min of moderate intense activity per day. The Dutch Fitnorm (NationaalKompas 2015) recommends 20 min of vigorous exercise for 3 days a week to improve cardio-respiratory fitness. Although there are no official guidelines yet with respect to sedentary behavior, initial recommendations were drawn up by an expert panel (Buckley et al. 2015). They recommend that desk workers perform at least 2 h of standing or light walking during work hours, eventually progressing to a total of 4 h per day. Furthermore, prolonged periods of sitting should be avoided. An additional argument for exercising more is the fact that substantial evidence exists that moderate physical exercise has a positive influence on mental well-being (Fox 1999).

    In SWELL, we investigated methods to measure cardiovascular fitness without performing a protocal and feedback methods aiming at increased physical activity at work.

2.2 The SWELL approach: Sense–Reason–Act

The goal of the SWELL project was to design mHealth applications supporting people in maintaining physical and mental well-being at work. The following hypothesis was taken as a starting point for the project:

Self-monitoring (recording activities and mental/physical state of humans across a long period of time) can be used to improve well-being of knowledge workers, by improving self efficacy and self-knowledge as well as by providing support for behavioral change.

The approach that we took throughout the project is user-centered and context-aware. By user-centered we actually refer to three elements: (i) we used a user-centered design method; (ii) to develop a tailored or personalized intervention; (iii) where the user is in full control of the collected personal data. Personalized is adaptation to the individual level; tailored is adaptation to a certain subgroup which shares common characteristics, but which is more well defined than the general population. By context-aware, we mean that detecting context with sensors and contextual reasoning are central elements in the design, in order to improve the usability and effectiveness of the intervention. We infer mental states, physical activity and health behavior from sensor data, collected with affordable, non intrusive sensors. Our solutions are grounded in models from work stress psychology and theories of health behavior.

Fig. 1
figure 1

Conceptual view of SWELL following the sense–reason–act framework

Pervasive technology (cheap networked sensors and mobile platforms) offers several opportunities to learning patterns and behaviour from on personal data. Sensors can be used for comprehensive self-monitoring of activities and health states, with a low effort and in an objective way. Sensors can also be used to contextualize physiological state data (think of physical and social context). The difference between personal goals and actual measured activities can subsequently function as an input for personalized coaching. Here, context data can be used to improve the relevance of suggestions and the timing of coaching messages.

With the development of prototypes in SWELL we followed the sense–reason–act framework (Kokar and Reveliotis 1993). This is illustrated in Fig. 1. Central component in this diagram is a human. The human is tracked by multiple sensors at a high frequency. In addition, subjective self assessments regarding experienced energy level are added to the data stream.

  • The ‘sense’ research focuses on interpreting the human activity, mental and physical condition on the basis of a combination of various types of low cost unobtrusive sensors. This challenge includes designing a multi aspect user model, filled with the personal analytics derived from the raw sensor data.

  • The ‘reason’ challenge is to learn personalized predictive models relating outcomes to specific activities. This is the type of data that can be the the starting point for providing the user feedback to reach their behavioural goals.

  • The reasoning engine can now compare the state of the user with his goals and generate a tailored recommendation ‘act’, taking into account the context.

2.3 Privacy user study and value-sensitive design

Proper handling of sensitive personal data is crucial for the adoption of the type of technology that is envisioned by the SWELL project. This includes a secure data storage, mechanisms for sharing (aggregated) analytics with health professionals or peers. To address this, we propose to develop context-aware systems with situated Cognitive Engineering (sCE) (Neerincx and Lindenberg 2008; Neerincx 2011). With sCE it is possible to combine theory on work stress with technological possibilities, taking into account input on the situated user needs.

Context-aware systems collect personal data in order to provide contextualized and personalized support. Tension may arise between technological possibilities of building rich user models and concerns that users might have, in particular regarding privacy. Current development methods do not (yet) address this tension, i.e., there is a lack of a compact and coherent method that balances functionality and privacy needs. Therefore, we refined sCE on two aspects: (1) defining the functional architecture with context and user-modeling components, and (2) integrating non-functional requirements into the (functional) requirements baseline. Our refined sCE has five complementary design guidelines, which we applied to the design of the SWELL applications (Koldijk et al. 2016).

Fig. 2
figure 2

Overview of data flow, user model and controls for SWELL prototypes, which holds information on the user’s work context, and the user’s well-being. Moreover, information on the private context outside of work may be included. Most important in the design is the aspect of user control (See Sect. 2.3)

An important goal in the first phase of the SWELL project was to collect user demands. We focused on privacy as important aspect of user concerns. During a workshop with potential users we first identified 15 separate requirements on desired functionality. When a first full system design was specified, users were specifically asked about hurdles to use the system. This yielded additional (non-functional) system, usability, quality, privacy and security requirements. The privacy impact assessment by Wright (2012) was then used to specifically formulate privacy requirements. We also identified several privacy requirements that were not mentioned by users themselves. These privacy requirements were addressed by eight privacy design strategies (Hoepman 2014). We integrated the outcome into the high-level architecture (Fig. 2).

As a last step, we evaluated in a user study whether information on privacy by design has a positive impact on users’ trust and attitude towards using the system (Koldijk et al. 2014a). We found that information about implemented privacy by design and data protection methods had a positive effect on perceived privacy and trust in our system. We also found that the attitude towards using our system was mainly related to personal motivation, not to perceived privacy and trust.

3 Prior work

In this section, we review related work that provided a point of departure at the start of the project (2011). We found inspiration in insights from the psychology of learning and motivate that pervasive technology has a natural role for personalized support (Sect. 3.1). In Sect. 3.2 we review some early works on agent-based support of knowledge workers and derive some lessons learned. These early works provide guidance how to situate pervasive technology in a complex human-agent setting. Sections 3.3 and 3.4 discuss a selection of related work that was taken as a starting point for the work on RQ1 and RQ2 respectively. Related relevant work that was published in parallel to the project (in the period 2011–2017) and beyond, is referenced in the discussion section of this paper (Sect. 5).

3.1 Reflective practice: learning based on reflection

In SWELL, we set out to design agent technology that would be accepted by knowledge workers interested in improving their well-being. The SWELL vision has partly been inspired by the theory of reflective practice (Schön 1983). Schön advocates reflection on experience as the basis for learning. Important (pre-digital era) recommendations for reflective practice are:

  1. 1.

    keeping a journal;

  2. 2.

    seeking feedback;

  3. 3.

    view experiences objectively;

  4. 4.

    taking time at the end of each day, meeting etc. to reflect on actions and consequences.

We believe that some of these recommended activities could be delegated to a digital personal assistant, especially activities that are difficult for humans:

  1. 1.

    record activities systematically and objectively, with contextual cues;

  2. 2.

    provide insight in actual time spent on activities (in order to improve estimates for required time as an input for planning).

  3. 3.

    balance short term and long term optimization (quantifying effect of activity on long term health goals).

In SWELL we have experimented with logging computer interaction in context in combination with physiological measurements in order to get a better grip on activities and their interaction with mental well-being.

3.2 Agents for supporting office work tasks and behavioral change

The idea of intelligent digital agents supporting humans has been developed during the last few decades of the twentieth century. Maes (1994) reviews these early works. She makes a distinction between rule based agents (which have to be created and activated by the user) and knowledge based agents containing a user model, a domain model and agents based on machine learning techniques. The rule and knowledge based agents have their roots in classical knowledge based AI, and rely on a significant amount of work by a knowledge engineer. The machine learning based agent approach relies on the idea that an agent can learn from observing the user and perhaps some form of feedback. This approach might be effective for simple tasks that have to be done frequently (such as sorting and prioritizing email). The metaphor for such an approach is a digital personal assistant that learns from observation and instructions. The presumed advantage is that such agents can learn and adapt over time.

Maes describes several prototypes based on this approach: news filtering and memory based action prediction in email handling. While these prototypes at MIT have only been tested by a small set of users, many of us have experience with intelligent agents that were developed in the 1990s for use in the office. Examples are ‘Clippy’, the digital assistant for Microsoft office, and Workpace, the digital assistant helping to avoid RSI. Both applications have been received rather critically. Several reasons can be pointed out: Workpace did not take into account the work context of the user in providing timings for wrist exercises, overruling the individuals autonomy. Clippy did trigger on particular contexts and offered “help”. Horvitz et al. (1998) describes the research project ‘Lumiere’ in which personal agent prototypes for Clippy have been developed that learn user intents and needs by observing interaction. The actual agent implementation in MS Office 1997 was far simpler and lost its ability to learn from feedback.

The disappointing experience with these early digital support agents in the office has demonstrated the importance of (a) respecting the autonomy of the user, (b) capturing context, (c) learning from user feedback, and (d) the potential of agent technology based on machine learning. The latter technology is rapidly advancing recently, due to deep learning and reinforcement learning breakthrough technology.

3.3 Interpreting behaviour at work using non-intrusive sensors

Knowledge workers rely on software for communication, information gathering, document creation and work planning, so a vast collection of digital traces is left behind on their computer. These are available in the form of mouse motion, click events, key presses and active window changes. We hypothesized that these traces can be used to automatically infer what task a user is currently performing. In this way we automatically create a real-time overview of tasks for the user in an unobtrusive way.

In the field of activity recognition, various computer activities can be automatically classified, for example activities in an adventure game (Albrecht et al. 1997) or filling in a form or planning a meeting (Rath et al. 2009). These activities have rather clear structures, involving predefined steps (Natarajan et al. 2008). Therefore, often model-based classification is applied, with logical models assuming a plan library (e.g. Goldman et al. 1999) or Markov models, modelling the sequence of actions in time (e.g. Albrecht et al. 1997). Moreover, most models have been applied to simple problems in a controlled environment. SWELL models have been tested in a less constrained environment.

In addition to the interaction with the computer, a worker’s activities can also be monitored using wearables. Within SWELL, a literature review of approaches to physical activity recognition was performed (Shoaib et al. 2015b).

Finally, several types of sensors can be used to estimate stress or arousal. Most often, body sensors are used to measure the physiological stress response directly. For a general overview of psycho-physiological sensor techniques we refer to Matthews et al. (2005). Most studies report on experimental environments, since this new technology is hardly deployed yet in work environments. E.g. it is possible to perform MRI measurements or invasive microdialysis to have a continuous measure of cortisol. However, we are interested to estimate stress in the workplace. Given the availability of cheap wearable sensors, ECG and GSR measurements would be interesting to look at. Mokhayeri et al. (2011), for example, collected such data in context of the Stroop color-word test. They state that pupil diameter and ECG have great potential for stress detection. Bakker et al. (2012) e.g. measured skin conductance of 5 employees during working hours.

3.4 Personalized feedback and coaching

When the actual behaviour and health status as monitored by sensors differs from the desired situation as specified by a human, m-Health applications can help as motivational tool to achieve behavioural change. People may know that particular behavior may be good for them, but still they may sustain their old behavior. Fogg (2002) identified three main hurdles preventing humans to perform the right or healthy behavior: lack of ability, lack of motivation and lack of a well-timed trigger. The interventions should be designed in a way that they address these hurdles. However, so far, the vast majority focused on accurate objective monitoring of physical activity and as such, it still remains unclear how exactly to provide the most effective feedback based on these accurate measurements.

Op den Akker et al. (2014) categorize the possibilities to provide information to individual users of real-time, technology-supported physical activity applications in seven types of tailoring—feedback, inter-human interaction, adaptation, user targeting, goal setting, context awareness, and self-learning—and show that adaptation, i.e. tailoring of feedback based on individuals’ scores on constructs from behavioral sciences, is rarely applied in modern-day, mobile physical activity applications. Non -technology-supported physical activity interventions however do frequently apply such theories and models from behavioral sciences that describe the constructs thought to underlie behavioral change to determine the content of feedback and other information (Conner and Norman 2005). Based on this it was concluded that much more insight is needed into the effective strategies to be used in mobile, technology-supported physical activity applications that are effective in supporting individuals in a personalized way in reaching behavioral changes.

4 Methods and results: personalized well-being support tools

In this section we present the tools that we developed and evaluated in SWELL. First, in Sect. 4.1, we present the SWELL knowledge work dataset with multimodal registrations of work behaviour in a lab setting. In the remainder of the section we present a number of studies carried out in SWELL, including the results obtained in those studies. From the sense–reason–act framework, our focus is on reason and act, with special attention for personalization aspects. In Sect. 4.2 (‘reason’) we describe the methods that we developed for data analysis and the results obtained. In Sect. 4.3 (‘act’), we present three SWELL tools that provide feedback to the user based on recognized activities. In Sect. 4.4, we present the results of studies that particularly addressed personalization in mHealth applications.

4.1 The SWELL knowledge work dataset

For research on stress and user modeling, and for evaluating our methods, we collected a new multimodal data set, the SWELL knowledge work (SWELL-KW) dataset. The data were collected in a lab setting, in which 25 people performed typical knowledge work (writing reports, making presentations, reading e-mail, searching for information). We manipulated working conditions with a number of stressors: email interruptions and time pressure. A heterogeneous set of data was recorded: computer interaction logging, facial expression from camera recordings (using Noldus Facereader software), body postures from a Kinect 3D sensor, heart rate (variability) and skin conductance from body sensors. The user’s interactions with their computer were logged using the uLog software developed by Noldus.Footnote 3 uLog is a key-logging tool that records the active application, window title, url or file that is open, caption information, mouse movements, mouse clicks and keyboard activity with timestamps.

The participants’ subjective experience on task load, mental effort, emotion and perceived stress was assessed with validated questionnaires and distributed together with the behavioral data. The resulting dataset is available to other researchers (Koldijk et al. 2014b; Sappelli et al. 2014).Footnote 4 A recent study based on the SWELL-KW dataset confirmed the importance of posture for predicting self-assessed stress levels, and also showed computer use patterns as an important predictor for frustration (arousal) (Alberdi et al. 2018a).

4.2 Reason: classifying work behaviour and activities

In this section, we discuss three methods for data analysis that we implemented in the SWELL project: visual analytics of work behavior data (Sect. 4.2.1), work behaviour classification (Sect. 4.2.2), and classification of physical activities (Sect. 4.2.3).

Fig. 3
figure 3

A heat map of facial activity data of 2 participants. Different facial regions show changes in activity

4.2.1 Visual analytics of work behavior data

We want to provide knowledge workers insight in their work behavior and how this relates to their well-being. Therefore, we explored visual analytics techniques to investigate how sensor data can be used to gain insight into work behavior, specifically related to stress at work. In an iterative approach, we combined automatic data analysis procedures with visualization techniques, to gain deeper insight into the data (Koldijk et al. 2015).

We found that of all the features we investigated, facial expression features were most closely related to mental effort. There are, however, many individual differences. By means of a heat map we were able to visualize meaningful patterns in facial activity for an individual user (see Fig. 3). The visualization was made more insightful by rendering facial expressions on an avatar (see Fig. 4), by-passing the issue of dealing with privacy of subjects. Finally, we identified several facial expressions that are typically related to a low or high mental effort (Koldijk et al. 2015). We conclude that facial expressions may be a promising measurable outward characteristic that can be visualized to indicate mental state patterns during work.

Fig. 4
figure 4

HapFACS avatar displaying different facial expressions of one participant while working

The benefit of incorporating visual analytics to our problem, instead of a black box machine learning approach, was to gather a deeper understanding of the structures in our data and to gain insights from individual users’ data. The final aim is to develop a visualization system for individual users that gives insight in a large amount of (self) behavioral data recorded with sensors. Visualizing such behavioral patterns may give users insight and actionable information.

4.2.2 Work behaviour classification

With the increasing amount of information knowledge workers have to handle, they can get overwhelmed easily: a phenomenon referred to as ‘information overload’ (Bawden and Robinson 2009). It has been shown in previous work that ‘working in context’ is beneficial for knowledge workers (Gomez-Perez et al. 2009; Warren 2013). Ardissono and Bosio (2012) have found that task-based and context-based filtering reduce the user’s workload. Thus, by recommending and highlighting information that is relevant to the context, while blocking information that is out-of-context, we help the user to stay focused on his current task.

For the purpose of modeling the user’s work activities in context, we have developed a novel model for context recognition and identification, based on the interactive activation and competition model (IA model) by McClelland and Rumelhart (1981). Our Contextual Interactive Activation model (CIA) has two main applications (Sappelli et al. 2016). In the first application the network recognizes and interprets what the user is doing, by interpreting keyboard and mouse actions, classifying the user’s activities into the projects the user is involved in. The knowledge worker can get an overview of when he worked on what project and for how long. This can support him in his time management and can simplify the task of hour tracking. Evaluation on the SWELL-KW dataset showed that our model can correctly identify a worker’s context (out of 8 different contexts) for 65% of the work time, which is significantly better than traditional classification methods using the same input data.

4.2.3 Classification of physical activities

Apart from recognizing work contexts, we also developed models for recognizing physical activities. This is a first step in advising users on their physical activity patterns.

To recognize physical activities, we utilized mobile phone sensors. As a first step, we investigated the classification of seven human activities at five body positions (Shoaib et al. 2013, 2014b). These activities are walking, biking, jogging, walking upstairs, walking downstairs, sitting and standing. In (Shoaib et al. 2013, 2014b), we show that these activities can be recognized in a reliable way using mobile phone motion sensors.

Moreover, we evaluated the effect of combining various sensors, their position, sampling rate, data features, and classification algorithms on recognition performance of these activities. Based on this analysis, we recommend guidelines on how and when to use various mobile phone sensors for better activity recognition. We also conducted an extensive survey on online activity recognition using mobile phones. In this work, we discuss various aspects of real-time recognition systems that are implemented on mobile devices. (Shoaib et al. 2015a, b).

As follow-up on the study with mobile phone sensors, we combined motion sensors from mobile phones and wrist-worn devices to recognize relatively complex human activities such as smoking, eating, working on computer, writing, giving a talk, and having a cup of coffee (Shoaib et al. 2015c). We show that thirteen human activities including these complex ones can be recognized with a reasonable accuracy, ranging from 84% accuracy for talking to 99% accuracy for writing. These results contribute to detecting bad habits in our daily life such as smoking or lack of physical activity.

For reproducibility purposes, we have made our tool and collected datasets publicly available (Shoaib et al. 2014a).

Summary of SWELL methods for classifying work behaviour and activities

In this section we showed how we can recognize human activities and work behavior using a combination of non-obtrusive, affordable and easy-to-obtain sensors. We can distinguish thirteen different types of physical activities with reasonable accuracy using mobile phone sensors and wrist-worn devices; we can recognize mental state patterns from facial expressions recorded with a webcam; and we can recognize the tasks and topics the user is involved in using the user’s computer activities such as keystrokes and active applications.

4.3 Act: case studies on behavioral feedback and coaching through mHealth applications

In this section we discuss three type of tools that advice users on their behaviour. We present a smartphone application guiding the user with a novel heart-rate based training method (Sect. 4.3.1); we evaluate a personal e-coach giving advice on for work behavior (Sect. 4.3.2); and we present a group feedback panel that mirrors work behavior on the department level (Sect. 4.3.3).

4.3.1 mBeats: Improving cardio-respiratory fitness through a smartphone app

Cardio-respiratory fitness is an important factor in the prevention of cardiovascular diseases. In order to improve cardio-respiratory fitness, one needs to engage in sufficient physical activity at an adequate intensity. Currently, only 24% of the Dutch adult population meets the recommended amount of physical activity required for a healthy cardio-respiratory fitness. Many people decide to enroll in the gym or take up sports to get into shape or stay fit. However, for various reasons, not everybody is able to maintain this commitment (DellaVigna and Malmendier 2006).

For those who want to become or stay fit, but fail to attend a gym regularly, connected fitness applications could be a solution. We tested the effectiveness of a novel heart-rate based training method, implemented in a smartphone application, mBeats. This method determines a personalized heart rate training zone for each individual and advises users to accumulate a certain number of target heart beats on a daily and weekly basis. The training method was implemented in a smartphone app, connected to a Mio Alpha heart rate watch. The mBeats app provides users with a personalized training program and continuous feedback based on their performance. For a detailed description of the mBeats app, we refer to Thomassen (2013).

For evaluating the effectiveness of the app, we compared three groups of participants who followed different training methods during two weeks. The first group used our method implemented in the mBeats smartphone application (app group). In the second group participants followed the same training method as offered by the app, but were supervised by a qualified trainer in a gym (gym group). Participants in the third group followed a program offered by Fitbit aimed at taking 10.000 steps per day (Fitbit group). In all three groups, we observed improvements in cardio-respiratory fitness and related health markers, such as blood pressure and heart rate (Rospo et al. 2016).

Thus, the mBeats app seems a promising and easy-to-use method to improve cardio-respiratory fitness, delivering similar results as existing methods. Moreover, in the app group these results were reached with a lower number of steps than in the gym group, suggesting that our training was more efficient. The main benefit of our novel method is that it is easily incorporated in daily life. Rather than having to go to the gym after work, office workers can accumulate target heart beats during their work day, by taking the stairs, taking a lunch walk or cycling to work. In the long run, this will likely lead to better adherence and better results than a gym-based training.

4.3.2 Evaluating the Brightr app for health promotion at work

The Brightr app is an mHealth application developed for workers at a high tech company to improve their vitality. We evaluated version 1.0 of the Brightr app with two groups of users (de Korte et al. 2018a, b): a focus group of workers and a focus group of experts: behavioral scientists, sociology experts, ergonomists, designers and HCI researchers.

The objectives of this study were to gain insight into (1) the opinions and experiences of employees and experts on drivers and barriers using an mHealth app in the working context and (2) the added value of three different qualitative methods that are available to evaluate mHealth apps in a working context: interviews with employees, focus groups with employees, and a focus group with experts.

Employees of a high-tech company and experts were asked to use the Brightr app for at least 3 weeks before participating in a qualitative evaluation. Twenty-two employees participated in interviews, 15 employees participated in three focus groups, and 6 experts participated in one focus group. Two researchers independently coded, categorized, and analyzed all quotes yielded from these evaluation methods with a codebook using constructs from user satisfaction and technology acceptance theories (Vosbergen et al. 2014; Wixom and Todd 2005; Bailey and Pearson 1983).

Interviewing employees yielded 785 quotes, focus groups with employees yielded 266 quotes, and the focus group with experts yielded 132 quotes. Overall, participants muted enthusiasm about the app. Combined results from the three evaluation methods showed drivers and barriers for technology, user characteristics, context, privacy, and autonomy. A comparison between the three qualitative methods showed that issues revealed by experts only slightly overlapped with those expressed by employees. In addition, it was seen that the type of evaluation yielded different results.

Findings from this study provide the following recommendations for organizations that are planning to provide mHealth apps to their workers and for developers of mHealth apps: (1) system performance influences adoption and adherence, (2) relevancy and benefits of the mHealth app should be clear to the user and should address users’ characteristics, (3) app should take into account the work context, and (4) employees should be alerted to their right to privacy and use of personal data. Furthermore, a qualitative evaluation of mHealth apps in a work setting might benefit from combining more than one method. Factors to consider when selecting a qualitative research method are the design, development stage, and implementation of the app; the working context in which it is being used; employees’ mental models; practicability; resources; and skills required of experts and users. (de Korte et al. 2018b)

4.3.3 Fishualization: group feedback at work

Fishualization (Schavemaker et al. 2014) is a novel intervention that maps human-computer activity data to a group feedback display on the basis of a combination of various types of unobtrusive, low-level sensors (Fig. 5).

Fig. 5
figure 5

Fishualization display (including legend)

The goal of Fishualization is to give employees insight into their working habits and to stimulate social interactions, group awareness and openness, which are all beneficial for well-being at work. The Fishualization feedback display shows visualizations of the state or mood of an entire team of knowledge workers. One avatar (fish) is shown for each employee. Each avatar has three degrees of freedom that are controlled by aggregated sensor measurements of an individual team member:

  • The movements (fast or slow) are controlled by activity level.

  • Upon change of application focus the fish changes its swimming direction;

  • The position (high or low) is controlled by mental energy.

‘Plants’ at the bottom of the screen represent a group of tasks of the same type, for example, writing e-mail, editing document, browsing, or preparing presentation.Activity level and focus are derived from logging keystrokes, mouse movement and application usage on user devices (see Sect. 4.2.2). Mental energy can be derived directly by asking users or from facial expression analysis (see Sect. 4.2.1) (Koldijk et al. 2013).

The Fishualization system has been evaluated in two trials at two different research labs (TNO, Thales) (Schavemaker et al. 2014). All trials were conducted with subjects in small research groups (about 20 employees) and with placement of the Fishualization display central at the coffee corner. Using pre- and post-test questionnaires and additional audio/video capturing using microphones and cameras different test criteria have been measured and evaluated.

We found that the Fishualization stimulates more social interaction at the coffee corner although not all new discussion is related to well-being at work. Furthermore, people are more aware of their working patterns and energy levels as well as that of others in the group. Most people find the system fun to use and have no objections against sharing aggregated data in such a way (privacy being a key factor for acceptance of employees).

Summary of case studies in behavioral feedback and coaching

In this section we investigated three case studies of SWELL applications that provide behavioral feedback and coaching. We presented the easy-to-use mBeats app that helps to improve cardio-respiratory fitness by effectively stimulating regular physical activity in the daily life of knowledge workers. We showed the importance of combining different evaluation strategies to elicit rich feedback for mHealth app development. And we presented a group feedback display to foster interactions between co-workers and awareness of work behaviour.

4.4 Personalization

In this section, we follow-up on the personalization of mHealth and knowledge worker applications. We present the results of three studies addressing personalized coaching: an e-coach for tailored coaching during the work day (Sect. 4.4.1); a personalized mHealth intervention for improving cardiovascular health (Sect. 4.4.2); and personalized coaching aimed at improved self-efficacy (Sect. 4.4.3).

4.4.1 NiceWork: tailored lifestyle coaching during the workday

The e-coaching app NiceWork provides knowledge workers with tailored tips that support well-being at work (Wabeke et al. 2014). We used a hybrid recommender system to suggest tailored tips. The suggested tips are concrete actions that are expected to improve well-being at work.

In a study evaluating the effectiveness and usability of the NiceWork e-coach users automatically received three tips per working day on their smartphone. By recommending tailored tips, the e-coach tried to motivate knowledge workers to enhance their coping abilities and improve their recovery from coping with high demands. Users and tips were modeled in a 8-dimensional real valued vector space. Features categorize the goal type (prevention or awareness) and the type of the tip content (e.g. relaxation, time management, work-conditions and food). The user profile vectors were adapted using feedback of the users while using the app. 35 knowledge workers participated in the study. They used the e-coach in their normal work environment for 2 weeks during this study.

The study mainly focused on the effect of tailoring.Footnote 5 More specifically, we investigated whether tailored tips—generated by our recommendation method—had a higher chance of being followed-up by a user than tips that are not adapted to the user’s preferences. It was hypothesized that tailored recommendations were followed-up more often than randomized recommendations.

The results of the study were promising, as they suggested that knowledge workers have a positive attitude towards the implemented e-coach. On the other hand, we did not find strong evidence for the effectiveness of our recommendation method, since tailored tips were followed up only slightly more often than tips that were not adapted to the user’s preferences (Wabeke et al. 2014).

4.4.2 Personalizing activity intervention programs

Information on fitness may provide useful input to stimulate behavioral change in knowledge workers, and to design tailored activity intervention programs to the needs of each individual user. This highlights the importance of repetitive and unobtrusive assessment of the individual fitness level (Sartor et al. 2013).

Current methods to determine cardio-respiratory fitness are not suitable for everyday use. The gold standard technique to determine individual fitness level is the maximal oxygen uptake (VO2max) test. Due to the need for special equipment and supervision, this is not convenient for serial assessments. Other methods, which estimate VO2max based on heart rate using sub-maximal protocols are less accurate and impractical to be carried out independently by users (Sartor et al. 2013).

Our research focused on developing a protocol-free method to determine VO2max by context-aware data fusion using wearable sensors. We explored whether information on heart rate (HR) and acceleration from the wrist could be used to estimate fitness level without having to carry out any standard protocol. Firstly, we developed algorithms to discriminate between activity types and sport modalities to generate the proper contextual information for further interpreting motion and physiological data and assess fitness (Margarito et al. 2015). This challenge included the discrimination of wrist motion patterns for different types of activity and sports routines. Template matching strategies were applied to assess the correspondence between the wrist acceleration signal and a template base including examples for popular sports activities such as cycling, weight lifting, cross trainer exercise, rowing, running and walking.

We evaluated to what extent the classification performance of our models was robust when moving from the cross-validation setting to a natural daily life setting. The classification performance was tested on data collected in daily life using a multiple-site accelerometer augmented with an activity diary for 20 healthy subjects. Leave-one-subject-out cross validation of the training data gave a classification accuracy of \(95.1 \pm 4.3\%\). This decreased significantly in the daily life setting to \(75.6 \pm 10.4\%; p < 0.05\). Thus, we found that cross validation of training data overestimates the accuracy of the classification algorithms in daily life (Bonomi 2013; Gyllensten and Bonomi 2011).

Our research led to acting upon classical limitations of typical machine learning approaches for activity classification which suffer from the poor reproducibility of laboratory accuracy in the free-living context. Once the contextual background is determined, physiological data as well as specific motion features are used to determine the fitness level of the user with an accuracy level adequate to personalize activity intervention programs and motivate users by following changes in cardio-respiratory fitness over time.

4.4.3 Personalization based on constructs from theories on behavioral change

Mobile health behavior change applications rarely personalize information/feedback based on constructs from behavioral change theories. Yet this is known to greatly contribute to the effectiveness of and adherence to traditional, non-mobile health behavior interventions (Free et al. 2013).

Our research into personalized behavioral change is grounded in the transtheoretical model (TTM), The TTM assumes that changing behaviour requires progress through five stages—Precontemplation, Contemplation, Preparation, Action, Maintenance—and that different cognitions may be of importance at different stages. We investigated the relations between self-efficacy, stage of change, and physical activity.

A total of 325 healthy control participants and 82 patients wore an activity monitor. Participants completed a self-efficacy or stage of change questionnaire. We found that higher self-efficacy is related to higher activity levels. Patients are less active than healthy controls and show a larger drop in physical activity over the day. Patients in the ‘maintenance’ stage of change are more active than patients in lower stages of change, but show an equally large drop in level of physical activity. Findings suggest that coaching should at least be tailored to level of self-efficacy, stage of change, and physical activity pattern (Achterkamp et al. 2018).

Based on these findings a number tailored coaching strategies were developed. These strategies were tested in a lab study to investigate how to increase self-efficacy in mobile technology-supported physical activity interventions (Achterkamp et al. 2015). Subjects were asked to walk from A to B in exactly 14, 16 or 18 s, wearing scuba fins and a blindfold. The task guaranteed an equal level of task experience among all subjects at the start of the experiment and makes it difficult for subjects to estimate their performance accurately. This allowed us to manipulate feedback and success experience through technology-supported feedback.

We found that subjects’ self-efficacy regarding the task decreases when experiencing little success and that self-efficacy regarding the task increases when experiencing success. This effect did not transfer to level of self-efficacy regarding physical activity in general. The study shows that experiencing success is a promising strategy to use in technology-supported interventions that aim at changing behavior.

Findings on personalization and tailoring

Personalization of health interventions (e.g. lifestyle or workstyle coaching) is a vast area of study and we believe that personal, context aware adaptive coaching is a key element of a successful strategy targeting behavioral change or more efficient and effective knowledge work. In this section, we showed four examples of personalization derived from studies in the SWELL project: (i) a recommender system for health tips at work, self learning from interaction with the user (ii) recognizing the fitness level of the user and following changes in his cardio-respiratory fitness in order to personalize the interventions on physical activity; and (iii) having the user experiencing success as a strategy to make activity interventions more effective.

These examples show that personalization and tailoring are important means to maximize effectiveness of an intervention, given the differences between individuals in a target group. We also want to stress that personalization and tailoring are important to maintain effectiveness and well-being of an individual, who is changing along the way, or whose mood and motivation is influenced by contextual and personal factors. We believe that the main impact of the SWELL research line on personalization is its capacity to measure mental and physical status in context. This can be achieved with measuring working behaviour and activity using non-obtrusive sensors, as discussed in Sect. 4.2.

5 Discussion

SWELL has been one of the early large research projects that aimed to take ideas from the ‘quantified self’ movement to the workplace. Its aim was to improve well-being at work by non-intrusive activity and health status monitoring as a basis lessons learned and limitations of the project (Sect. 5.2). We first review a selection of relevant studies that were published contemporary to and after the SWELL project period 2011-2017 (Sect. 5.1).

5.1 Contemporary related work

A 2018 study showed that the economic costs of work-related stress to society are high, summing up to 187 billion in Europe in a rather conservative estimation (Hassard et al. 2018). The major part of the costs relate to productivity losses. This shows the continued relevance of stress reduction for society at large. The domain of mHealth apps for self-monitoring and behavioural change therefore is a growing area, for app builders as well as research.

The Interstress project

Similarly to SWELL and contemporary with the SWELL project work, the EU FP7 project Interstress aimed at reducing stress at work. Where SWELL focused on knowledge workers as main target group, the Interstress project notably had teachers and nurses as focus groups. Like SWELL, the project experimented with unobtrusive stress sensing methods, such as Kinect 3D (Giakoumis et al. 2012). One novelty compared to SWELL is that in Interstress a stress intervention method based on virtual reality was developed and evaluated. It was investigated whether behavior in the physical world influences the experience in the virtual world and visa versa (Gaggioli et al. 2014).

Stress in knowledge workers

The work by Alberdi et al. (2018b) evaluates predictive models for mental workload and stress, based on unobtrusively and ubiquitously gathered smart office data. They use the SWELL Knowledge Work Dataset for Stress and User Modeling Research (SWELL-KW) (see Sect. 4.1) in their study. The results show that behavioral changes for stress and mental workload levels, as well as for change in workload conditions can be well predicted, which is in line with the SWELL findings. The authors stress the importance of self-reported scores’ standardization and the suitability of the NASA Task Load Index test for workload assessment. In addition, Alberdi et al. (2018b) state that methods and models are needed that “address the use of time-series statistics describing physiological and behavioral change over time” (p. 15), an aspect that had less emphasis in SWELL.

Sensing and predicting stress

Relevant related work on the topic of sensing stress is the Autosense project by Plarre et al. (2011). The project results indicate that respiration measured by a belt is a strong predictor for measurements of physical stress (as validated in the public speaking test).

In a more recent publication, Alharthi et al. (2019) propose an acute stress prediction system. A user’s stress status is predicted on basis of current contextual data (ECG signals) gathered with a smartphone. The stress prediction algorithm is adaptive and personalized, and has 78% accuracy when evaluated on ground truth data. The system is evaluated in a real-life experiment with 5 subjects who used the application between 3 and 5 days. The personalized and adaptive prediction of one stressor class, namely acute stress, is more advanced than what has been done in SWELL. On the other hand, the focus in the work by Alharthi et al. (2019) is on the specific algorithm and its evaluation, and it does not have a system perspective as SWELL. The scale of the experiment is small, so one should be careful with general conclusions.

Another small-scale study is presented by Betti et al. (2018). They implement a wearable physiological sensors system, based on ECG, EDA, and EEG, to capture human stress. The study investigates to what extent the detected changes in physiological signals correlate with changes in salivary cortisol level, which is a reliable, objective biomarker of stress. The system is evaluated with 15 participants using the Maastricht Acute Stress Test. The results of the study indicate that some physiological features clearly correlate with salivary cortisol levels, and could be used for predicting stress with 86% accuracy. The work by Betti et al. (2018) covers more detailed data analysis than has been done in SWELL, but the scope is more limited.

The work by Sarker et al. (2017) concerns a validation study to detect stress episodes in order to deliver just-in-time stress interventions. For the stress detection step, the signals of multiple wearable sensors are combined. Respiration bands are used as carrier for the sensors. The stress detection method is trained and evaluated on ground truth data collected in a lab study with validated stress inducing protocols such as public speaking and submerging a hand in ice cold water. Compared to the work by Sarker et al. (2017), the SWELL sensors are less intrusive than respiration bands. Our results indicate that posture and facial expression are important biomarkers; these are however only applicable in stationary office settings. Our study differs also in the use of the stress inducing protocol, which perhaps is more indicative of the type of stress at work. Sarker’s study and our SWELL study both are challenged by the differences in time scale of EMA measurements and continuous stress measurements. Recording exact timings for stress moments is informative for analysis, but these moments may not be suitable moments for just-in-time interventions.

In line with the SWELL studies on activity recognition and stress measurement using unobtrusive wearable sensors , the paper by Smets et al. (2018) presents the results of a large-scale study in which physiological and contextual measurements were collected through wearable devices and smartphones in a free-living context. 1002 subjects participated during a period of five consecutive days. The subjects also completed questionnaires on their perceived stress, measured on the Perceived Stress Scale (PSS). The results of the study shows associations between wearable physiological signals and self-reported daily-life stress. Further analysis of the collected data indicated that physiological responses to stress strongly differ among subjects, distinguishing groups with small and large dynamic ranges of the physiological features. These results confirm the need for personalized models, as expressed in the SWELL project.

Behavioral intervention studies

Jaimes and Steele (2018) introduce an architecture for pervasive stress management and just in time intervention using smartphones and mobile sensing. They follow the same concept as in SWELL; the elements sense, reason and act are the same, but in their situation they go a step further in the sense that the reasoning part is focused on predicting a future event. The main idea of the intervention is to deliver the right intervention or the right combination of interventions, at the right time to maximize the effectiveness(i.e. reduce the frequency of the interventions) while also reducing the number of interventions. For this part they showed a probabilistic model.

Slovak et al. (2017) argue that self-tracking is only a first step in the behavioral change process for reducing stress. In a transformative learning setting, a mentor should play an active role to result in actual transformation. Choe et al. (2017) argue that pure manual or pure automatic tracking for self monitoring is not effective. They recommend a combination of manual and automatic tracking, to diminish the burden of consistent self-tracking on one hand and maintain a certain level of engagement and awareness on the other hand.

5.2 Lessons learned

Looking back at the original sense–reason–act motivation of the project, we must conclude that most of the project focus has been allocated to (i) reason, (ii) act (including personalized interventions) and (iii) non functional requirements such as addressing privacy concerns. The reasoning part focused almost entirely on the interpretation of continuous heterogeneous multi-sensor data at task level (minutes–hours), without an analysis of patterns on a longer time scale.

Sense: Using low-cost non-intrusive sensors to inform about health/activity status

In SWELL, several studies were carried out comparing the predictive power of different modalities to monitor work activities (providing context), affect (workload, arousal), and physical activities. The keyboard and mouse interaction proved informative features to detect changes in work task. Camera observation of workers in stationary settings proved useful for the detection of stress and increased workload. Especially facial expression and posture changes were informative. Finally, accelerometry and photophletysmography embedded in wrist-worn devices and smartphones proved useful for activity recognition and VO2 max estimation. We thus conclude that non-intrusive measurement of status and activities by wearable and stationary commodity sensors is possible. However, adequate reasoning techniques are necessary to be able to interpret the data.

Reason: mapping continuous raw sensor data to interpretable high level actions

We have gained initial evidence that self-lifelogging has the potential to improve swell-being of knowledge-workers. Our experiments with innovative ways of monitoring stress and mental efforts have revealed the potential of facial expression and posture. Although imaging technologies pose additional privacy challenges, we consider this an important result, since it is a departure of the more invasive methods based on measuring cortisol secretion or accumulation. We also found that the physical expression of stress differed across subjects. This means that interpreting raw sensor signals (such as 3D images and facial camera views) is more complex than assumed. These results also suggest that individual user models/profiles may help to improve raw sensor signals.

In addition we have shown the potential of using the smartphone array of sensors for the classification of activities. We also showed that smart watch technology can be the basis for valid measurement of cardiovascular fitness. Finally, the CIA model for context recognition and identification has shown that computer activity can be linked to higher level projects and tasks automatically. The next step is to combine all these sensing technologies together to automatically record a rich semantic summary of a work day, for the purpose of individual reflective practice. We have also gained some initial evidence that digital apps can in fact improve cardiovascular health, by focusing on the opportunities throughout the day to perform mild exercise and providing feedback about actual performance using wearable technology.

Act: Personalized feedback for improving mental and physical well-being

Personalization is an important field in Digital Health research roadmaps as well as in digital services. Looking at the sense–reason–act framework, personalization will probably have the largest impact on the reason-act layers. From a health perspective, personalization enables a more precise diagnosis, prognosis and treatment. Tracking personal analytics through advanced sensing ideally results in a personal digital model of the person involved. A model, which is continuously updated with actual measurements. The actual reasoning (e.g. diagnosis, intervention planning etc) will most probably rely on established protocols. Personalization also has a large potential for the act/feedback/coaching step. Making use of behavioral change models to personalize coaching is important to be more effective. Especially stage of change varying from from pre-contemplation to maintenance and self efficacy are important constructs to take into account. Tailoring of this step to a specific person is a recognized theme in health communication. SWELL research showed that tailoring coaching to the level of self-efficacy and cardio-vascular fitness level is a promising strategy.

Limitations

The project was scoped by several design choices:

  • Focus on empowerment, so taking the perspective of employees, not employers.

  • Focus on interdisciplinary socio-technical approach, respecting user autonomy and privacy.

  • Restrict data collection to ‘office hours’ and workplace settings (motivated by privacy).

  • Focus on component technologies, large scale evaluation of integrated prototypes out of scope.

  • Focus on method/technology development, not on large scale intervention or observational study.

Especially the third decision probably restricted the data collection too severely. During the project several pilots were done with 24/7 data gathering. A large scale data collection across a longer period of time was beyond the scope of the plan, which prohibited a proper extrinsic validation of integrated apps. Still, controlled lab studies did provide initial evidence and feasibility of several of the ideas driving the project. An important factor impeding longitudinal studies was that collecting tracking data on a large scale in a real life office setting proved to be a big challenge given privacy concerns, worker regulations and security policies. A more longitudinal SWELL study (ten whole working days) was eventually conducted in the lab, without a stress inducing protocol (Brouwer et al. 2018). We found associations between experienced emotion and heart rate, facial expression, keylogging and task switches. The fine-grained EMA measurement protocol had strong adherence, showing the potential for future studies.

The fact that the project was focused on a use case of individual empowerment through reflective practice (mirroring) is probably limiting the impact of the intervention. In order to interpret self measurements, it is almost imperative to compare indicators to a larger peer population. This implies that users of the digital e-coach should eventually also contribute their (aggregated and anonimized) data in order to compute peer group averages. The topic of privacy respecting analysis has is further developed in the Personal Health Train initiative.Footnote 6

It is clear that longitudinal field studies with a larger test population are needed to get better insights in the effectivenesss of pervasive well-being e-coaches. However, the regulations concerning data protection and medical ethical approval committees make it rather costly to perform such comprehensive studies. Also from a methodological point of view, it will be rather difficult to run large scale double blind RCT’s with assistive technology. Another promising line of research that aligns well with the SWELL vision is to use self monitoring in a self-experimentation framework, using single case designs with randomization tests (Karkar et al. 2016).

6 Conclusion and future work

The SWELL project addressed the challenge of developing models and applications to improve physical and mental well-being at work and reduce information overload in a multidisciplinary setting. In this section we will revisit the research questions from the introduction.

  1. RQ1

    To what extent is it possible to recognize (sense and classify) physical human activities and work behavior using a combination of non-obtrusive, affordable and easy-to-obtain sensors?

We showed how we can recognize human activities and work behavior using a combination of non-obtrusive, affordable and easy-to-obtain sensors. We can distinguish thirteen different types of physical activities with reasonable accuracy using mobile phone sensors; we can recognize mental state patterns from facial expressions recorded with a webcam; and we can recognize the tasks and topics the user is involved in using the user’s computer activities such as keystrokes and active applications.

  1. RQ2

    How can we design personalized feedback and coaching provided by mHealth applications to knowledge workers using theories of behavioral change and make these tools more context aware?

We presented an easy-to-use app that helps to improve cardio-respiratory fitness by effectively stimulating regular physical activity in the daily life of knowledge workers, and a group feedback board that successfully gives an entire team insight in their working patterns. For lifestyle coaching during the workday, we could not prove that tailoring to user preferences was more effective than generic advice, but all presented mHealth applications were judged as fun and easy to use by the subjects.

We showed three examples of how the feedback provided by mHealth applications and information systems can effectively be personalized: recognizing the fitness level of the user and following changes in his cardio-respiratory fitness in order to personalize the interventions on physical activity; having the user experiencing success as a strategy to make activity interventions more effective; and personalizing query suggestions to make professional information search more efficient.

We think the SWELL project has been a timely contribution to the multidisciplinary field of digital health apps. There is a clear tendency towards increased reliance of self-management for health issues such as chronical diseases. Digital tools can help to empower individual employees to increase insight in risk factors for reduced physical and mental health. The high prevalence levels of obesity and burn-out do underline that there is a societal challenge. Even if it is clear that digital health apps cannot solve those issues completely, they might be a suitable tool embedded in a larger intervention.

During the course of the project, several generations of smart watches and thousands of health apps have been introduced on the market. Large technology vendors and their CEO’s have announced that big data approaches may have a profound impact on the health sector. It is clear that there are business opportunities, however we believe that digital health technology should serve and empower individual citizens, workers and patients in the first place.

Future research

Of course each individual study, concisely presented in this paper gives rise to follow up questions. In this sections we list several larger open research avenues that we have identified.

  • Longitudinal study. The full value of many of the described technologies can only be quantified and validated through a large scale user trial over a longer period, where interventions can have their effect and the well-known effect of declining adherence needs to be faced.

  • Physical/mental/social. Although SWELL may be unique in addressing mental and physical well-being, the interaction of both dimensions has been kept outside of consideration. We believe that a study on the interaction may reveal rather interesting personal mechanisms. It might also be good to measure and model social factors.

  • Privacy respecting analysis. We did find that potential users are sometimes rather reluctant to use the tracking functionality of mHealth apps, since it is unclear where the personal data is stored and who has access to the data. It is also becoming more common that data is stored in different repositories, making it difficult to learn from the combined data. Secured multiparty computation or distributed learning techniques may be of help, supporting analysis of distributed data without providing access to the original data (Veeningen et al. 2018).

  • Aggregation/peer groups. If the technology permits aggregation of user behavior data at a population level and users give consent, their may very well be a larger potential for delivering the best advice. An accurate personalized prediction of a future health state for different types of lifestyles may very well be a compelling instrument for behavioural change. At the same time, people increasingly request better transparancy and explainability of such predictive models.

  • Learning from feedback. Learning from feedback from the user may also be an effective method of learning an effective coaching strategy.

  • Organisation versus individual. It is clear that employee and employer both have a keen interest in improving health of employees at risk. However, since their is an asymmetrical relation, it is important that collected information is only shared from employee to employer in a manner that respects legal and ethical constraints. It is both a business model and technical challenge how to design an infrastructure that both meets the constraints and creates a technology with a viable business model.

It is also clear that society and technology are moving on. Digital voice or chat assistants are becoming increasingly popular and might have potential applications for mHealth and E-health goals such as studied in SWELL. We expect that the increasing powers of AI based assistants can strengthen the reasoning component of SWELL, which needs a strong ramp up in order to realize the original vision.