“What we measure affects what we do; and if our measurements are flawed, decisions may be distorted.”

- Stiglitz, Sen and Fitousi ( 2009, p. 7)

In light of changes in the conditions and nature of work, along with wider appreciation of the importance of social responsibility, organizations and consultancy firms have taken a serious interest in worker well-being (Scott and Spievack 2019). Indeed, an article in Forbes magazine on the human resources (HR) trends of 2020 suggests that worker well-being should be HR’s top priority, explaining, “Many companies concerned about the future of work focus on the massive disruption of jobs, automation, and workforce demographics. All of this is important but as HR leaders we need to start with making worker wellbeing a priority in 2020!” (Meister 2020). The current workplace wellness market is worth more than $45 bilion and is projected to grow in the decades to come (Allied Market Research, 2020; Global Wellness Institute 2016). A lot of buzz surrounds worker well-being.

Numerous good reasons support widespread interest in worker well-being. The Forbes article highlights the purported role of worker well-being in workforce resilience and healthy organizational culture. Indeed, worker well-being may be an indicator of organizational ethics (Giacalone and Promislo 2010), and it has been found to predict other key indicators of organizational performance (Salas et al. 2017; Taris and Schaufeli 2015), such as productivity (Bellet et al. 2019; Oswald et al. 2015), absenteeism (Kuoppala et al. 2008), job performance (Judge et al. 2001) and voluntary turnover (Judge 1993; Wright and Bonett 2007; Wright and Cropanzano 1998). In addition to all of these ways in which worker well-being may be instrumentally valuable for advancing organizational objectives, worker well-being has great intrinsic value. Among the many things that might be thought to be good in themselves, human well-being is perhaps the one object most highly regarded as such (Aristotle, 350 C.E.; Mill 1859; Raz 1986; Sidgwick 1874). In sum, for many different reasons, the well-being of workers (and anyone else) is well worth pursuing.

Not only is there great interest in worker well-being by practitioners in organizations, academic researchers have also been paying much attention to the subject matter too (Chen and Cooper 2014; Zheng et al. 2015). Over many decades, a rich and mature field of research has emerged, with thousands of psychological studies that conceptually and empirically study worker well-being constructs such as job satisfaction (Judge et al. 2017) and engagement (Macey and Schneider 2008; Purcell 2014). More recently, researchers from outside the psychological sciences have started to embrace the topic, including economics (Bryson et al. 2013; Golden and Wiens-Tuers 2006; Oswald et al. 2015), information systems (Gelbard et al. 2018; Jung and Suh 2019) and machine learning (Lawanot et al. 2019; LiKamWa et al. 2013). However, buzz about worker well-being, enthusiasm for new programs to promote it and interest to research it have not been accompanied by universal enthusiasm for scientific measurement on the work floor. Hence, there remains a gap between the buzz surrounding worker well-being and the science needed to support it. However, pushes to research and influence worker well-being without careful scientific measurement may be ineffective (Bartels et al. 2019). Even worse, these endeavors may be genuinely problematic: If researchers conceptualize or measure worker well-being inadequately, a scientific study may impede rather than advance the science that surrounds it (Podsakoff et al. 2016). If an organization touts purported improvements in well-being when, in fact, there has been no real improvement, it amounts to a case of “ethics washing” (Bietti 2020; Wagner 2018), and may hide the need for actual meaningful improvement.

We believe that the gap between the burgeoning psychological science of worker well-being and the buzz around it in other domains is caused by the complexity of worker well-being itself and the vast array of approaches to measuring it, combined with the variety of goals stakeholders may have for studying it. For many, it can be difficult to choose, let alone confidently justify, the selection of a particular research strategy for studying worker well-being. The primary goal of this paper is to help close the gap by offering a conceptual overview of the science of worker well-being and practical guidance for leveraging it in light of the particular objectives motivating the study of worker well-being.

This work will be useful for researchers of various stripes. First and foremost, this work will be relevant for research practitioners in organizations and academics outside psychological sciences. After all, it is not straightforward to move from intuitions about the need to pay more attention to worker well-being to adequate conceptualization and rigorous measurement. Insufficient scientific rigor prevents policy and research initiatives from being as relevant as they could be. In addition, even experienced psychological researchers who have been administrating well-being surveys – currently still the preferred instrument for measuring well-being (Nave et al. 2008) – for years may benefit from a synthesis of conceptual approaches and an enlargement of their inventory of approaches to measurement. As most psychologists are trained primarily in classic psychological methods (Aiken et al. 2008), a foray outside their comfort zone that updates them on the methodological developments across other fields may prove useful. Inspiration to use new, innovative measures helps researchers to address calls for increased attention to the construction of better well-being measures (Brulé and Maggino 2017; Diener 2012; Schneider and Schimmack 2009) and facilitates collaborative interdisciplinary research.

We build on prior work that offers direction through “the conceptual jungle that currently characterizes the employee wellbeing literature” (Mäkikangas et al. 2016, p. 62). For example, Johnson et al. (2018) and Zheng et al. (2015) offered conceptual overviews on employee well-being and provide a handful of examples of validated survey instruments that can be readily used. Focusing on particular well-being constructs, other academics have reviewed existing traditional survey measures (Cooke et al. 2016; Roscoe 2009; Schaufeli and Bakker 2010; Van Saane et al. 2003; Veenhoven 2017), non-survey measures (Luhmann 2017; Rossouw and Greyling 2020), or both (Diener 1994, 2012). Going beyond both disciplinary and construct borders, other academics have concentrated on the promise of certain devices (e.g., wearable devices, Chaffin et al. 2017; Eatough et al. 2016), and measure categories for measurement of psychological constructs in general (Ganster et al. 2017; Luciano et al. 2017). A commonality among these works is that they each have a focus on specific instruments or constructs. Such specificity is both a blessing and a curse. It is helpful for researchers wanting an overview of the state-of-the-science of a particular instrument (e.g., the use of physiological measures in organizational science) or construct (e.g., survey measures of job satisfaction), but of limited use for readers interested in the bigger picture. In our work, we therefore offer a comprehensive field guide, which we hope will have broad appeal. Notably, in its broad scope, our work is not meant as an exhaustive overview, but rather as illustrative synthesis that maps the lay of the land and directs researchers to more specialized research. We structure our synthesis around three research questions:

  1. (1)

    What is worker well-being?

  2. (2)

    How can worker well-being be measured?

  3. (3)

    How should a worker well-being measure be selected?

We will address the first question by offering a rationale about how to think about the concept of worker well-being and proposing a construct taxonomy that researchers can draw from to operationalize the concept of worker well-being. In doing so, we intend to disentangle the conceptual jungle that we find in the current literature. The second question will be addressed by creating an illustrative overview of measures for ten constructs that fall under the conceptual umbrella of worker well-being: life satisfaction, dispositional affect, moods, emotions, psychological well-being, job satisfaction, dispositional job affect, job moods, job emotions and work engagement. Looking beyond disciplinary borders, we will show that innovative, non-survey measures show promise for measuring worker well-being and, thereby, hopefully inspire researchers to enrich their methodological toolboxes. The third question will be answered by reviewing different conceptual, methodological, practical and ethical considerations for selecting a measure and doing so in ways that are responsive to the motivations driving researchers and practitioners to take an interest in worker well-being. These considerations are summarized into a checklist.

What Is Worker Well-Being?

Worker Well-Being and Related Concepts

We assume that worker well-being, at the most inclusive level, comes down to the general well-being of working people. To ensure clear conceptual boundaries, it is useful to differentiate worker well-being from concepts that relate to it. Worker well-being differs from employee well-being, as not all working people are employed by organizations, e.g., volunteers, independent contractors, executives and business owners. Even though most well-being constructs are relevant for both employees and non-employed working people, there may be some exceptions. For instance, the construct of satisfaction with pay will be inapplicable to volunteers. Satisfaction with co-workers and satisfaction with supervisor will likely be irrelevant concepts for independent contractors. Worker well-being differs from work-specific well-being, as constructs falling under that conceptual umbrella that have their origin and application distinctively within the work context. For example, the construct of satisfaction with colleagues has its origin in the work context. Work-specific well-being’s manifestation can be within and outside the work context, e.g., a worker can feel contented about social relationships at work at the dinner table or before going to bed too, which can impact other parts of worker well-being. Worker well-being differs also from well-being at work, as this concept merely concerns the experience or state of well-being in the work setting or when working. Notably, the source of well-being at work can be unrelated to work. Workers could, for instance, be contemplating fights with their spouses or reliving a fun weekend while being at work. Finally, worker well-being differs from general individual-level well-being, as, in contrast to general individual-level well-being, it pertains specifically to the lives and experiences of working people.

A Taxonomy of Worker Well-Being Constructs

Many constructs have been proposed to operationalize the concept of worker well-being. We propose a theory-driven construct taxonomy that can be used to categorize constructs and map construct boundaries. We have drawn on eight other conceptual works on worker well-being (i.e. C. D. Fisher 2014; Ilies et al. 2007; Johnson et al. 2018; Page and Vella-Brodrick 2009; Taris and Schaufeli 2015; Warr 2012; Warr and Nielsen 2018; Zheng et al. 2015) to do this.Footnote 1 We constructed our taxonomy along four dimensions: (i) philosophical foundation, (ii) temporal stability, (iii) scope and (iv) valence.Footnote 2

First, researchers have been adopting different philosophical foundations for conceptualizing well-being (Forgeard et al. 2011; Kashdan et al. 2008) and worker well-being (Taris and Schaufeli 2015). Among the most prevalent are the philosophical traditions of hedonia and eudaimonia (Linley et al. 2009; Ryan and Deci 2001). The hedonic approach regards well-being as subjective experience of happiness (Diener et al. 1999; Veenhoven 2000); the eudaimonic approach focuses on the realization of human potential (Ryff 1989a; C.D. Ryff and Keyes 1995). The classification of constructs on the hedonic and eudaimonic continuum is not an easy task, because the different philosophical traditions are partially overlapping (C. D. Fisher 2014; Waterman 2008) and also empirically related (Linley et al. 2009; Pancheva et al. 2020). We categorize a construct as eudaimonic, if intrinsic motivation, activation, purpose and meaningfulness are at its core (Ryan and Deci 2001). However, it is important that researchers acknowledge that a eudaimonic construct often contains a hedonic component.

Second, a classification can be made based on constructs’ temporal stability (Johnson et al. 2018; Mäkikangas et al. 2016). Well-being researchers have developed state-like and trait-like well-being constructs (C. D. Fisher 2014). State-like constructs are characterized by high variability over time due to high state variance, whereas trait-like constructs are characterized by greater stability over time (Schimmack et al. 2010). Some state-like constructs are truly momentary and last for a few minutes at most, while others remain somewhat stable (Kashdan et al. 2008). Some traits are inherited and are unlikely to change over a lifetime, while others are subject to some change over months or years (Johnson et al. 2018).

Third, two levels of scope of worker well-being constructs can be distinguished: context-free and domain-specific constructs (Ilies et al. 2007). Context-free constructs concern the worker’s life and experience in general, whereas domain-specific well-being constructs concern well-being within particular life domains (e.g., work, leisure, health, finance). Context-free and domain-specific (especially work-specific) constructs capture the bigger picture and subtleties of worker well-being, respectively (Page and Vella-Brodrick 2009).

Fourth, the valence of a construct can be considered. Some constructs are indicators of ill-being or the absence of well-being (e.g., burnout, stress, workaholism, negative affect), whereas others are indicators of well-being (e.g., work engagement, flow, job satisfaction, positive affect). Intuitively, the realization of constructs with positive valence is desirable, while the realization of those with negative valence is undesirable.

To illustrate, we describe eight worker well-being constructs that together span the breath of the taxonomy.Footnote 3 In light of its broad scope and alignment with our understanding of worker well-being, we build on Page and Vella-Brodrick’s (2009) Framework of Employee Mental Health. It revolves around three concepts: subjective well-being (SWB), psychological well-being (PWB) and workplace well-being (WWB). As made explicit by Page and Vella-Brodrick, the model does not include eudaimonic WWB constructs. However, we have included work engagement as eudaimonic WWB construct. The constructs and their categorization are summarized in Table 1. Table 1 also contains a brief characterization based on the academic literature surrounding the individual constructs.

Table 1 Worker well-being constructs and their categorization

Subjective Well-Being

SWB encompasses diverse aspects of people’s evaluations of how their lives are going (Diener et al. 1999) Life satisfaction, the cognitive evaluation of satisfaction with life circumstances, is a trait-like, context-free, positive well-being construct (Diener et al. 1999). Affect, “people’s on-line evaluations of the events that occur in their lives” (Diener et al. 1999, p. 277), is constituted by both trait-like and state-like components, which can vary in their valence as well as their degree of arousal (active vs. passive, Barrett and Russell 1999). Some aspects of a person’s affect are relatively stable over time. Accordingly, dispositional affect is a trait-like construct and has been defined as “durable dispositions or long-term, stable individual differences that reflect a person’s general tendency to experience a particular affective state” (Gray and Watson 2007, p. 172). Other affect-related constructs within SWB follow a fluctuating course and classify as state-like (Gray and Watson 2007). For instance, moods are emotional states can last days or even a week, occur relatively frequently, have nonspecific triggers and manifestations (e.g. positive mood), and are primarily manifested in behavior and subjective experiences of people. Emotions can last seconds to, at most, a few minutes, are intense, occur infrequently, have specific triggers and manifestations (e.g., anger, joy), and are manifested in different forms, e.g., behavior, subjective experiences, brain activity, and physiological response (Gray and Watson 2007).

Psychological Well-Being

Although its various aspects can be studied individually, we treat PWB as a single construct concerning the “formulations of human development and existential challenges of life” (Keyes et al. 2002, p. 1007). PWB is often represented by Ryff’s (1989b) six-factor model, including self-acceptance, personal growth, purpose in life, positive relations with others, environment mastery, and autonomy. PWB is grounded in the eudaimonic well-being tradition, and is a trait-like, context-free, positive well-being construct (Page and Vella-Brodrick 2009; Ryan and Deci 2001).

Workplace Well-Being

Within WWB, we consider the constructs of job satisfaction, dispositional job affect, job emotions, job moods and work engagement. Job satisfaction can be defined as “a positive (or negative) evaluative judgment one makes about one’s job or job situation” (H. M. Weiss 2002, p. 175). Job satisfaction is a domain-specific, hedonic and trait-like construct (Bowling et al. 2005, 2010; C. D. Fisher 2014). As such, job satisfaction is the work-specific counterpart to the context-free life satisfaction construct we described above.Footnote 4Dispositional job affect, job moods and job emotions are equivalent to context-free conceptions of dispositional affect, moods and emotions, except for their narrower, work-specific focus. For example, we could be narrowly interested in a worker’s general affect while working (dispositional job affect) or more broadly interested in the worker’s general affect across life domains (dispositional affect, Ilies and Judge 2004). In contrast to these hedonistic constructs, work engagement is an eudaimonic construct (C. D. Fisher 2014) concerned with how workers experience the exercise of their capacities at work. Work engagement as been defined in various ways, but is generally described as a domain-specific construct characterized by high levels of identification with work, positive affect, enthusiasm and energy (Bakker et al. 2008) and is theoretically distinct from other constructs, such as job satisfaction and organizational commitment (Schaufeli and Bakker 2010). Work engagement could be defined as “harnessing of organization members’ selves to their work roles: in engagement, people employ and express themselves physically, cognitively, emotionally and mentally during role performances” (Kahn 1990, p. 694) and “a positive, fulfilling, work-related state of mind that is characterized by vigor, dedication, and absorption” (Schaufeli et al. 2002, p. 74). Work engagement turns out to be relatively stable over time (Seppälä et al. 2015), hence its classification as trait-like.Footnote 5

How Can Worker Well-Being Constructs Be Measured?

Measure Classification

Constructs, like each of those just discussed, are put together to study real phenomena that cannot be observed directly and perfectly (Edwards and Bagozzi 2000). A measure, “an observed score gathered through self-report, interview, observation or some other means” (Edwards and Bagozzi 2000, p. 156), can therefore be regarded as the empirical equivalent of a construct. A measure thus does not necessarily perfectly reflect the well-being construct it is intended to measure; rather it provides an instrument-dependent representation of it. In this article, we introduce two classifications that will prove important for selecting the most appropriate measure for a given construct. The first classification concerns the extent to which the obtaining a measure interferes with the workers’ affairs and experience, and the second considers the different types of data a researcher can obtain.

Measure Obtrusiveness

Regarding the extent of interference with a workers’ affairs and experience, we distinguish between three measurement approaches for worker well-being: unobtrusive measurement, reaction-based obtrusive measurement and observation-based obtrusive measurement. Unobtrusive measures are methods that allow researchers to gain insights about subjects without the researcher, the subject, or others intruding into the research context and draw their data from naturally occurring circumstances and events (Hill et al. 2014; Webb et al. 1966). Obtrusive measures, methods characterized by active cooperation of subjects (Hill et al. 2014; Webb et al. 1966), come in two forms. Reaction-based obtrusive measures are based on the instruments that ask subjects for conscious, subjective input, whereas observation-based obtrusive measures are based on instruments that collect data automatically but require subjects to operate them. In other words, observation-based measures rely solely on the practical cooperation of subjects, and reaction-based measures rely both on practical cooperation and subjects’ effort to offer responses.

Measure Types

We distinguish between four types of measures: closed question measures, word measures, behavioral measures and physiological measures (Luciano et al. 2017). We will describe both the general characteristics of these types, as well as their relations to the obtrusiveness classifications just discussed.

Closed survey question measures are obtained from workers’ responses to one or more survey questions or statements with a finite number of answer categories, as with multiple-choice questions and discrete number scales. Most often, self-report closed survey question measures are used, which are inherently reaction-based obtrusive. In light of common method biases associated with self-report measures, well-being researchers have used other-report (e.g., spouses, friends, children, colleagues) well-being measures to validate self-report measures (Schneider and Schimmack 2010). Other-report measures are observation-based obtrusive, because, even though subjects do not have to exert cognitive effort, they must cooperate with a researcher to identify and contact relevant others who can fill out a survey.

Two classes of survey measures are distinguished: attitudinal or experience-based measures (Grube et al. 2008). Attitudinal measures are designed to uncover a person’s overall, usually retrospective assessment of trait-like attitudes, such as life and job satisfaction. Experience-based measures are designed to measure a person’s momentary state, e.g., moods and emotions. Typical experience-based survey instruments prompt questions about whereabouts, events, company, activity and feelings of the respondent for several days, either multiple times during the day (i.e. experience sampling method) or at the end of the day (i.e. day reconstruction method; Kahneman et al. 2004).

Word measures are derived from spoken or written text, and can represent the relevant semantic content of the speech or writing (i.e., meaning), or the pattern of speech (Luciano et al. 2017). Word data can be manually analyzed by independent coders or processed automatically by computer software and can be collected either obtrusively (e.g., administering open-ended survey questions) or unobtrusively (e.g., scraping social media data).

Behavioral measures consist of observations of individual behavior, and come in many forms, e.g. data on movement, position, body posture, facial expression, online behavior, substance abuse, etc. (Luciano et al. 2017). Behavioral measures can be either unobtrusive (e.g., publicly available video data) or observation-based obtrusive (e.g., video data obtained from a lab experiment).

Physiological measures are markers that reveal the state of a person’s body or its subsystems (Luciano et al. 2017). Building on the work of Akinola (2010) on the most widely used physiological measures in organizational sciences, we distinguish four prominent subcategories: endocrine activity (e.g., cortisol, testosterone, oxytocin, dopamine and serotonin), electrodermal activity (e.g., skin conductance response, skin conductance level), cardiovascular activity (e.g., blood pressure, heart rate, cardiac efficiency) and neurological activity (e.g., frontal lobe activation). These markers reflect changes in the autonomic nervous system, a part of the peripheral nervous system that serves regulatory functions by helping the human body adapt to internal and external demands (Akinola 2010).

Because physiological data is not recorded naturally, researchers typically rely on observation-based obtrusive measures. The obtrusiveness of these instruments varies substantially (Eatough et al. 2016; Ilies et al. 2016). Devices such as arm-cuff digital blood pressure monitors, fingertip pulse oximeters and cotton swab saliva sampling require substantial effort for subjects (e.g., attaching a device to the body) and can be uncomfortable in use (e.g., some activities could be inhibited by the device), while devices such as wearable bracelets and smartphone applications are almost completely hassle-free.

Illustrations of Measures

In Table 2, we provide illustrations of measures for constructs falling into the framework that we used for illustrating our construct taxonomy. More information on these measures can be found in Appendix A. We echo our previous disclaimer that the list of measurement options is non-exhaustive and will not cover all potential conceptual nuances. In addition, we want to note that the different measures vary in their degree to which they are valid for the constructs they are purported to measure. For example, evaluative constructs such as job satisfaction and life satisfaction are likely better measured using subjective measures, while affective constructs such as emotions and moods can be gauged with both subjective and objective measures (Brulé and Maggino 2017). We will discuss the validity of measures in the next section. Finally, several rows in Table 2 contain blank cells. These are areas where, as of yet, there has been little or no work applying the relevant measurement approach to the type of construct in question. As such, these blank areas signify current opportunities in the study of worker well-being.

Table 2 Illustrations of measures

How Should a Worker Well-Being Measure Be Selected?

With such a wide assortment of measures for worker well-being constructs, the next question is how to choose one in your research. In this section, we will show why demonstrating measurement fit, “the degree of alignment between how a construct is conceptualized and measured” (Luciano et al. 2017, p. 593), is a challenging task. Luciano et al.’s (2017) framework of measurement fit illustrates that researchers have to go through various (iterative) steps to make well-reasoned measurement decisions: researchers must explicate the construct thoroughly (e.g., map a construct’s content, dimensionality, stability and hypothesized manifestation), determine measurement features (e.g., identify a measure’s content, source and aggregation strategy), consider the research context (e.g., state-of-the-science and research purpose), ethics of a proposed research plan (e.g., privacy, discrimination, paternalism) and feasibility, accuracy and completeness of a measure. Considering space concerns, we cannot follow Luciano et al.’s full model for each worker well-being construct. Instead, we sketch a high-level picture of the various relevant considerations for choosing a measure and refer the reader to dedicated works for more elaborate discussion. We summarize this overview in the form of a checklist in Table 3.

Table 3 Checklist for selecting a worker well-being measure

Conceptualization

One must decide on the construct or constructs of study before a measure can be selected. This decision is driven by many factors, e.g., objective of the research, the employment situation of the workers you study, the research context and the research question(s). For example, when researchers are interested in evaluating the well-being enhancing potential of a new coffee machine, they are well-advised to select a very narrow, domain-specific construct, such as satisfaction with facility management, rather than a broader construct, such as job satisfaction. For another example, when researchers are tasked to evaluate the well-being enhancing potential of receiving a compliment, they may want to consider a more dynamic, state-like well-being construct, such as job emotions, rather than a stable, trait-like well-being construct, such as job satisfaction, because the effects of compliments will likely be only temporary.

For the selection of appropriate worker well-being construct, we recommend researchers measure as many well-being constructs as possible and maximize diversity. As the measures on different constructs are not easily aggregated, we urge researchers to report well-being measures individually, in the spirit of a dashboard (Forgeard et al. 2011). Such broad measurement of worker well-being is relevant for several reasons.

First, since most researchers’ goals for studying worker well-being will be largely motivated by moral considerations and general goodwill, it is important to ensure sufficient breadth of measurement. The reason for this is that constructs vary in their intrinsic value.Footnote 6 Most context-free well-being constructs reflect theoretically and philosophically grounded conceptions of human value, e.g., PWB (Aristotle, 350 C.E.; Zagzebski 1996), life satisfaction (Sumner 1996) and dispositional affect (Bentham 1789; de Lazari-Radek and Singer 2014; Feldman 2004). For domain-specific constructs such as job satisfaction and work engagement, the moral case favoring attention to these constructs is slightly harder to make, as they do not necessarily and inherently contribute to worker well-being. Work engagement, for instance, could have a dark side (Bakker et al. 2011; Dolan et al. 2012), as illustrated by research showing that it, in some cases, may instigate work-family conflict (Halbesleben 2011; Halbesleben et al. 2009). None of this is to deny that varieties of domain-specific well-being may frequently, or even usually, drive general well-being, and thus are valuable. It is just that the value of domain-specific well-being constructs depends on the contingencies of their causal interplay with context-free well-being constructs, which better reflect a worker’s overall well-being.

Researchers can mind such well-being trade-offs by measuring a diverse set of constructs. To illustrate, it may be necessary to study constructs with negative valence, such as burnout or work addiction, to uncover downsides of policies driven by the goal of increasing positive affect at work. An organization’s increasing focus on social responsibility may increase engagement, but with the unintended effect of enticing some workers to be too engaged in their work, giving rise to work addiction (Brieger et al. 2019). A dashboard covering a variety of domain-specific and context-free constructs allows researchers to keep all possible tradeoffs in view. However, if the selection of constructs must be constrained, researchers may prioritize constructs that are most likely to uncover those tradeoffs.

Second, for researchers who are motivated to study worker well-being in the service of other objectives, keeping an open mind to the measurement of multiple worker well-being constructs will likely pay off. This holds for researchers with various research objectives, e.g., academics interested in testing theory or practitioners aiming at advancing organizational performance through the enhancement of worker well-being. The reason is that worker well-being constructs can be related to other constructs and factors in unexpected ways. To illustrate, concerning antecedents of worker well-being, a meta-analysis of Steffens et al. (2017) showed that social identification processes relate more strongly to positive well-being constructs than to negative well-being constructs. Regarding outcomes, a meta-analysis by Erdogan et al. (2012) demonstrated that life satisfaction correlates significantly stronger to organizational commitment and turnover intention than to job performance. In conclusion, having a sufficiently broad measurement scope will enable researchers to uncover the most interesting and important relationships among variables.

For researchers interested in making an academic contribution, there is an additional impetus for measuring multiple constructs. Like many research fields in social sciences, the field of worker well-being is burdened with the problem of construct proliferation: “research streams are built around ostensibly new constructs that are theoretically or empirically indistinguishable from existing constructs” (Shaffer et al. 2016, p. 81). For example, research suggested that employee engagement is not distinct from constructs like job burnout (Cole et al. 2012) and job satisfaction (Christian et al. 2011). Measuring multiple, ostensibly distinct constructs will help researchers to demonstrate or refute the theoretical and empirical distinctiveness of well-being constructs and thereby advance the science of worker well-being.

Once one or more constructs have been chosen, researchers are well-advised to turn to established literature to carefully define the construct and understand the conceptual nuances to it. Articles covering best practices for construct definition (Podsakoff et al. 2016) and conceptual works on the conceptualization and categorization of worker well-being (e.g., our current work, Johnson et al. 2018; Page and Vella-Brodrick 2009; Zheng et al. 2015) could be helpful. When constructs have been selected and adequately conceptualized, researchers can move into the constructs’ ideal measurement strategy.

Measurement

One of the most important considerations in choosing a suitable measure is a measure’s validity. Validity can be described as “the degree to which scores on an appropriately administered instrument support inferences about variation in the characteristic that the instrument was developed to measure” (Cizek 2012, p. 35). A measure must be the causal outcome of a construct (Borsboom et al. 2004), which means that it has to satisfy the following four conditions for causality: (i) definition of a construct must be chosen and articulated independently and prior to the measure, so that the relationship between the two is not merely tautological, (ii) substantial association (or covariation) between the construct and the measure, (iii) realization of the construct temporally prior to the measurement, and (iv) elimination of rival explanations that could explain the relationship between a construct and a measure, such as history and instrumentation (Edwards and Bagozzi 2000). In summary, for a measure to be valid for a hypothesized construct, it must be the hypothesized construct – and only the hypothesized construct – that causes the measure.

Proving that a measure is a valid requires a process of theoretical and empirical validation (Borsboom et al. 2004), “the ongoing process of gathering, summarizing, and evaluating relevant evidence concerning the degree to which that evidence supports the intended meaning of scores yielded by an instrument and inferences about standing on the characteristic it was designed to measure” (Cizek 2012, p. 35). Researchers interested in using a previously developed measure are therefore advised to understand how that measure has been validated and assess the adequacy of the validation process. Researchers aiming to innovate in the development of a new measure must accept the responsibility of performing, or otherwise ensuring, a proper process of measure validation. Either way, understanding the validation process is essential to avoid relying on misleading indicators of the relevant constructs and drawing specious conclusions.

Theoretical (or content) validation starts with a logical analysis of measure-construct fit, often performed by academic and/or practitioner subject matter experts (Bornstein 2011; Luciano et al. 2017). This is where the preparatory work from the conceptualization phase comes into play: a high-quality conceptual definition and deep understanding of conceptual nuances are useful for making methodological decisions. For instance, as the definition of life satisfaction suggests that a valid measure of this construct should be based on a cognitive evaluation and will typically remain stable over time (Diener 1994; Shin and Johnson 1978), one can safely forego dynamic, unobtrusive or observation-based obtrusive word, behavioral or physiological measures, and narrow the methodological scope to reaction-based obtrusive, subjective measures, such as surveys and interviews. In sharp contrast, one is well advised to consider more objective behavioral and physiological measures when the measurement of affective states or other state-like constructs is of interest, as their conceptual definition permits it (Mauss and Robinson 2009). In case the research contexts necessitates survey measurement of affect, one would need to accommodate the state-like nature of affect by focusing on experience-based measures instead of attitudinal measures (C. D. Fisher 2000).

After theoretical validation, a measure must be empirically validated. This is traditionally done by demonstrating adequate reliability of a measure and demonstrating appropriate statistical associations between a new measure and measures of related or unrelated constructs (Bornstein 2011; for early examples, see Campbell and Fiske 1959; Cronbach and Meehl 1955). More specifically, one can examine a new measure’s convergent validity, discriminant validity, predictive validity and incremental validity, in relation to other validated measures, or design experiments to uncover biases in measures and to unravel the underlying mechanisms causing the measurements observed (Bornstein 2011; Edwards 2003). Often, one can draw on existing validation research to substantiate the empirical validity of a measure and pick appropriate validation tests (e.g., confirmatory factor analysis, internal consistency analysis, Edwards 2003). For example, in the development of new closed question job satisfaction measures, Ironson et al. (1989), Thompson and Phua (2012) and Bowling et al. (2018) all followed common practice (e.g., Clark and Watson 1995; Edwards 2003; Hinkin 1998) by examining the new measures’ convergent validity (i.e., alignment with) with existing job satisfaction scales and their discriminant validity (i.e., departure from) with measures of related, but distinct constructs.

During empirical validation, one should pay serious attention to the various kinds of measurement error that measures are susceptible to. For instance, closed question survey measures, word measures based on social media and physiological measures obtained from wearable sensors are all vulnerable to selection biases: subjects self-select themselves into participating to a survey, using social media and utilizing a wearable sensor (Ganster et al. 2017; Kern et al. 2016; Landers and Behrend 2015). Closed question survey measures and word based social media measures are both susceptible to social desirability biases (Marwick and Boyd 2011; Podsakoff et al. 2003; Wang et al. 2014), while physiological measures are not. Other sources of measurement error are relevant for specific measurement instruments. Surveys are vulnerable to careless responding, the tendency to respond to questions without regard to the content of items (Meade and Craig 2012; e.g., an intense experience sampling study, Beal 2015; lengthy batteries of job satisfaction questions, Kam and Meyer 2015). Word measures obtained through computer-aided textual analyses will be vulnerable to algorithm error, the pattern of error observed when multiple computer-aided textual analysis techniques produce different measures using the same methods and texts (McKenny et al. 2018; Short et al. 2010). Instruments collecting physiological data will inescapably introduce noise (Chaffin et al. 2017; Ganster et al. 2017). Researchers should ensure that they have the appropriate expertise to catch and mitigate the relevant sorts of errors.

We conclude with a note on the varying complexity of theoretically and empirically validating measures. As previously indicated, obtrusive measures such as closed questions, open questions and interviews are relatively straightforward to validate. For theoretical validation, this mainly is due to the deliberate alignment of the measure with the construct definition (e.g., during item pool generation and item purification, Brod et al. 2009; Hinkin 1998). By maximizing the semantic equivalence of the questions and the construct definition, researchers are able to eliminate alternative explanations prior to the collection of data. The theoretical validation of an unobtrusive measure is much less straightforward, because one has little to no influence over the way data is collected. With an unobtrusive measure we have much less guarantee that the cause of the measurements is limited to factors relevant to the construct to be measured. Because of inherent differences between the instrument and the intended construct, one is forced to rely heavily on theory to make a case for why the content of a measure best resembles the construct of interest rather than related, but distinct constructs (Hill et al. 2014). The same pattern of difficulty holds for empirical validation. Empirical validation of obtrusive measures is relatively convenient, as a multitude of validation guidelines and validated measures have accumulated over time. Empirically validating an unobtrusive measure is much more challenging, as it is often impossible to find a well-validated unobtrusive measure for comparison and introducing a validated obtrusive (e.g., survey) measure in an obtrusive measurement design takes away the valuable unobtrusive nature of the data (Hill et al. 2014).

Practicality

After conceptualization and measurement, researchers must consider the practicality of a measurement strategy in a given research context. In some way, all researchers must accommodate the preferences and demands of stakeholders, e.g., organizations, employees and institutions. At the same time, they must safeguard their scientific and ethical integrity. Finally, they must always remain mindful of their own resource limitations.

Organizations

Organizations may use their position as facilitator of worker well-being research to put pressure on researchers to do research as cheaply and efficiently as possible (Lapierre et al. 2018). For example, organizations may be hesitant to facilitate physiological measurement, as purchasing and distributing wearable devices are still much more costly than administering questionnaires (Akinola 2010; Ganster et al. 2017). Relatedly, organizations may prefer single-item measures over their psychometrically superior multi-item counterparts, as the opportunity costs associated with filling out multi-item measures are expected to be too high (G. G. Fisher et al. 2016; Gardner et al. 1998).

Beyond the need to deal with unequal power relations, it is important for researchers to be wary of the values and leadership in an organization. In particular, for well-being research to have an effect on the well-being of workers, an organization’s leadership has to value both research and well-being (Nielsen et al. 2006; Nielsen and Noblet 2018). Without commitment from senior management, worker well-being research, regardless of its rigor, will be of limited value, as any resulting policy recommendations will not be implemented. Hence, it is advisable to start well-being research only if the topic is a strategic topic in the organization and there is a culture of receptivity to research and evidence-based practices. On the other hand, organizational change must always begin somewhere, and we should not lose hope that well-presented, well-timed research on a topic of moral importance may occasionally prove pivotal.

Workers

Researcher on worker well-being is, of course, typically motivated by a moral interest in lives and experience of workers. However, when striving to obtain valid measurements of worker well-being, researchers must not lose sight of the impact of measurement on those very workers whose well-being is to be measured. For choosing a well-being measurement strategy, the rights and interests of the research subjects matter for both practical and ethical reasons. Practically, without satisfactory buy-in from them, measures will be subject to substantial non-response or validity issues (Rogelberg et al. 2000). It is therefore advisable to accommodate workers’ tendency to dislike lengthy batteries of questions or long interviews, as participation can be unpleasant and distracting. Further ethical considerations emerge in light of the inherent moral significance of well well-being research and the increasing convenience of collecting (big) data (see Israel and Hay 2006 for an extensive overview of research ethics for social science; Metcalf and Crawford 2016). Here we briefly touch upon important ethical considerations and direct readers to referenced works for more information.

First of all, there is an obligation that will be obvious to academic researchers but perhaps less familiar to professionals in organizations: In order to ensure that research does not harm the workers who are the research subjects, researchers must adhere to the principles of research ethics (e.g., American Psychological Association 2017). In most instances, a review by an independent ethics committee is highly advisable (Wassenaar and Mamotte 2012), as any research conducted by an organization on employees of that same organization presents special problems, due to pressure employees may feel to “volunteer” for the research (Kim 1996). In cooperation with the ethics committee, researchers must be prepared to justify any measurement choices for which a less obtrusive, invasive or burdensome alternative might have been available.

Second, although it is sometimes neglected with novel forms of social research (Flick 2016), the informed consent of research subjects is of paramount importance. This requires that researchers adequately inform workers about the study, thereby taking into account their expectations and social norms (Brody et al. 2000; Manson and O’Neill 2007), and ensuring that their participation is voluntary, not coerced (Faden and Beauchamp 1986). The imperative of informed consent has implications for measurement strategies. When practical, it is advisable to use measures that have a clear and intuitive connection to the constructs to be measured (high face-validity), as is the case with most survey measures. This makes it more straightforward to fully inform workers about the connection between the research and their well-being, and hence reduces barriers to consent and willing participation. Where experimental design precludes full transparency in advance, a thorough debriefing session after the experiment becomes critically important (Brody et al. 2000; S. S. Smith and Richardson 1983; Sommers and Miller 2013). Novel or esoteric measurement techniques require extra care with regard to informing and debriefing, simply because these techniques may run contrary to workers’ expectations of the research process.

Third, it is essential to mind ethical considerations regarding autonomy and privacy, as respectively associated with obtrusive and unobtrusive measurement. Since obtrusive measures, by their nature, interfere with workers’ work and other experience, use of such measures implicates the autonomy of workers. Significant interference with their lives should be limited as much as possible and explained clearly. This ensures that workers’ abilities to make sure their own choices are not unduly diminished, and not affected beyond the participation to which they have consented (Faden and Beauchamp 1986). On the flip side, unobtrusive measures raise special concerns about the privacy of workers, because, by the very design of the measurement methods, the subjects may not be aware of the information collected about them (Motro et al. 2020). Hence, it is incumbent upon researchers to ensure that workers are not monitored beyond what is relevant to the study, or beyond that to which they have consented. In general, pitfalls of both obtrusive and unobtrusive measures can be largely mitigated by diligent procedures for informed consent.

Institutions

Researchers must also navigate institutional pressure and legal requirements. The relevant regulations are very much dependent on the type of study and the location where the study is conducted. For instance, the General Data Privacy Regulation (GDPR, European Parliament and Council 2016) has outlined strict rules on the analysis, collection, sharing and storage of individual-level data and, in particular, health data (e.g., biometric data, survey data on mental health, Guzzo et al. 2015). If the analysis of health data is of interest, researchers within an organization may want to consider a collaboration with external researchers who specialize in managing such data securely and responsibly. Finally, if the workers are unionized, proactive communication with union representatives is advisable. Although unions support initiatives to advance worker well-being, they may well be wary of measurement procedures that appear to diminish worker autonomy or privacy.

Researchers

In light of the many, often divergent preferences and demands of various stakeholders, researchers are forced to be pragmatic and accommodating. Making concessions, however, does not mean that the researchers’ own objectives should be discounted. The responsibility falls to researchers themselves to ensure that well-being is measured in a valid way and that, therefore, research questions are answered adequately. In addition, as researchers’ time, skills, and resources are finite, certain well-being measures will be infeasible in certain contexts. For instance, if an organization wants to evaluate a company-wide vitality promotion program using wearable devices and dynamic surveying, researchers must be certain to have enough time and resources available to prepare data collection (e.g., selecting vendors, customizing instruments, training subjects) and to analyze the data (e.g., collaborating with researchers in other fields, learning new analytical techniques Chaffin et al. 2017; Eatough et al. 2016). Being pragmatic and minding resource limitations does not have to undermine the validity of measures. Researchers can draw from extant literature to select validated alternatives to the more time-consuming and costly measures. If one wants to measure job affect using the experience sampling method, and an organization suggests a cross-sectional survey to do this, researchers can suggest day reconstruction method as a valid alternative (Dockray et al. 2010; Kahneman et al. 2004). If one wants to use well-established multi-item scales to measure well-being constructs, and an organization rejects this idea, researchers might want to suggest validated single-item measures (e.g., G. G. Fisher et al. 2016; Wanous et al. 1997) or shortened scales (e.g., Russell et al. 2004; Schaufeli et al. 2006, 2017). This may allow investigation of several constructs with satisfactory precision instead of a single construct with higher precision, which should be a desirable trade-off in many contexts, for the reasons noted above.

In the process of managing stakeholders, good communication is key. Organizations, in particular, are not easily convinced by the presentation of statistical or theoretical evidence (Hodgkinson 2012). For this reason, it is key to communicate about topics such as instrument validity, research design and construct choice in an understandable and persuasive manner (Lapierre et al. 2018). We refer the reader to research on the communication of evidence-based practice (Baughman et al. 2011; Highhouse et al. 2017; Hodgkinson 2012; Lapierre et al. 2018; Zhang 2018) and bridging the academia-practice gap (Banks et al. 2016; Rynes 2012) for best practices.

Conclusion

Our work aimed at answering three questions that are relevant for the study of worker well-being. We addressed the first question, What is worker well-being?, by proposing a construct taxonomy based on four dimensions: philosophical foundation, scope, stability and valence. We illustrated the taxonomy by classifying the ten worker well-being constructs. By synthesizing the many conceptual models of worker well-being, the taxonomy helps researchers to make sense of the burgeoning but messy field of worker well-being.

To answer the question, How can worker well-being constructs be measured?, we offered a multi-disciplinary overview of traditional (e.g., surveys and interviews) and novel data sources (e.g., wearable sensors) that can be leveraged to measure worker well-being. Therein, we distinguished four broad types of data sources: closed question survey, word, behavioral and physiological measures, and further classified them as either unobtrusive, reaction-based obtrusive or observation-based obtrusive. We hope that our overview inspires researchers to think outside their current methodological toolboxes and to foster collaborations outside the social sciences to leverage new data collection techniques.

Taken together, our construct taxonomy and our overview of existing measurement approaches uncovered some notable gaps in the current science of worker well-being. In particular, we showed that several of the most important work-specific well-being constructs have been measured primarily using closed question surveys. In light of the fact that the context-free counterparts of these constructs have undergone innovation in measurement methodology, we encourage researchers to draw from other research strands to develop new measures of these important work-specific constructs. More generally, we hope that our overview inspires researchers to think outside their current methodological toolboxes and to foster collaborations outside the social sciences to leverage new measurement and data collection techniques.

To address the final question, How should an worker well-being construct measure be selected?, we described the importance of good conceptualization, rigorous operationalization and pragmatic stakeholder management. Because of its broad scope, this discussion was not intended to be exhaustive. Instead, we hope that the discussion provides a useful map of the most important considerations and guidance to detailed references on particular topics (e.g., construct definition, validation, ethics, and communication).

In conclusion, with our work, we intended to bridge the gap between the popular buzz about worker well-being and the extant scientific research about it. Our work has provided guidelines to go beyond the ad-hoc study of worker well-being and conduct rigorous, responsible research. It is our hope that researchers, whether working in organizations, in academia or both, will feel more competent to take the well-being of workers into account, eventually permitting them to better understand what drives worker well-being and design policies to promote it accordingly.