Baumeister et al.’s (2001) claim is that bad is stronger than good.Footnote 3 In one summary of this titular hypothesis, they write “When equal measures of good and bad are present… the psychological effects of bad ones [events] outweigh those of the good ones [events]” (323). Variations on this claim have come to be known as the negativity effect or the negativity bias and, as introductorily noted, it is now widely accepted.Footnote 4 Unfortunately, the hypothesis is never adequately clarified, creating a host of hitherto unrecognized problems.
The authors seem to assume that their key terms—‘bad’, ‘good’, and ‘strong’—require no clear definition, and indeed that none can be given, because they are “universal and fundamental”. Accordingly, in their introduction they nevertheless attempt to assuage worries about defining their key terms by writing (p.325):
Definition implies rendering one concept in terms of others, and the most fundamental ones therefore will resist satisfactory definition. Good, bad, and strength are among the most universal and fundamental terms… and it could be argued that they refer to concepts that are understood even by creatures with minimal linguistic capacity (such as small children and even animals). By good we understand desirable, beneficial, or pleasant outcomes including states or consequences. Bad is the opposite: undesirable, harmful, or unpleasant. Strength refers to causal impact. To say that bad is stronger than good is thus to say that bad things will produce larger, more consistent, more multifaceted, or more lasting effects than good things.
For clarifying their hypothesis, however, this will not do. First, ‘good’, ‘bad’, and ‘strong’ remain unclear in ways that matter for evaluating the hypothesis. Second, the problems arising from this unclarity are exacerbated by the lack of clarity concerning the subjects to which the key terms are being applied, i.e. what the hypothesis is supposed to be about. Consider each in turn.
Predicates
‘Good’, ‘bad’, and ‘strong’ are all unclear in ways that matter for evaluating the hypothesis.
First consider ‘strength.’ In the passage quoted above, the authors say that “strength refers to causal impact.” What kind of causal impact and how is its strength to be evaluated? They seem to acknowledge that further clarification is required by initially specifying a stronger effect as one that is “larger, more consistent, more multifaceted, or more lasting” than another.
Unfortunately, however, these are distinct hypotheses. More multifaceted and more lasting, for instance, are distinct and dissociable measures: some effects may last longer while being less multifaceted than brief others, and conversely some effects may last a short while being highly multifaceted. The same problematic dissociations arise for consistency. There is no reason to think these three specifications of strength are measures of the same thing and even brief consideration suggests that they are not. One may, of course, operationalize ‘strength’ in a given context in any way one likes—but consistency across the supposed evidence offered in support of claims about strength is required.
The problematic conflation across notions of strength is seen in the supposed evidence offered by the authors throughout the text, as the meaning of ‘strength’ becomes increasingly stretched. A wide range of measures are blithely offered. In addition to the initial three specifications, a thing is taken by Baumeister et al. to be stronger than another as a matter of the degree to which: one is motivated by it (p. 351); it produces emotion (p.328); it affects adjustment measures (p. 328); it predicts marital longevity (328); it is “pronounced” (p.330); people agree about its application (p.330); it influences opinion (p.331); it is “important” (p. 332); it is avoided by a wide range of techniques (p. 332); it takes time to process (p. 334); the elaboration with which it is processed (p. 340); one makes decisions concerning it (p. 334); it facilitates learning about other things (p. 335); it is itself learned about (p. 336); it causes a “response in the brain” (336); it is remembered (337); and it predicts distress (340). This list is not nearly exhaustive, but includes more than enough to be perplexing.Footnote 5
Many of these senses of ‘strength’ are regrettably vague, but insofar as we can specify them, they are again distinct and dissociable. Counterexamples come easy for almost any of the above two criteria. I may use only one technique to deal with my fear of spiders (e.g. try to get away), but that fear may nonetheless be a good predictor of my distress and a poor predictor of my marital longevity. In considering the multitudinous criteria, in Baumeister et al. alone and much less beyond, it begins to looks as if ‘strength’ is allowed to mean almost anything that we can measure. But if that is right, then it is not clear what the hypothesis amounts to. Again, the problem is not that we need a once-and-for-all operationalization of strength for all hypotheses that we might want to test, but to know whether some good X is stronger than some bad Y, we’ve got to know what ‘stronger’ means as it occurs in the negativity bias. Without this, we can neither confirm nor disconfirm the hypothesis. Worse still, with conflicting measures we could both confirm and disconfirm the hypothesis with the same data. The lack of clarity concerning ‘strength’ is thus deeply problematic.
The laxity concerning ‘strength’ and how strength is measured is what appears to lead contradictory results to be offered in supposed support of the hypothesis. Consider just one example of this from Baumeister et al..Footnote 6 On the one hand, the authors claim that bad information (most perspicuously: information about something the receiver takes to be bad) and bad moods take longer to process and involve further cognitive elaboration, and they offer results in support of this claim (e.g. p. 334). Increases in response time, cognitive processing, and elaborated responses are thus all taken as measures of strength. On the other hand, however, the authors claim that negative information is processed faster because it is more important and they offer results in support of this claim (e.g. p. 346). Decreases in response time, cognitive processing, and elaborated responses are measures of strength. Because ‘strength’ is unclear, it is understood in opposite ways, and conflicting results are both taken to provide support.
More generally then, without restrictions on what ‘strength’ means and how it is supposed to be measured, any measurable asymmetries will seem to both confirm and disconfirm the hypothesis. Any difference between the effects of some bad X and good Y, that is, can be interpreted as evidencing either strength or a lack of strength. The authors, of course, opt to interpret all of these results as evidence of strength—and the negativity bias literature, built upon this foundation, has followed suit. You remember X longer? That’s taken to be evidence of X’s strength; it is more important, so you remember it for future. You remember Y longer? That is taken to be evidence of X’s strength; it is threatening, so you forget it as quickly as possible. You learn X more easily? That is taken to be evidence of X’s strength; you have evolved to pay more attention to it, facilitating learning. You learn Y more easily? That is taken to be evidence of X’s strength; it is painful to concentrate on X, so you allow yourself to become easily distracted. Clarification restricting the measures for strength is needed before any (subset) of these results can be legitimately accepted as evidence.
The lack of clarity concerning ‘good’ and ‘bad’ only compounds these problems. Baumeister et al. claim that “‘[g]ood’ and ‘bad’ are among the first words and concepts learned by children (and even by house pets), and most people can readily characterize almost any experience, emotion, or outcome as good or bad” (p. 323). The authors seem to think it is simply obvious whether an experience, an emotion, or an outcome is good or bad. This, I submit, is simply untrue.
One problem is that experiences, emotions, and outcomes are often good in some ways but bad in others. The authors claim that bad is stronger than good. Good or bad in what way? As with ‘strength,’ while we do not require that the authors settle some once-and-for-all meaning of ‘good’ and ‘bad,’ evaluation of the hypothesis does require consistency in the meaning of these terms as they there occur. Presumably, the required categorization is good or bad all things considered. But this all things considered evaluation is not straightforward—especially if, as it seems, the relevant all-things-considered good encompasses moral, prudential, and aesthetic goods. The brief specification the authors give of undesirable/desirable, harmful/beneficial, unpleasant/pleasant are not enough taken individually. Taken together, as any ethicist knows, these specifications will often conflict.
As a single example, imagine that you send a drafted piece of work to a respected colleague for feedback. The colleague generously takes the time to send you extensive feedback: along with identifying points that they believe that you have made well, they point out mistakes in your reasoning, grammatical infelicities, and gaps in your scholarship. Consider the experience of reading their feedback. Is this a good experience or a bad experience all things considered? Being both beneficial and unpleasant, it is hard to say. One might object that this example actually involves many experiences, emotions, and outcomes, each of which is obviously good or bad. In response then, consider the experience of reading one particular comment identifying a problem in your argument. None of this is to deny that we regularly make all things considered judgments. It is to deny that these are easy, universal, and require no theorizing. Most import for present purposes is that when these judgements are difficult, it is not any easier to make them by using any one of the authors’ criteria and, moreover, they conflict.
One might think hedonically complex experience like this one are relatively rare, allowing the authors to maintain that “almost any” experience, emotion, or outcome is easily categorized, but this seems simply not to be so. Hedonic complexity is commonplace. My lunch is delicious tasting, but artery clogging, and I think about my arteries as I eat. My session on the elliptical, to work off my fattening lunch, makes me feel healthy and proud, and also tired, sweaty, and involves an annoying pain in my ankle. Are these commonplace experiences good or bad? Whatever the answers, they are not obvious. Similar considerations apply to both emotions and outcomes. I am happy about something harmful, I am relieved to lose a hated job, I am remorseful for ending an unhealthy friendship: are these good or bad emotions and outcomes, all things considered? The hedonic complexity of many (if not most) experiences, outcomes, and emotions is such that it is simply not obvious whether they are good or bad. This is not to argue that there is no answer, but it is to say that the hypothesis cannot be evaluated without further clarification. And again, notice that the critique–if sound—extends beyond Baumeister et al.’s foundational work; while I am focused on the canonical text, use of ‘good’ and ‘bad’ and the range of measures taken to support their presence and degree are increasingly stretched and ambiguous the more of the literature that we consider. Adding yet further senses of ‘good’ and ‘bad’ exacerbates the problem.
The intended sense of ‘good’ and ‘bad’ for stating and evaluating the hypothesis is plagued by a number of further problems arising from variation. Consider that what is beneficial for one person can be harmful for another. So too, things that are pleasant for one person—to use another, conflicting criterion—can be unpleasant for another. I might also evaluate some experience type—say, a roller coaster ride or a horror movie—as good, while you evaluate it as bad. There is rampant hedonic variation and the authors give no indication of how the hypothesis is meant to apply in the face of it. Insofar as this variation is not taken into account, the applications and explanations made available by the negativity bias flounder. We need some further specifications to deal with the differences in what is good or bad—in these distinct ways—for distinct creatures and persons, and for the same person across times.
A difficulty evaluating the hypothesis that is acknowledged in the contemporary literature is that the compared things must be good or bad to the same degree. As Rozin and Royzman (2001) note (p. 300): “The logic or argument for negativity bias is complex, largely because of the difficulty of equating negative and positive events.” No one thinks that anything good to any degree is weaker than anything bad to any degree. Instead, the bad and good being compared must have the same hedonic magnitude. Comparing hedonic magnitudes is difficult enough when the senses of ‘good’ and ‘bad’ are clarified,Footnote 7 but it does not seem to have been appreciated that without clarification of these terms, any attempts to engage in this difficult task remain unprincipled. With conflicting criteria for good and bad, controlling for hedonic magnitude becomes a mug’s game.Footnote 8
Moreover and finally, the problematic laxity with which ‘strength’ is measured fatally exacerbates the problems of hedonic magnitude. As Peeters and Czapinski (1990) note (p. 34), “If the greater impact of a negative stimulus is due to the greater intensity of that stimulus, we do not have a genuine negativity effect but simply a trivial intensity effect.” But without further clarification, greater impact may always be interpreted as evidencing greater intensity. Because the measures of strength and hedonic magnitude are unrestricted, there is nothing to stop their conflation. Again, the problem is not the lack of a once and for all meaning of ‘good’ and ‘bad’—requiring that of affective scientists would be inappropriate. But we do require a single and consistent meaning of ‘good’, ‘bad’, and ‘strength’ as these occur in the hypothesis in order to evaluate it. The failure to consistently clarify the hypothesis’ key terms undermines legitimately interpreting any results of empirical inquiries as evidence or support for the negativity bias: any apparent difference in strength discovered might always, instead, be as legitimately taken to evidence a difference in hedonic magnitude. The hypothesis, then, could never be confirmed or disconfirmed. As such, it should be rejected as ill-formed. Unless the intended senses of ‘strength’, ‘good’ and ‘bad’ are clarified in a more restricted way, it is hard to see how this wholesale rejection of the hypothesis can be avoided. Notice that this problem is not a problem with Baumeister et al.’s formulation in particular; rather, the problem will arise insofar as the wide range of measures of ‘strength’, ‘good’, and ‘bad’ are all taken to support some unified, increasingly stretched, hypothesis.
Subjects
Not only are the key terms of the hypothesis thus problematically unclear, but its subject matter is not adequately identified. What things of equal hedonic magnitude are being compared for strength? Candidates throughout the text include emotions, information, outcomes, interactions, personality traits, and more besides.
I think that the most charitable interpretation is to understand the authors not as confusing the many types of things they discuss, but as taking the hypothesis to hold equally well for all of them. There is good reason to think this is indeed what they mean. Baumeister et al. conclude their article by saying (p. 362):
In our review, we have found bad to be stronger than good in a disappointingly relentless pattern. We hope that this article may stimulate researchers to search for and identify exceptions…Given the large number of patterns in which bad outweighs the good, however, any reversals are likely to remain mere exceptions. The lack of exceptions suggests how basic and powerful is the greater power of bad.
They likewise tend to infer from the general claim to any particular subject, for instance: “If bad is generally stronger than good, then information pertaining to bad events should receive more thorough processing than information pertaining to good events” (p.340). They later note (p. 355) that they were “…unable to locate any significant spheres in which good was consistently stronger than bad.” It seems that the negativity bias is intended to hold for any types of things whose tokens may be good or bad. We do best to interpret the hypothesis as the claim that bad events, experiences, outcomes, information, and so on are all (respectively) stronger than good events, experiences, outcomes, information, and so on of corresponding hedonic magnitude.
One qualification, however, appears to be that the negativity bias must be some psychological phenomenon or other. And, indeed, the negativity bias has been taken to be a hypothesis that has been established as useful for explanation and prediction in psychology in particular. Thus they write (p.323) that the hypothesis “…may in fact be a general principle or law of psychological phenomenon.” This psychological qualification may be interpreted in at least two ways. First, it may mean that the bad is psychologically stronger than the good. In this case, the psychological entities to which the law applies are the effects of the good or bad thing. Second, it may mean that the psychologically bad is stronger than the psychologically good. In this case, the psychological entities to which the law applies are the psychological states which are themselves good or bad, and causes of asymmetrically strong effects. Again, the authors appear to endorse both interpretations: the bad has a stronger psychological impact than the good and the psychological good is stronger than the psychological bad. And again, the subsequent literature has unquestioningly followed suit.
It is important, however, to keep clear whether the subjects of the hypothesis are inputs to psychological states, e.g. events, or are instead psychological states themselves, e.g. emotions.Footnote 9 Conflating these creates problems.
One problem is an intensification of those already seen, because the intelligible senses of ‘good’, ‘bad’ and ‘strong’ are limited by their subjects. A cup of coffee may be good, bad, or strong in different ways than a shot of whisky—and in virtue of different features. Similarly, the features in virtue of which a mental episode is good are distinct from the features in virtue of which an external event is good. The conflation of subject to which the hypothesis is supposed to apply is, I suspect, one source of the problematic lack of clarity concerning the intended predicates. Unless we are clear on the subjects, it will be hard to specify the predicates as needed to evaluate the hypothesis.
Another problem is that mental episodes, in particular the emotions, are sometimes taken to be effects by which to evaluate causes, while at other times they are taken to be causes which are to be evaluated by their effects. When they are taken as the effect of a valenced cause, they are taken to serve as a measure of the strength of the causes being evaluated. When they are taken as the valenced cause being evaluated, they are instead measured for strength by their distinct effects. So, on the one hand, when evaluating the evidence concerning the way that people react to events, Baumeister et al. take emotions to be the effect of valenced causes, summarizing (p. 328): “…most findings indicate that people react more strongly to bad events than good events. …. Bad events produce more emotion, have bigger effects on adjustment measures, and have longer lasting effects.” Later, however, the authors take the emotions themselves to be the valenced causes being evaluated, writing (331): “The prediction [of the negativity bias] is that negative affect and emotional distress will have stronger effects than positive affect and pleasant emotions...”. There is nothing illegitimate about evaluating both what causes emotions and the effects of emotions, but whether the emotions are being understood as the hedonic cause of an effect or instead as the effect of a hedonic cause matters for the different predictions and explanations the negativity bias is interpreted as offering. These remain conflated across the literature.
As an example, consider the way that Bauemiester et al. draw (2001) on Baumeister and Leary (1995) to support the negativity bias. In summarizing this support, they say (p.331)
…when Baumeister and Leary (1995) reviewed the evidence in support of a need to belong, they concluded that that need was for nonnegative interactions, rather than positive ones as they had originally theorized. The reason was that neutral interactions seemed adequate to satisfy the need to belong in many cases. This too confirms the greater power of bad: The effects of positive, good interactions were not consistently different from the effects of neutral interactions, whereas bad ones were clearly different from the neutral.
But the “neutral interactions” here are ones that seem likely to be categorized as involving neutral happenings, but non-neutral mental episodes. At least, in my own case, I would categorize many of the interactions seemingly relevant to my feeling of belonging in this way. Someone saying hello, my neighbour hanging their laundry, mail in my slot, the same man being in my corner store, the familiar smells and sights on my way to the office—these are all part of the humdrum of my life. If asked, I would categorize these as neutral events. Nonetheless, and indeed perhaps partly because of the humdrum neutrality of these events, they also involve positive feelings of belonging. This hedonic complexity is hidden in the conflation of events and emotions.
Note that none of this is to deny that there is an important connection between the goodness and badness of things in the world and the pleasantness and unpleasantness of one’s mental episodes. Any theory of hedonics needs, ultimately, to be complemented by a plausible theory of value. The problem is the conflation of these connected things.
This problem may seem easily avoided by determining the valence of the events by the hedonics of the states they cause. This, however, spawns other difficulties. The variation across conditions, persons, and times mentioned in the previous section would again wreak havoc. So too, as we will see in section three, many of the results which have been taken to support the negativity bias involve stimuli that are also taken to have some independently determined valence. These results are offered even in cases where it is known that there is a poor correlation. For one early instance, in Baeyens et al. (1990), an ingested sugary substance is taken to be a ‘good’ to be compared to an ingested non-sugary ‘bad,’ despite the fact that this particular sugary substance is reported by subjects as tasting unpleasant.
Further specification of the subject to which the hypothesis is intended to apply is thus needed. In particular, the claim that bad psychological states are stronger than good psychological states is distinct from the claim that bad inputs to psychological states are stronger than good inputs to psychological states. These hypotheses mean different things, make different predictions, and would be explained by different mechanisms. This ambiguity would remain even after ‘good’, ‘bad’, and ‘strong’ were specified—though the intelligible specification of these predicates is not independent of the needed specification of the subjects. Notice again that though I have focused discussion on Baumeister et al.’s formulation, the offered criticisms concerning ambiguity and contradictions apply across the literature and intensify when extended to it.