What work on urban life and mental disorder comes closer to Galea’s causal architecture by seeking to identify mechanisms? We can review seven strategies.
Strategy 1: Include everything: ‘NEM III R’
How can we link the huge scale differences between urban political economy and molecular biology that might be involved in urban mental health? The NEM III R (Network Episode Model) of the causes of mental disorder has been developed by sociologist Pescosolido over the last 25 years. Starting from a fairly simple model of patient networks, demographics and support systems (1991), she has extended it in NEM II to include the treatment system, and in NEM III to include the biological system (Pescosolido 2006, 2011; Perry and Pescosolido 2015).
This approach was developed explicitly to integrate the biological and social, through the specification of four basic requirements:
“Consider and articulate the full set of contextual levels documented to have impact.
Offer an underlying mechanism or ‘engine of action’ that connects levels.
Employ a metaphor and analytic language familiar to both social and natural science.
Understand the need for and use the full range of methodological tools”. (Pescosolido 2006, p. 194).
The model broadly includes a time dimension covering an individual’s life course, and a number of analytic levels, connected by networks, ranging across the biological, personal, social and system levels. In this sense, it is widely inclusive, but although the NEM model is strong in sensitising us to all the factors and levels which might be important, it is less clear about how to determine which mechanisms are the key ones. This is not just an empirical issue, as it is still not clear where to look. Pescosolido’s answer is to follow the actors and their networks:
the entire process is dynamic, constituted, and embedded in individuals’ social networks. What individuals know, how they evaluate the potential efficacy and suitability of a range of options and providers, and what they do (in what order and under which “tone”) are fundamentally tied to, negotiated in, and given meaning through social interactions. (Pescosolido 2011, p. 45)
However, it should be noted that ‘social interactions’ is still biology-light. Actors here do not include the materialities of molecules, cells, membranes, and biological systems and structures. Despite her claim to the contrary, Pescosolido’s approach is really quite thin on biology, and it occupies relatively little space in her work. She does not follow the actors ‘all the way down’ in the way so notable in actor network theory (Latour 2005, p. 52).
Strategy 2: Cut through on a clear path: SES as a ‘fundamental cause’
NEM excels at bringing all possible factors to bear, but how do we find the network connections between levels that are active mechanisms? An alternative strategy is to cut a clear path through the myriad of possible factors with one major explanatory mechanism. This was proposed by Link and Phelan with the idea of ‘fundamental cause’.
Urban life is a major multiplier of inequality: just as the world population has now become majority urban, inequalities grow all around the world (Amin 2007; Scambler and Scambler 2015). Urban life accumulates both the richest and the poorest. Even a cursory glance at the list of recent epidemiological studies of the social (Table 1) shows a common theme of disadvantage, deprivation, isolation, fragmentation, and discrimination. These are overlapping aspects of inequalities of income, status and power manifest in a variety of ways. Link and Phelan (1995) have suggested that (unequal) socio-economic status (SES) could be conceptualised as a fundamental cause (mechanism) of ill health (and of course by definition, good health). They do not suggest that SES stands in for a number of known and unknown specific mechanisms, but is in itself a dominant mechanism for understanding the experience of ill health (including of course mental ill heath). Even if specific diseases and therapies come and go, they argue that SES will continue to exercise dominance, as those with more resources gain disproportionate access to new healthcare and disproportionately avoid health risks. Resources, they predict, can
help individuals avoid diseases and their negative consequences through a variety of mechanisms. Thus, even if one effectively modifies intervening mechanisms or eradicates some diseases, an association between a fundamental cause and disease will re-emerge. As such, fundamental causes can defy efforts to eliminate their effects when attempts to do so focus solely on the mechanisms that happen to link them to disease in a particular situation. (Link and Phelan 1995, p. 81).
Over the subsequent 20 years, this prediction seems to have held up. Assessing the evidence that has accumulated in four areas, they suggest (Phelan et al. 2010) that SES does indeed shape disease outcomes via various risk factors, through the differential deployment of resources, a pattern which is very stable and reproduced over time via the emergence and decay of intervening mechanisms. This clearly leads in exactly the opposite direction from contemporary epidemiological sophistication—there is little point in identifying more and more (and smaller) risk and protective factors with larger and larger samples, if there is an underlying mechanism generated by SES that over time continues to reproduce ill health.
As originally defined and subsequently used, the term ‘resources’ is ambiguous. While Link and Phelan continue to emphasise resources, since wealth and money income will unequally confer potential and actual agency, it should be noted that other classical elements of SES, such as power, status, geographical clustering, and cultural habitus (including language) are likely parts of this mechanism, and indeed have been partially identified as such in a detailed ethnography exploring a fundamental cause analysis of diabetes in the USA (Lutfey and Freese 2005—discussed further below). In this respect, Link and Phelan’s strategy is a case of “emergent simplification” mentioned at the beginning of this paper, where Schaffner (2008, p. 1018) discusses how to “transcend the specific workings of the molecular details”.
Strategy 3: Social psychology of small groups
Another strategy has been to try to identify mediators between the social and the biological levels. Not as rigid as the ‘fundamental cause’ approach, these are the very small scale interactions that shape capacities and emotions. An example is research developed from the work of Henry Tajfel (1981) in the 1950s, aiming to understand stereotyping of Jewish people in WWII though the concept of ‘everyday’ or mundane categorisation, rather than the personal prejudices of the powerful. He focussed on the process of prejudice and stereotyping arising from the social judgements that people are typically hard wired to make in small groups. He showed that people’s sense of identity is strongly influenced by the group in which they find themselves, and the way in which groups are categorised. There are similarities here to Goffman’s (1986) work on stigma, and Scheff’s (1974) work on labelling. In all three approaches, the individual both creates and receives categorical judgements about themselves and others in terms of the group context in which they find themselves (Phelan et al. 2008).
Indeed it is argued in this ‘self-categorization theory’ (Turner 1987) that there is a strong desire in people to ask ‘who am I?’ and to answer this through their perception of the groups they are in. While those making these judgments may strengthen their own identity in this way, the consequence for people who are so judged can be considerable. This may range from material exclusion (for example, from employment, housing), to ‘social defeat’ and shame—a particular experience for people with mental disorder, migrant status, and ethnic minority status (Reader et al. 2015). While much work in this area is based on animal studies, Reader et al. explore a number of ways in which the experience of social defeat ‘gets under the skin’ and into the brain via the hypothalamic–pituitary–adrenal (HPA) system. The link to urban mental health would be the way in which cities both destabilise the fixed social groups of rural life, and in addition multiply the opportunities, divisions and expectations for those who move to the city.
Strategy 4: Social capital
The expanded conceptualisation of resources to include power, status, geographical clustering, and cultural habitus (including language), suggested by Lutfey and Freese (2005) above, is very close to another recent and increasingly common conceptualisation of ‘the social’—social capital. While fundamental cause research resonates with the gloomy list of disadvantages and deprivations in Table 1, there is another rapidly growing literature on the mechanisms of social capital that might protect against mental disorder.
In a parallel development to the fundamental cause literature since the 1990s, social capital has come to embrace a bewildering variety of aspects of the social (Moore and Kawachi 2016). In a systematic review of the effects of social capital on mental health, De Silva et al. (2005) observe that social capital might include ‘cognitive, structural, bridging, bonding, and linking’ types, drawn from the very different theoretical traditions of Pierre Bourdieu, James Coleman and Robert Putnam. However, in the tradition of epidemiological evidence synthesis these are happily combined into a review as if they (more or less) extracted elements of a common underlying reality. Ten years later a further review (focussed on young people) illustrates the wide range of ideas that have been drawn under the idea of social capital (Table 2):
McPherson concludes not unreasonably that “It is, therefore, important that future research seeks to uncover the mechanisms through which social capital may exert different influences on the mental health” (p. 13).
Strategy 5: Interaction ritual chains
While Pescosolido emphasises what people know and what they do, as the core aspects of social interactions, and Link and Phelan emphasise the deployment of resources, and social capital stresses the social context, there is further aspect of social life which hardly appears in these approaches. This is the point that there are important mental health consequence of the emotions generated by the way people live their lives in close and intimate proximity to one another. For example, stress is generated by high levels of negative emotional expression in families and, by contrast, trauma can result from low levels of emotional expression (e.g. neglect or isolation). There is evidence that both are bad for mental health (Koutra et al. 2014; Norman et al. 2012).
Sociological work on mechanisms that might analyse the effect of small/intimate groups such as families on emotions has been developed by Randall Collins and others on violence (2009) and ‘interaction ritual chains’ (2005) combining the traditions of Darwin, Durkheim and Goffman. Collins argues that we are biologically predisposed to emotional contagion in small group settings through physical and psychological rhythmic entrainment (resonance) which generates positive (or negative) emotional states, creating or releasing emotional stress (Fig. 1).
The biological roots of this mechanism he suggests are shared with non-human animals:
We have things of considerable importance to incorporate from Darwin and ethology. First and foremost is the proposition that society is subhuman. Many animals live in groups just as we do. The distinctively human forms of cognition and communication are built on top of the pre-existing capacity for social bonds; they are not the basis of it. (Collins 1975, p. 92, italics original)
Goffman, similarly, recognised that sociality is “located in a physical, biological, and social world” (Goffman 1974, p. 247). However, the biological pathways from bodies to rhythmic entrainment are not clear from Collins. One interesting possibility has been suggested by Heinskou and Liebst (2016), through the polyvagal parasympathetic part of the autonomic nervous system. This, they observe, is the neural pathway that regulates a number of internal organs, including the heart. It is phylogenetically the most recent subsystem and is connected to social communication, compared to the second oldest (fight-flight) and oldest (freezing or death feigning). In short, they suggest that this mechanism might connect the Collins model with the polyvagal system, genuinely bringing the biological and social together.
Strategy 6: Biology led mechanism
Biological mechanisms have so far only been touched on. To move further in that direction, we can consider mechanisms that place biology more centrally. Pescosolido (2006, p. 194) suggested earlier that in seeking mechanisms it might be possible to ‘employ a metaphor and analytic language familiar to both social and natural science, so as to facilitate synergy’. Arroyo-Santos (2011) suggests that in biology ‘the metaphor is a tool used to fulfill three major goals: 1) as an epistemic device seeking to provide a satisfactory explanation, 2) as an inferential device that can help infer new aspects of the phenomenon under investigation, 3) as an important theory-construction device’. (p. 89). It has already been noted from Bechtel (2013a, b) that the mechanisms that are a core part of explanation in biology are often accompanied by detailed graphical/metaphorical representation—indeed diagrams can be argued to be part of a full biological explanation. We should start therefore with a comment on metaphors and images in biology.
Biology has been particularly rich in the use of vivid metaphors and diagrams as a way of realising possible and actual mechanisms in various fields of its work. Hodgkin and Huxley represented their mechanism through a mathematical formula and electrical circuit used as a proxy mechanism for the squid neuron. The vagal nerve discussed earlier in relation to interaction ritual chains is frequently embedded in diagrams that link brain function to external stress, the HPA system and the gut microbiome. There are a number of common biological metaphors found in the literature, of which five are prominent. Table 3 suggests how they might be related to social and biological mechanisms.
The most common mechanism/metaphor to be found in the biology led literature on urban life is the proposal that external stressors (trauma, poverty, etc.) are invasive, or at least signal the need for biological changes. The use of this concept of stress is now almost universal. However, it has not always been so—indeed stress would be an excellent example of JS Mill’s celebrated warning not “to believe that whatever received a name must be an entity or being, having an independent existence of its own” (Mill 1869, footnote 2). In a recent Wellcome project on the history of stress, Mark Jackson (2012, 2013, 2014, 2015) suggests that although the concept of stress was used in the late nineteenth century, it only became widely adopted after the second world war in the UK. Used initially to explain an upsurge in ulcers and other stomach disorders, it was taken up in psychiatry as war time migration and the newly established health and social services revealed the kind of lives that poorer people led.
The classic paper underpinning the modern era of stress research is McEwen and Stellar (1993), in which stress is described in the first paragraph as “physical stressors such as exertion, heat, cold, trauma, infection, and inflammation; psychologic stressors such as fear and anxiety, social defeat and humiliation, disappointment, and sometimes even intense joy” (p. 2093). Significantly, stress is not described in any further detail, but summarised as physical or psychological stimulus which threatens the individual. ‘Allostatic load’ is further defined as the accumulated ‘wear and tear’ of adjusting the natural balance (allostasis) such as heart rate, to threats over time.
However, Jackson’s history raises the question, does stress really exist, and if so, what is it? The conceptualisation and measurement of stress, and its impact on allostatic load, is central. However, there is little clear idea of how this accumulation mechanism actually works, nor the units in which it might be measured. Allostatic load is now conventionally elaborated to include environment, life events, and early trauma, such that childhood trauma from many years earlier is added together with concurrent experiences, such as travelling though the city as measured by moment-by-moment mobile technology, as illustrated by McEwan here (Fig. 2):
Three approaches have been used to elaborate the evidence for such stress mechanisms. One is to measure external social stressors and allostatic load in population surveys. Two comprehensive reviews suggest that this evidence has yet to be found. Dowd et al. (2009) conclude on the basis of the 26 studies they found that “Current empirical evidence linking SES [socio-economic status] to cortisol and AL (allostatic load) is weak. Future work should standardise approaches to measuring SES, chronic stress and cortisol to better understand these relationships”. (p. 1297). (Cortisol level is the standard biomarker used to measure stress response). In a further recent review of studies since 2009, Johnson et al. (2017) state that “the scope of this review was limited to the biological internalization of SEP [socio-economic position] and the effects of this stressor on AL, highlighting AL as a mechanism on the causal pathway between SEP and health outcomes” (p. 1). The authors report that a total of the 59 different biomarkers were used in one or more of the 26 studies they found. The number of biomarkers used to create an AL index ranged between 6 and 25. There were 20 different biomarker combinations observed across the 26 studies included in the literature review. They conclude by observing how struck they were by the substantial inconsistency in biomarkers used to operationalise AL, and also by the lack of fidelity to its original conception as an index of biological response to stress. They argue it is difficult to know what is being measured by AL, or interpret findings about the association between SEP and AL.
The second approach is to induce stress in experimental events, as a proxy for the stress of ‘urbanicity’, and correlate this with experience of past or current urban living. A widely used approach is to subject experimental subjects to a situation of unfamiliarity and perceived social judgement, such as public speaking. This is associated with changes in cortisol levels in the blood, and seems robust. Using cortisol levels as a measure of stress in studies of mental disorder has become common—they are fairly easy to evaluate from saliva. However, the use of cortisol to measure stress raises a very real danger of circularity, in which stress is itself defined in terms of cortisol, since there is little reflection in the literature on whether public speaking is the ‘same’ stress as adverse life events, or childhood trauma, or urban living and whether those get ‘under the skin’ in the same way.
An example which has recently become widely influential is a study of how urban living and upbringing affect ‘neural social stress processing’ (Lederbogen et al. 2011). Two stress measures are used: the “Montreal Imaging Stress Task (MIST), a social stress paradigm where participants solve arithmetic tasks under time pressure” (p. 498) and a similar test in which “subjects performed two cognitive tasks (arithmetic and mental rotation) while being continuously visually exposed to disapproving investigator feedback through video” (p. 499). This is assumed to be a model of urban social stress, such that the authors conclude with the “parsimonious proposal that social stress contributes causally to the impact of urbanicity on the neural circuits identified” (p. 500). In a final flourish, they suggest that
Beyond mental illness, our data are of general interest in showing a link between cities and social stress sensitivity. This indicates that an experimental approach … could be used to characterize further … the effects of finer-grained quantifiers of individuals’ social networks or individual social experience in urban contexts. One such potential component is unstable hierarchical position. (p. 500, italics added).
These findings were not replicated in a subsequent study explicitly linked to the Lederbogen study, which used two different stress tests (Steinheuser et al. 2014). The Trier Social Stress Test asked subjects to “recall a list of 50 previously learned items in front of an audience” (p. 679). The Socially Evaluated Cold Pressor Test asked subjects to “immerse their right hand up to and including the wrist for 3 min (or until they could not tolerate it any more) into ice water”. (p. 679).
It is a long extrapolation from such tests to an argument that the same stress is produced by an ‘unstable hierarchical position’, and it remains unclear how such psychosocial components create stress in the field, and how it is to be measured.
A third approach has been the development of a model of the ‘the social brain’ by Cacioppo in social neuroscience, looking at the consequences of social experience on the brain. The key area for his work has been the consequence of loneliness, i.e. the absence of the social, on brain function. Moreover, the key studies here have been with animals rather than humans, and the definition of the social is very limited. In a long review of ‘Social Neuroscience: Progress and Implications for Mental Health’, Cacioppo et al. (2007, p. 106), for example, define the social as ‘social behaviour’, which is classified into four subcategories of self-perception, self-regulation, interpersonal perception, and group processes. However,, the fourth type, group processes, is dealt with in merely one sentence in this long and detailed review: “The stigma of physical and mental illness operates at the group or collective level of analysis” (p. 109). This is a substantial failure to grasp or work with the reality of human society, and a misleading use of the term social in ‘social neuroscience’ and the ‘social brain’.
In relation to stress, this lopsided approach continues in a recent comprehensive review of stress and the ‘social brain’. Sandi and Halle (2015) open as follows: “Stress (defined in this Review as the activation of the neurophysiological stress response) helps organisms to cope” (p. 290). Stress as a concept is thus collapsed into neurophysiological change, rather than separately theorised and measured. Stress becomes the neurophysiological response, rather than the social experience. Thus, the conceptualisation and measurement of stress arising from social life continues to prove elusive in practice. They comment that most of the studies they review are from animal studies (rodents), and thus stress is read off from small games and situations created for rodents to react to (p. 292). In common with the Lederbogen study (above), this is almost exclusively an animal-based review, which has equivocal findings, yet this has not inhibited the reviewers from concluding with an unsubstantiated and massive projection into human social life that
If stress that is caused by social disputes — such as war, physical abuse or aberrant socioeconomic inequalities — exacerbates antisocial dispositions in individuals, it may be instrumental in the development of spirals of violence (p. 300) (italics added)