Biology & Philosophy

, Volume 26, Issue 2, pp 281–293

Wimsatt and the robustness family: Review of Wimsatt’s Re-engineering Philosophy for Limited Beings


    • Philosophy Program, RSSS and Centre for Macroevolution & MacroecologyAustralian National University
Review Essay

DOI: 10.1007/s10539-010-9202-x

Cite this article as:
Calcott, B. Biol Philos (2011) 26: 281. doi:10.1007/s10539-010-9202-x


This review of Wimsatt’s book “Re-engineering Philosophy for Limited Beings” focuses on analysing his use of robustness, a central theme in the book. I outline a family of three distinct conceptions of robustness that appear in the book, and look at the different roles they play. I briefly examine what underwrites robustness, and suggest that further work is needed to clarify both the structure of robustness and the relation between it various conceptions.




A fundamental method for discovering the structure of many proteins, salts, and metals is X-ray crystallography. The procedure involves bombarding a crystal of the substance under examination with X-rays to produce a diffraction pattern, called a reflection. A single reflection is not enough to discover the substance’s structure though; the whole process requires rotating the crystal, bombarding it again and again, and recording a number of these reflections. The two-dimensional reflections are then assembled via complex mathematical transformations, and combined with known chemical data about the substance, to finally produce a three-dimensional model. The whole process proceeds in several iterations, with information from the model being used to refine the initial parameter estimates. Information from similar structures or estimates from different techniques might play a part in refining these parameters too. Wimsatt’s vision of how science is done, and how philosophy should be done, resembles this process writ large. Multiple methods, multiple data-sets, iterative procedures, and perhaps most importantly, an awareness of the limitations of tools being used and when and how they break down, all play a part in establishing our best stab at an approximate vision of the current subject matter. Wimsatt’s focus on the pragmatic engineering aspects of this process highlights a very different set of issues than that traditionally studied in twentieth century philosophy of science. This collection of annotated and extended papers by Wimsatt provides an excellent overview of this unorthodox, but ultimately visionary, take on how we go about understanding the world about us.

Wimsatt’s exposition is like its subject matter: complex, many-leveled, and deeply interwoven. This is unavoidable, for Wimsatt eschews easy over-simplifications—or when he does simplify, he is apt to tell us ten different ways that these simplifications might mislead. This approach can, however, make an initial foray into Wimsatt’s world rather overwhelming; everything is connected to everything else, and it is only slowly that the bigger picture begins to be revealed. If the reader is momentarily swamped by ideas, I recommend heading to the Epilogue: On the softening of the Hard Sciences. This chapter is largely a narrative, recounting and reflecting on important events that shaped the development of Wimsatt’s ideas, and gives a marvelous insight into how the Wimsatt mind works.

This book is, however, the best way to read Wimsatt. Besides overcoming the problem of actually getting hold of copies of his most well-cited pieces (that seem to be published in collections not available at my library), having an entire collection of Wimsatt before you means you can trace many of the connections, both explicit and implied, than run throughout the book. And, I should add, reading Wimsatt is a thoroughly worthwhile project—his work prefigures many ideas now mainstream in philosophy of biology, and there is plenty of fertile ground here yet.

The rest of this review concentrates on a single recurrent concept in Wimsatt’s book: robustness. Robustness might well be the duct-tape in Wimsatt’s heuristic engineering approach to doing both science and philosophy—binding together many other of his ideas. Its role is central because of Wimsatt’s express commitment to our limits, and the necessity to proceed in piecewise approximations. Though we can only get a limited picture at a time, it is by assembling a number of these limited pictures that we get some measure of reliability about our judgments, assertions, and theories about a bigger picture (like the X-ray crystallography described above). Robustness is the right kind of relationship between these various snapshots—the one that gives us confidence, but not certainty, that we’re on the right track.

Before reading the book, I felt I had a clear idea of what robustness was. The more I encountered the idea in Wimsatt’s book, however, the less sure I was of exactly what it meant. It wasn’t that I disagreed with Wimsatt about the importance of the concept. It was simply that robustness seemed to appear in a number of guises, each time playing a slightly different role, and how these roles differed, and how they were connected, was not obvious. What follows is an attempt to clarify some of these problems, and suggest some solutions.

An analysis of robustness

Imagine I have a theory of megafaunal extinction in Australia, telling us why all the big marsupials, such as the massive Diprotodon, died out. My theory relies on a subtle interaction between climate change and the spread of certain invasive species. I construct a number of different models; some are simple mathematical models with analytic solutions, others are complex simulations that take hours to run. In all cases, I get a robust [R1] result: an extinction event that follows from a particular constellation of climatic and ecological parameters. I publish these results in a prestigious journal. A short while later, a reply is published in the same journal. The authors of the reply point out that, while the theorem itself might well be correct, it is unlikely to be true, for the model requires such a precise constellation of events that is unlikely they would have ever co-occurred. In other words, the phenomena itself is not very robust [R2], for any slight variation from the precise details indicated by the theorem would not lead to extinction. In true scientific spirit, I deny these objections and set out to prove my theory. I contact my friends in the field, and piece together a number of lines of evidence from different parts of Australia, each using different dating techniques. To my delight (and everyone’s surprise) each line of evidence points to the exact constellation of parameters required by my model to produce the extinction event. I feel confident in this result, for the evidence is robust [R3]—derived from a number of independent sources and different experimental procedures.

The example is made-up, but gives a plausible account of the complex developments that might occur when trying to introduce and confirm a new theory. In the passage, the word ‘robust’ appears three times and, I shall suggest, each time a different kind of robustness is being referred to. These three kinds of robustness each occur in Wimsatt’s book, in various places. I’ll name and describe each kind of robustness, and then examine each in more detail.
  1. R1.

    Robust Theorems (derived from Robustness Analysis) A robust theorem is one whose derivation can be supported in multiple ways. This kind of robustness is mostly discussed in the context of modelling and robustness analysis. To model a complex world, we often construct models—idealised representations of the features of the world we want to study. Given they are idealised, we need to sure the results derived from these models tell us about the world, rather than simply reflecting the particular idealisations of the models. One answer, following Levins (1966) [and more recently defended by (Weisberg 2006)], is to construct a series of related models, which make different idealisations, and then subject these models to robustness analysis. This analysis identifies, if possible, a common structure in all the models; one that consistently produces some static or dynamic property. The relation between common structure and the production of the stable property constitutes a robust theorem, for it holds regardless of the specific idealisations made in each model. This robustness lends reliability to the claim that a formal structure represents some real world phenomena.

  2. R2.

    Robust Phenomena (also Robust Engineering, Homeostasis) Some phenomena, or kind of phenomena, are robust when they are reliably present in many different contexts. Often, but not always, the phenomena in question can be thought of as some kind of mechanism, either designed, or evolved. In this case, the mechanism continues to function reliably, despite perturbations or interventions. These perturbations may be internal, to the parts of the mechanism, or external, in the environment in which the mechanism operates. This robustness is ontological, identifying some stable entity or property in the world.

  3. R3.

    Robust Detection (also Triangulation, Multiple Lines of Evidence) A claim about the world is robust when there are multiple, independent ways it can be detected or verified. So, for example, different sensory modalities may deliver consistent information about the world, or different experimental procedures may produce the same results. The robustness is epistemic, as it delivers a reliable connection between a claim and some fact about the world.


Each kind of robustness describes a situation where one thing remains stable despite changes to something else that, in principle, could affect it. So, (1) multiple models produce the same results despite different idealisations, and (2) a mechanism continues to function despite changes to its parts or environment and, (3) various lines of evidence point to the same conclusion despite different modalities or experimental procedures. In each case, the results could have been otherwise: different idealisations can change results, there are many interventions that might cause mechanisms to fail, and different experimental procedures often turn up different results. When things are robust, however, we get constancy, despite change being possible.

This gives us an initial characterisation of robustness: a relation between something that remains constant (call it the R-target), and several other factors (call them R-variants) that could affect the target, but do not. I’ll say that the R-target is robust because it produces the same thing in the presence of the R-variants (see Fig. 1).
Fig. 1

Robustness. An R-target remains stable under various R-variants

Although each kind of robustness has the same general structure—containing an R-target and R-variants—they are not equivalent. I’ll go into more depth for each of these kinds of robustness, look at a number ways that they differ, and say why it may matter to keep them distinct. I’ll begin with robust detection, perhaps the simplest of the three.

Robust detection

Here’s a story of Robust Detection in marital affairs. Sue is married to Bob, and suspects him of having an affair. Sue thinks she has good reason to believe this, as she was independently told by three of her friends—John, Mary, and Penelope—that they saw Bob at a bar with another woman, at a time when he claimed to working late. Sue thinks it is very improbable that all three of her friends misidentified Bob, so she plans to confront him as soon as he gets home.

The first thing to notice in this example is that our initial characterisation is missing something. We have the R-target: the claim that Bob was at a bar with another woman (and is therefore being unfaithful). We have the R-variants: the reports of her friends, each of whom have independent faculties of observation, yet all deliver the same result. What we’re missing is the thing in the world that the claim is about: the fact of the matter about Bob being with another woman (and, by extension, his unfaithfulness). Let’s call this fact of the matter the R-source. Now, if all goes well well, our claim about the world is reliable because there are a number of independent paths (the R-variants) from the R-source to the R-target (see Figs. 2 and 3).
Fig. 2

Robust detection: the stable R-target is about something in the world, an R-source
Fig. 3

An example of robust detection

Of course, things don’t always go well. Let’s say Sue chats a little more with her friends and discovers that, although all three of her friends were at the bar, it was only Penelope that actually saw Bob. Penelope then told John and Mary, who subsequently reported Bob’s presence to her, though they had not actually seen him. Sue is now much less sure that Bob was actually at the bar, for what looked like three independent lines of evidence have now collapsed to one (see Fig. 4). Lack of independence in the R-variants undermines the robustness of the R-target.
Fig. 4

Robust detection breaks down when the paths are not independent

Let’s say Sue gathers further evidence. She finds lipstick on Bob’s collar, and a large expense from a local jeweler appears on their Visa bill that she knows nothing about. These sources of evidence all point to Bob’s infidelity, and they all seem independent. It turns out, however, that Bob’s mum dropped into see him at work (hence the lipstick on the collar), and his watch needed repairing (hence the jeweler’s bill). Independence wasn’t the problem this time. What mattered was that a number of lines of evidence appeared to point to one thing, but in fact were all evidence for different things. There were a number of R-sources instead of just one (see Fig. 5), so we still don’t have robust result.
Fig. 5

Robust detection breaks down when there are different R-sources

Robust detection failed to be robust in two ways – our R-variants lack independence, or the R-variants refer to divergent sources. In order for our R-target to be truly robust, there must be independent R-variants that refer to a single R-source.

Robust theorems

Robust Theorems have a similar structure to robust detection: the R-variants are different assumptions or idealisations, and the R-source is some kind of causal structure in the world (see Fig. 6). There is, however, an important difference in what the arrows mean. With robust detection, the arrows from R-source to R-variants are causal. If Sue’s original assessment of Bob was correct, then Bob’s being at the bar would have caused her friends to see him, and they then relayed that information. With Robust theorems the R-variants are not caused by the R-source, but have been constructed to represent the R-source. So the connection between Source and Maker is not one of causality, but of resemblance. Each model, though containing different assumptions, resembles the phenomena of interest in the world.
Fig. 6

Robust theorems have a similar structure to robust detection

The difference is subtle, but important. With robust tracking, the R-target is a claim directly about the world; with robust theorems, the R-target is a claim about models, and only indirectly about the world, via resemblance. They tell us something about the world in virtue of how well they represent it (Weisberg 2003; Godfrey-Smith 2006). Of course, models are (for the most part) designed to tell us something about the world. So having a robust theorem may, indirectly, make us more confident of a claim about the world, because—for the reasons given above—a robust theorem is unlikely to simply be an artefact of any particular idealisations made in any models.

If both are eventually about the world, then why bother keeping them distinct? Patrick Forber has provided some insight into why it is important to pay attention to this distinction (Forber Forthcoming). Forber’s discussion is about evolutionary biology, but his analysis extends more generally. What concerns us here is Forber’s objection to the claim that robustness analysis be treated as some kind of “low-level” confirmation. These claims have been made by Levins, and recently defended by Weisberg (Weisberg 2006). According to Forber, however, “confirmation-theoretic support comes from connecting models to the world, rather than assessing the models themselves”. What Forber is getting at here is the indirectness I mentioned above. A series of models might be judged to be robust, but by itself this is not enough to confirm anything. The models must be connected to the world, and this relies on making good on the resemblance they are meant to have with the phenomena in question. Note that this step is not required for robust detection—multiple lines of causally relevant evidence are just what would be useful to confirm a hypothesis.

If robustness analysis is not providing confirmation, then what does it do? The answer, according to Forber, is that formal enquiry, such as robustness analysis, provides an important way of narrowing the possibility space of candidate solutions. These candidates are then tested against the evidence to see what, if any, solution actually explains the real-world situation. The task of narrowing down possibilities shouldn’t be underestimated. Confirmation requires testing some set of possibilities against the evidence, but choosing the rival hypotheses to begin with not an obvious task—we can’t just include all possibilities (like some Bayesian confirmation theorists naively assume we should do).

According to this analysis, both robust theorems and robust tracking are important, but they do different jobs at different stages in the explanatory process. This is reflected in the megafaunal extinction story I told above. The initial theory got onto the table because formal exploration, via modelling, suggested there was a robust relationship between a particular constellation of properties and an extinction event. So the robustness analysis provided a credible candidate for explaining the extinction events. But confirming this theory required testing the theorem against some real-world data. Gathering such data can be problematic, particularly when reconstructing historical events from proxies, rather than using a direct experimental approach. In the story, multiple lines of evidence are used to provide a robust, reliable result. So different kinds of robustness are used at different stages in the process.

Robust phenomena

Robust phenomena differ from both Robust Theorems and Robust detection in an important respect: they have no equivalent an R-source. Robust theorems and robust detection both give us reason to think that a theory or statement about something in the world (the R-Source) is reliable. So the R-target in Robust Theorems and Robust Detection tell us about something in the world, but Robust phenomena simply are something in the world, such as a biological mechanism (see Fig. 7).
Fig. 7

Robust phenomena have no R-source

The difference is also apparent when we consider how something might fail to be robust. For theorems and detection, something may appear to be robust, but on examination, we find that it is not. We saw this in the case of Bob’s apparent infidelity, when what we discovered that that what we thought were independent lines of evidence collapsed together. If a particular phenomenon (some kind of mechanism, say) remains constant in several different environments, however, then it simply is robust—there is no other further question.

Robustness for theorems and detection depends on how the R-target connects to the world; for phenomena, however, robustness is something about the R-target itself. To see what makes some phenomenon more robust, let’s look at a simple example of a robust mechanism. A mechanism can be made robust by building redundancy into it. Parts are duplicated, so that when one fails, another part takes its place. In Fig. 8, we see a mechanism with duplicated parts, each connected to some internal part that is responsible for producing the final function of the mechanism—the thing that remains stable. Interventions that destroyed the function of the duplicate parts do not affect the final function of the mechanism, for the duplicate parts compensate for the loss. The explanation for robustness, in this case, lies in the structure internal to the R-target (the phenomena or mechanism), rather than in how the R-target connects to the R-source.
Fig. 8

Robust phenomena may still share some abstract structural resemblance to the other types of robustness

What underwrites the robustness of the R-target in the case of detection and theorems is a particular kind of relation with something in the world—the R-source. When we look at Robust phenomena, however, what underwrites their robustness is not something their relationship to the R-source; instead, it is something to do with the structure of the phenomena or mechanism itself.

Although what underwrites robustness is different across these cases, there are some structural similarities. The redundancy of parts in Fig. 8 that are inside the robust mechanism resemble the structure of robust detection. For the parts to be truly redundant, the interventions must be independent, affecting only one part. The mechanism would not be robust to an intervention that caused all of the duplicate parts to fail simultaneously. One way to think of this, is that there are independent pathways upstream of the mechanism’s function, and each of these is sufficient to feed into the downstream components, ensuring the mechanism’s function remains constant.

So despite the differences between these kinds of robustness, perhaps there are similarities (at some level of abstraction) in the structure that underwrites the robustness. I take it we intuitively think this to be true—that it is not just the appearance of robustness that leads us to group these different kinds together, but also something about the structure underlying the production of robustness. In the next section I examine this structure more closely.

The structure of robustness

Something is robust when it remains constant in various situations, despite the fact that it could have changed. This describes what it is to be robust, but it does not tell us what underwrites or explains robustness. What makes it the case that our target remains the same, rather than changes?

We saw one general property that underwrites robustness in the last section, when we looked at independence. In the case of theorems and detection, the independence required was in how the R-source was connected to the R-target. In the case of phenomena, duplicated parts enabled parts of the system to fail independently, leaving the systems function intact. What more can be said about the structure underlying robustness?

Wimsatt’s model

In an early chapter on robustness, Wimsatt discusses the structure of robustness in detail, focussing on robust theorems. He distinguishes between serial and parallel strategies for constructing theories about the world. A serial strategy is one which attempts to reduce the core assumptions to a few powerful ones, from which the rest of the theory can be derived. A parallel strategy is one which is a large number of interconnected and redundant assumptions, with few principles singled out as axiomatic, and with many different ways to derive results. This is an example of robust theorising, and it should be obvious which of these Wimsatt prefers. A strategy with multiple independent paths for deriving results will be robust, for the same result can be produced, despite it being derived in different ways.

Wimsatt then produces simple models for contrasting each of these two strategies, showing how each performs when attempting to produce a “real (error-prone) theory constructed and manipulated by real (fallible) operators” (Wimsatt 2007, p. 49). His models assumes that each theory is produced via a series of moves, such as making assumptions, or applying a rule, and each of these moves has a small, independent, probability of error. Given this, how error-prone are these different strategies? In the case of a serial strategy, success requires every step to be successful. If there are n steps, this gives:
$$ p({\text{success}}) = (1 - e)^{n} $$
Even with a very small chance of error, the probability of success decreases quickly with increasing n.
With a parallel strategy, the situation is very different. Success requires only that a single move succeeds, and increasing the number of independent moves increases the chance of success. If there are n independent paths, we have
$$ p({\text{success}}) = 1 - e^{n} $$

As Wimsatt emphasises, in the parallel case, adding more alternatives always increases reliability. He concludes: “Fallible thinkers should avoid long serial chains of thinking.” (Wimsatt 2007, p. 50).

As with any simple model, one could object that the situation is far more complex than this. But the model does clearly show how the robustness of the parallel strategy is delivered: it results from multiple independent paths. Wimsatt notes, as we did above, that such a parallel strategy is similar to what engineers do when they build redundancy into a system. The failure of one part does not result in the break-down of the system, for their is multiple ways that the system can function. Redundancy, however, is not the only way of getting robustness.

Condorcet’s model

Consider a situation that, at first glance, resembles engineering. Instead of building a mechanism that functions despite the fragility of its parts, our task is to assemble a panel to ascertain the truth of some statement, despite the fallibility of their individual judgments. Assume that the group makes its decision by majority-vote and that, as before, each decision-maker has an independent probability of being wrong. Just as we might want a mechanism to function correctly despite the fragility of its parts, we want the group to arrive at the right decision despite of the fallibility of the individuals.

What is the relationship between the size of the group and the probability of arriving at a correct decision? This answer was worked out by the Marquis de Condorcet in 1785, and is given by his famous jury theorem. As with Wimsatt’s model above, the probability that the group gets the right answer increases with the number of jurors. But not always. If the individuals in the group have a more than 50% chance of making an error, then adding more individuals actually makes the situation worse—it becomes less and less likely that a majority vote will produce the right answer.

A greater than 50% probability of error is large, so you might think it is not worth worrying about, but that would be to miss the point. In Wimsatt’s model, increasing the number of parts always made the system more reliable—no matter the size of the error. Even if there was a 90% chance of failure of each part in a redundant system, adding parts always made it more reliable. The same is not true with a majority group decision. Adding individuals who make errors 90% of the time would decrease the reliability of the group decision-making process. These models are clearly not capturing the same kind of thing.

The two models differ in the way that the outputs of the parts (the individual decisions) are combined to produce the collective result. One way to make this more concrete is to ask what kind of function is applied to the set of individual contributions in order to produce the final result. Let’s assume that a correctly informed individual or correctly functioning part produces a 1, and a misinformed individual or malfunctioning part produces a 0. Wimsatt’s function simply picks that maximum of all outputs—so even if a single 1 is being output, the collective output is one. Condorcet’s function picks the mode of the set of outputs; the one which is most commonly produced. Clearly, these two functions, max and mode, are not equivalent. Yet both redundancy (Wimsatt’s parallel strategy) and group decision-making with majority voting are ways of increasing the reliability of a process, making it more robust to the fragility of the parts. So there is more than one way of making something robust, and the particular method used has implications for whether or not adding more parts will increase the robustness.

This case also highlights that we should be careful interpreting what is meant by “independence” in these cases. The probabilities in both Wimsatt’s parallel strategy and Condorcet’s voting aggregation are independent, in the sense that they take on their values independently of one another. But with the parallel strategy, it is also the case that each path is independently sufficient for the strategy to succeed. With this difference in mind, I think it likely that many cases of robustness resemble Condorcet’s model more than Wimsatt’s model, especially when we think of robust detection. It is often the weight of this evidence, all pointing to the same thing, but derived in different ways, that sways us to believing some conclusion. Each line of evidence may be independent, but none is independently sufficient to warrant the conclusion by itself. In many situations then, the engineering analogy of redundancy is inappropriate; robustness will rely on something more like Condorcet’s model.

Alternative structures

This discussion is not meant to (nor does it) show that robustness is problematic. It does suggest, however, that there are distinct ways that robustness might be underwritten, and these differences have implications for our assumptions about robustness. In this case, one of Wimsatt’s assertions about robustness—that increasing the number of individual lines of evidence is always good—breaks down when the result we are looking for depends on aggregating the results from the individual parts, rather than assuming that each was independently sufficient to produce a decision.

The idea that robustness may be underwritten in different ways also arises in discussion of robust phenomena. Redundancy is sometimes meant as the simple duplication of parts—a straightforward way of ensuring that a system is robust is to simply have multiple copies of the same parts. This kind of redundancy can occur in biological systems, for example, where duplicated genes can stand in for one another. But Andreas Wagner argues that this method of making things robust may well be less important in biological systems than something he calls “distributed robustness”:

In distributed robustness, many parts of a system contribute to system function, but all of these parts have different roles. When one part fails or is changed through mutations, the system can compensate for this failure, but not because a “back-up” redundant part takes over the failed part’s role. (Wagner 2007).

Wagner provides a number of biological examples of complex metabolic and genetic networks, which don’t have duplicate parts, yet still manage to be very robust. A model from Von Dassow and his collaborators is particularly impressive, for it shows that the genetic network being modelled continues to function normally even when many of the concentrations of chemicals important to the network are increased by an order of magnitude (von Dassow et al. 2000).

Although Wagner looks closely at trying to distinguish between these ways that robustness can be produced, he has little to say about how distributed robustness might actually work. One possibility is that the process is implemented by a web of negative and positive feedback that function as an internal monitoring system. The knock-out of a functional pathway is detected by the system, and compensated for elsewhere. The key to robustness in this case is some kind of homeostatic feedback device—a more complex version of the familiar thermostat. Another possibility is that the result is the accumulation of many smaller sub-operations, and is robust in the same way that many artificial neural networks are—functionality degrades gracefully, rather than simply breaking down, for the burden of functionality is shared across a large number of operational parts, each contributing in some small part. Both these options are different than straightforward duplication of parts suggested by Wimsatt's engineering analogy.


If Wimsatt is right about the importance of robustness for both doing and understanding philosophy and science, then in order for it to take a more central role we need a clear picture of just what robustness is and how it works, when it applies, and what good it does. What I’ve tried to show in the previous sections is that there is no single answer to these questions. Robustness is a family of related concepts; which one is in play depends on the task at hand.

More importantly, the brief exploration of the concept given here suggests that the structure of robustness still requires much clarification. How, for example, might the structure of distributed robustness found in biology relate to the structure of robustness for theorems or detection? We can also ask what interactions there are between the various roles that these different kinds of robustness play. For example, what interactions are there between robust phenomena and their detection? This question is relevant to Wimsatt’s work on levels of organisation, where the question of robust detection and robust phenomena are not well separated. More work needs to be done to clarify this relationship.

This analysis of robustness is, I hope, very much in the spirit of Wimsatt’s own vision of how to conduct philosophy. For according to Wimsatt, an important part of robustness analysis itself is to determine the scope and failures of invariance (Wimsatt 2007, p. 44). Much of this review has been spent trying to ascertain when and how robustness applies, and what commonalities are shared when it appears in its various guises. The results are not simple. I suspect this would come as no surprise to Wimsatt.

Copyright information

© Springer Science+Business Media B.V. 2010