Another Way to Detect Design? Part 1
1. Overview of the Design Inference
Darwinbegan his Origin of Species with the commonsense recognition that artificial selection in animal and plant breeding experiments is capable of directing organismal variation. A design-theoretic research program likewise begins with a commonsense recognition, in this case that humans draw design inferences routinely in ordinary life, explaining some things in terms of blind, undirected causes and other things in terms of intelligence or design. For instance, archeologists attribute rock formations in one case to erosion and in another to design — as with the megaliths at Stonehenge.
Having massaged our intuitions, Darwinnext formalized and extended our commonsense understanding of artificial selection. Specifically, Darwinproposed natural selection as a general mechanism to account for how organismal variation gets channeled in the production of new species. Similarly, a design-theoretic research program next formalizes and extends our commonsense understanding of design inferences so that they can be rigorously applied in scientific investigation. My codification of design inferences as an extension of Fisherian hypothesis testing provides one such formalization. At the heart of my codification is the notion of specified complexity, which is a statistical and complexity-theoretic concept. An object, event, or structure exhibits specified complexity if it matches a highly improbable, independently given pattern.
Think of the signal that convinced the radio astronomers in the movie Contact that they had found an extraterrestrial intelligence. The signal was a long sequence of prime numbers. On account of its length the signal was complex and could not be assimilated to any natural regularity. And yet on account of its arithmetic properties it matched an objective, independently given pattern. The signal was thus both complex and specified. What’s more, the combination of complexity and specification convincingly pointed those astronomers to an extraterrestrial intelligence. Design theorists contend that specified complexity is a reliable indicator of design, is instantiated in certain (though by no means all) biological structures, and lies beyond the remit of nature to generate it.
To say that specified complexity lies beyond the remit of nature to generate it is not to say that naturally occurring systems cannot exhibit specified complexity or that natural processes cannot serve as a conduit for specified complexity. Naturally occurring systems can exhibit specified complexity and nature operating unassisted can take preexisting specified complexity and shuffle it around. But that is not the point. The point is whether nature (conceived as a closed system of blind, unbroken natural causes) can generate specified complexity in the sense of originating it when previously there was none. Take, for instance, a Dürer woodcut. It arose by mechanically impressing an inked woodblock on paper. The Dürer woodcut exhibits specified complexity. But the mechanical application of ink to paper via a woodblock does not account for that specified complexity in the woodcut. The specified complexity in the woodcut must be referred back to the specified complexity in the woodblock which in turn must be referred back to the designing activity of Dürer himself (in this case deliberately chiseling the woodblock). Specified complexity’s causal chains end not with nature but with a designing intelligence.
To employ specified complexity as a criterion for detecting design remains controversial. The philosophy of science community, wedded as it is to a Bayesian (or more generally likelihood) approach to probabilities, is still not convinced that my account of specified complexity is even coherent. The Darwinian community, convinced that the Darwinian mechanism can do all the design work in biology, regards specified complexity as an unexpected vindication of Darwinism — that’s just what the Darwinian mechanism does, we are told, to wit, generate specified complexity. On the other hand, mathematicians and statisticians have tended to be more generous toward my work and to regard it as an interesting contribution to the study of randomness. Perhaps the best reception of my work has come from engineers and the defense industry looking for ways to apply specified complexity in pattern matching (for instance, I’ve even been approached on behalf of DARPA to assist in developing techniques for tracking terrorists). The final verdict is not in. Indeed, the discussion has barely begun. In this talk, I will address the concern that specified complexity is not even a coherent probabilistic notion. In my next talk, I will show that the Darwinian mechanism is incapable of doing the design work that biologists routinely attribute to it.
Detecting design by means of specified complexity constitutes a straightforward extension of Fisherian significance testing. In Fisher’s approach to significance testing, a chance hypothesis is eliminated provided an event falls within a pre-specified rejection region and provided that rejection region has small probability with respect to the chance hypothesis under consideration. The picture here is of an arrow landing in the target. Provided the target is small enough, chance cannot plausibly explain the arrow hitting the target. Of course, the target must be given independently of the arrow’s trajectory. Movable targets that can be adjusted after the arrow has landed will not do (one can’t, for instance, paint a target around the arrow after it has landed).
In extending Fisher’s approach to hypothesis testing, the design inference generalizes the types of rejection regions capable of eliminating chance. In Fisher’s approach, to eliminate chance because an event falls within a rejection region, that rejection region must be identified prior to the occurrence of the event. This is to avoid the familiar problem known among statisticians as “data snooping” or “cherry picking,” in which a pattern is imposed on an event after the fact. Requiring the rejection region to be set prior to the occurrence of an event safeguards against attributing patterns to the event that are factitious and that do not properly preclude its occurrence by chance.
This safeguard, however, is unduly restrictive. In cryptography, for instance, a pattern that breaks a cryptosystem (known as a cryptographic key) is identified after the fact (i.e., after one has listened in and recorded an enemy communication). Nonetheless, once the key is discovered, there is no doubt that the intercepted communication was not random but rather a message with semantic content and therefore designed. In contrast to statistics, which always identifies its patterns before an experiment is performed, cryptanalysis must discover its patterns after the fact. In both instances, however, the patterns are suitable for inferring design. Patterns suitable for inferring design I call specifications. A full account of specifications can be found in my book The Design Inference as well as in my forthcoming book No Free Lunch. Suffice it to say, specifications constitute a straightforward generalization of Fisher’s rejection regions.
2. Design by Comparison
By employing rejection regions, the design inference takes an eliminative approach to detecting design. But design detection can also be approached comparatively. In their review for Philosophy of Science of my book The Design Inference, Branden Fitelson, Christopher Stephens, and Elliott Sober argue that my eliminative approach to detecting design is defective and offer an alternative comparative approach that they regard as superior. According to them, if design is to pass scientific muster, it must be able to generate predictions about observables. By contrast, they see my approach as establishing the plausibility of design “merely by criticizing alternatives.”
Sober’s critique of my work on design did not end with this review. In his 1999 presidential address to the American Philosophical Association, Sober presented a paper titled “Testability.” In the first half of that paper, he laid out what he regards as the proper approach for testing scientific hypotheses, namely, a comparative, likelihood approach in which hypotheses are confirmed to the degree that they render observations probable. In the second half of that paper he showed how the approach I develop for detecting design in The Design Inference diverges from this likelihood approach. Sober concluded that my approach to detecting renders design untestable and therefore unscientific.
The likelihood approach that Sober and his colleagues advocate was familiar to me before I wrote The Design Inference. I found that approach to detecting design inadequate then and I still do. Sober’s likelihood approach is a comparative approach to detecting design. In that approach, all hypotheses are treated as chance hypotheses in the sense that they confer probabilities on states of affairs.  Thus, in a competition between a design hypothesis and other hypotheses, design is confirmed by determining whether and the degree to which the design hypothesis confers greater probability than the others on a given state of affairs.
The likelihood approach of Sober and colleagues has a crucial place in Bayesian decision theory. The likelihood approach is concerned with the relative degree to which hypotheses confirm states of affairs as measured by likelihood ratios. Bayesian decision theory, in addition, focuses on how prior probabilities attached to those hypotheses recalibrate the likelihood ratios and thereby reapportion our belief in those hypotheses. Briefly, likelihoods measure strength of evidence irrespective of background knowledge. Bayesianism also factors in our background knowledge and thus characterizes not just strength of evidence but also degree of conviction.
Sober’s likelihood approach parallels his preferred model of scientific explanation, known as inference to the best explanation (IBE), in which a “best explanation” always presupposes at least two competing explanations. Inference to the best explanation eliminates hypotheses not by eliminating them individually but by setting them against each other and determining which comes out on top. But why should eliminating a chance hypothesis always require additional chance hypotheses that compete with it? Certainly, this is not a requirement for eliminating hypotheses generally. Consider the following hypothesis: “The moon is made of cheese.” One does not need additional hypotheses (e.g., “The moon is a great ball of nylon”) to eliminate the moon-is-made-of-cheese hypothesis. There are plenty of hypotheses that we eliminate in isolation, and for which additional competing hypotheses do nothing to assist in eliminating them. Indeed, often with scientific problems we are fortunate if we can offer even a single hypothesis as a proposed solution (How many alternatives were there to Newtonian mechanics when Newton proposed it?). What’s more, a proposed solution may be so poor and unacceptable that it can rightly be eliminated without proposing an alternative. It is not a requirement of logic that eliminating a hypothesis means superseding it.
But is there something special about chance hypotheses? Sober advocates a likelihood approach to evaluating chance hypotheses according to which a hypothesis is confirmed to the degree that it confers increasing probability on a known state of affairs. Unlike Fisher’s approach, the likelihood approach has no need for significance levels or small probabilities. What matters is the relative assignment of probabilities, not their absolute value. Also, unlike Fisher’s approach (which is purely eliminative, eliminating a chance hypothesis without accepting another), the likelihood approach focuses on finding a hypothesis that confers maximum probability, thus making the elimination of hypotheses always a by-product of finding a better hypothesis. But there are problems with the likelihood approach, problems that severely limit its scope and prevent it from becoming the universal instrument for adjudicating among chance hypotheses that Sober intends. Indeed, the likelihood approach is necessarily parasitic on Fisher’s approach and can properly adjudicate only among hypotheses that Fisher’s approach has thus far failed to eliminate.
To see this, consider the supposed improvement that a likelihood analysis brings to one of my key examples for motivating the design inference, namely, the “Caputo case.” Nicholas Caputo, a county clerk from New Jerseywas charged with cheating — before the New Jersey Supreme Court no less — because he gave the preferred ballot line to Democrats over Republicans 40 out of 41 times (the improbability here is that of flipping a fair coin 41 times and getting 40 heads). I give a detailed analysis of the Caputo case in The Design Inference and extend the analysis in my forthcoming book No Free Lunch. For now, however, I want to focus on Sober’s analysis. Here it is — in full detail:
There is a straightforward reason for thinking that the observed outcomes favor Design over Chance. If Caputo had allowed his political allegiance to guide his arrangement of ballots, you would expect Democrats to be listed first on all or almost all of the ballots. However, if Caputo did the equivalent of tossing a fair coin, the outcome he obtained would be very surprising. 
Such an analysis does not go far enough. To see this, take Caputo’s actual ballot line selection and call that event E. E consists of 41 selections of Democrats and Republicans in some particular sequence with Democrats outnumbering Republicans 40 to 1 (say twenty-two Democrats, followed by one Republican, followed by eighteen Democrats). If Democrats and Republicans are equally likely, this event has probability 2^(-41) or approximately 1 in 2 trillion. Improbable, yes, but by itself not enough to implicate Caputo in cheating. What, then, additionally do we need to confirm cheating (an thereby design)? To implicate Caputo in cheating it’s not enough merely to note a preponderance of Democrats over Republicans in some sequence of ballot line selections. Rather, one must also note that a preponderance as extreme as this is highly unlikely. A crucial distinction needs to be made here. Probabilists distinguish between outcomes or elementary events on the one hand and composite events on the other. To roll a six with a single die is an outcome or elementary event. On the other hand, to roll an even number with a single die is a composite event that includes (or subsumes) the outcome of rolling a six, but also includes rolling a four or two.
In the Caputo case, it’s not the event E (Caputo’s actual ballot line selections) whose improbability the likelihood theorist needs to compute but the composite event E* consisting of all possible ballot line selections that exhibit at least as many Democrats as Caputo selected. This composite event — E* — consists of 42 possible ballot line selections and has improbability 1 in 50 billion. It’s this event and this improbability on which the New Jersey Supreme Court rightly focused. Moreover, it’s this event that the likelihood theorist needs to identify and whose probability the likelihood theorist needs to compute to perform a likelihood analysis (Sober and colleagues concede this point when they admit that the event of interest is one in which Democrats are “listed first on all or almost all of the ballots” — in other words, it’s not the exact sequence of Caputo’s ballot line selections but all of which are as extreme as his).
But how does the likelihood theorist identify this event? Let’s be clear that observation never hands us composite events like E* but only elementary events like E (i.e., Caputo’s actual ballot line selection and not the ensemble of ballot line selections as extreme as Caputo’s). But whence this composite event? Within the Fisherian framework the answer is clear: E* is a pre-specified rejection region, in this case given by a test-statistic that counts number of Democrats selected over Republicans. That’s what the court used and that’s what likelihood theorists use. Likelihood theorists, however, offer no account of how they identify the composite events to which they assign probabilities. If the only events they ever considered were elementary events, there would be no problem. But that’s not the case. Likelihood theorists and Bayesians routinely consider composite events. In the case of Bayesian design inferences, those composite events are given by specifications. But how do Bayesian probabilists and likelihood theorists more generally individuate the events they employ to discriminate among chance hypotheses? This question never gets answered in Bayesian or likelihood terms. Yet it is absolutely crucial to their project.
. Branden Fitelson, Christopher Stephens, and Elliott Sober, “How Not to Detect Design — Critical Notice: William A. Dembski, The Design Inference,” Philosophy of Science 66 (1999): 472-488.
. Ibid., 487.
. Elliott Sober, “Testability,” Proceedings and Addresses of the American Philosophical Association 73(2) (1999): 47-76.
. In Sober’s account, probabilities are conferred on observations. Because states of affairs constitute a broader category than observations, in the interest of generality I prefer to characterize the likelihood approach in terms of probabilities conferred on states of affairs.
. Elliott Sober, Philosophy of Biology (Boulder, Colo.: Westview, 1993), 30-36.
. Fitelson et al., “How Not to Detect Design,” 475.