Sunday, December 20, 2009

Sober's causation

I have almost finished reading Elliot Sober's The Nature of Selection. It is a complex book, which inspired many marginal notes and a number of journal entries, and which I am sure I will need to come back to more than once to fully appreciate. With some temerity, perhaps, I have decided to address a couple of issues related to this book in my next couple of posts on this blog. This week, I am going to reflect on an idea about causality that Sober puts forth. Next week, hopefully, I will revisit the evolution of altruism, which I have discussed before (9/28/09).

Sober's causality claim relates specifically to what he calls "population level" causality, as distinguished from "individual level" causality. An example he uses to illustrate this difference is: suppose a golfer is trying to sink a putt, and a squirrel runs by after he hits the ball, and kicks it. Improbably, the ball deflects off of some obstruction, but sinks in the hole, anyway. From an individual level, we would wish to say that the squirrel's kick caused the ball to sink in the hole, because it started a chain of events that resulted in the ball sinking. But from a population level, we would not wish to say that "squirrel kicks sink balls," because we are convinced that, usually, this would not happen.

On the population level, Sober first points out, non-controversially, that causality is not implied by correlation. My own favorite example illustrating this truism is the theory that fire fighters cause damage at fires. It is observed that damage at fires is positively correlated with the number of fire fighters that arrive at the scene. If correlation is taken as proof of causation, the conclusion is that fire fighters cause damage. In fact, of course, the number of fire fighters and the amount of damage are correlated because they have a common background cause - the intensity of the fire.

Sober believes that by strengthening the criteria, it is possible to derive a probability-based definition of population-level causation. The rule he argues for is this: an event x is a (positive) causal factor of an event y if the probability of y given x is greater than or equal to the probability of y given (not-x) under all possible background conditions, and the inequality is a strict inequality in at least one condition. In other words (or, rather, in symbols):

x CF y <=> ∀z (P(y|x ⋅ z) ≥ P(y|!x ⋅ z)) ⋅ ∃z (P(y|x ⋅ z) > P(y|!x ⋅ z))

Here I am defining the relation "CF" given by "x is a causal factor of y". I will be using the dot operator both for the logical "and" between propositions and the probabilistic conjunction of events, and also using ! for a logical "not" and for negating an event (the event !x is the event that x does not occur). This should be contextually unambiguous. Hopefully your browser will display the symbols properly - if not, you may need to change your character set to "UTF-8". I've tested it with both Internet Explorer 7 and Firefox 3.5.6. (Firefox, annoyingly, sticks extra space above and below all equations, leading to very ugly page renderings.)

The reason for the universal modifier on the first part of the above expression is interesting. Sober argues that it is not enough for causality to increase the likelihood of an event in most circumstances, but decrease it in others (even in a minority of cases). If you allow any negative cases, he argues, the causality claim reduces to P(y|x) > P(y|!x), which simply represents correlation. So to have a definition of causality stronger than mere correlation, the first event must raise the probability of the second event, or be neutral, in all circumstances, and must positively raise it (strict inequality) in at least one.

Sober raises some issues with the above formulation regarding the causal independence of the background issues from the factor being examined. Specifically, he requires that the "background events" (z) not be "causally relevant" (either positively or negatively) to the proposed cause being investigated (x). If they are, this leads to undefined conditional probabilities of the form P(y|x ⋅ !x). Conceptually, this represents a form of double counting. Recasting this requirement in the quantificational form gives something like:

x CF y <=> ∀z (z ∈ B => P(y|x ⋅ z) ≥ P(y|!x ⋅ z)) ⋅ ∃z (z ∈ B ⋅ (P(y|x ⋅ z) > P(y|!x ⋅ z)))

B = {z: !(x CF z) ⋅ !(z CF x) ⋅ !(x CF !z) ⋅ !(z CF !x) ⋅ (z ≠ x) ⋅ (z ≠ y)}

The last two inequalities are necessary to avoid undefined conditional probabilities in P(y|!x ⋅ x), and because the strict inequality P(y|x ⋅ y) > P(y|!x ⋅ y) is always false.

The set notation above is kind of nasty, because it carries the "free variables" x and y outside of the expression. But we can eliminate this notation by expanding the independence criterion in place (although it gets a little unwieldy):

x CF y <=> ∀z ((!(x CF z) ⋅ !(z CF x) ⋅ !(x CF !z) ⋅ !(z CF !x) ⋅ (z ≠ x) ⋅ (z ≠ y)) => P(y|x ⋅ z) ≥ P(y|!x ⋅ z))
⋅ ∃z ((!(x CF z) ⋅ !(z CF x) ⋅ !(x CF !z) ⋅ !(z CF !x) ⋅ (z ≠ x) ⋅ (z ≠ y)) ⋅ (P(y|x ⋅ z) > P(y|!x ⋅ z)))

Now that certainly looks circular. I hesitate only because I'm not 100% certain that it is impossible to iteratively expand the "CF" terms, at least in a finite universe. I took a stab at it in a universe containing only 4 events {x, y, B1, B2}, but the problem quickly exceeded my limited powers of symbolic manipulation. But, anyway, I think the definition is circular.

Actually, I don't know why I even bother to fret over it, since Sober actually admits that it is a circular definition - but he argues that it is a useful definition, anyway. Hs references a 1979 Nous article by N. Cartwright, which supposedly goes into this in more depth. It would be interesting to read that, but I have no easy way of tracking it down.

I am not going to dispute that a definition can be conceptually useful, even if circular, outside certain strictly formal contexts. But I think we need to ask in each case where does the circularity come from, and why it is necessary, and/or useful. In this case, I have a suspicion that it is because we have an underlying, intuitive definition of causation that has nothing to do with the definition that is being attempted, here. This is also reflected in my sense that this idea is only useful if we "prune" it somehow, as suggested by my reference to a "finite universe" above. For another example of pruning, I think we are only interested in background conditions that have some causal effect, themselves - totally neutral conditions are not interesting. In other words:

B = {z: !(x CF z) ⋅ !(z CF x) ⋅ !(x CF !z) ⋅ !(z CF !x) ⋅ (z ≠ x) ⋅ (z ≠ y) ⋅ (z CF y)}

But how do we prune the universe of possible events, other than applying some other, a priori, theory of causation? And in that case, how is Sober's causation test any different then just using correlation as an empirical test of the a priori theory?

I'd be the first to admit that my reasoning above is a little mushy. But a specific example, I think, shows that Sober's definition doesn't quite jibe with our intuitive ideas about causation, and may not ultimately be satisfactory as a definition of population-level causation.

Imagine a rectangular pool table, with the long axis oriented north-south. From time to time, billiard balls are introduced approximately on the 1/3rd line (the imaginary line dividing the southern 1/3 of the table from the northern 2/3). Some of these balls will be struck with a cue. The horizontal angle with which the cue strikes the ball is normally distributed such that 90% of the variation is w/in ± 70 degrees of the mean, which is to the north. There is friction in the table, and spin (variation in the incident angle of the cue with respect to the radial angles from the center of the balls to the point of impact), and the impulse imparted by the cue is finite, so that a ball may strike a side wall, or other obstruction, and come to rest before hitting the north wall. As balls accumulate, they may strike, or be struck by, other balls. Impacts are approximately elastic (with frictional/damping losses). Additionally, a number of bumpers are introduced between the 1/3rd line and the north wall. The precise position of the bumpers is varied, from time to time. Usually, when a ball strikes a bumper, it will be a glancing blow, and the ball will continue in a generally northerly direction; however, occasionally the impact will be square enough that the ball will rebound to the south. This rebound may be sufficient to carry the ball south of the 1/3 line (e.g., if the obstruction is close to the line).

Finally, the entire table will be lifted and tilted, from time to time (but infrequently), either to north or to south, with the conditional probability of a southward tilt, given that a tilt occurs, equal to 50%. The tilt is of finite (temporal) duration - i.e., there is a probability greater than zero but less than 1 that a ball on the table will reach the south wall during a south tilt. Note that any ball which has ended up south of the 1/3rd line due to a rebound has a greater probability of touching the south wall during a southward tilt than it did when it was first introduced into the game.

When a ball strikes either the north or south wall, it is removed from the game. Its probability of striking the other wall, at that point, is zero.

It is hard to say that, in this game, the impact of the cue is not a "causal factor" in increasing the percentage of the ball population that touches the north wall, even though there are some members of "B" (combinations of bumper location, ball position, cue angle, other factors) under which the impact will, in fact REDUCE the probability of reaching the north wall. But by Sober's definition of causality we would, in fact, need to make that claim.

I find myself inclined to throw out Sober's definition, or rather, to view it only as a sort of statistical test of some underlying idea of causality. I'm inclined, further, to view population-level causation as just an aggregation of individual causation-events (including those that have actually occurred, plus hypotheticals). "Squirrel kicks cause balls to sink sometimes, but most of the time they don't." So squirrel kicks are not considered a "cause" of successful putts, at the population level.

I don't think this has a negative affect on Sober's substantive arguments about the nature of selection. His arguments about "units of selection", for example, depend on the distinction between "selection of" some kind of entity and "selection for" some specific quality or trait. The question is, at what level does the cause of the selection operate? One (admittedly artificial) example he uses for illustration is to postulate several groups of otherwise similar organisms which are homogenous within each group with respect to some quality - say tallness - but vary between groups. Suppose some predator differentially picks off the shorter organisms. Is this an example of group selection (the predator is selecting organisms from groups of short organisms), or individual selection (the predator selects shorter organisms)? The question cannot be resolved strictly by looking at results, because in either case there is selection of the same organisms. One needs to look at the cause - what trait is being selected for? Does the predator simply favor shorter animals? Or does it avoid tall groups of animals? A test of the causal assumptions would be to examine what would happen if a shorter organism happened to be found in a taller group. Would it be subject to the same level of predation as if it were in a group of small organisms? In that case, the individual selection model would be supported. Or would it have the same security as its taller group members? In that case, a group selection mechanism is indicated. Of course, this test might be empirically impossible, if this were a real-life example, but it illustrates the role that a concept of causality plays in determining the unit of selection. I see no way, though, in which this argument depends on Sober's specific formulation of his law for population-level causality.


P.S. I am not 100% certain how I feel about the explanatory necessity of the concept of "causation". For instance, if I say "an object subject to a given force F will experience an acceleration proportional to its mass", does it add anything useful to the explanation to say "the force causes the object to accelerate"? The idea of cause is important to Sober, and he bases a lot of his "units of selection" arguments on the concept. I have no dogmatic objection to this, but I can't help but wonder if the concept of causation isn't somehow reducible.

No comments:

Post a Comment