Probability Kinematics and Probability Dynamics
Richard Jeffrey developed the formula for probability kinematics with the intent that it would show that strong foundations are epistemologically unnecessary. But the reasons that support strong foundationalism are considerations of dynamics rather than kinematics. The strong foundationalist is concerned with the origin of epistemic force; showing how epistemic force is propagated therefore cannot undermine his position. The weakness of personalism is evident in the difficulty the personalist has in giving a principled answer to the question of when the conditions for the application of the kinematic formula--the rigidity of the posteriors--are fulfilled, a problem made intractable by the personalist commitment to treating changes in intermediate probability as unexplained surds. Because the strong foundationalist admits changes in the intermediate probability of propositions only when there is some change in the foundations, he can avail himself of an answer to the problem of the rigidity of the posteriors which the personalist cannot regard as complete. While probability kinematics does not make certain foundations unnecessary, the possession of certain foundations also does not make the probability kinematics formula superfluous. The formula allows us to model the indirect routes by which the foundations influence various non-foundational propositions in the probability distribution.
I. KINEMATICS AND DYNAMICS
In the course of his famous exchange with Isaac Levi, Richard Jeffrey makes explicit the physics analogy behind the coinage "probability kinematics."
In Physics, Dynamics is a contrary of Kinematics as well as of Statics: it is the first contrariety that I had in mind when I called Chapter 11 of The Logic of Decision, 'Probability Kinematics'. Take a see-saw, with fulcrum 2/3 of the way toward your end. If you push your end down two feet, the other end will go up three. That is kinematics: You talk about the propagation of motions throughout a system in terms of such constraints as rigidity and manner of linkage... When you talk about forces--causes of accelerations--you are in the realm of dynamics. (Jeffrey 1970: 172)
Jeffrey's point is that his formula for updating on a change in the probability of uncertain evidence is merely kinematic; it tells about the propagation of evidential force but nothing about the "causes of accelerations," i.e., about where such changes in intermediate probability come from or about whether they are rational.
This, I think, is quite correct, and it can be seen to be correct by the very nature of Jeffrey’s formula. Where prob is one's old probability distribution and PROB is one's new distribution, the formula for Jeffrey Conditioning (JC) is as follows:
PROB (H) = PROB (E) prob (H|E) + PROB (~E) prob (H|~E)
The formula is applicable if and only if PROB (H|"E) = prob (H|"E), which is known as the rigidity condition and which, when satisfied, guarantees that the formula is simply a substitution instance of the Theorem on Total Probability. But nothing here tells us how E gets its new probability or whether that new probability is rational. The formula tells us only how the new probability of some uncertain E can be propagated to H. Hence the formula, while entirely unobjectionable from a logical point of view, is in Jeffrey's terms kinematic rather than dynamic.
But Jeffrey was apparently not satisfied with according this modest status to his formula. From the outset, he presented it as having dynamic relevance and, specifically, as counting "against" the strong foundationalism of C. I. Lewis. Jeffrey states this motivation for probability kinematics explicitly when he discusses the history of the idea:
Probability kinematics was first introduced for an in-house philosophical purpose: to show how, in principle, all knowledge might be merely probable, in the face of a priori arguments to the contrary, e.g., those of C. I. Lewis..., who saw conditioning as the only reasonable way to modify judgmental probabilities by experience.... Using probability kinematics, I aimed to show...how the familiar language of objective statement needed no supplementation by what C. I. Lewis...called "the expressive use of language..." (Jeffrey 1992: 135-6)
In other words, Jeffrey takes it that if probability kinematics is sound, we do not need (and we certainly need not restrict ourselves to) certain evidence. (See also Jeffrey 1972: 97; 2004: 55-60.) Conversely, he seems to have assumed (1965: 165) that if we do have certain evidence, probability kinematics is unnecessary. On this view, probability kinematics and Lewis-style foundations are in a competition where each would render the other unnecessary.
Both of these competitive assumptions are false. The fact that Jeffrey's Rule is probabilistically unimpeachable does nothing to remove the sorts of considerations that motivated Lewis. And the availability of foundational certainties as evidence does not render JC superfluous. I shall focus most of all on the first of these claims, and I shall do so by way of considering a puzzle for JC--the difficulty in ascertaining when the rigidity condition is met.
II. THE PROBLEM: NORMATIVITY AND RIGIDITY
JC can be used to update the probability of H when the uncertain probability value for some evidence E has changed, but only when the probabilities of H given E and given ~E are the same in the old and new distributions. But, as Judea Pearl (1988: 64) mildly points out, "[T]his condition...is not easy to test." Jeffrey himself (1965: 168) implied that the posteriors are rigid when H is not one of the propositions "directly" affected by some "passage of experience," but Pearl (1988: 66-7) shows that there can be situations where the posteriors are not rigid but where it does not seem correct to say that H is "directly" affected by the passage of experience.
Consider the following example of a case in which it is difficult to tell at first examination that the posterior probabilities are not rigid. Suppose that I am sitting in my kitchen one evening and have a "passage of experience" (to use Jeffrey's phrase) that raises the probability for me that the wind chimes just outside the door are ringing wildly in the wind. The experience, of course, is an auditory experience that sounds to me like the wild ringing of the chimes. Let E be "The chimes are ringing in a high wind." Let H be "A storm is coming." Let p refer to the auditory experiential evidence. Now, this would seem at first blush to be a straightforward case in which Jeffrey's rule applies. For I know by my background knowledge that high winds are for obvious meteorological reasons correlated with an on-coming storm. There should be no problem with using JC to update H using the new, higher probability for E.
But on some sets of additional background knowledge, the posteriors would not be rigid. Suppose, for example, that I also know that I have an unusual hearing condition that just occasionally causes me to have ringing in the ears when there is a drop in atmospheric pressure, and suppose I also know that this ringing in the ears, when it occurs, mimics the sound of my wind chimes. Since this condition affects me only occasionally and I still often really do hear the chimes ringing in the wind, I will reasonably have a higher probability for E than I had before the experience, despite my strange hearing condition. But prob (H|"E) on that background does not equal PROB (H|"E), for PROB (H) is higher than prob (H) even given that E is false--that is, even if the chimes are for some reason not ringing. Nor is it the case that H is "directly affected" by the passage of experience. Rather, it is affected by a different belief E', "The atmospheric pressure is dropping," which is itself affected by p and, for that reason, has a higher probability in PROB than it had in prob. In terms of Jeffrey's own philosophical commitments and his own description of the applicability of the JC formula, it is not possible to give a clear explanation for the fact that the posteriors are rigid if we have normal background information but cease to be rigid given the unusual background information about hearing.
The trouble here lies not with JC itself but rather with the attempt to use JC to render strong foundations unnecessary. Jeffrey (1972: 99-100; 1965: 184-5) is adamant that the change in the probability of E is simply caused by the passage of experience and that the experience cannot be cashed out in propositional terms nor its relation to E modeled in probabilistic terms. This aspect of radical probabilism lies at the heart of the difficulty in stating clearly when the posteriors are rigid. For if the passage of experience bears a causal relationship to one's beliefs rather than an evidential relation, the change it induces in the distribution is a surd. It is neither predictable nor criticizable. So, in the example case, my auditory experience simply causes the changes in probability for E and E'; those changes are not the result of an evidential relationship between the experience and those uncertain propositions or between the experience and H.
The unpredictability point has been pressed by Mary Hesse (1974: 122-3), who argues from a pragmatic perspective that a confirmation theory is of no use unless it is possible to answer questions about "the effect of possible future evidence upon the probability of hypotheses." But, she points out, "...the physical observation...has an unpredictable causal effect upon the whole distribution."
In other words, if Jeffrey were willing for the new experience to bear an evidential relation to the rest of the distribution, this would leave open the possibility that the relation could be modeled in probabilistic terms and hence that the changes induced in the distribution, including the conditions for JC itself--the change in E and the rigidity of P(H|E)--could be predicted. But this possibility is ruled out by the purely causal and subjective nature of the change. Not surprisingly, dynamic considerations limit the relevance of kinematics.
Rudolf Carnap was concerned early on about the non-evidential nature of the new probability of uncertain evidence in Jeffrey's system and emphasized the apparent absence of rational constraints upon changes in probability in his 1957-58 correspondence with Jeffrey (quoted in Jeffrey 1975: 44): "You emphasize correctly that your ai [the new probability for some uncertain proposition] is behavioristically determinable. But this concerns only the factual question of the actual belief of A in ei. But A desires to have a rule which tells him what is the rational degree of belief."
Raising a concern similar to Carnap's, Isaac Levi (1970: 140-2) argues at some length that, given Jeffrey's personalism, JC has no normative force. Since a personalist holds that S could have chosen one of many coherent distributions at t1 and can choose one of many coherent distributions at t2, there is no reason to require him to update his probabilities from t1 to t2 by following Jeffrey's Rule, by asking himself (for example) which propositions are directly affected by some experience and which are not. S might just as well choose some new coherent distribution or other bearing no regular relation whatsoever to the former one.
Levi (1967: 200-5) goes so far as to argue that a "passage of experience" is unnecessary and that Jeffrey has no principled way to distinguish a probability shift induced by a relevant "passage of experience" from one induced by a change in blood chemistry. Since there is no genuine normativity in personalism to begin with, no dynamic norms as to what sorts of things ought to count as evidence and drive one's other probabilities, normativity cannot be created ex nihilo by the introduction of a kinematic rule for updating on a change in the probability of uncertain evidence.
Levi does not explicitly mention the issue of the rigidity of the posteriors, but that issue is related to his criticism. For if we restrict ourselves to considering Jeffrey's own examples involving some passage of experience which may affect the distribution in various ways, we may fool ourselves into thinking that this is a normative process. And if this were so, we might think that we could tell by unanalyzed intuition how the experience would, for the rational subject, affect the relevant posteriors and the other propositions in the distribution. In the example case, we imagine a subject who is rationally taking into account the auditory chiming experience in giving a new and higher probability to E, E', and H. This impression, in turn, lends some appearance of credibility to the claim that JC removes the motive for Lewis's evidential foundations, as experience still seems (despite Jeffrey's own insistence to the contrary) to be playing a quasi-evidential role. But when we keep clearly in mind that entirely arational factors can also cause a Jeffrey shift and that even experience causes such a shift rather than justifying it, all bets are off. Anything could happen to the posteriors. It is thus no wonder that we have, given personalism, no principled explanation for the fact that the posteriors are rigid in the case of ordinary background information and the auditory experience but are not rigid in the case of the unusual background information. We must simply accept that it is so, without any explanation in terms of underlying evidential structure. Thus, again, the problem is not with JC itself but with the attempt to wring blood from a stone--to get dynamic value out of a kinematic rule.
III. THE SOLUTION: CONDITIONING AND PRINCIPLED RIGIDITY
A solution to the problem of when the probability of E changes but the posteriors remain rigid is to be found, ironically, in the very foundationalism Jeffrey rejects. Pearl (1988: 64) points out that, if the "passage of experience" is treated as a piece of evidence e and if A is some hypothesis and Bi some intermediate-valued proposition in the distribution, then the probability of A given Bi will be rigid from the old to the new distribution only when A and e are conditionally independent modulo Bi, i.e. when
prob (A|Bi, e) = prob (A|Bi).
This relationship is sometimes described by saying that Bi screens off e from A.
It can be shown, further, that if the only change in evidence is the addition of the new observational evidence, then the screening off (hereafter SO) relation is sufficient as well as necessary for rigidity. The argument is as follows:
Take the definition of screening off according to which E SO p from H on some background k iff P(H|E & k) = P(H|E & k & p).
Suppose that k- is some body of given background evidence that does not include p. Now assume that, in the old evidence situation, p is not present. k- consists of S's given background information but does not include p. Then,
prob (H|E) =def. prob (H|E & k-)
(Note that this does not mean that E SO k- from H, as the k- is merely suppressed on the left side and expressed on the right.)
Assume that the only difference between prob and PROB is the addition of some new certain evidence p at probability 1. Hence
PROB (H|E) = prob (H|E & k- & p)
Suppose that E and H are intermediate-valued propositions in both probability distributions and that we are wondering whether we can use the JC formula to calculate PROB
(H) using PROB (E).
Now, on the one hand, suppose that E SO p from H on k-. Then,
prob (H|E & k-) = prob (H|E & k- & p)
Therefore, by definition of SO, rigidity holds for E, i.e. prob (H|E) = PROB (H|E), since the two terms in the above statement equal, respectively, prob (H|E) and PROB (H|E). So, under these conditions, E SO p from H is a sufficient condition for the rigidity of the posterior probability of H given E from prob to PROB.
Under these circumstances, it is also a necessary condition. For suppose, on the other hand, that E does not SO p from H on k-. Then,
prob (H|E & k-) prob (H|E & k- & p).
prob (H|E) PROB (H|E). (For the JC formula to be applicable for updating H given some new probability for E, both posteriors must be rigid, so the SO condition would have to apply with regard to -E as well.)
We can make a similar argument when p is certain in prob and is deleted--that is, ceases to be certain--in PROB. Suppose that the subject has some certain evidence p available to him at time t but loses his direct access to this evidence at t+1. Hence, p is no longer part of his certain foundations in the new distribution. As before, let k- be a body of given background evidence that does not include p. For deletion, assume that p has probability 1 in prob but has probability < 1 in PROB. In PROB p has merely its intermediate value conditional on all the subject's other evidence, i.e. on k-. Assume that the only difference between prob and PROB is that p is no longer given evidence in PROB. Then,
prob (H|E) = PROB (H|E & k- & p)
PROB (H|E) =def. PROB (H|E & k-).
Since p is given in prob, screening holds trivially in prob, because there p is always part of the given evidence even if not explicitly written to the right of the solidus. So it is trivially true that
prob (H|E & k- & p) = prob (H|E & k-).
To obtain the relevant result, then, assume that screening holds in the distribution where it does not hold trivially, in PROB where p is not part of the given evidence. Then
PROB (H|E & k- & p) = PROB (H|E & k-).
In that case,
prob (H|E) = PROB (H|E).
On the other hand, suppose screening does not hold in PROB. Then
PROB (H|E & k- & p) PROB (H|E & k-).
In that case,
prob (H|E) PROB (H|E).
Hence the screening condition is also a necessary and a sufficient condition for the rigidity of the posterior probability of H given E from prob to PROB when some evidence p is given in prob but not in PROB, and the same is true, mutatis mutandis, for ~E.
IV. EPISTEMOLOGICAL CONNECTIONS
The discussion here not only takes more literally than have earlier treatments the notion that Jeffrey Conditioning should be placed in the context of Bayesian conditioning on certain foundations, it also makes it possible to see fairly clearly the implications of the results above for various epistemological positions. Strong foundationalists will admit as rational a change from one rational distribution to another only if there has been some change in the foundations. Intermediate-valued propositions, to a strong foundationalist, cannot rationally change their credibility merely because of some non-evidential cause. Their probabilities are derivative from their evidential relations to the given evidence. Since two perfectly rational subjects at any given time with the very same foundational evidence will have the same credibilities for all non-foundational propositions, they will move from this static state to another only if the foundations change. To allow a change in propositions of intermediate probability without a change in the foundations would be, implicitly, to allow that two people with the same foundations can have different rational credibilities for non-foundational propositions.
It follows that a screening-off condition involving changes in given evidence provides a rule for the strong foundationalist as to the applicability of JC. Suppose that E changes its probability for S from one rational intermediate value to another. The strong foundationalist must look for a difference in S's given evidence pertinent to E to justify this change. If the change is (only) the addition of p, then the posteriors prob (H|"E) are rigid iff "E SO p from H in prob. If the change is (only) that p ceases to be given evidence, then the posteriors prob (H|"E) are rigid iff "E SO p from H in PROB. But no such rule will be generally applicable either for moderate foundationalists or for personalists, both of whom allow propositions with probabilities of less than 1 to be evidentially basic. On either of these views, E could change from one intermediate probability to another without a change in anything more fundamental; hence, if we reject strong foundationalism, we must allow that there might be no certain evidence p that can be examined in its relation to E and H to see if the SO condition holds in the relevant distribution.
It cannot be stressed too strongly that this understanding of the conditions for JC does not commit the strong foundationalist to the claim that probabilities must change from moment to moment only by way of conditioning. This is true if for no other reason than that I am allowing for deletion of given evidence, so conditioning will, obviously, not always model the change from one distribution to another. This is a place where Jeffrey (2004: 60) seems to have saddled the strong foundationalist with a simplistic and incorrect position. He says that on C.I. Lewis's view, one's present probabilities should encode all the information from "the conjunction of all your hardcore data sentences up to that time," not allowing the foundationalist to admit that propositions that are known with certainty at one moment may cease to be certainties at another. Yet it is obvious from the nature of experience itself that one might have an experience at one moment but cease having it later on. It is therefore in principle possible that a subject should be given a set of foundational evidence at t+1 that bears no relation by way of Bayesian conditioning to the set available to him at t. This does not mean that it is impossible for his credences at t+1 to be rational.
Moreover, the connection among the concepts of screening off, Jeffrey Conditioning, and rigidity is important for analysis regardless of the actual diachronic order in which given evidence is added or deleted. The strong foundationalist is interested in synchronic rationality and in analyzing synchronic support relations, and he insists that differences in intermediate probabilities between rational distributions be explicable in a principled way in terms of differences in the strong foundations. Hence the addition or deletion involved in applying the above result may be hypothetical; the foundationalist may be considering analytically, by way of a hypothetical diachronic model which starts without a piece of given evidence and then adds that evidence, what the evidence is doing epistemologically in the present distribution.
V. SOME OBJECTIONS AND REPLIES
Many objections will immediately spring to mind for those who do reject strong foundationalism. I propose for the most part to set aside here the objection on which, perhaps, the most ink has been spilled already--whether in fact there is such a given element to experience and whether it can or should be treated propositionally and evidentially. This was one of Richard Jeffrey's most constant objections. He argued against foundationalism of any kind, including strong foundationalism, along these lines in his discussion of Carnap, Quine, and C. I. Lewis in Probability and the Art of Judgment (1992: 3-12, esp. 6, 10). He revisits the objection at greater length in "Probabilizing Pathology" (1989: 218-220), emphasizing there the concern that, if there is such a thing as given experience, it cannot properly be regarded as propositional in nature or as having a defined probability of 1 in the new distribution and some other probability in the old. (See also Jeffrey 1992: 78-79.) Obviously, if strong foundations are strictly unavailable, there is no point in discussing what the probabilistic situation would be if only we had them. If they are not the sort of thing that can be treated probabilistically, they cannot be used as the strong foundationalist wants to use them. And if they cannot be conditioned on (perhaps because they never or seldom have a probability in the old probability distribution), then they cannot be used in the way that p is used in the first of the above results. The discussion of deletion below has some implications for the contention that propositions expressing immediate experiences have no probability when one is not having the experience. But the entire debate between strong foundationalists and their critics cannot be reenacted here.
Several other objections, however, can be answered without such a rehearsal. First, one might understandably wonder whether the screening off relation discussed here is any more accessible to the subject--even to a subject who is himself an epistemologist--than the rigidity of the posteriors themselves. May we not have solved the puzzle of when the posteriors are rigid only to introduce a new and equally difficult puzzle of when screening-off holds?
In fact, the SO relation is, plausibly, at least as psychologically accessible as the rigidity condition. It could even be argued that when we can tell easily and (seemingly) directly that rigidity holds, we are implicitly accessing screening off. The emphasis on screening helps us to bring into clearer psychological focus what we are really groping for when we think about rigidity. This contention is supported by Jeffrey's own attempt to keep the discussion to situations where a "passage of experience" occurs and "directly" affects some proposition or propositions. So natural is it to link rigidity with screening off that it would be easy to think that this sort of screening off just is the same thing as the rigidity condition. But this is true only if, unlike Jeffrey himself, we allow the experience in question to be given evidential import.
But beyond this psychological fact of the matter, it is simply philosophically better if our degrees of confidence are explicable in a principled fashion, and this is true regardless of whether we access the underlying evidential grounding of those probabilities more or less easily than the intermediate probabilities themselves. And, as Levi points out, it is only by having some principled stance regarding synchronic rationality that we can take any rule for probability change to be normative. By thinking in terms of foundational evidence and the ways in which it affects non-foundational propositions--sometimes directly, sometimes indirectly, sometimes by this or that route--we are enabled to think of both synchronic distributions and diachronic changes in a normative fashion, and this is all to the good.
A related tu quoque objection involves the argument that, just as conditions for the rigidity of posteriors are difficult to access, so too the proper proposition to express the relevant foundational evidence is difficult to access. It is true that it can be difficult to isolate and express accurately the proposition that represents some given piece of foundational evidence. But this fact does not amount to admitting that the personalist has a tu quoque against the strong foundationalist, for the problems with personalism indicated here go much deeper than a mere difficulty in discovering when rigidity is satisfied or in finding the proper evidence E for which the posteriors are rigid. The deeper problem lies in the fact that, given personalism, the shift in the probability of any intermediate E is of necessity a surd. The question is whether JC, taken in its original personalist context, requires any experience at all or, when there happens to be a new experience, any principled relation between that experience and probability change. Personalism does not require an experiential reason for intermediate probability change, and it does not assume that there is some right way to model the connection between experiential change and new intermediate probabilities. The difficulty in finding conditions for rigidity draws attention to the underlying lack of rational moorings for JC--or, indeed, for synchronic intermediate probabilities--given personalism itself. The strong foundationalist, whatever his difficulties may be in stating a foundational proposition that expresses the evidentially relevant aspect of experience in some given case, has to his credit a theoretical commitment to the existence of such aspects of experience in all cases, at least where 'experience' is taken broadly enough to include not only sensory experience but also experiences of introspection, memory-like experiences, and the like. Thus his search for the relevant aspects of experience in some specific case is never an ad hoc and possibly pointless attempt to justify a change in an intermediate probability in terms of something more fundamental.
Issues of accessibility lead to specific questions about the applicability of the result for deletion. Where given evidence is added, the above discussion would suggest that various new probabilities could in principle be calculated by simple conditioning. But in the case of deletion, there is no "unconditioning" formula we can use to represent the fact that p has been removed as given evidence and thereby to calculate PROB (E) by a formula from within prob. Does this make PROB (E) arational in the deletion case, and does it therefore mean that the strong foundationalist cannot claim any advantage over the personalist?
Here again it is important to stress that the point was never to recommend that all diachronic changes be calculated either by simple conditioning or by some other formula that models the change mechanically. The point, rather, was that if uncertain E does change from one rational intermediate probability to another, that change should be traceable to a change in the foundations. The new rational probability of E when p is deleted will, on the strong foundationalist's view, be a function of E's connection to the remaining pertinent given evidence, and this different probability would be accessible to a perfectly rational and probability-theoretically omniscient subject who could see the evidential impact of the remaining given evidence. So the new probability of E is principled despite the absence of a deletion formula similar to the formula for simple conditioning.
It also makes sense, on the strong foundationalist view, to consider p itself to have a principled probability when it is deleted--that is, when it is no longer certain. This idea of a probability for p in a distribution where p is not certain is important for the addition case as well and relates to the oft-repeated objection that propositions expressing immediate experience simply have no defined probability at all before they occur. My suggestion is that, when some experience is not directly present and hence the proposition describing one's having that experience has less than probability 1, it can have what we might call a "theoretical likelihood" based on its semantic and logical relations to relevant present foundations and to any relevant intermediate-valued hypotheses that receive probability from them. A probability-theoretically omniscient subject would be able to see this probability even for experiences he was not presently having, and this would be true whether the experience had not yet happened or had happened before and then had ceased to be immediately present to consciousness.
The greatest difficulty in modeling deletion cases is the ceteris paribus condition. The result above applies only to cases where the deletion of the piece of given data is the only difference between prob and PROB. But both in real-life circumstances and in analysis, this is rarely the case. The simplest case where given evidence ceases to be given is one in which the subject ceases to have some occurrent sensory experience. But almost always that cessation is accompanied by the occurrence of a new memory-type experience as if one has just had that sensory experience. The new memory-type experience is itself added evidence, so ceteris paribus is violated; the deletion of the sensory experience is not the only change.
Similarly in analysis, one often wants to ask what one's evidential situation would be if S had not gained some particular evidence--e.g. if S had not had a conversation with Jones in the hallway about Smith's tenure prospects. But do we then for purposes of analysis want to imagine that S's present cognitive state includes a memory of having suffered a mental blackout on the way to the water cooler? What indirect ramifications might such an attempt to delete cleanly (without memory trace) the conversation with Jones have for S's assessment of Smith's tenure prospects? For example, if S has reason to fear that he is developing mental problems because he is having blackouts, will he then reasonably give lesser weight to the memories he still has of conversations with other colleagues about Smith or to his memories of the quality of Smith's work? If, on the other hand, we analytically "fill in" the time he actually spent talking with Jones with some other set of experiences, it is possible that the hypothetically added experiences will be pertinent to the target proposition about Smith's getting tenure. Again, it can be difficult to construct a realistic deletion case that maintains ceteris paribus.
The result and discussion above suggest that if the question is one of the rigidity of posteriors, we should analyze both the added and the deleted information in terms of screening off. And this is helpful to some extent. For example, suppose that E is something like "Jones said in a conversation that he would vote against Smith's tenure," H is "Smith receives tenure," and p is something like "I have a memory experience as of Jones saying that he will vote against Smith's tenure." If then we remove S's current direct access to p so that, at most, he can predict having such a memory experience on the basis of other evidence, and if we add instead q, something like "I seem to remember blacking out on my way to the water cooler," it might plausibly be argued that p is screened from H by "E but that q is not, for the reasons given above about introducing doubts in S about the proper operation of his mental faculties. So we might try introducing instead q', something like, "I seem to recall going to the water cooler and getting a drink without speaking to anyone," which would plausibly and on most ordinary backgrounds be irrelevant both to E and to H and hence screened from H by "E. We would then be left with the lowering of the probability of E and the raising of the probability of H by the deletion of p and the filling in of q' in its place. But again, in this case we are able to see the irrelevance to H of q' on the background, an epistemic intuition for which there is no computational substitute.
In all of this, what becomes of Jeffrey Conditioning? Here we return to the second competitive assumption mentioned at the outset--that the presence of certain foundations makes JC superfluous. Jeffrey always introduces JC after saying that given evidence is not always available, and Pearl (1988: 70) explicitly conjectures that we need not bother with JC if we have certain evidence or if we can legitimately model our experience as certain evidence.
This opposition between JC and foundationalism goes hand in hand with the assumption that, if we have certain foundations, other propositions will be based on them directly. At one time Jeffrey (1972: 97) apparently believed that strong foundationalists require everything else to be based directly on the foundations. C. I. Lewis, Jeffrey implies, was motivated in his search for certain evidence by "an inability to see how uncertain evidence can be used." Jeffrey then gives an interesting misquotation of Lewis as saying, "If anything is to be probable, then something must be certain. The data which themselves support a genuine probability, must themselves be certainties." But Lewis (1946: 186) actually said, "The data which eventually support a genuine probability, must themselves be certainties." [emphasis added] Of course, Jeffrey had other reasons for rejecting strong foundationalism, but the misquotation is interesting for the light it sheds on the assumed conflict between JC and strong foundationalism.
In a similar vein, Peter Vranas (personal communication) has argued that JC is not necessary for the strong foundationalist, since on the strong foundationalist's view, posteriors can be calculated by simple conditioning when new certain evidence is added. As Vranas sees it, the addition result given here is available to anyone for what it is worth (it is, after all, a provable result and does not crucially depend for its correctness on any particular epistemological position) but is not particularly helpful to anyone who needs it. It cannot be viewed as helpful to the personalist, who would not be a personalist anymore if he became a strong foundationalist so as to be able to apply the result widely. And the problem it aims to solve--the determination of the rigidity of the posteriors for JC--does not arise for the strong foundationalist in the first place, since he can use Bayesian conditioning instead of JC.
If strong foundationalism does render JC superfluous, this is a challenge, albeit not a devastating one, to strong foundationalism. Jeffrey Conditioning does not seem superfluous; once it is understood, it seems enormously useful. We rarely think explicitly of our strictly certain, directly accessible foundational evidence. More often we are likely to ask ourselves how the probability of some higher-order hypothesis changes as the probability of some everyday, but still uncertain, proposition changes. If we must make a stark choice between conditioning directly on certain evidence and using JC, some (especially those not independently convinced of the necessity for strong foundations) might be tempted to stick with uncertain evidence and to hope for the best as far as justifying intermediate probabilities by way of foundations with probability 1.
The response to this objection involves stating outright what has been more or less implicit heretofore: The epistemic value of JC derives from the notion of epistemic routing, and the contemplation of screening off and posterior rigidity helps to make epistemic routing explicit. The idea is that new evidence may be relevant to the hypothesis H "through" or "by way of" its relevance to E. My experience of the wind chime sound is relevant, in the normal case though not in the case where I have the special hearing problem, to the proposition about the coming storm by way of the proposition about the chimes ringing in the high wind. (See also Pearl 1988: 66-7.)
Such routing facts are everywhere in a rational subject's epistemic economy, and they are both objectively real and epistemically crucial. We do not go directly from qualia to quantum theory. In fact, if all one had were an isolated experience like the chiming sound without any background knowledge about wind and storms--as, for example, in the mind of an infant who has sensory experiences but lacks a context by which to interpret them--no evidential connection between the experience and the proposition about a coming storm could exist at all. The importance of vast amounts of background knowledge--grounded, to be sure, in a host of occurrent memory-like experiences, but bearing on uncertain empirical facts--makes it clear that the concept of routing is of overwhelming importance to the understanding of evidence.
This is why JC is important. When there is a change in the foundations, or even when we hypothetically imagine a change in the foundations for purposes of analysis, the new evidential situation affects some propositions more directly than others and affects some by way of its effect on others. This fine structure of a rational evidential corpus, involving propositions that route the force of foundational evidence to higher-level hypotheses, must not be ignored, and it cannot be understood without a consideration of intermediate-valued propositions and screening off relations.
But simple Bayesian conditioning on the foundations, even in cases of addition where it can be applied to calculate PROB (H) as prob (H|p), tells us nothing by itself about this intermediate evidential structure. Suppose that the proposition about the wind chimes ringing in the wind (E) and its negation do, on my background, screen off the experiential evidence (p) from the proposition about the coming storm (H). If p is added to my certain evidence, and if I get the new probability of the proposition about the storm by Bayesian conditioning directly on p, this by itself does not indicate the evidential role of E, the proposition about the actual ringing of the chimes in the high wind. To be sure, if I am probabilistically acute enough I can also condition on p to get a new and higher probability for E, but that calculation, too, does not show how the experience, the proposition about the chimes ringing in the wind, and the proposition about the storm are all related evidentially to one another. Those three items are like pieces of a puzzle. It is possible to have access to all three of them and even to note some of their evidential relations without seeing clearly that the evidential force of the experience of the sound is routed to the proposition that a storm is coming by way of the proposition that a high wind is blowing the chimes. The Jeffrey formula forces us to focus on the change in intermediate E and on the question of the rigidity of the posteriors. The rigidity of the posteriors, in turn, can be analyzed by the strong foundationalist in terms of whether ±E SO H from p in prob. And once these screening facts come to light, both the importance of the foundational evidence and the role of E in routing the force of that evidence to H are clarified.
Bayesians are not exempt from the need to address arguments for strong foundations, because manipulations of intermediate probabilities cannot, in the nature of the case, replace the epistemic foundations that make intermediate probabilities non-arbitrary. This is why the formula for probability kinematics does not render strong foundations unnecessary; it cannot do dynamic work. To an epistemologist who sees the value of Bayes's Theorem in epistemology, strong foundationalism can be helpful in that it avoids the difficulties of personalism, only one of which is the difficulty in giving a principled explanation of posterior rigidity.
But if strong foundations are indispensible, so too is the complex set of evidential relations by which they influence higher-level propositions. An appreciation of the role of Jeffrey Conditioning within a strong foundationalist framework can help us to follow an otherwise tangled evidential thread backwards through the propositions that are intermediate--both in the sense of having non-extremal probabilities and in the sense of standing between the foundations and other propositions--to its origin in the foundations.
Diaconis, Persi and Zabell, Sandy. 1982. "Updating Subjective Probability." Journal of the American Statistical Society 77: 822-830.
Earman, John. 2002. "Bayes, Hume, Price, and Miracles." In Richard Swinburne (ed.), Bayes's Theorem. Oxford: Oxford University Press for the British Academy, 91-109.
Hawthorne, James. 2004. "Three Models of Sequential Belief Updating on Uncertain Evidence." Journal of Philosophical Logic 33: 89-123.
Hesse, Mary. 1974. The Structure of Scientific Inference. Berkeley: University of California Press.
Jeffrey, Richard. 1965. The Logic of Decision. Chicago: University of Chicago Press. (Page numbers are from the second edition.)
Jeffrey, Richard. 1970. "Dracula meets Wolfman: Acceptance vs. Partial Belief." In Marshall Swain (ed.), Induction, Acceptance, and Rational Belief. Dordrecht: D. Reidel, 157-185.
Jeffrey, Richard. 1972. "Probable Knowledge." In Sidney A. Luckenbach (ed.), Probabilities, Problems, and Paradoxes: Readings in Inductive Logic. Encino, CA: Dickenson, 92-103.
Jeffrey, Richard. 1975. "Carnap's Empiricism." In Grover Maxwell and Robert M. Anderson, Jr. (eds.), Induction, Probability, and Confirmation (Minnesota Studies in the Philosophy of Science, Volume VI), Minneapolis, MN: University of Minnesota Press, 37-49.
Jeffrey, Richard. 1989. "Probabilizing Pathology." Proceedings of the Aristotelian Society 89: 211-25.
Jeffrey, Richard. 1992. Probability and the Art of Judgment. Cambridge: Cambridge University Press.
Jeffrey, Richard. 2004. Subjective Probability: The Real Thing. Cambridge: Cambridge University Press.
Levi, Isaac. 1967. "Probability Kinematics." British Journal for the Philosophy of Science 18: 197-209.
Levi, Isaac. 1970. "Probability and Evidence." In Marshall Swain (ed.), Induction, Acceptance, and Rational Belief. Dordrecht: D. Reidel, 134-156.
Lewis, C.I. 1946. An Analysis of Knowledge and Valuation. LaSalle: Open Court.
McGrew, Timothy. 1995. The Foundations of Knowledge. Lanham, MA: Littlefield Adams.
McGrew, Timothy and McGrew, Lydia. “Foundationalism, Probability, and Mutual Support.” Erkenntnis 68: 55-77.
Pearl, Judea. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann Publishers.
Wagner, Carl. 2001. "Old Evidence and New Explanation III." Philosophy of Science 68: S165-S175.
Wagner, Carl. 2002. "Probability Kinematics and Commutativity." Philosophy of Science 69: 266-78.
.JC applies for updating the probability of H using any exclusive and exhaustive set of sentences for which rigidity holds. An exclusive and exhaustive set, known as a partition, can take the place of "E in the JC formula. Cases of this type are discussed in McGrew and McGrew 2008.
.In Pearl's example, H is affected by the new sensory evidence via a different indirect route, not via E. Pearl's point is that one might not think of this route and might try to apply the JC formula, believing erroneously that P(H|E) remains rigid because of the irrelevance to H of E and the fact that E is itself directly affected by the new experience. (I am here changing Pearl's notation to my own.) Pearl does make a slight misstep in explaining this example when he apparently conflates the color a piece of cloth appears to be with the color it actually is.
.I should stress that Levi himself is by no means a strong foundationalist. He explicitly rejects Lewis-style foundations in favor of fallible foundations to which one nonetheless gives probability 1. See Levi 1967: 206-9.
.I developed this argument before I was aware of Pearl's work on the subject. Pearl does not mention that the SO condition is sufficient as well as necessary for rigidity.
.I am indebted to Peter Vranas for pointing out a serious technical error in an earlier version of the result for deletion. That error is corrected here.
.Diaconis and Zabell (1982: 824) had suggested that JC could be thought of as arising from ordinary conditioning taking place within a richer evidential probability space. Carl Wagner refers briefly (2002: endnote 1) to the notion of an experience as a “fictional ‘phenomenological event’” on which we can think of ourselves as conditioning, and Pearl (1988: 64, 68) implies that we should act as if the experience could be treated propositionally and hence analyzed in terms of such concepts as screening off.
.This point relates to the result discussed by Diaconis and Zabell and by Carl Wagner showing that there can be distributions P and Q such that Q is not accessible from P by way of conditioning. See Wagner 2001: 174, Diaconis and Zabell 1982: 824, Theorem 2.1, and Jeffrey 1992: 127. I am indebted to Carl Wagner for bringing this issue to my attention. That result does not undermine the epistemic application I wish to make of Pearl's result, as I am not requiring that diachronic belief change occur only by conditioning but rather stressing that there must be differences in strong foundational evidence as a principled explication of differences in intermediate probabilities.
.In line with Pearl's criticism, it is important to stress that the new experience may not affect all the altered posteriors directly. Relatedly, JC can be useful for modeling the effect on H of a change in E even if there are other propositions in the distribution for which the posterior probabilities on E do not remain rigid, so long as the posteriors P (H|±E) are rigid. I am grateful to James Hawthorne for stressing this point in correspondence, for explaining in detail why it is right, and for drawing my attention to Jeffrey's own assumption (1965: 168-9) that one will use JC for propagating the change in the probability of E through the entire distribution.
.I owe this criticism to an anonymous reviewer.
.A loosely-described example of such a “theoretical likelihood” would be the non-certain probability that you will have a tiger-like experience given other foundational evidence bearing on the proposition that you are presently at the zoo.
.Jeffrey gives this same misquotation of Lewis repeatedly, e.g. Jeffrey 1965: 167 and 2004: 60. But in Jeffrey 2004: 59-60 he also gives a quotation from Lewis's Mind and the World Order in which Lewis clearly rejects a requirement that everything be based directly on the foundations. At that point Jeffrey does not appear to have had this particular misconception about strong foundationalist requirements.
.The misquotation is pointed out by McGrew (1995: 71), who discusses at more length the charge that foundationalists are unable to assimilate uncertain evidence.
.This is a brief summary of some comments by Vranas on an earlier version of this paper presented at FEW 2006.
.Similarly, John Earman (2002: 103) refers to evidence as bearing on one proposition "only through" another.
.For a much longer discussion of epistemic routing, the use of the JC formula as a modeling tool, and the concept of intermediate-valued propositions as conduits of the evidential force of foundational evidence, see McGrew and McGrew 2008.
.I wish to thank Timothy McGrew, James Hawthorne, Peter Vranas, and Carl Wagner for helpful comments and discussion. James Hawthorne provided generous bibliographic help both early and late in my research without which it would not have been possible to complete this project.