1992 Response to Kyburg's Believing on The Basis of The Evidence a COMPUTATIONAL INTELLIGENCE discussion KYBURG AND VOLKSWAGENS My understanding of probability is Kyburg's. I believe the reference class problem is the most interesting problem in probabilistic reasoning (machine learning take note!). I believe that probability should mediate the fit of scientific theory to observational data when predictive power and error must be traded against each other (plan recognition beware!). I believe the Bayesians are corrupt, or bankrupt, or in any case, up to no good; I believe that we accept statements as firmly as I believe anything. My problem with Kyburg has to do with computational practice. What are the implications of his philosophy to people who are building programs? What difference will it make when uncertain and inductive reasoning is just a component? Inexpressive language, resource-limited computation, and poor modeling of preference are surely larger concerns for the knowledge engineer. Is Kyburg suggesting something that will pay new dividends in artificial intelligence, or just a better metaphor? I think Kyburg's essay is merely an arcane description of current practice; it is an apology for how we actually do things in our epistemological lives, unbeknownst to ourselves. It may be too painful for Bayesians and others to acknowledge, but it is what we already do, at least when we are rational. Elsewhere, Harman has been more successful at arguing why acceptance might make computational sense. A Bayesian knowledge engineer will condition on statements that are contingent, hence, not knowable with certainty, hence accepted. This is because we choose the input to the programs and we usually choose a high level of abstraction -- we choose not to model uncertainty at the level of robotic sensor inputs (and most Bayesian inference engines are connected to diagnosis programs, not to robots). Perhaps robots and other image interpretation programs can be pure Bayesians and avoid acceptance altogether. The computational effort is immense when there is unwillingness to build abstractions. This explains why Cheeseman can remain a pure Bayesian: he is happy to crunch huge data sets. The rest of the Bayesians, for example Pearl, admit that they accept contingent sentences and will continue to do so whether or not Kyburg notices. Identifying reference classes is something that knowledge engineers don't much worry about since they do not often work directly with sample data. But consider training connectionist networks on data sets, then using forward propagation to predict some property for new inputs. Shallow applications of connectionism of this kind abound: predicting bad weather, component of speech, control of motion, handwritten characters. A criterion of similarity must be determined, whether by the magic of back-propagation or by the light of reasoned methods. The criterion must trade similarity against the desire to bring as much of the past to bear as possible. Consider, too, case-based reasoning in a legal or problem-solving domain. Ashley and Rissland's "Waiting on weighting: a symbolic approach to least commitment" perfectly describes the non-Bayesian alternative Kyburg advocates: prediction need not be the result of weighting all past experience; some past experience might be excluded and ``participate'' only with a weight of zero. This is the same question raised by reference classes. Kyburg's own methods are worth studying, implementing, and applying (I have done all three), but they do not seem to lead to major advances over what is already being done. Finally, Kyburg details how theory, measurement, and error interact. This is the jewel in Kyburg's philosophical crown; it is the legacy of a life lived with probabilistic honesty. Only Kyburg can give a good explanation of how one comes to adopt some universal generalizations and not others, i.e., how one theory or set of axioms is chosen over another. Langley, Kelly-Glymour, Kautz-Pollack, McCarty, Mitchell, Quinlan, and many, many more would do well to understand why Kyburg has so much leverage when fitting rules to observations. Kyburg does not use familiar examples, such as curve-fitting, when he could. But even if they were to understand, they probably would not change their practice. Each has provided for the right tradeoff of error against predictive power in his or her own way, or else has chosen to avoid the issue. To be enlightened by Kyburgian methods would only be to become aware of an intriguing method, and then to dismiss it as too expensive computationally. So what are we to make of Kyburg's exotic machinery? Ferdinand Porsche said that if he were an automobile, he would want to be a Porsche. There will be programs that perform inductive tasks, and some programs will want to be Kyburgian. But they will always be a specialty breed, expensive to manufacture and inappropriate for the masses. Porsche's doom is Kyburg's as well: most of the time, a simple Volkswagen is good enough.