Friday, January 5, 2007

Awesome, With Class

So, just delivered an A++ lecture and thought I would share. This may get hyper-nerdy; you have been warned.

So we've been talking about hypothesis testing in Statistics these past few days back from break. On a fundamental level, understanding what goes on in hypothesis testing would require at least a vague, dreamlike memory of pre-break concepts. Ahem. So we did our best with serious gaps in our groupwide knowledge of probability curves, z-scores and such, and hopefully the whole thing sunk in more or less well.

Things got very interesting today when we talked about the implications of the implicit structure of hypothesis testing. When a claim is made (i.e., 4 out of 5 dentists prefer colgate), in order to test that claim you set up two hypotheses. The first is called the "null hypothesis," the assertion that nothing fishy is going on and the claim is actually true. The second is the alternative hypothesis, that the opposite of the null hypothesis is true, and as such these two hypotheses have to be more or less mutually exclusive.

By its very nature, nothing in inferential statistics is ever 100% certain: you are taking samples to represent wholes, and as such there is always the possibility, however remote, that your sample is not representative. So it turns out that when you test a null hypothesis, you can never absolutely accept it; really what you're doing is making a decision between rejecting the null hypothesis (if it is blatantly incorrect) or failing to reject it (if you cannot prove it to be false). This may sound like a subtle distinction, but it is not.

Our justice system, for example, relies entirely on this method of hypothesis testing. When a person is indicted, they are assumed innocent until proven guilty. The null hypothesis becomes "they are innocent" and the alternative becomes "they are guilty." The trial becomes a sampling process of the evidence (surely being sub-omnipotent we cannot ever present ALL the evidence), and we take that evidence and compare it to our null hypothesis. If the null hypothesis is shown by evidence beyond reasonable doubt to be false, we reject the null hypothesis and accept that the alternative hypothesis, that the person is guilty, must be true. However, if the sample of evidence does not compel us to reject the null hypothesis, then we are left with only that option: we do not reject the alternative hypothesis, it's simply that we fail to reject the null hypothesis. The language is glaringly complicit with this idea: the accused is not declared "innocent," he is declared "not guilty." Meaning that he may be innocent, he may even probably be innocent, but the most that we can say is that we are not rejecting that possibility, NOT that we are accepting it.

Null hypothesis testing, then, has a very noticeable trait: you can never prove anything with it, you can only disprove things. Take the supposition that "there is no such thing as a naturally purple polar bear." In this case the null hypothesis is that there are no purple polar bears; the alternative is that there is at least one purple polar bear. You will be extraordinarily hard-pressed to prove that there are no purple polar bears; you would have had to have seen every polar bear in the known universe and confirmed it to have been non-purple, plus you would have had to have investigated every nook & cranny that serves as a potential purple polar bear hiding place. You would also have to do this for an infinite amount of time, based on the chance that the purple polar bear is just constantly evading you. Etc. The point is that the null in this case is near impossible to prove, but exceedingly easy to disprove: all you have to do is show up with one purple polar bear, and you would be forced to reject the null hypothesis. The absurdity of the situation aside, it is vastly easier to disprove things than to prove things; any ad that claims that "Studies prove" probably means that there is an exceedingly small probability that what we're saying is false and therefore we're assuming we're probably right.

So the reason class was so cool today is because we were discussing this idea, that you can basically never know anything for sure; all you can do is make reasonable guesses and assume they are correct until something shows you otherwise. There are an unlimited number of examples this from science, but we touched on a few of the big ones: Copernicanism, Relativity, practical physics. And we slowly got to the point that no one ever asserts that what they're saying is undeniably true; good science revolves around hypotheses that are capable of being shown to be false. What we do is draw conclusions based on the evidence at hand and so long as our conclusions continue to be consistent and useful, we continue to use them.

So this was the big AHA of the class today - that science could actually care less whether things are true or not. Science (a lot of it, anyways) is entirely rooted in stats; no one would ever be able to show or "prove" anything without P-values. And statistics implicitly admits that nothing is certain, that everything has a chance of being incorrect, and the only reason we work with assumptions that cannot be shown to be true is that it would be entirely impractical not to do so. Assuming that things that are probably true are actually true is at base an exercise in pragmatics: we use science and the models and theories it provides not because of their truth, but because of their utility. Science is far and away the best thing we've come up with for accomplishing things, and it is this that drives science (not to mention the dollars associated with that utility), not some abstract notion of the quest for truth. If the tools science use fundamentally can't tell you absolute truths, how could science ever hope to?

So pragmatism was the big word of the day; you have to shift your thinking in stats and science away from "Truth" and towards that which is useful. I have a couple of fairly devout religious kids in the classroom, and they really enjoyed the subsequent thought: that all of the clash and emotional conflict of the Creation v. Evolution debate is by definition incompatible and not really a conflict at all. Creation deals in truth; it is a truth that may or may not be actually *True*, but it does not really have a beef with Evolution, because evolution theory will only be considered "probably true" insofar as it continues to be useful. Science does not pretend to "know" anything, nor was it ever meant to. Religion, on the other hand, professes to know a lot of things. On the plus side for the reli kids, this means that there is no real conflict between the two doctrines unless you incorrectly assume that science is revealing truth. The main problem, I think, is that the notion of "what is the objective truth" is such a desirable endeavor that scientists tend to overstep their own self-imposed bounds. Occam's razor is generally taken to mean "All things equal, the model that is simplest is best." The word "truth" appears nowhere in there. The minus for the reli kids, of course, is that things like Creationism, by virtue of their assertion of Absolute Truth, is by definition NOT SCIENCE, and as such belongs nowhere near a school science curriculum.

The notion of science as it pertains to its utility regarding the world and the separation of that notion from the one of science as it objectively describes the true world is a painfully abstract and subtle one. I.e., I don't think USA Today will be heading off the argument with this entry as its headline anytime soon, and that's not to say that I think it's the reli-heads who don't get it, I don't think the science-heads get it, either. Of course, there are outstanding and overlapping components of this problem; practical decisions are made by virtue of religiously held ideas and concepts of truth are derived from scientific findings. So I would be hard-pressed to argue that the conversation does not need to take place. I would just argue that the vitriol and fire of the debate could effectively be dropped; any argument that claims these two areas of study are in conflict and veers near fisticuffs is letting irrational heat cloud the fact that on the rational level, these two views are really not at odds. The irrational side is an embarrassing and vital component to people, but tempering it in intellectual debate would be much appreciated.

I ended the class today with the aphorism: "Science is pragmatics; leave truth to religion and philosophy." I'm not sure where that rates on the cheesiness scale. But the point was made loud and clear, understood and discussed by my students, and so it feels a bit like I* got through today.

* - I also successfully negated the entire thing with "this is coming through the filter of me, so take it with a grain of salt." That's what known in show biz as "a disclaimer."A devilish one be I, arrrrrr.

No comments:

Post a Comment