Submitted by nksingh on Tue, 05/12/2009 - 15:09.
Following are key points of Statistical Thinking
-
All works occur in system of inter connected processes.
-
Variation exist in all processes
-
Process of output varies according to special variation (due to special cause) and random variation (random cause). Understanding and reducing variation is key to success.
These concepts are in context of quality control. For giving it a broader meaning so that concept of Statistical Thinking may be generalized, it is reshaped by use of following concepts (see Wild and Pfannkuch)
-
Statistics is science of variation
-
Statistical tools works between context area and data
-
Ultimate aim of use of statistical tool is enriching the context area.
On this site, concept of Statistical Thinking has been used in more broader sense by keeping its original spirit intact. For our purposes (to use statistical thinking in other area than engineering), concept of Statistical Thinking is based on following components
-
Study of variation
-
Data
-
Enriching context area
-
Simulation
Given diagram represents how `Statistical Thinking’ may be perceived to accommodate current need.
A Unified Overview of Statistical Thinking
The brief “goal oriented” history of statistics I offered in my earlier post of today (3 October 2009) described three purposes for statistical analysis’s uses of data (x): measuring the probability or plausibility of scientific hypotheses; identifying the likely or expected consequences of decisions (allocations of resources); & measuring the evidence, or the change x induces in the odds of one hypothesis versus another.
One fundamental mathematical fact limits how these three purposes may be worked out: According to the most commonly accepted laws of probability (especially Kolmogorov’s definition of conditional probability), observed data alone cannot identify the probability of any hypothesis, or the expected consequences of any decision. Rather, x can only change the probabilities & expected consequences that existed before the data were observed.
In other words, as a consequence of these laws of probability, these three purposes of statistics follow a certain necessary sequence: Evidence, Inference and Decision Analysis. That is, the data x constitute evidence regarding hypotheses, which we may measure and report; we may combine the evidence with other knowledge, to infer probabilities about hypotheses; and we may further combine those inferences with the utilities, or consequences, of our candidate decisions, to identify the decisions’ expected utilities.
I will illustrate this sequence using Royall’s (2000) medical diagnostic testing example, in which he identified three questions the physician wanted to answer on the basis of a positive test result (+T):
There, the probability Pr(+T|+D) of +T given a positive disease status (also called the “sensitivity” of the test) was 0.94, whereas Pr(+T|-D) was 0.02 (meaning the test’s “specificity” was 0.98). Now, due to the aforementioned definition of conditional probability, the pre-test odds of the disease, Pr(+D)/Pr(-D), is related to the post-test odds, Pr(+D|+T)/Pr(-D|+T), through the equation
Pr(+D|+T)/Pr(-D|+T) = Pr(+T|+D)/ Pr(+T|-D) · Pr(+D)/Pr(-D).
Substituting the test’s sensitivity & specificity into this equation yields
Pr(+D|+T)/Pr(-D|+T) = 0.94/0.02 · Pr(+D)/Pr(-D) = 47· Pr(+D)/Pr(-D).
Thus, whatever was the pre-test odds of +D, the post-test odds is 47 times as great; in other words, the evidence contained in the test result, & measured using the so-called “likelihood ratio” Pr(+T|+D)/ Pr(+T|-D), is 47. The evidential interpretation of the observation +T is that +T has increased the odds of +D by a factor of 47.
Once we specify a pre-test probability of +D, a statistical inference about +D is available. Say that, among all patients having risk factors, demographic characteristics and so on, similar to those of the patient being observed (but absent any test results), the prevalence of the disease being studied is 0.001. Then, the pre-test odds Pr(+D)/Pr(-D) is 0.001001, so that the post-test odds is 47·0.001001=0.047047. This statistical inference about +D is equivalent to the inference that Pr(+D)=0.045, and exemplifies how evidential statistics is a step along the path towards inferential statistics.
Likewise, we can use Royall’s example to show how Statistical Inference forms a stage in the process of Decision Analysis. Say—simplistically—that the physician’s only two possible decisions are A. treat the patient with a certain drug, or B. do nothing. If +D is true, then Decision A would restore the patient’s health completely (a “zero” end-state), but Decision B would lead to particular adverse consequences C1. If –D is true, on the other hand, then Decision A would lead to adverse consequences C2 whereas Decision B would allow the patient to simply heal spontaneously & quickly (another “zero” end state). Thus, given +T, the expected consequences associated with Decision A are
0·Pr(+D|+T) + C2·Pr(-D|+T)= C2·Pr(-D|+T).
The expected consequences, given +T, of Decision B are, similarly,
C1·Pr(+D|+T) + 0·Pr(-D|+T)= C1·Pr(+D|+T).
If one has identified C1, C2, Pr(-D|+T) & Pr(+D|+T), then one may easily determine the decision associated with the better expected consequences. Alternatively (and rather elegantly), Decision A is advantageous iff C2·Pr(-D|+T) < C1·Pr(+D|+T), equivalently, if
Pr(+D|+T)/Pr(-D|+T)·C1/C2>1,
which is easier to use if the ratio C1/C2 can be identified even if the values C1 & C2 cannot be identified. This inequality illustrates how Statistical Inference and utilities (consequences) combine to determine optimal decisions. Alternatively, Decision A is advantageous iff
Pr(+T|+D)/ Pr(+T|-D) · Pr(+D)/Pr(-D)·C1/C2>1.
This inequality neatly lays out the 3 components of decision analysis: Evidence (the likelihood ratio), Pre-test beliefs (probabilities or odds), and utilities (consequences).