Login |  Password

Exploratory data analysis

Generally a sort module on descriptive statistics is covered in statistics course but  how to analyze through these tools, is ignored in most of the cases. After availability vast computational power, now some additional tools should be included in course of descriptive statistics. Since tools under descriptive statistics are not statistic only, it is better to call it Exploratory Data Analysis (EDA). I do not know why wikipedia fails to identify relation in both terms (Descriptive Statistics and Exploratory Data Analysis). Both technique provides tools to draw conclusion based on sample. These tools are very important to initiate statistical analysis. Generally statistician ignores because its outcome cannot be generalise for entire population.
Nomenclature, exploratory data analysis (EDA) was use by Tukey along with confirmatory data analysis (CDA). In the former the data are important while in the latter the model is important. In EDA the principal aim is to see what the data are “saying”. It is used to look for unexpected patterns in data. In CDA one is trying to disconfirm a previously identified indication, hopefully doing this on fresh data. It is used to decide whether data confirm hypotheses the study was designed to test. Fisher called EDA as cross examining of data i.e. questiong the data and elisiting the answer. Slides o Prof. C.R. Rao helps in getting understanding to use EDA (along with CDA. Chatflied ( Chatfield, C. (1985). "The Initial Examination of Data," JRSS-A, 148, 214-253) has discucussed these issues in detail.
In course of Descriptive Statistics (or EDA) mostly people ignore “Effect Size “ It is more important for bio statistics. Following links may help to get understanding of What generally people ignore to "Effect Size" in I am sending some links on EDA and Effect Size for your comments (on basis of your teaching experience in this field).

  1. EDA as part of data mining
  2. Book on effect size
  3. What effect size is and why it is important?
  4. A Scale of Magnitudes for Effect Statistics