Challenges before statisticians
Submitted by nksingh on Sun, 08/09/2009 - 19:28.
As member of statisticians group at yahoo, I saw a lot of my friends are searching statistical tools without posing problem clearly. I feel need of some fundamental changes in our orientation for using statistical tools in analysis.
People associated with statistics can be divided in two groups- group of developers and users. Challenges for both are different. Developers of statistical methods work on assumption while users work on ground reality. At initial phase of development of statistics this gap was narrow but now it is widening.
As user of statistics, our primary aim should be enrich concerned domain. One can do it in following steps
People associated with statistics can be divided in two groups- group of developers and users. Challenges for both are different. Developers of statistical methods work on assumption while users work on ground reality. At initial phase of development of statistics this gap was narrow but now it is widening.
As user of statistics, our primary aim should be enrich concerned domain. One can do it in following steps
- Check which type of abstract ideas and believes are prevailing in concerned domain.
- Think how believes and abstract ideas (based on intuition) may be represented through data. Three things are important here (1) What characteristics (like caste, land, welfare) should used on what unit (household, community etc) (2) How these characteristics should be represented through data (3) What are dependent and independent characteristics (4) How independent characteristics are related- additively, interactively etc. This is very crucial step. Here it is pertinent to mention that there may be more than one way to represent abstract idea (and characteristics). For example, welfare (a characteristics) of household may be represented in many ways through data. Similarly there may be different theories (set of independent variables) to explain the production. So basic model comes from expertise of domain. Statistical tool should be used to estimate (an test) the parameter of model so that comparison can be made. Statistician can also help in searching a better model by inclusion of more suitable characteristics or taking different function of characteristics.
- Collect seemingly concerned data according to statistical methods (as far as possible)
- Use statistical tool to explore, estimate and test parameter of model.
- Revise initial model so that it may be supported through data in better way.
It is ground reality that there may be limitation to use various statistical method. What we can do is to show all our limitation in report. For example if we are using secondary data and it is not random. In this case, we should mention what are possible source of bias. See “How to lie with statistics”.
Problem of pure statistician is generally socio economic data is not suitable of advance statistical tool. For applied statistician, creativity is in using tools of one domain in others. For example life table method generally used by demographer but it may be used to understand dropout in education. Similarly hazard based model used in medical may be used in economics. Real challenge before pure statistician is to get sufficient expertise in different domain quickly and explore whatever data can say and publish it with its limitation.
Problem of pure statistician is generally socio economic data is not suitable of advance statistical tool. For applied statistician, creativity is in using tools of one domain in others. For example life table method generally used by demographer but it may be used to understand dropout in education. Similarly hazard based model used in medical may be used in economics. Real challenge before pure statistician is to get sufficient expertise in different domain quickly and explore whatever data can say and publish it with its limitation.
Generally without getting domain expertise (or collaboration of domain expert) we want to apply statistical tool in absolute. One of the reason that we are taught by experts (as tool developers not user) who never emphasize role of context. We are taught in terms of random variable. That is why we think statistical tools may be applied in absolute.
Above mentioned steps are nearer to causal model which covers large proportion of human thinking. There may be other type of modeling (like used to explain queues and network). Steps used for such models will be different.
As conclusion, I want to say to search method and data as per need of problem in place of searching a problem and methods which is suitable for data. Do not exercise for changing your body to adjust with already created (some time second hand) shirt (data and method). Better to create a shirt which fit on your body. I know this philosophy will not suits to many applied statistician who are under pressure to create more research paper. Applied statistics in socio economic area is long way which starts from case study and participatory (qualitative surveys) to use of data of large scale quantitative survey for which data has been collected by others.
- Login to post comments