This is a link to an editorial in Basic and Applied Social PsychologyBasic and applied social psychology cover. It says that inferential statistics are no longer allowed by authors in the journal.

“What?”, you ask. Does that have anything to do with evaluation? Yes and no. Most of my readers will not publish here. They will publish in evaluation journals (of which there are many) or if they are Extension professionals, they will publish in the Journal of Extension.JoE logo And as far as I know, BASP is the only journal which has established an outright ban on inferential statistics. So evaluation journals and JoE still accept inferential statistics.

Still–if one journal can ban the use, can others?

What exactly does that mean–no inferential statistics? The journal editors define this ban as as “…the null hypothesis significance testing procedure is invalid and thus authors would be not required to perform it.” That means that authors will remove all references to  p-values, t-values, F-values, or any reference to statements about significant difference (or lack thereof) prior to publication. The editors go on to discuss the use of confidence intervals (No) and Bayesian methods (case-by case) and what inferential statistical procedures are required by the journal.

This ban reminds me of a valuable lesson presented to me in my original statistics course (maybe not the original one, maybe several). That lesson being the difference between practically significant and statistically significant. Something can be statistically significant (all it has to do is meet the p<.05 bar) and not be practically significant. To demonstrate this point, I offer the example of a three point gain per semester. It was statistically significant at p<.05 but did it actually show that the students learned something over the semester? Does three points make that much difference? Three points is three questions on a 100-question test or 1.5 questions on a 50-question test.  How much would they have to learn to make a difference in their lives? You do the math.

The journal is requiring strong descriptive statistics INCLUDING EFFECT SIZE (go read Cohen) because effect size is independent of sample size unlike significance tests. Cohen lists effect sizes as small (0.2), medium (0.5) or large (0.8).  Descriptive statistics are those numbers which describe the SAMPLE (not the population from which the sample was drawn) and typically include measures of central tendency (meanmean2, medianmedian 1, modemode 1)  and variability (range range 1, standard deviationstandard deviation, kurtosisKurtosis , and skewnessskew 1), as well as frequency and percentage (i.e., distributional data).

By using a larger sample size, the “descriptive statistics become increasingly stable and sampling error is less of a problem.” The journal stops “…short of requiring particular sample sizes…”, however, stating that “…it is possible to imagine circumstances where more typical sample sizes might be justifiable.” I do remember a voice advocating for stating effect size; no one ever went so far as to talk down inferential statistics.

What does that say for the small sample sizes Extension professionals typically achieve? I would suggest Extension professionals look at effect size.

my two cents.

molly.

1. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates.

Print Friendly, PDF & Email
Be Sociable, Share!


2 Comments Already, Leave Yours Too

William Pate on 4 March, 2015 at 9:45 am #
    

I agree that inferential statistics should be taken with a grain of salt. That was Cohen’s argument for considering effect sizes as a better measure of what’s going on (see his article, “the earth is round, p<.05"). However, conducting an inferential test and finding no significant differences (providing all assumptions have been met) is telling in it's own way – there is likely no effect of treatment. Although null hypothesis testing has been relied on too heavily by many researchers and journal editors (ditching good research because they failed to find significance) and it may not amount to much when considering other approaches, to ban it outright seems a bit much.


Eli Sagor on 4 March, 2015 at 10:48 am #
    

Thanks Molly for a great post. Banning inferential statistics seems a bit rash. A large effect can be statistically insignificant as well with a small sample. Your recommendation to focus more attention on effect size is a good one. But banning a whole class of statistical procedures seems unwise.


Post a Comment
Name:
Email:
Website:
Comments: