A colleague asks, “What is the appropriate statistical analysis test when comparing means of two groups ?”
I’m assuming (yes, I know what assuming does) that parametric tests are appropriate for what the colleague is doing. Parametric tests (i.e., t-test, ANOVA,) are appropriate when the parameters of the population are known. If that is the case (and non-parametric tests are not being considered), I need to clarify the assumptions underlying the use of parametric tests, which have more stringent assumptions than nonparametric tests. Those assumptions are the following:
The sample is
- randomized (either by assignment or selection).
- drawn from a population which has specified parameters.
- normally distributed.
- demonstrating equality of variance in each variable.
If those assumptions are met, the part answer is, “It all depends”. (I know you have heard that before today.)
I will ask the following questions:
- Do you know the parameters (measures of central tendency and variability) for the data?
- Are they dependent or independent samples?
- Are they intact populations?
Once I know the answers to these questions I can suggest a test.
My current favorite statistics book, Statistics for People Who (Think They) Hate Statistics, by Neil J. Salkind (4th ed.) has a flow chart that helps you by asking if you are looking at differences between the sample and the population and relationships or differences between one or more groups. The flow chart ends with the name of a statistical test. The caveat is that you are working with a sample from a larger population that meets the above stated assumptions.
How you answer the questions above also depends on what test you can use. If you do not know the parameters, you will NOT use a parametric test. If you are using an intact population (and many Extension professionals use intact populations), you will NOT use inferential statistics as you will not be inferring to anything bigger than what you have at hand. If you have two groups and the groups are related (like a pre-post test or a post-pre test), you will use a parametric or non-parametric test for dependency. If you have two groups and are they unrelated (like boys and girls), you will use a parametric or non-parametric test for independence. If you have more than two groups you will use different test yet.
Extension professionals are rigorous in their content material; they need to be just as rigorous in their analysis of the data collected from the content material. Understanding the what analyses to use when is a good skill to have.
What is an intact population? I am guessing that it means that your data comes from a 100% sample of the entire population (i.e. participants in a workshop – you give a survey to everyone in a workshop, not a random sample of participants).
Amy, the term I’ve always used is “intact population” meaning all the possible participants who are included in the variable of interest. Perhaps a more accurate phrase would be is finite population or inclusive population. The example you give about participants in a workshop clearly defines the population–all the folks who could and did attend as opposed to all the possible folks who are eligible to attend. You only evaluate those who attended–intact.
Thanks for posting this, the book you reference looks like such a great resource to supplement all the other quant books we amass over the years in this profession. Besides being an evaluator, I also work for AEA and they are holding a 6 hour workshop that covers similar information. I wonder if your readers might have an interest? http://comm.eval.org/dem/Resources/