The question of the week is:

What statistical test do I use when I have pre/post reflective questions.

First, what is a reflective question?

Ask says: “A reflective question is a question that requires an individual to think about their knowledge or information, before giving a response. A reflective question is mostly used to gain knowledge about an individual’s personal life.”

I assume (and we have talked about assumptions before assume) that these items were scaled to some hierarchy, like a lot to a little, and a number assigned to each.  Since the questions are pre/post, they are “matched” and can be compared using a comparison test of dependence, like a t-test or a Wilcoxon.  However, if the questions are truly nominal (i.e., “know” and “not know”) and in response to some prompt and DO NOT have a keyed response (like specific knowledge questions),  then even though the same person answered the pre questions and the post questions there really isn’t established dependence.

If the data are nominal, then using a chi-square test would be the best approach because it will tell you if there is a difference from what was expected and what was actually observed (responded).  On a pre/post reflective question, one would expect that they respondents would “know” some information before the intervention, say 50-50 and after the intervention, that difference would shift to say 80 “know” to 20 “not know”.  A chi-square test would give you a statistic of probability that that distribution on the post occurred by chance.  SPSS will run this test; find it under the non-parametric tests.

Miscellaneous thought 1.

Yesterday, I had a conversation with a long time friend of mine.  When we stopped and calculated (which we don’t do very often), we realized that we have know each other since 1981.  We met at the first AEA (only it wasn’t AEA then) conference in Austin, TX.  I was a graduate student; my friend was a practicing professional/academic.  Although we were initially talking about other things evaluation; I asked my friend to look at an evaluation form I was developing.  I truly believe that having other eyes (a pilot if you will) view the document helps.  It certainly did in this case.  I feel really good about the form.  In the course of the conversation, my friend advocated strongly for a odd numbered scales.  My friend had good reasons, specifically

1) It tends to force more comparisons on the respondents; and

2)  if you haven’t given me a neutral  point I tend to mess up the scale on purpose because you are limiting my ability to tell you what I am thinking.

I, of course, had an opposing view (rule number 8–question authority).  I said, ” My personal preference is an even number scale to avoid a mid-point.  This is important because I want to know if the framework (of the program in question) I provided worked well with the group and a mid-point would provide the respondent with a neutral point of view, not a working or not working opinion.   An even number (in my case four points) can be divided into working and not working halves.  When I’m offered a middle point, I tend to circle that because folks really don’t want to know what I’m thinking.  By giving me an opt out/neutral/neither for or against option they are not asking my opinion or view point.”

Recently, I came across an aea365 post on just this topic.  Although this specific post was talking about Likert scales, it applies to all scaling that uses a range of numbers (as my friend pointed out).  The authors sum up their views with this comment, “There isn’t a simple rule regarding when to use odd or even, ultimately that decision should be informed by (a) your survey topic, (b) what you know about your respondents, (c) how you plan to administer the survey, and (d) your purpose. Take time to consider these four elements coupled with the advantages and disadvantages of odd/even, and you will likely reach a decision that works best for you.”  (Certainly knowing my friend like I do, I would be suspicious of responses that my friend submitted.)  Although they list advantages and disadvantages for odd and even responses, I think there are other advantages and disadvantages that they did not mentioned yet are summed up in their concluding sentence.

Miscellaneous thought 2.

I’m reading the new edition of Qualitative Data Analysis (QDA).  Qualitative data analysis ed. 3  This has always been my go to book for QDA and I was very sad when I learned that both of the original authors had died.  The new author, Johnny Saldana (who is also the author of The Coding Manual for Qualitative Researcherscoding manual--johnny saldana), talks (in the third person plural, active voice) about being a pragmatic realist.  That is an interesting concept.  They (because the new author includes the previous authors in his statement) say “that social phenomena exist not only in the mind but also in the world–and that some reasonably stable relationships can be found among the idiosyncratic messiness of life.”  Although I had never used those exact words before, I agree.  It is nice to know the label that applies to my world view.  Life is full of idiosyncratic messiness; probably why I think systems thinking is so important.  I’m reading this volume because I’ve been asked to write the review of one of my favorite books.  We will see if I can get through it between now and July 1 when the draft of the review is due.  Probably aught to pair it with Saldana’s other book; won’t happen between now and July 1.

I have a few thoughts about causation, which I will get to in a bit…first, though, I want to give my answers to the post last week.

I had listed the following and wondered if you thought they were a design, a method, or an approach. (I had also asked which of the 5Cs was being addressed–clarity or consistency.)  Here is what I think about the other question.

Case study is a method used when gathering qualitative data, that is, words as opposed to numbers.  Bob Stake, Robert Brinkerhoff, Robert Yin, and others have written extensively on this method.

Pretest-post test Control Group is (according to Campbell and Stanley, 1963) an example of  a true experimental design if a control group is used (pg. 8 and 13).  NOTE: if only one group is used (according to Campbell and Stanley, 1963), pretest-post test is considered a pre-experimental design (pg. 7 and 8); still it is a design.

Ethnography is a method used when gathering qualitative data often used in evaluation by those with training in anthropology.  David Fetterman is one such person who has written on this topic.

Interpretive is an adjective use to describe the approach one uses in an inquiry (whether that inquiry is as an evaluator or a researcher) and can be traced back to the sociologists Max Weber and Wilhem Dilthey in the later part of the 19th century.

Naturalistic is  an adjective use to describe an approach with a diversity of constructions and is a function of “…what the investigator does…” (Lincoln and Guba, 1985, pg.8).

Random Control Trials (RCT) is the “gold standard” of clinical trials, now being touted as the be all and end all of experimental design; its proponents advocate the use of RCT in all inquiry as it provides the investigator with evidence that X (not Y) caused Z.

Quasi-Experimental is a term used by Campbell and Stanley(1963) to denote a design where random assignment cannot be made for ethical or practical reasons be accomplished; this is often contrasted with random selection for survey purposes.

Qualitative is an adjective to describe an approach (as in qualitative inquiry), a type of data (as in qualitative data) or
methods (as in qualitative methods).  I think of qualitative as an approach which includes many methods.

Focus Group is a method of gathering qualitative data through the use of specific, structured interviews in the form of questions; it is also an adjective for defining the type of interviews or the type of study being conducted (Krueger & Casey, 2009, pg. 2)

Needs Assessment is method for determining priorities for the allocation of resources and actions to reduce the gap between the existing and the desired.

I’m sure there are other answers to the terms listed above; these are mine.  I’ve gotten one response (from Simon Hearn at BetterEvaluation).  If I get others, I’ll aggregate them and share them with you.  (Simon can check his answers against this post.

Now causation, and I pose another question:  If evaluation (remember the root word here is value) is determining if a program (intervention, policy, product, etc. ) made a difference, and determined the merit or worth (i.e., value) of that program (intervention, policy, product, etc.), how certain are you that your program (intervention, policy, program, etc.) caused the outcome?  Chris Lysy and Jane Davidson have developed several cartoons that address this topic.  They are worth the time to read them.

When I teach scientific writing (and all evaluators need to be able to communicate clearly verbally and in writing), I focus on the 5Cs:  letter c 1larity, 5Cs-2-Coherenceoherence, 5Cs-3-Concisenessonciseness, 5Cs-4-Consistencysonsistency, and 5Cs-5-Correctnessorrectness,   I’ve written about the 5Cs in a previous blog post, so I won’t belabor them here.  Suffice it to say that when I read a document that violates one (or more) of these 5Cs, I have to wonder.

Recently, I was reading a document where the author used design (first), then method, then approach.  In reading the context, I think (not being able to clarify) that the author was referring to the same thing–a method and used these different words in an effort to make the reading more entertaining where all it did was cause obfuscation, violating 5Cs-1-Claritylarity, one of the 5Cs     .

So I’ll ask you, reader.  Are these different?  What makes them different?  Should they have been used interchangeably in the document?  I went to my favorite thesaurus of evaluation terms (Scriven)Scriven book cover  (published by Sage) to see what he had to say, if anything.  Only “design” was listed and the definition said, “…process of stipulating the investigatory procedures to be followed in doing a certain evaluation…”  OK–investigatory procedure.

So, I’m going to list several terms used commonly in evaluation and research.  Think about what each is–design, method, approach.  I’ll provide my answers next week.  Let me know what you think each of the following is:

Case Study

Pretest-Posttest Control Group




Random Control Trials (RCT)



Focus Group

Needs Assessment