A reader commented that I need to be attending to my analytics not just reading my comments. Hmmm…
My question is: what do analytics tell me about making a difference–by providing an educational forum that changes people am I making a difference? Keep in mind that I am an evaluator and that the root for the word evaluation is “value”. So I wonder, do the analytics tell me about the merit, worth, value of this educational intervention?
What will the analytics really tell me about the readers? What will the comments tell me that the analytics don’t? Will the analytics tell me what difference this blog has made in the readers. Will analytics tell me about intention to change? How will analytics help me write posts to which more people will respond; make me more of an authority in my posting?
I DO NOT KNOW.
If someone, any one out in cyber space knows the answers (readers?), I’d love to hear from you. I blog weekly; sometimes more than weekly (like this week because, although I had the post written, I didn’t get it posted before I left the office so I posted it when I came back). I check my blog regularly for comments. I approve those which provide thoughtful meaningful responses for other readers as well as for me.
Another reader suggests that I look at the number of readers who have established an RSS feed or established a subscription. Hmmm…Not sure what that will tell me. I’ll talk to the IT folks for an answer to that question.
I would certainly appreciate any thoughts from readers.
After last week’s post on random sampling, I received a comment from a friend. She recommended some tools that might help calculate sample size especially when the population is different from the list Dillman offers. It is called Macorr. It has fields for the same variables that Dillman lists AND the population size can vary. Very important since Extension programs that repeat may not have a nice round number in the population.
She also says that you can calculate a random sample in Excel. She has to send me the directions on how to do that. I offer that to those of you who are much more adapt at Excel than I am.
When she sends it to me, I will post it. In the meantime, we wait or use the resources already available.
A comment was made: How important is it to have geographical representation in your random sample?
Theoretically, the random sample allows all individuals in a population to have the same chance of being in the sample. Because of that chance, there is also an excellent likelihood that the geographic representation will be distributed to represent the population. Of course, you have to decide before you sample, to what questions you want answers. If geographic areas may affect the outcome, then I would suggest the following. If you want to make sure that a particular area is represented (i.e., mixed metropolitan and rural areas), you can stratify on the type of representation you want. I’m doing this in an evaluation I will be undertaking this summer and fall. We hypothesize that the metropolitan areas are different from the mixed metropolitan/rural areas and both are different from the rural areas. The evaluation team stratified on the density question and are randomly selecting in the three areas. I’ll let you know how the stratification worked.
A reader asked how to choose a sample for a survey. Good question.
My daughters are both taking statistics (one in college, one in high school) and this question has been mentioned more than once. So I’ll give you my take on sampling. There are a lot of resources out there (you know, references and other sources). My favorite is in Dillman 3rd edition, page 57.
Sampling is easier than most folks make it out to be. Most of the time you are dealing with an entire population. What, you ask, how can that be?
You are dealing with an entire population when you survey the audience of a workshop (population 20, or 30, or 50). You are dealing with a population when you deal with a series of workshops (anything under 100). Typically, workshops are a small number; only happen once or twice; rarely include participants who are there because they have to be there. If you have under 100, you have an entire population. They can all be surveyed.
Now if your workshop is a repeating event with different folks over the offerings, then you will have the opportunity to sample your population because it is over 100 (see Dillman, 3rd edition, page 57). If you have over 100 people to survey AND you have contact information for them, then you want to randomly sample from that population. Random selection (another name for random sampling) is very different from random assignment; I’m talking about random sampling.
Random sampling is a process where everyone gets an identification number (and an equal chance to be selected), sequentially; so 1- 100. Then find a random number table; usually found in statistic books in the back. Close your eyes and let your hand drop onto a number. Let’s say that number is 56997. You know you need numbers between 1 and 100 and you will need (according to Dillman) for a 95% confidence level with a plus or minus 3% margin of error and a 50/50 split at least 92 cases (participants) OR if you want an 80/20 split, you will need 87 cases (participants). So you look at the number and decide which two digit number you will select (56, 69, 99, 0r 97). That is your first number. Let us say you chose 99 that is the third two digit number found in the above random number (56 and 69 being the first two). So participant 99 will be on the randomly selected (random sampling) list. Now you can go down the list, up the list, to the left or the right of the list and identify the next two digit number in the same position. For this example, using the random numbers table from my old Minium (for which I couldn’t find a picture since it is OLD) stat book (the table was copied from the Rand Corporation, A million random digits with 100,000 normal deviates, Glencoe, IL: The Free Press, 1955), the number going right is 41534, I would choose participant number 53. Continuing right, with the number 01953, I would choose participant number 95, etc. If you come across a number that you have already chosen, go to the next number. Do this process until you get the required number of cases (either 92 or 87). You can select fewer if you want a 10% plus or minus margin of error (49, 38) or a 5% plus or minus margin of error (80, 71). (I always go for the least margin of error, though.) Once you have identified the required number, drafted the survey, and secured IRB approval, you can send out the survey. We will talk about response rates next week.
The question of surveys came up the other day. Again.
I got a query from a fellow faculty member and a query from the readership. (No not a comment; just a query–although I now may be able to figure out why the comments don’t work.)
After getting a copy of Dillman for your desk, This is what I suggest: Start with what you want to know.
This may be in the form of statements or questions. If the result is complicated, see if you can simplify it by breaking it into more than one statement or question. Recently, I got a “what we want to know” in the form of complicated research questions. I’m not sure that the resulting survey questions answered the research questions because of the complexity. (I’ll have to look at the research questions and the survey questions side by side to see.) Multiple simple statements/questions are easier to match to your survey questions, easier to see if you have survey questions that answer what you want to know. Remember: if you will not use the answer (data), don’t ask the question. Less can actually be more, in this case, and just because it would be interesting to know doesn’t mean the data will answer your “what you want to know” question.
Evaluators strive for evaluation use . (See: Patton, M. Q. (2008). Utilization Focused Evaluation, 4ed. Thousand Oaks, CA: Sage Publications, Inc.; AND/OR Patton, M. Q. (2011). Essentials of Utilization-Focused Evaluation. Thousand Oaks, CA: Sage Publications, Inc.). See also the The Program Evaluation Standards , which lists utility (use) as the first attribute and standard for evaluators. (Yarbrough, D. B., Shulha, L. M., Hopson, R. K., & Caruthers, F. A. (2011). The Program Evaluation Standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage Publications, Inc.)
Evaluation use is related to stated intention to change about which I’ve previously written. If your statements/questions of what you want to know will lead you to using the evaluation findings, then stating the question in such a way as to promote use will foster use, i.e., intention to change. Don’t do the evaluation for the sake of doing an evaluation. If you want to improve the program, evaluate. If you want to know about the program’s value, merit, and worth, evaluate. Then use. One way to make sure that you will follow-through is to frame your initial statements/questions in a way that will facilitate use. Ask simply.
I’ve just read Ernie House’s book, Regression to the Mean. It is a NOVEL about evaluation politics. A publishers review says, “Evaluation politics is one of the most critical, yet least understood aspects of evaluation. To succeed, evaluators must grasp the politics of their situation, lest their work be derailed. This engrossing novel illuminates the politics and ethics of evaluation, even as it entertains. Paul Reeder, an experienced (and all too human) evaluator, must unravel political, ethical, and technical puzzles in a mysterious world he does not fully comprehend. The book captures the complexities of evaluation politics in ways other works do not. Written expressly for learning and teaching, the evaluation novel is an unconventional foray into vital topics rarely explored.”
Many luminaries (Patton, Lincoln, Scriven, Weiss) made pre-publication comments. Although I found the book fascinating, I found the quote that is included attributed to Freud compelling. That quote is, “The voice of the intellect is a soft one, but it does not rest until it has gained a hearing. Ultimately, after endless rebuffs, it succeeds. This is one of the few points in which we can be optimistic about the future of mankind (sic).” Although Freud wasn’t speaking about evaluation, House contends that this statement applies, and goes on to say, “Sometimes you have to persist against your emotions as well as the emotions of others. None of us are rational.”
So how does rationality fit into evaluation. I would contend that it doesn’t. Although the intent of evaluation is to be objective, none of us can be because of what I called personal and situational bias; what is known in the literature as cognitive bias. I contend that if one has cognitive bias (and everyone does) then that prevents us from being rational, try as we might. Our emotions get in the way. House’s comment (above) seems fitting to evaluation–evaluators must persist against personal emotions as well as emotions of others. I would add persists against personal and situational bias. I believe it is important to make explicit the personal and situational bias prior to commencing an evaluation. By clarifying assumptions that occur with the stakeholders and the evaluator, surprises are minimized, and the evaluation may be more useful to program people.
Intention to change
I’ve talked about intention to change and how stating that intention out loud and to others makes a difference. This piece of advice is showing up in some unexpected places and here. If you state your goal, there is a higher likelihood that you will be successful. That makes sense. If you confess publicly (or even to a priest), you are more likely to do the penance/make a change. What I find interesting is that this is so evaluation. What difference did the intervention make? How does that difference relate to the merit, worth, value of the program?
Lent started March 5. That is 40 days of discipline–giving up or taking on. That is a program. What difference will it make? Can you go 40 days without chocolate?
I got my last comment in November, 2013. I miss comments. Sure most of them were check out this other web site. Still there were some substantive comments and I’ve read those and archived them. My IT person doesn’t know what was the impetus for this sudden stop. Perhaps Google changed its search engine optimization code and my key words are no longer in the top. So I don’t know if what I write is meaningful; is worthwhile; or is resonating with you the reader in any way. I have been blogging now for over four years…this is no easy task. Comments and/or questions would be helpful, give me some direction.
Chris Lysy cartoons in his blog. This week he blogged about logic models. He only included logic models that are drawn with boxes. What if the logic model is circular? How would it be different? Can it still lead to outcomes? Non-linear thinkers/cultures would say so. How would you draw it? Given that mind mapping may also be a model, how do they relate?
I’ve been reading about models lately; models that have been developed, models that are being used today, models that may be used tomorrow.
Webster (Seventh New Collegiate) Dictionary has almost two inches about models–I think my favorite definition is the fifth one: an example for imitation or emulation. It seems to be most relevant to evaluation. What do evaluators do if not imitate or emulate others?
To that end, I went looking for evaluation models. Jim Popham’s book has a chapter (2, Alternative approaches to educational evaluation) on models. Fitzpatrick, Sanders, and Worthen has numerous chapters on “approaches” (what Popham calls models). (I wonder if this is just semantics?)
Models have appeared in other blogs (not called models, though). In the case of Life in Perpetual Beta (Harold Jarche) provides this view of how organizations have evolved and calls them forms.(The below image is credited to David Ronfeldt.)
(Looks like a model to me. I wonder what evaluators could make of this.)
The reading is interesting because it is flexible. It approaches the “if it works, use it” paradigm; the one I use regularly.
I’ll just list the models Popham uses and discuss them over the next several weeks. (FYI-both Popham and Fitzpatrick, et. al., talk about the overlap of models.) Why is a discussion of models important, you may ask? I’ll quote Stufflebeam: “The study of alternative evaluation approaches is important for professionalizing program evaluation and for its scientific advancement and operation” (2001, p. 9).
Popham lists the following models:
Popham does say that the model classification could have been done a different way. You will see that in the Fitzpatrick, Sanders, and Worthen volume where they talk about the following approaches:
They have a nice table that does a comparative analysis of alternative approaches (Table 10.1, pp. 249-251)
Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2011). Program evaluation: Alternative approaches and practical guidelines (4th ed.). Boston, MA: Pearson.
Popham, W. J. (1993). Educational Evaluation (3rd ed.). Boston, MA: Allyn and Bacon.
Stufflebeam, D. L. (2001). Evaluation models. New Directions for Evaluation (89). San Francisco, CA: Jossey-Bass.
People often say one thing and do another.
This came home clearly to me with a nutrition project conducted with fifth and sixth grade students over the course of two consecutive semesters. We taught them nutrition and fitness and assorted various nutrition and fitness concepts (nutrient density, empty calories, food groups, energy requirements, etc.). We asked them at the beginning to identify which snack they would choose if they were with their friends (apple, carrots, peanut butter crackers, chocolate chip cookie, potato chips). We asked them at the end of the project the same question. They said they would choose an apple both pre and post. On the pretest, in descending order, the students would choose carrots, potato chips, chocolate chip cookies, and peanut butter crackers. On the post test, in descending order, the students would choose chocolate chip cookies, carrots, potato chips, and peanut butter crackers. (Although the sample sizes were reasonable [i.e., greater than 30], I’m not sure that the difference between 13.0% [potato chips] and 12.7% [peanut butter crackers] was significant. I do not have those data.) Then, we also asked them to choose one real snack. What they said and what they did was not the same, even at the end of the project. Cookies won, hands down in both the treatment and control groups. Discouraging to say the least; disappointing to be sure. What they said they would do and what they actually did were different.
Although this program ran from September through April, and is much longer than the typical professional development conference of a half day (or even a day), what the students said was different from what the students did. We attempted to measure knowledge, attitude, and behavior. We did not measure intention to change.
That experience reminded me of a finding of Paul Mazmanian . (I know I’ve talked about him and his work before; his work bears repeating.) He did a randomized controlled trial involving continuing medical education and commitment to change. After all, any program worth its salt will result in behavior change, right? So Paul Mazmanian set up this experiment involving doctors, the world’s worst folks with whom to try to change behavior.
He found that “…physicians in both the study and the control groups were significantly more likely to change (47% vs 7%, p<0.001) IF they indicated an INTENT (emphasis added in both cases) to change immediately following the lecture ” (i.e., the continuing education program). He did a further study and found that a signature stating that they would change didn’t increase the likelihood that they would change.
Bottom line, measure intention to change in evaluating your programs.
Mazmanian, P. E., Daffron, S. R., Johnson, R. E., Davis, D. A., & Kantrowitz, M. P. (August 1998). Information about barriers to planned change: A randomized controlled trial involving continuing medical education lectures and commitment to change. Academic Medicine, 73(8), 882-886.
Mazmanian, P. E., Johnson, R. E., Zhang, A., Boothby, J. & Yeatts, E. J. (June, 2001). Effects of a signature on rates of change: A randomized controlled trial involving continuing education and the commitment-to-change model. Academic Medicine, 76(6), 642-646.
I may have mentioned naturalistic models; if not I needed to label them as such.
Today, I’ll talk some more about those models.
These models are often described as qualitative. Egon Guba (who died in 2008) and Yvonna Lincoln (distinguished professor of higher education at Texas A&M University) talk about qualitative inquiry in their 1981 book, Effective Evaluation (it has a long subtitle–here is the cover). They indicate that there are two factors on which constraints can be imposed: 1) antecedent variables and 2) possible outcomes, with the first impinging on the evaluation at its outset and the second referring to the possible consequences of the program. They propose a 2×2 figure to contrast between naturalistic inquiry and scientific inquiry depending on the constraints.
Besides Eisner’s model, Robert Stake and David Fetterman have developed models that fit this model. Stake’s model is called responsive evaluation and Fetterman talks about ethnographic evaluation. Stake’s work is described in Standards-Based & Responsive Evaluation (2004) . Fetterman has a volume called Ethnography: Step-by-Step (2010) .
Stake contended that evaluators needed to be more responsive to the issues associated with the program and in being responsive, measurement precision would be decreased. He argued that an evaluation (and he is talking about educational program evaluation) would be responsive if it “oreints more directly to program activities than to program intents; responds to audience requirements for information and if the different value perspectives present are referred to in reporting the success and failure of the program” (as cited in Popham, 1993, pg. 42). He indicates that human instruments (observers and judges) will be the data gathering approaches. Stake views responsive evaluation to be “informal, flexible, subjective, and based on evolving audience concerns” (Popham, 1993, pg. 43). He indicates that this approach is based on anthropology as opposed to psychology.
More on Fetterman’s ethnography model later.
Fetterman, D. M. (2010). Ethnography step-by-step. Applied Social Research Methods Series, 17. Los Angeles, CA: Sage Publications.
Popham, W. J. (1993). Educational Evaluation (3rd ed.). Boston, MA: Allyn and Bacon.
Stake, R. E. (1975). Evaluating the arts in education: a responsive approach. Columbus, OH: Charles E. Merrill.
Stake, R. E. (2004). Standards-based & responsive evaluation. Thousand Oaks, CA: Sage Publications.
Warning: This post may contain information that is controversial .
Schools (local public schools) were closed (still are).
The University (which never closes) was closed for four days (now open).
The snow kept falling and falling and falling. (Thank you Sandra Thiesen for the photo.)
Eighteen inches. Then freezing rain. It is a mess (although as I write this, the sun is shining, and it is 39F and supposed to get to 45F by this afternoon).
This is a complex messy system (thank you Dave Bella). It isn’t getting better. This is the second snow Corvallis has experienced in the same number of months, with increasing amounts.
It rains in the valley in Oregon; IT DOES NOT SNOW.
Another example of a complex messy system is what is happening in the UK.
These are examples extreme events; examples of climate chaos.
Evaluating complex messy systems is not easy. There are many parts. If you hold constant one part, what happens to the others? If you don’t hold constant one part, what happens to the rest of the system?. Systems thinking and systems evaluation has come of age with the 21st century; there were always people who viewed the world as a system; one part linked to another, indivisible. Soft systems theory dates back to at least von Bertalanffy who developed general systems theory and published the book by the same name in 1968 (ISBN 0-8076-0453-4).
Evaluating systems is complicated and complex.
Bob Williams, along with Iraj Imam, edited the volume Systems Concepts in Evaluation (2007), and along with Richard Hummelbrunner, wrote the volume Systems Concepts in Action: A Practitioner’s Toolkit (2010). He is a leader in systems and evaluation.
These two books relate to my political statement at the beginning and complex messy systems. According to Amazon, the second book “explores the application of systems ideas to investigate, evaluate, and intervene in complex and messy situations”.
If you think your program works in isolation, think again. If you think your program doesn’t influence other programs, individuals, stakeholders, think again. You work in a complex messy system. Because you work in a complex messy system, you might want to simplify the situation (I know I do); only you can’t. You have to work within the system.
Might be worth while to get von Bertalanffy’s book; might be worth while to get Williams books; might be worth while to get a copy of Gunderson and Holling book Panarchy: Understanding Transformations in Systems of Humans and Nature.
After all, nature is a complex messy system.