unintended-consequencesA colleague asked, “How do you design an evaluation that can identify unintended consequences?” This was based on a statement about methodologies that “only measure the extent to which intended results have been achieved and are not able to capture unintended outcomes (see AEA365). (The cartoon is attributed to Rob Cottingham.)

Really good question. Unintended consequences are just that–outcomes which are not what you think will happen with the program you are implementing. This is where program theory comes into play. When you model the program, you think of what you want to happen. What you want to happen is usually supported by the literature, not your gut (intuition may be useful for unintended, however). A logic model lists as outcome the “intended” outcomes (consequences). So you run your program and you get something else, not necessarily bad, just not what you expected; the outcome is unintended.

Program theory can advise you that other outcomes could happen. How do you design your evaluation so that you can capture those. Mazmanian in his 1998 study on intention to change had an unintended outcome; one that has applications to any adult learning experience (1). So what method do you use to get at these? A general question, open ended? Perhaps. Many (most?) people won’t respond to open ended questions–takes too much time. OK. I can live with that. So what do you do instead? What does the literature say could happen? Even if you didn’t design the program for that outcome. Ask that question. Along with the questions about what you expect to happen.

How would you represent this in your logic model–by the ubiquitous “other”? Perhaps. Certainly easy that way. Again, look at program theory. What does it say? Then use what is said there. Or use “other”–then you are getting back to the open ended questions and run the risk of not getting a response. If you only model “other”–do you really know what that “other” is?

I know that I won’t be able to get to world peace, so I look for what I can evaluate and since I doubt I’ll have enough money to actually go and observe behaviors (certainly the ideal), I have to ask a question. In your question asking, you want a response right? Then ask the specific question. Ask it in a way that elicits program influence–how confident the respondent is that X happened? How confident the respondent is that they can do X? How confident is the respondent that this outcome could have happened? You could ask if X happened (yes/no) and then ask the confidence questions (confidence questions are also known as self-efficacy). Bandura will be proud. See Bandure social cognitive theory  OR Bandura social learning theory  OR   Bandura self-efficacy (for discussions of self-efficacy and social learning).

mytwo cents

molly.

1. Mazmanian, P. E., Daffron, S. R., Johnson, R. E., Davis, D. A., Kantrowitz, M. P. (1998). Information about barriers to planned change: A randomized controlled trial involving continuing medical education lectures and commitment to change. Academic Medicine 73(8), 882-886.

On May 9, 2014, Dr. Don Kirkpatrick  Don Kirkpatrick photo died at the age of 90. His approach (called a model) to evaluation was developed in 1954 and has served the training and development arena well since then; it continues to do so.

For those of you who are not familiar with the Kirkpatrick model, here is a primer, albeit short. (There are extensive training programs for getting certified in this model, if you want to know more.)

Don Kirkpatrick, Ph. D. developed the Kirkpatrick model when he was a doctoral student; it was the subject of his dissertation which was defended in 1954.  There are four levels (they are color coded on the Kirkpatrick website) and I quote:

Level 1: Reaction

 

To what degree participants react favorably to the training

 

Level 2: Learning

 

To what degree participants acquire the intended knowledge, skills, attitudes, confidence and commitment based on their participation in a training event

 

Level 3: Behavior

 

To what degree participants apply what they learned during training when they are back on the job

 

Level 4: Results

                   To what degree targeted outcomes occur as a result of the training event and subsequent reinforcement

Sounds simple, right. (Reminiscent of a logic model’s short, medium, and long term outcomes).  He was the first to admit that it is difficult to get to level four (no world peace for this guy, unfortunately). We all know that behavior can be observed and reported, although self-report is fraught with problems (self-selection, desired response, other cognitive bias, etc.). Continue reading

In a recent post, I said that 30 was the rule of thumb, i.e., 30 cases was the minimum needed in a group to be able to run inferential statistics and get meaningful results.  How do I know, a colleague asked? (Specifically,  “Would you say more about how it takes approximately 30 cases to get meaningful results, or a good place to find out more about that?”) When I was in graduate school, a classmate (who was into theoretical mathematics) showed me the mathematical formula for this rule of thumb. Of course I don’t remember the formula, only the result. So I went looking for the explanation. I found this site. Although my classmate did go into the details of the chi-square distribution and the formula computations, this article doesn’t do that. It even provides an Excel Demo for calculating sample size and verifying this rule of thumb. I am so relieved that there is another source besides my memory.

 

New Topic:

Continue reading

My friend, Susan, who promised instructions on how to use Excel to select a sample, wrote a post on that very topic. excel and random sampling

She tells me that using screen shots made providing instructions easier so she posted it in a tutorial. Thank you, Susan, for adding this information. Now there is no reason for not selecting a random sample from your large population. Whether the sample will respond or not is out of your control.

Response rates are another thing, to be covered later.

A reader asked how to choose a sample for a survey. Good question.

My daughters are both taking statistics (one in college, one in high school) and this question has been mentioned more than once. So I’ll give you my take on sampling. There are a lot of resources out there (you know, references and other sources). My favorite is in Dillman 3rd edition, page 57. 698685_cover.indd

Sampling is easier than most folks make it out to be. Most of the time you are dealing with an entire population. What, you ask, how can that be?

You are dealing with an entire population when you survey the audience of a workshop (population 20, or 30, or 50). You are dealing with a population when you deal with a series of workshops (anything under 100). Typically, workshops are a small number; only happen once or twice; rarely include participants who are there because they have to be there. If you have under 100, you have an entire population. They can all be surveyed.

Now if your workshop is a repeating event with different folks over the offerings, then you will have the opportunity to sample your population because it is over 100 (see Dillman, 3rd edition, page 57). If you have over 100 people to survey AND you have contact information for them, then you want to randomly sample from that population. Random selection (another name for random sampling) is very different from random assignment; I’m talking about random sampling.

Random sampling is a process where everyone gets an identification number (and an equal chance to be selected), sequentially; so 1- 100. Then find a random number table; usually found in statistic books in the back. Close your eyes and let your hand drop onto a number. Let’s say that number is 56997. You know you need numbers between 1 and 100 and you will need (according to Dillman) for a 95% confidence level with a plus or minus 3% margin of error and a 50/50 split at least 92 cases (participants) OR if you want an 80/20 split, you will need 87 cases (participants). So you look at the number and decide which two digit number you will select (56, 69, 99, 0r 97). That is your first number. Let us say you chose 99 that is the third two digit number found in the above random number (56 and 69 being the first two). So participant 99 will be on the randomly selected (random sampling) list. Now you can go down the list, up the list, to the left or the right of the list and identify the next two digit number in the same position. For this example, using the random numbers table from my old Minium (for which I couldn’t find a picture since it is OLD) stat book (the table was copied from the Rand Corporation, A million random digits with 100,000 normal deviates, Glencoe, IL: The Free Press, 1955), the number going right is 41534, I would choose participant number 53. Continuing right, with the number 01953, I would choose participant number 95,  etc. If you come across a number that you have already chosen, go to the next number. Do this process until you get the required number of cases (either 92 or 87). You can select fewer if you want a 10% plus or minus margin of error (49, 38) or a 5% plus or minus margin of error (80, 71). (I always go for the least margin of error, though.) Once you have identified the required number, drafted the survey, and secured IRB approval, you can send out the survey. We will talk about response rates next week.

The question of surveys came up the other day. Again.

I got a query from a fellow faculty member and a query from the readership. (No not a comment; just a query–although I now may be able to figure out why the comments don’t work.)

So surveys; a major part of evaluation work. (My go-to book on surveys is Dillman’s 3rd edition 698685_cover.indd; I understand there is a 4th edition coming later this year.9781118456149.pdf )

After getting a copy of Dillman for your desk, This is what I suggest: Start with what you want to know.

This may be in the form of statements or questions. If the result is complicated, see if you can simplify it by breaking it into more than one statement or question. Recently, I  got a “what we want to know” in the form of complicated research questions. I’m not sure that the resulting survey questions answered the research questions because of the complexity. (I’ll have to look at the research questions and the survey questions side by side to see.) Multiple simple statements/questions are easier to match to your survey questions, easier to see if you have survey questions that answer what you want to know. Remember: if you will not use the answer (data), don’t ask the question. Less can actually be more, in this case, and just because it would be interesting to know doesn’t mean the data will answer your “what you want to know” question.

Evaluators strive for evaluation use . (See: Patton, M. Q. (2008). Utilization Focused Evaluation, 4ed. Thousand Oaks, CA: Sage Publications, Inc.Utilization-Focused Evaluation; AND/OR Patton, M. Q. (2011). Essentials of Utilization-Focused Evaluation. Thousand Oaks, CA: Sage Publications, Inc.Essentials of UFE).  See also the The  Program Evaluation Standards , which lists utility (use) as the first attribute and standard for evaluators. (Yarbrough, D. B., Shulha, L. M., Hopson, R. K., & Caruthers, F. A. (2011). The Program Evaluation Standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage Publications, Inc.The_Program_Evaluation_Standards_3ed)

Evaluation use is related to stated intention to change about which I’ve previously written. If your statements/questions of what you want to know will lead you to using the evaluation findings, then stating the question in such a way as to promote use will foster use, i.e., intention to change. Don’t do the evaluation for the sake of doing an evaluation. If you want to improve the program, evaluate. If you want to know about the program’s value, merit, and worth, evaluate. Then use. One way to make sure that you will follow-through is to frame your initial statements/questions in a way that will facilitate use. Ask simply.

People often say one thing and do another.

This came home clearly to me with a nutrition project conducted with fifth and sixth grade students over the course of two consecutive semesters. We taught them nutrition and fitness and assorted various nutrition and fitness concepts (nutrient density, empty calories, food groups, energy requirements, etc.). We asked them at the beginning to identify which snack they would choose if they were with their friends (apple, carrots, peanut butter crackers, chocolate chip cookie, potato chips). We asked them at the end of the project the same question. They said they would choose an apple both pre and post. On the pretest, in descending order, the  students would choose carrots, potato chips, chocolate chip cookies, and peanut butter crackers. On the post test, in descending order, the students would choose chocolate chip cookies, carrots, potato chips, and peanut butter crackers. (Although the sample sizes were reasonable [i.e., greater than 30], I’m not sure that the difference between 13.0% [potato chips] and 12.7% [peanut butter crackers] was significant. I do not have those data.) Then, we also asked them to choose one real snack. What they said and what they did was not the same, even at the end of the project. Cookies won, hands down in both the treatment and control groups. Discouraging to say the least; disappointing to be sure. What they said they would do and what they actually did were different.

Although this program ran from September through April, and is much longer than the typical professional development conference of a half day (or even a day), what the students said was different from what the students did. We attempted to measure knowledge, attitude, and behavior. We did not measure intention to change.

That experience reminded me of a finding of Paul Mazmanian pemazman. (I know I’ve talked about him and his work before; his work bears repeating.) He did a randomized controlled trial involving continuing medical education and commitment to change. After all, any program worth its salt will result in behavior change, right? So Paul Mazmanian set up this experiment involving doctors, the world’s worst folks with whom to try to change behavior.

He found that “…physicians in both the study and the control groups were significantly more likely to change (47% vs 7%, p<0.001) IF they indicated an INTENT (emphasis added in both cases) to change immediately following the lecture ” (i.e., the continuing education program).  He did a further study and found that a signature stating that they would change didn’t increase the likelihood that they would change.

Bottom line, measure intention to change in evaluating your programs.

References:

Mazmanian, P. E., Daffron, S. R., Johnson, R. E., Davis, D. A., & Kantrowitz, M. P. (August 1998). Information about barriers to planned change: A randomized controlled trial involving continuing medical education lectures and commitment to change. Academic Medicine, 73(8), 882-886.

Mazmanian, P. E., Johnson, R. E., Zhang, A., Boothby, J. & Yeatts, E. J. (June, 2001). Effects of a signature on rates of change: A randomized controlled trial involving continuing education and the commitment-to-change model. Academic Medicine, 76(6), 642-646.

 

When Elliot Eisner eliott eisner died in January, I wrote a post on his work as I understood it.

I may have mentioned naturalistic models; if not I needed to label them as such.

Today, I’ll talk some more about those models.

These models are often described as qualitative. Egon Guba egon guba (who died in 2008) and Yvonna Lincoln yvonna lincoln (distinguished professor of higher education at Texas A&M University) talk about qualitative inquiry in their 1981 book, Effective Evaluation (it has a long subtitle–here is the cover)effective evaluation. They indicate that there are two factors on which constraints can be imposed: 1) antecedent variables and 2) possible outcomes, with the first impinging on the evaluation at its outset and the second referring to the possible consequences of the program. They propose a 2×2 figure to contrast between naturalistic inquiry and scientific inquiry depending on the constraints.

Besides Eisner’s model, Robert Stake robert stakeand David Fetterman Fetterman have developed models that fit this model. Stake’s model is called responsive evaluation and Fetterman talks about ethnographic evaluation. Stake’s work is described in Standards-Based & Responsive Evaluation (2004) Stake-responsive evaluation.  Fetterman has a volume called Ethnography: Step-by-Step (2010) ethnography step-by-step.

Stake contended that evaluators needed to be more responsive to the issues associated with the program and in being responsive, measurement precision would be decreased. He argued that an evaluation (and he is talking about educational program evaluation) would be responsive if it “oreints more directly to program activities than to program intents; responds to audience requirements for information and if the different value perspectives present are referred to in reporting the success and failure of the program” (as cited in Popham, 1993, pg. 42). He indicates that human instruments (observers and judges) will be the data gathering approaches.  Stake views responsive evaluation to be “informal, flexible, subjective, and based on evolving audience concerns” (Popham, 1993, pg. 43).  He indicates that this approach is based on anthropology as opposed to psychology.

More on Fetterman’s ethnography model later.

References:

Fetterman, D. M. (2010). Ethnography step-by-step. Applied Social Research Methods Series, 17. Los Angeles, CA: Sage Publications.

Popham, W. J. (1993). Educational Evaluation (3rd ed.). Boston, MA: Allyn and Bacon.

Stake, R. E. (1975). Evaluating the arts in education: a responsive approach. Columbus, OH: Charles E. Merrill.

Stake, R. E. (2004). Standards-based & responsive evaluation. Thousand Oaks, CA: Sage Publications.

 

 

 

 

On February 1 at 12:00 pm PT, I will be holding my annual virtual tea party.  This is something I’ve been doing since February of 1993.  I was in Minnesota and the winter was very cold, and although not as bleak as winter in Oregon, I was missing my friends who did not live near me.  I had a tea party for the folks who were local and wanted to think that those who were not local were enjoying the tea party as well.  So I created a virtual tea party.  At that time, the internet was not available; all this was done in hard copy (to this day, I have one or two friends who do not have internet…sigh…).  Today, the internet makes the tea party truly virtual–well the invitation is; you have to have a real cup of tea where ever you are.
Virtual Tea Time 2014

 

How is this evaluative?  Gandhi says that only you can be the change you want to see…this is one way you can make a difference.  How will you know?

I know because my list of invitees has grown exponentially.  And some of them share the invitation.  They pass it on.  I started with a dozen or so friends.  Now my address list is over three pages long.  Including my daughters and daughters of my friends (maybe sons, too for that matter…)

Other ways:  Design an evaluation plan; develop a logic model; create a metric/rubric.  Report the difference.  This might be a good place for using an approach other than a survey or Likert scale.  Think about it.

Evaluation models abound.

Models are a set of plans.

Educational evaluation models are plans that could “lead to more effective evaluations” (Popham, 1993, p. 23).  Popham, educational evaluation  Popham (1993) goes on to say that there was little or no thought given to a new evaluation model that would make it distinct from other models so that in sorting models into categories, the categories “fail to satisfy…without overlap” (p. 24).  Popham employs five categories:

  1. Goal-attainment models;
  2. Judgmental models emphasizing inputs;
  3. Judgmental models emphasizing outputs;
  4. Decision-facilitation models; and
  5. Naturalistic models

I want to acquaint you with one of the naturalistic models, the connoisseurship model.  (I hope y’all recognize the work of Guba and Lincoln in the evolution of naturalistic models; if not I have listed several sources below.)  Elliott Eisner  drew upon his experience as an art educator and used art criticism as the basis for this model.  His approach relies on educational connoisseurship and educational criticism.  Connoisseurship focuses on complex entities (think art, wine, chocolate); criticism is a form which “discerns the qualities of an event or object” (Popham, 1993, p. 43) and puts into words that which has been experienced.  This verbal presentation allows for those of us who do not posess the critic’s expertise can understand what was perceived.  Eisner advocated that design is all about relationships and relationships are necessary for the creative process and thinking about the creative process.  He proposed “that experienced experts, like critics of the arts, bring their expertise to bear on evaluating the quality of programs…” (Fitzpatrick, Sanders and Worthen, 2004).  He proposed an artistic paradigm (rather than a scientific one) as a supplement other forms of inquiry.  It is from this view that connoisseurship derives—connoisseurship is the art of appreciation; the relationships between/among the qualities of the evaluand. 

Elliot Eisner died January 10, 2014; he was 81. He was the Lee Jacks Professor of Education at Stanford Graduate School of Education.  He advanced the role of arts in education and used arts as models for improving educational practice in other fields.  His contribution to evaluation was significant.

Resources:

Eisner, E. W. (1975). The perceptive eye:  Toward the reformation of educational evaluation.  Occasional Papers of the Stanford Evaluation Consortium.  Stanford, CA: Stanford University Press.

Eisner, E. W. (1991a). Taking a second look: Educational connoisseurship revisited.  In Evaluation and education: At quarter century, ed. M. W. McLaughlin & D. C. Phillips.  Chicago: University of Chicago Press.

Eisner, E. W. (1991b). The enlightened eye: Qualitative inquiry and the enhancement of educational practice.  New York: Macmillian.

Eisner, E. W., & Peshkin, A. (eds.) (1990).  Qualitative inquiry in education.  NY:Teachers College Press.

Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program Evaluation: Alternative approaches and practical guidelines, 3rd ed. Boston, MA: Pearson

Guba, E. G., & Lincoln, Y. S. (1981). Effective evaluation: Improving the usefulness of evaluation results through responsive and naturalistic approaches.  San Francisco: Jossey-Bass.

Lincoln, Y. S. & Guba, E. G. (1985). Naturalistic Inquiry. Newbury Park, CA: Sage Publications.

Patton, M. Q. (2002).  Qualitative research & evaluation methods. 3rd ed. Thousand Oaks, CA: Sage Publications.

Popham, W. J. (1993). Educational evaluation. 3rd ed. Boston, MA: Allyn and Bacon.

 

Normal
0

false
false
false

EN-US
X-NONE
X-NONE

MicrosoftInternetExplorer4