value « Evaluation is an Everyday Activity

“In reality, winning begins with accountability. You cannot sustain success without accountability. It is an absolute requirement!” (from walkthetalk.com.)

I’m quoting here. I wish I had thought of this before I read it. It is important in everyone’s life, and especially when evaluating.

Webster’s defines accountability as, “…“the quality or state of being accountable; an obligation (emphasis added) or willingness to accept responsibility for one’s actions.” The business dictionary goes a little further and defines accountability as “…The obligation of an individual (or organization) (parentheses added) to account for its activities, accept responsibility for them, and to disclose the results in a transparent manner.”

It’s that last part to which evaluators need to pay special attention; the “disclose results in a transparent manner” part. There is no one looking over your shoulder to make sure you do “the right thing”; that you read the appropriate document; that you report the findings you found not what you know the client wants to hear. If you maintain accountability, you are successful; you will win.

AEA has a adopted a set of Guiding Principles for the organization and its members. The principles are 1) Systematic inquiry; 2) Competence; 3) Integrity/Honesty; 4) Respect for people; and 5) Responsibilities for the General and Public Welfare. I can see where accountability lies within each principle. Can you?

AEA has also endorsed the Program Evaluation Standards of which there are five as well. They are: 1) Utility, 2) Feasibility, 3) Proprietary, 4) Accuracy, and 5) Evaluation accountability. Here, the developers were very specific and made accountability a specific category. The Standard specifically states, “The evaluation accountability standards encourage adequate documentation of evaluations and a metaevaluative perspective focused on improvement and accountability for evaluation processes and products.”

You may be wondering about the impetus for this discussion of accountability (or, not…). I have been reminded recently that only the individual can be accountable. No outside person can do it for him or her. If there is an assignment, it is the individual’s responsibility to complete the assignment in the time required. If there is a task to be completed, it is the individual’s responsibility (and Webster’s would say obligation) to meet that responsibility. It is the evaluator’s responsibility to report the results in a transparent manner–even if it is not what was expected or wanted. As evaluator’s we are adults (yes, some evaluation is completed by youth; they are still accountable) and, therefore, responsible, obligated, accountable. We are each one responsible–not the leader, the organizer, the boss. Each of us. Individually. When you are in doubt about your responsibility, it is your RESPONSIBILITY to clarify that responsibility however works best for you. (My rule to live by number 2: Ask. If you don’t ask, you won’t get; if you do, you might not get.)

Remember, only you are accountable for your behavior–No. One. Else. Even in an evaluation.; especially in an evaluation

You implement a program. You think it is effective; that it makes a difference; that it has merit and worth. You develop a survey to determine the merit and worth of the program. You send the survey out to the target audience which is an intact population–that is, all of the participants are in the target audience for the survey. You get less than 4o% response rate. What does that mean? Can you use the results to say that the participants saw merit in the program? Do the results indicate that the program has value; that it made a difference if only 40% let you know what they thought.

I went looking for some insights on non-responses and non-responders. Of course, I turned to Dillman (my go to book for surveys…). His bottom line: “…sending reminders is an integral part of minimizing non-response error” (pg. 360).

Dillman (of course) has a few words of advice. For example, on page 360, he says, ” Actively seek means of using follow-up reminders in order to reduce non-response error.” How do you not burden the target audience with reminders, which are “…the most powerful way of improving response rate…” (Dillman, pg. 360). When reminders are sent they need to be carefully worded and relate to the survey being sent. Reminders stress the importance of the survey and the need for responding.

Dillman also says (on page 361) to “…provide all selected respondents with similar amounts and types of encouragement to respond.” Since most of the time incentives are not an option for you the program person, you have to encourage the participants in other ways. So we are back to reminders again.

To explore the topic of non-response further, there is a book (Groves, Robert M., Don A. Dillman, John Eltinge, and Roderick J. A. Little (eds.). 2002. Survey Nonresponse. Wiley-Interscience: New York) that deals with the topic. I don’t have it on my shelf, so I can’t speak to it. I found it while I was looking for information on this topic.

I also went on line to EVALTALK and found this comment which is relevant to evaluators attempting to determine if the program made a difference: “Ideally you want your non-response percents to be small and relatively even-handed across items. If the number of nonresponds is large enough, it does raise questions as to what is going for that particular item, for example, ambiguous wording or a controversial topic. Or, sometimes a respondent would rather not answer a question than respond negatively to it. What you do with such data depends on issues specific to your individual study.” This comment was from Kathy Race of Race & Associates, Ltd., September 9, 2003.

A bottom line I would draw from all this is respond…if it was important to you to participate in the program then it is important for you to provide feedback to the program implementation team/person.

This Thursday, the U.S. celebrates THE national holiday. I am reminded of all that comprises that holiday. No, not barbeque and parades; fireworks and leisure. Rather all the work that has gone on to assure that we as citizens CAN celebrate this independence day. The founding fathers (and yes, they were old [or not so old] white men} took great risks to stand up for what they believed. They did what I advocate- determined (through a variety of methods) the merit/worth/value of the program, and took a stand. To me, it is a great example of evaluation as an everyday activity. We now live under that banner of the freedoms for which they stood.

Oh, we may not agree with everything that has come down the pike over the years; some of us are quite vocal about the loss of freedoms because of events that have happened through no real fault of our own. We just happened to be citizens of the U.S. Could we have gotten to this place where we have the freedoms, obligations, responsibilities, and limitations without folks leading us? I doubt it. Anarchy is rarely, if ever, fruitful. Because we believe in leaders (even if we don’t agree with who is leading), we have to recognize that as citizens we are interdependent; we can’t do it alone (little red hen notwithstanding). Yes, the U.S. is known for the strength that is fostered in the individual (independence). Yet, if we really look at what a day looks like, we are interdependent on so many others for all that we do, see, hear, smell, feel, taste. We need to take a moment and thank our farmer, our leaders, our children (if we have them as they will be tomorrow’s leaders), our parents (if we are so lucky to still have parents), and our neighbors for being part of our lives. For fostering the interdependence that makes the U.S. unique. Evaluation is an everyday activity; when was the last time you recognized that you can’t do anything alone?

Happy Fourth of July–enjoy your blueberry pie!

I have a few thoughts about causation, which I will get to in a bit…first, though, I want to give my answers to the post last week.

I had listed the following and wondered if you thought they were a design, a method, or an approach. (I had also asked which of the 5Cs was being addressed–clarity or consistency.) Here is what I think about the other question.

Case study is a method used when gathering qualitative data, that is, words as opposed to numbers. Bob Stake, Robert Brinkerhoff, Robert Yin, and others have written extensively on this method.

Pretest-post test Control Group is (according to Campbell and Stanley, 1963) an example of a true experimental design if a control group is used (pg. 8 and 13). NOTE: if only one group is used (according to Campbell and Stanley, 1963), pretest-post test is considered a pre-experimental design (pg. 7 and 8); still it is a design.

Ethnography is a method used when gathering qualitative data often used in evaluation by those with training in anthropology. David Fetterman is one such person who has written on this topic.

Interpretive is an adjective use to describe the approach one uses in an inquiry (whether that inquiry is as an evaluator or a researcher) and can be traced back to the sociologists Max Weber and Wilhem Dilthey in the later part of the 19th century.

Naturalistic is an adjective use to describe an approach with a diversity of constructions and is a function of “…what the investigator does…” (Lincoln and Guba, 1985, pg.8).

Random Control Trials (RCT) is the “gold standard” of clinical trials, now being touted as the be all and end all of experimental design; its proponents advocate the use of RCT in all inquiry as it provides the investigator with evidence that X (not Y) caused Z.

Quasi-Experimental is a term used by Campbell and Stanley(1963) to denote a design where random assignment cannot be made for ethical or practical reasons be accomplished; this is often contrasted with random selection for survey purposes.

Qualitative is an adjective to describe an approach (as in qualitative inquiry), a type of data (as in qualitative data) or
methods (as in qualitative methods). I think of qualitative as an approach which includes many methods.

Focus Group is a method of gathering qualitative data through the use of specific, structured interviews in the form of questions; it is also an adjective for defining the type of interviews or the type of study being conducted (Krueger & Casey, 2009, pg. 2)

Needs Assessment is method for determining priorities for the allocation of resources and actions to reduce the gap between the existing and the desired.

I’m sure there are other answers to the terms listed above; these are mine. I’ve gotten one response (from Simon Hearn at BetterEvaluation). If I get others, I’ll aggregate them and share them with you. (Simon can check his answers against this post.

Now causation, and I pose another question: If evaluation (remember the root word here is value) is determining if a program (intervention, policy, product, etc. ) made a difference, and determined the merit or worth (i.e., value) of that program (intervention, policy, product, etc.), how certain are you that your program (intervention, policy, program, etc.) caused the outcome? Chris Lysy and Jane Davidson have developed several cartoons that address this topic. They are worth the time to read them.

Recently, I was privileged to see the recommendations of William (Bill) Tierney on the top education blogs. (Tierney is the Co-director of the Pullias Center for Higher Education at the University of Southern California.) He (among others) writes the blog, 21st scholar. The blogs are actually the recommendation of his research assistant Daniel Almeida. These are the recommendations:

What criteria were used? What criteria would you use? Some criteria that come to mind are interest, readability, length, frequency. But I’m assuming that they would be your criteria (and you know what assuming does…)

If I’ve learned anything in my years as an evaluator, it is to make assumptions explicit. Everyone comes to the table with built in biases (called cognitive biases). I call them personal and situational biases (I did my dissertation on those biases). So by making your assumptions explicit (and thereby avoiding personal and situational biases), you are building a rubric because a rubric is developed from criteria for a particular product, program, policy, etc.

How would you build your rubric? Many rubrics are in chart format, that is columns and rows with the criteria detailed in those cross boxes. That isn’t cast in stone. Given the different ways people view the world–linear, circular, webbed–there may be others, I would set yours up in the format that works best for you. The only thing to keep in mind is be specific.

Now, perhaps you are wondering how this relates to evaluation in the way I’ve been using evaluation. Keep in mind evaluation is an everyday activity. And everyday, all day, you perform evaluations. Rubrics formalizes the evaluations you conduct–by making the criteria explicit. Sometimes you internalize them; sometimes you write them down. If you need to remember what you did the last time you were in a similar situation, I would suggest you write them down. No, you won’t end up with lots of little sticky notes posted all over. Use your computer. Create a file. Develop criteria that are important to you. Typically, the criteria are in a table format; an x by x form. If you are assigning number, you might want to have the rows be the numbers (for example, 1-10) and the columns be words that describe those numbers (for example, 1 boring; 10 stimulating and engaging). Rubrics are used in reviewing manuscripts, student papers, assigning grades to activities as well as programs. Your format might look like this:

Or it might not. What other configuration have you seen rubrics? How would you develop your rubric? Or would you–perhaps you prefer a bunch of sticky notes. Let me know.

We are four months into 2013 and I keep asking the question “Is this blog making a difference?” I’ve asked for an analytic report to give me some answers. I’ve asked you readers for your stories.

Let’s hear it for SEOs and how they pick up that title–I credit that with the number of comments I’ve gotten. I AM surprised at the number of comments I have gotten since January (hundreds, literally). Most say things like, “of course it is making a difference.” Some compliment me on my writing style. Some are in a foreign language which I cannot read (I am illiterate when it comes to Cyrillic, Arabic, Greek, Chinese, and other non-English alphabets). Some are marketing–wanting ping backs to their recently started blogs for some product. Some have commented specifically on the content (sample size and confidence intervals); some have commented on the time of year (vernal equinox). Occasionally, I get a comment like the comment below and I keep writing.

The questions of all questions… Do I make a difference? I like how you write and let me answer your question. Personally I was supposed to be dead ages ago because someone tried to kill me for the h… of it … Since then (I barely survived) I have asked myself the same question several times and every single time I answer with YES. Why? Because I noticed that whatever you do, there is always someone using what you say or do to improve their own life. So, I can answer the question for you: Do you make a difference? Yes, you do, because there will always be someone who uses your writings to do something positive with it. So, I hope I just made your day! 🙂 And needless to say, keep the blog posts coming!

Enough update. New topic: I just got a copy of the third edition of Miles and Huberman (my to go reference for qualitative data analysis). Wait you say–Miles and Huberman are dead–yes, they are. Johnny Saldana (there needs to be a~ above the “n” in his name only I don’t know how to do that with this keyboard) was approached by Sage to be the third author and revise and update the book. A good thing, I think. Miles and Huberman’s second edition was published in 1994. That is almost 20 years. I’m eager to see if it will hold as a classic given that there are many other books on qualitative coding in press currently. (The spring research flyer from Gilford lists several on qualitative inquiry and analysis from some established authors.)

I also recently sat in on a research presentation of a candidate for a tenure track position here at OSU who talked about how the analysis of qualitative data was accomplished. Took me back to when I was learning–index cards and sticky notes. Yes, there are marvelous software programs out there (NVivo, Ethnograph, N*udist); I will support the argument that the best way to learn about your qualitative data is to immerse yourself in it with color coded index cards and sticky notes. Then you can use the software to check your results. Keep in mind, though, that you are the PI and you will bring many biases to the analysis of your data.

A rubric is a way to make criteria (or standards) explicit and it does that in writing so that there can be no misunderstanding. It is found in many evaluative activities especially assessment of classroom work. (Misunderstanding is still possible because the English language is often not clear–something I won’t get into today; suffice it to say that a wise woman said words are important–keep that in mind when crafting a rubric.)

This week there were many events that required rubrics. Rubrics may have been implicit; they certainly were not explicit. Explicit rubrics were needed.

I’ll start with apologies for the political nature of today’s post.

Yesterday’s activity of the US Senate is an example where a rubric would be valuable. Gabby Giffords said it best:

Certainly, an implicit rubric for this event can be found in this statement:

Only it was not used. When there are clear examples of inappropriate behavior; behavior that my daughters’ kindergarten teacher said was mean and not nice, a rubric exists. Simple rubrics are understood by five year olds (was that behavioir mean OR was that behavior nice). Obviously 46 senators could only hear the NRA; they didn’t hear that the behavior (school shootings) was mean.

Boston provided us with another example of the mean vs. nice rubric. Bernstein got the concept of mean vs. nice.

Music is nice; violence is mean.

Helpers are nice; bullying is mean.

There were lots of rubrics, however implicit, for that event. The NY Times reported that helpers (my word) ran TOWARD those in need not away from the site of the explosion (violence). There were many helpers. A rubric existed, however implicit.

I want to close with another example of a rubric:

I’m no longer worked up–just determined and for that I need a rubric. This image may not give me the answer; it does however give me pause.

For more information on assessment and rubrics see: Walvoord, B. E. (2004). Assessment clear and simple. San Francisco: Jossey-Bass.

In a conversation with a colleague on the need for IRB when what was being conducted was evaluation not research, I was struck by two things:

I needed to discuss the protections provided by IRB (the next timely topic??) and
the difference between evaluation and research needed to be made clear.

Leaving number 1 for another time, number 2 is the topic of the day.

A while back, AEA 365 did a post on the difference between evaluation and research (some of which is included below) from a graduate students perspective. Perhaps providing other resources would be valuable.

To have evaluation grouped with research is at worst a travesty; at best unfair. Yes, evaluation uses research tools and techniques. Yes, evaluation contributes to a larger body of knowledge (and in that sense seeks truth, albeit contextual). Yes, evaluation needs to have institutional review board documentation. So in many cases, people could be justified in saying evaluation and research are the same.

NOT.

Carol Weiss (1927-2013, she died in January) has written extensively on this difference and makes the distinction clearly. Weiss’s first edition of Evaluation Research was published in 1972.She revised this volume in 1998 and issued it under the title of Evaluation. (Both have subtitles.)

She says that evaluation applies social science research methods and makes the case that it is intent of the study which makes the difference between evaluation and research. She lists the following differences (pp 15 – 17, 2nd ed.):

Utility;
Program-driven questions;
Judgmental quality;
Action setting;
Role Conflicts;
Publication; and
Allegiance.

(For those of you who are still skeptical, she also lists similarities.) Understanding and knowing the difference between evaluation and research matters. I recommend her books.

Gisele Tchamba who wrote the AEA365 post says the following:

Know the difference. I came to realize that practicing evaluation does not preclude doing pure research. On the contrary, the methods are interconnected but the aim is different (I think this mirrors Weiss’s concept of intent).
The burden of explaining. Many people in academia vaguely know the meaning of evaluation. Those who think they do mistake evaluation for assessment in education. Whenever I meet with people whose understanding of evaluation is limited to educational assessment, I use Scriven’s definition and emphasis words like “value, merit, and worth”.
Distinguishing between evaluation and social science research. Theoretical and practical experiences are helpful ways to distinguish between the two disciplines. Extensive reading of evaluation literature helps to see the difference.

She also sites a Trochim definition that is worth keeping in mind as it captures the various unique qualities of evaluation. Carol Weiss mentioned them all in her list (above):

“Evaluation is a profession that uses formal methodologies to provide useful empirical evidence about public entities (such as programs, products, performance) in decision making contexts that are inherently political and involve multiple often conflicting stakeholders, where resources are seldom sufficient, and where time-pressures are salient”.

Resources:

§ Sandra Mathison’s chapter – What is the difference between evaluation and research ad why do we care?
§ Trochim’s website: Social Research Methods
§ Harvard Family Research Project website: Evaluation Exchange
§ Weiss, C. H. (1972). Evaluation research: Methods of assessing program effectiveness. Englewood Cliffs, NJ: Prentice-Hall.
§ Weiss, C. H. (1998). Evaluation: Methods for studying programs and policies (2nd ed.). Upper Saddle River, NJ: Prentice-Hall.

What have you listed as your goal(s) for 2013?

How is that goal related to evaluation?

One study suggests that you’re 10 times more likely alter a behavior successfully (i.e. get rid of a “bad” behavior; adopt a “good” behavior) than you would if you didn’t make resolution. That statement is evaluative; a good place to start. 10 times! Wow. Yet, even that isn’t a guarantee you will be successful.

How can you increase the likelihood that you will be successful?

Set specific goals. Break the big goal into small steps; tie those small steps to a time line. You want to read how many pages by when? Write it down. Keep track.
Make it public. Just like other intentions, if you tell someone there is an increased likelihood you will complete them. I put it in my quarterly reports to my supervisors.
Substitute “good” for “less than desirable”. I know how hard it is to write (for example). I have in the past and will this year again, schedule and protect a specified time to write those three articles that are sitting partly complete. I’ve substituted “10:00 on Wednesdays and Fridays” for the vague “when I have a block of time I’ll get it done”. The block of time never materializes.
Keep track of progress. I mentioned it in number 1; I’ll say it again: Keep track; make a chart. I’m going to get those manuscripts done by X data…my chart will reflect that

So are you going to

Read something new to you (even if it is not new)?
Write that manuscript from that presentation you made?
Finish that manuscript you have started AND submit it for publication?
Register for and watch a webinar on a topic you know little about?
Explore a topic you find interesting?
Something else?

Let me hear from you as to your resolutions; I’ll periodically give you an update.

And be grateful for the opportunity…gratitude is a powerful way to reinforce you and your goal setting.

What do I know that they don’t know?
What do they know that I don’t know?
What do all of us need to know that few of us knows?”

These three questions have buzzed around my head for a while in various formats.

When I attend a conference, I wonder.

When I conduct a program, I wonder, again.

When I explore something new, I am reminded that perhaps someone else has been here and wonder, yet again.

Thinking about these questions, I had these ideas

I see the first statement relating to capacity building;
The second statement relating to engagement; and
The third statement (relating to statements one and two) relating to cultural competence.

After all, aren’t both of these statements (capacity building and engagement) relating to a “foreign country” and a different culture?

How does all this relate to evaluation? Read on…

Premise: Evaluation is an everyday activity. You evaluate everyday; all the time; you call it making decisions. Every time you make a decision, you are building capacity in your ability to evaluate. Sure, some of those decisions may need to be revised. Sure, some of those decisions may just yield “negative” results. Even so, you are building capacity. AND you share that knowledge–with your children (if you have them), with your friends, with your colleagues, with the random shopper in the (grocery) store. That is building capacity. Building capacity can be systematic, organized, sequential. Sometimes formal, scheduled, deliberate. It is sharing “What do I know that they don’t know (in the hope that they too will know it and use it).

Premise: Everyone knows something. In knowing something, evaluation happens–because people made decisions about what is important and what is not. To really engage (not just outreach which much of Extension does), one needs to “do as” the group that is being engaged. To do anything else (“doing to” or “doing with”) is simply outreach and little or no knowledge is exchanged. Doesn’t mean that knowledge isn’t distributed; Extension has been doing that for years. Just means that the assumption (and you know what assumptions do) is that only the expert can distribute knowledge. Who is to say that the group (target audience, participants) aren’t expert in at least part of what is being communicated. Probably are. It is the idea that … they know something that I don’t know (and I would benefit from knowing).

Premise: Everything, everyone is connected. Being prepared is the best way to learn something. Being prepared by understanding culture (I’m not talking only about the intersection of race and gender; I’m talking about all the stereotypes you carry with you all the time) reinforces connections. Learning about other cultures (something everyone can do) helps dis-spell stereotypes and mitigate stereotype threats. And that is an evaluative task. Think about it. I think it captures the What do all of us need to know that few of us knows?” question.

Evaluation is an Everyday Activity

Program Evaluation Discussions

Tag Archives: value

Accountability

Response rates and non-responders

Independence or Interdependent?

Causation?

More on rubrics and evaluation

Free Technology for Teachers

MindShift

Joanne Jacobs

Teaching Tolerance

Brian McCall’s Economics of Education Blog

April update on making a difference

What is a rubric?

Evaluation or research? Is the distinction important?

New Year’s Resolutions

Capacity building, engagement, cultural competence

Contact Info