summative « Evaluation is an Everyday Activity

Historically, April 15 is tax day (although in 2011, it is April 18 )–the day taxes are due to the revenue departments.

State legislatures are dealing with budgets and Congress is trying to balance a Federal budget.

Everywhere one looks, money is the issue–this is especially true in these recession ridden time. How does all this relate to evaluation, you ask? This is the topic for today’s blog. How does money figure into evaluation.

Let’s start with the simple and move to the complex. Everything costs–and although I’m talking about money, time, personnel, and resources (like paper, staples, electricity, etc.) must also be taken into consideration.

When we talk about evaluation, four terms typically come to mind: efficacy, effectiveness, efficiency, and fidelity.

Efficiency is the term that addresses money or costs. Was the program efficient in its use of resources? That is the question asked addressing efficiency.

To answer that question, there are three (at least) approaches that are used to address this question:

Cost or cost analysis;
Cost effectiveness analysis; and
Cost-benefit analysis.

Simply then:

Cost analysis is the number of dollars it takes to deliver the program, including salary of the individual(s) planning the program.
Cost effectiveness analysis is a computation of the target outcomes in an appropriate unit in ratio to the costs.
Cost-benefit analysis is also a ratio of the costs of outcomes to the benefits of the program measured in the same units, usually money.

How are these computed?

Cost can be measured by how much the consumer is willing to pay. Costs can be the value of each resource that is consumed in the implementation of the program. Or cost analysis can be “measuring costs so they can be related to procedures and outcomes” (Yates, 1996, p. 25). So you list the money spent to implement the program, including salaries, and that is a cost analysis. Simple.
Cost effectiveness analysis says that there is some metric in which the outcomes are measured (number of times hands are washed during the day, for example) and that is put in ratio of the total costs of the program. So movement from washing hands only once a day (a bare minimum) to washing hands at least six times a day would have the costs of the program (including salaries) divided by the changed number of times hands are washed a day (i.e., 5). The resulting value is the cost-effectiveness analysis. Complex.
Cost-benefit analysis puts the outcomes in the same metric as the costs–in this case dollars. The costs (in dollars) of the program (including salaries) are put in ratio to the outcomes (usually benefits) measured in dollars. The challenge here is assigning a dollar amount to the outcomes. How much is frequent hand washing worth? It is often measured in days saved from communicable/chronic/ acute illnesses. Computations of health days (reduction in days affected by chronic illness) is often difficult to value in dollars. There is a whole body of literature in health economics for this topic, if you’re interested. Complicated and complex.

Yates, B. T. (1996). Analyzing costs, procedures, processes, and outcomes in human services. Thousand Oaks, CA: Sage.

Although I have been learning about and doing evaluation for a long time, this week I’ve been searching for a topic to talk about. A student recently asked me about the politics of evaluation–there is a lot that can be said on that topic, which I will save for another day. Another student asked me about when to do an impact study and how to bound that study. Certainly a good topic, too, though one that can wait for another post. Something I read in another blog got me thinking about today’s post. So, today I want to talk about gathering demographics.

Last week, I mentioned in my TIMELY TOPIC post about the AEA Guiding Principles. Those Principles along with the Program Evaluation Standards make significant contributions in assisting evaluators in making ethical decisions. Evaluators make ethical decisions with every evaluation. They are guided by these professional standards of conduct. There are five Guiding Principles and five Evaluation Standards. And although these are not proscriptive, they go along way to ensuring ethical evaluations. That is a long introduction into gathering demographics.

The guiding principle, Integrity/Honesty states that “Evaluators display honesty and integrity in their own behavior, and attempt to ensure the honesty and integrity of the entire evaluation process.” When we look at the entire evaluation process, as evaluators, we must strive constantly to maintain both personal and professional integrity in our decision making. One decision we must make involves deciding what we need/want to know about our respondents. As I’ve mentioned before, knowing what your sample looks like is important to reviewers, readers, and other stakeholders. Yet, if we gather these data in a manner that is intrusive, are we being ethical?

Joe Heimlich, in a recent AEA365 post, says that asking demographic questions “…all carry with them ethical questions about use, need, confidentiality…” He goes on to say that there are “…two major conditions shaping the decision to include – or to omit intentionally – questions on sexual or gender identity…”:

When such data would further our understanding of the effect or the impact of a program, treatment, or event.
When asking for such data would benefit the individual and/or their engagement in the evaluation process.

The first point relates to gender role issues–for example are gay men more like or more different from other gender categories? And what gender categories did you include in your survey? The second point relates to allowing an individual’s voice to be heard clearly and completely and have categories on our forms reflect their full participation in the evaluation. For example, does marital status ask for domestic partnerships as well as traditional categories and are all those traditional categories necessary to hear your participants?

The next time you develop a questionnaire that includes demographic questions, take a second look at the wording–in an ethical manner.

Three weeks ago, I promised you a series of posts on related topics–Program planning, Evaluation implementation, monitoring and delivering, and Evaluation utilization. This is the third one–using the findings of evaluation.

Michael Patton’s book is my reference.

I’ll try to condense the 400+ page book down to 500+ words for today’s post. Fortunately, I have the Reader’s Digest version as well (look for Chapter 23 [Utilization-Focused Evaluation] in the following citation: Stufflebeam, D. L., Madaus, G. F. Kellaghan, T. (2000). Evaluation Models: Viewpoints on educational and human services evaluation, 2ed. Boston, MA: Kluwer Academic Publishers). Patton’s chapter is a good summary–still it is 14 pages.

To start, it is important to understand exactly how the word “evaluation” is used in the context of utilization. In the Stufflebeam, Madaus, & Kellaghan publication cited above, Patton (2000, p. 426) describes evaluation as, “the systematic collection of information about the activities, characteristics, and outcomes of programs to make judgments about the program, improve program effectiveness and/or inform decisions about future programming. Utilization-focused evaluation (as opposed to program evaluation in general) is evaluation done for and with specific intended primary users for specific, intended uses (emphasis added). ”

There are four different types of use–instrumental, conceptual, persuasive, and process. The interest of potential stakeholders cannot be served well unless the stakeholder(s) whose interests are being served is made explicit.

To understand the types of use, I will quote from a document titled, “Non-formal Educator Use of Evaluation Findings: Factors of Influence” by Sarah Baughman.

“Instrumental use occurs when decision makers use the findings to change or modify the program in some way (Fleisher & Christie, 2009; McCormick, 1997; Shulha & Cousins, 1997). The information gathered is used in a direct, concrete way or applied to a specific decision (McCormick, 1997).

Conceptual use occurs when the evaluation findings help the program staff or key stakeholders understand the program in a new way (Fleisher & Christie, 2009).

Persuasive use has also been called political use and is not always viewed as a positive type of use (McCormick, 1997). Examples of negative persuasive use include using evaluation results to justify or legitimize a decision that is already made or to prove to stakeholders or other administrative decision makers that the organization values accountability (Fleisher & Christie, 2009). It is sometimes considered a political use of findings with no intention to take the actual findings or the evaluation process seriously (Patton, 2008). Recently persuasive use has not been viewed as negatively as it once was.

Process use is the cognitive, behavioral, program, and organizational changes resulting, either directly or indirectly, from engagement in the evaluation process and learning to think evaluatively” (Patton, 2008, p. 109). Process use results not from the evaluation findings but from the evaluation activities or process.”

Before beginning the evaluation, the question, “Who is the primary intended user of the evaluation?” must not only be asked; it also must be answered. What stakeholders need to be at the table? Those are the people who have a stake in the evaluation findings and those stakeholders may be different for each evaluation. They are probably the primary intended users who will determine the evaluations use.

Citations mentioned in the Baughman quotation include:

Fleischer, D. N. & Christie, C. A. (2009). Evaluation use: Results from a survey of U.S. American Evaluation Association members. American Journal of Evaluation, 30(2), 158-175
McCormick, E. R. (1997). Factors influencing the use of evaluation results. Dissertation Abstracts International: Section A: The Humanities and Social Sciences, 58, 4187 (UMI 9815051).
Shula, L. M. & Cousins, J. B. (1997). Evaluation use: Theory, research and practice since 1986. Evaluation Practice, 18, 195-208.
Patton, M. Q. (2008). Utilization Focused Evaluation (4th ed.). Thousand Oaks: Sage Publications.

I’ve been reminded recently about Kirkpatrick’s evaluation model.

Donald L. Kirkpatrick (1959) developed a four level model used primarily for evaluating training. This model is still used extensively in the training field and is espoused by ASTD, the American Society of Training and Development.

It also occurred to me that Extension conducts a lot of training from pesticide handling to logic model use and that Kirkpatrick’s model is one that isn’t talked about a lot in Extension–at least I don’t use it as a reference. And that may not be a good thing, given that Extension professionals are conducting training a lot of the time.

Kirkpatrick’s four levels are these:

Reaction: To what degree participants react favorably to the training
Learning: To what degree participants acquire intended knowledge, skills, and attitudes based on the participation in learning event
Application: To what degree do participants apply what they learned during training on the job
Impact: To what degree targeted outcomes occur, as a result of the learning event(s) and subsequent reinforcement

Sometimes it is important to know what the affective reaction our participants are having during and at the end of the training. I would call this a formative evaluation and formative evaluation is often used for program improvement. Reactions are a way that participants can tell the Extension professional how things are going–i.e., what their reaction is–using a continuous feedback mechanism. Extension professionals can use this to change the program, revise their approach, adjust the pace, etc. The feedback mechanism doesn’t have to be constant–which is often the interpretation of “continuous”. Soliciting feedback at natural breaks, using a show of hands, is often enough for on-the- spot adjustments. It is a form of formative evaluation as it is an “in-process” evaluation. Kirkpatrick’s level one (reaction) doesn’t provide a measure of outcomes or impacts. I might call it a “happiness” evaluation or a satisfaction evaluation–tells me only what is the participants’ reaction. Outcome evaluation–to determine a measure of effectiveness–happens in a later level and is another approach to evaluation which I would call summative–although, Michael Patton might call developmental in a training situation where the outcome is always moving, changing, developing.

Kirkpatrick, D. L. (1959) Evaluating Training Programs, 2nd ed., Berrett Koehler, San Francisco.

Kirkpatrick, D. L. (comp.) (1998) Another Look at Evaluating Training Programs, ASTD, Alexandria, USA.

For more information about the Kirkpatrick model, see their site, Kirkpatrick Partners.

Last Wednesday, I had the privilege to attend the OPEN (Oregon Program Evaluators Network) annual meeting.

Michael Quinn Patton, the key note speaker, talked about developmental evaluation and

utilization focused evaluation. Utilization Focused Evaluation makes sense–use by intended users.

Developmental Evaluation, on the other hand, needs some discussion.

The way Michael tells the story (he teaches a lot through story) is this:

“I had a standard 5-year contract with a community leadership program that specified 2 1/2 years of formative evaluation for program improvement to be followed by 2 1/2 years of summative evaluation that would lead to an overall decision about whether the program was effective. ” After 2 1/2 years, Michael called for the summative evaluation to begin. The director was adamant, “We can’t stand still for 2 years. Let’s keep doing formative evaluation. We want to keep improving the program… (I) Never (want to do a summative evaluation)”…if it means standardizing the program. We want to keep developing and changing.” He looked at Michael sternly, challengingly. “Formative evaluation! Summative evaluation! Is that all you evaluators have to offer?” Michael hemmed and hawed and said, “I suppose we could do…ummm…we could do…ummm…well, we might do, you know…we could try developmental evaluation!” Not knowing what that was, the director asked “What’s that?” Michael responded, “It’s where you, ummm, keep developing.” Developmental evaluation was born.

The evaluation field offered, until now, two global approaches to evaluation, formative for program improvement and summative to make an overall judgment of merit and worth. Now, developmental evaluation (DE) offers another approach, one which is relevant to social innovators looking to bring about major social change. It takes into consideration systems theory, complexity concepts, uncertainty principles, nonlinearity, and emergence. DE acknowledges that resistance and push back are likely when change happens. Developmental evaluation recognized that change brings turbulence and suggests ways that “adapts to the realities of complex nonlinear dynamics rather than trying to impose order and certainty on a disorderly and uncertain world” (Patton, 2011). Social innovators recognize that outcomes will emerge as the program moves forward and to predefine outcomes limits the vision.

Michael has used the art of Mark M. Rogers to illustrate the point. The cartoon has two early humans, one with what I would call a wheel, albeit primitive, who is saying, “No go. The evaluation committee said it doesn’t meet utility specs. They want something linear, stable, controllable, and targeted to reach a pre-set destination. They couldn’t see any use for this (the wheel).”

For Extension professionals who are delivering programs designed to lead to a specific change, DE may not be useful. For those Extension professionals who vision something different, DE may be the answer. I think DE is worth a look.

Look for my next post after October 14; I’ll be out of the office until then.

Patton, M. Q. (2011) Developmental Evaluation. NY: Guilford Press.

Last week, I talked about formative and summative evaluations. Formative and summative evaluation roles can help you prioritize what evaluations you do when. I was then reminded of another approach to viewing evaluation that relates to prioritizing evaluations that might also be useful.

When I first started in this work, I realized that I could view evaluation in three parts–process, progress, product. Each part could be conducted or the total approach could be used. This approach provides insights to different aspects of a program. It can also provide a picture of the whole program. Deciding on which part to focus is another way to prioritize an evaluation.

Process evaluation captures the HOW of a program. Process evaluation has been defined as the evaluation that assesses the delivery of the program (Scheirer, 1994). Process evaluation identifies what the program is and if it is delivered as intended both to the “right audience” and in the “right amount”. The following questions (according to Scheirer) can guide a process evaluation:

Why is the program expected to produce its results?
For what types of people may it be effective?
In what circumstances may it be effective?
What are the day-to-day aspects of program delivery?

Progress evaluation captures the FIDELITY of a program–that is, did the program do what the planners said would be done in the time allotted? Progress evaluation has been very useful when I have grant activities and need to be accountable for the time-line.

Product evaluation captures a measure of the program’s products or OUTCOMES. Sometimes outputs are also captured and this is fine. Just keep in mind that outputs may be (and often are) necessary; they are not sufficient for demonstrating the impact of the program. A product evaluation is often summative. However, it can also be formative, especially if the program planners want to gather information to improve the program rather than to determine the ultimate effectiveness of the program.

This framework may be useful in helping Extension professionals decide what to evaluate and when. It may help determine what program needs a process, progress, or product evaluation. Trying to evaluate all your program all at once often defeats being purposeful in your evaluation efforts and often leads to results that are confusing, invalid, and/or useless. It makes sense to choose carefully what evaluation to do when–that is, prioritize.

A question was raised in a meeting this week about evaluation priorities and how to determine them. This reminded me that perhaps a discussion of formative and summative was needed as knowing about these roles of evaluation will help you answer your questions about priorities.

Michael Scriven coined the terms formative and summative evaluation in the late 1960s. Applying these terms to the role evaluation plays in a program has been and continues to be a useful distinction for investigators. Simply put, formative evaluation provides information for program improvement. Summative evaluation provides information to assist decision makers in making judgments about a program, typically for adoption, continuation, or expansion. Both are important.

When Extension professionals evaluate a program at the end of an training or other program, typically, they are gathering information for program improvement. The data gathered after a program are for use by the program designers to help improve it. Sometimes, Extension professionals gather outcome data at the end of a training or other program. Here, information is gathered to help determine the effectiveness of the program. These data are typically short term outcome data, and although they are impact data of a sort, they do not reflect the long term effectiveness of a program. These data gathered to determine outcomes are summative. In many cases, formative and summative are gathered at the same time.

Summative data are also gathered to reflect the intermediate and long term outcomes. As Ellen Taylor-Powell points out when she talks about logic models, impacts are the social, economic, civic, and/or environmental consequences of a program and tend to be longer term. I find calling these outcomes condition changes helps me keep in mind that they are the consequences or impacts of a program and are gathered using a summative form of evaluation.

So how do you know which to use when? Ask yourself the following questions:

What is the purpose of the evaluation? Do you want to know if the program works or if the participants were satisfied?
What are the circumstances surrounding the program?Is the program in its early development or late development? Are the politics surrounding the program challenging?
What resources are available for the evaluation? So you have a lot of time or only a few weeks? Do you have access to people to help you or are you on your own?
What accountability is required? Do you have to report about the effectiveness of a program or do you just have to offer it?
What knowledge generation is expected or desired? Do you need to generate scholarship or support for promotion and tenure?

Think of the answers to these questions as a decision tree as the answers to these questions will help you prioritize your evaluation. Those answers will help you decide if you are going to conduct a formative evaluation, a summative evaluation, or include components of both in your evaluation.

Evaluation is an Everyday Activity

Program Evaluation Discussions

Tag Archives: summative

TIMELY TOPIC: IS IT ALL ABOUT MONEY?

Demographics

Evaluation Use

Kirkpatrick’s evaluation model

Summative, formative, developmental

Process, progress, product?

Formative–Summative

Contact Info