We recently held Professional Development Days for the Division of Outreach and Engagement.  This is an annual opportunity for faculty and staff in the Division to build capacity in a variety of topics.  The question this training posed was evaluative:

How do we provide meaningful feedback?

Evaluating a conference or a multi-day, multi-session training is no easy task.  Gathering meaningful data is a challenge.  What can you do?  Before you hold the conference (I’m using the word conference to mean any multi-day, multi-session training), decide on the following:

  • Are you going to evaluate the conference?
  • What is the focus of the evaluation?
  • How are you going to use the results?

The answer to the first question is easy:  YES.  If the conference is an annual event (or a regular event), you will want to have participants’ feedback of their experience, so, yes, you will evaluate the conference. Look at a Penn State Tip Sheet 16 for some suggestions.  (If this is a one time event, you may not; though as an evaluator, I wouldn’t recommend ignoring evaluation.)

The second question is more critical.  I’ve mentioned in previous blogs the need to prioritize your evaluation.  Evaluating a conference can be all consuming and result in useless data UNLESS the evaluation is FOCUSED.  Sit down with the planners and ask them what they expect to happen as a result of the conference.  Ask them if there is one particular aspect of the conference that is new this year.  Ask them if feedback in previous years has given them any ideas about what is important to evaluate this year.

This year, the planners wanted to provide specific feedback to the instructors.  The instructors had asked for feedback in previous years.  This is problematic if planning evaluative activities for individual sessions is not done before the conference.  Nancy Ellen Kiernan, a colleague at Penn State, suggests a qualitative approach called a Listening Post.  This approach will elicit feedback from participants at the time of the conference.  This method involves volunteers who attended the sessions and may take more persons than a survey.  To use the Listening Post, you must plan ahead of time to gather these data.  Otherwise, you will need to do a survey after the conference is over and this raises other problems.

The third question is also very important.  If the results are just given to the supervisor, the likelihood of them being used by individuals for session improvement or by organizers for overall change is slim.  Making the data usable for instructors means summarizing the data in a meaningful way, often visually.  There are several way to visually present survey data including graphs, tables, or charts.  More on that another time.  Words often get lost, especially if words dominate the report.

There is a lot of information in the training and development literature that might also be helpful.  Kirkpatrick has done a lot of work in this area.  I’ve mentioned their work in previous blogs.

There is no one best way to gather feedback from conference participants.  My advice:  KISS–keep it simple and straightforward.

I’ve talked about how each phase of a logic model has evaluative activities.  I’ve probably even alluded to the fact that needs assessment is the evaluative activity for that phase called situation (see the turquoise area on the left end of the image below.)

What I haven’t done is talk about is the why, what,  and how of needs assessment (NA).  I also haven’t talked about the utilization of the findings of a needs assessment–what makes meaning of the needs assessment.

OK.  So why is a NA conducted?  And what is a NA?

Jim Altschuld is my go-to person when it comes to questions about needs assessment.  He recently edited a series of books on the topic.

Although Jim is my go-to person, Belle Ruth Witkin (a colleague, friend, and collaborator of Jim Altschuld) says in the preface to the co-authored volume (Witkin and Altschuld, 1995–see below),  that the most effective way to decide the best way to divide the (often scarce) resources among the demands (read programs) is to conduct a needs assessment when the planning for the use of those resources begins.

Book 1 of the kit discusses an overview.  In that volume, Jim defines what a needs assessment is: “Needs assessment is the process of identifying needs, prioritizing them, making needs-based decisions, allocating resources, and implementing actions in organizations to resolve problems underlying important needs (pg.20).”  Altschuld states that there are many models for assessing needs and provides citations for those models.  I think the most important aspect of this first volume is the presentation of the phased model developed by Belle Ruth Witkin in 1984 and revised by Altschuld and Witkin in their 1995 and 2000 volumes.Those phases are preassessment, assessment, and postassessment.  They divide those three phases into three levels, primary, secondary, and tertiary, each level targeting a different group of stakeholders.  This volume also discusses the why and the how.  Subsequent volumes go into more detail–volume 2 discusses phase 1 (getting started); volume 3 discusses phase II (collecting data); volume 4 discusses analysis and priorities; and volume 5 discusses phase III (taking action).

Laurie Stevahn and Jean A. King are the authors of this volume. In chapter 3, they discuss strategies for the action plan using facilitation procedures that promote positive relationships, develop shared understanding, prioritize decisions, and assess progress.  They warn of interpersonal conflict and caution against roadblocks that impede change efforts.  They also promote the development of evaluation activities at the onset of the NA because that helps ensure the use of the findings.

Needs assessment is a political experience.  Some one (or ones) will feel disenfranchised, loose resources, have programs ended.  These activities create hard feelings and resentments.  These considerations need to be identified and discussed at the beginning of the process.  It is like the elephant and the blind people–everyone has an image of what the creature is, there may or may not be consensus, yet for the NA to be successful, consensus is important.  Without it, the data will sit on someone’s shelf or in someone’s computer.  Not useful.

…that there is a difference between a Likert item and a Likert scale?**

Did you know that a Likert item was developed by Rensis Likert, a psychometrician and an educator? 

And that the item was developed to have the individual respond to the level of agreement or disagreement with a specific phenomenon?

And did you know that most of the studies on Likert items use a five- or seven-points on the item? (Although sometimes a four- or six-point scale is used and that is called a forced-choice approach–because you really want an opinion, not a middle ground, also called a neutral ground.)

And that the choices in an odd-number choice usually include some variation on the following theme, “Strongly disagree”, “Disagree”, “Neither agree or disagree”, “Agree”, “Strongly Agree”?

And if you did, why do you still write scales, and call them Likert, asking for information using a scale that goes from “Not at all” to “A little extent” to “Some extent” to “Great extent?  Responses that are not even remotely equidistant (that is, have equal intervals with respect to the response options) from each other–a key property of a Likert item.

And why aren’t you using a visual analog scale to get at the degree of whatever the phenomenon is being measured instead of an item for which the points on the scale are NOT equidistant? (For more information on a visual analog scale see a brief description here or Dillman’s book.)

I sure hope Rensis Likert isn’t rolling over in his grave (he died in 1981 at the age of 78).

Extension professionals use survey as the primary method for data gathering.  The choice of survey is a defensible one.  However, the format of the survey, the question content, and the question construction must also be defensible.  Even though psychometric properties (including internal consistency, validity, and other statistics) may have been computed, if the basic underlying assumptions are violated, no psychometric properties will compensate for a poorly designed instrument, an instrument that is not defensible.

All Extension professionals who choose to use survey to evaluate their target audiences need to have scale development as a personal competency.  So take it upon yourself to learn about guidelines for scale development (yes, there are books written on the subject!).

<><><><><>

**Likert scale is the SUM of of responses on several Likert items.  A Likert item is just one 4 -, 5-, 6, or 7-point single statement asking for an opinion.

Reference:  Devellis, R. F. (1991).  Scale development:  Theory and applications. Newbury Park: Sage Publications. Note:  there is a newer edition.

Dillman, D. A, Smyth, J. D., & Christian, L. M. (2009).  Internet, mail, and mixed-mode surveys:  The tailored design method. (3rd ed.). Hoboken, NJ: John Wiley& Sons, Inc.

Hi everyone–it is the third week in April and time for a TIMELY TOPIC!  (I was out of town last week.)

Recently, I was asked: Why should I plan my evaluation strategy in the program planning stage? Isn’t it good enough to just ask participants if they are satisfied with the program?

Good question.  This is the usual scenario:  You have something to say to your community.  The topic has research support and is timely.  You think it would make a really good new program (or a revision of a current program).  So you plan the program. 

Do you plan the evaluation at the same time? The keyed response is YES.  The usual response is something like, “Are you kidding?”  No, not kidding.  When you plan your program is the time to plan your evaluation.

Unfortunately, my experience is that many (most) faculty when planning or revising a program fail to think about evaluating that program at the planning stage.  Yet, it is at the planning stage that you can clearly and effectively identify what you think will happen and what will indicate that your program has made a difference. Remember the evaluative question isn’t, “Did the participants like the program?”  The evaluative question is, “What difference did my program make in the lives of your participants–and if possible in the economic, environmental, and social conditions in which they live.” That is the question you need to ask yourself when you plan your program.  It also happens to be the evaluative question for the long term outcomes in a logic model.

If you ask this question before you implement your program, you may find that you can not gather data to answer it.  This allows you to look at what change (or changes) can you measure.  Can you measure changes in behavior?  This answers the question, “What difference did this program make in the way the participants act in the context presented in the program?” Or perhaps,  “What change occurred in what the participants know about the program topic?”  These are the evaluative questions for the short and intermediate term outcomes in a logic model.  (As an a side, there are evaluative questions that can be asked at every stage of a logic model.)

By thinking about and planning for evaluation at the PROGRAM PLANNING STAGE,you avoid an evaluation that gives you data that cannot be used to support your program.  A program you can defend with good evaluation data is a program that has staying power.  You also avoid having to retrofit your evaluation to your program.  Retrofits, though often possible,  may miss important data that could only be gathered by thinking of your outcomes ahead of the implementation.

Years ago (back when we beat on hollow logs), evaluations typically asked questions that measured participant satisfaction.  You probably still want to know if participants are satisfied with your program.  Satisfaction questionnaires may be necessary; they are no longer sufficient.  They do not answer the evaluative question, “What difference did this program make?”

Last week, I spoke about how to questions  and applying them  to program planning, evaluation design, evaluation implementation, data gathering, data analysis, report writing, and dissemination.  I only covered the first four of those topics.  This week, I’ll give you my favorite resources for data analysis.

This list is more difficult to assemble.  This is typically where the knowledge links break down and interest is lost.  The thinking goes something like this.  I’ve conducted my program, I’ve implemented the evaluation, now what do I do?  I know my program is a good program so why do I need to do anything else?

YOU  need to understand your findings.  YOU need to be able to look at the data and be able to rigorously defend your program to stakeholders.  Stakeholders need to get the story of your success in short clear messages.  And YOU need to be able to use the findings in ways that will benefit your program in the long run.

Remember the list from last week?  The RESOURCES for EVALUATION list?  The one that says:

1.  Contact your evaluation specialist.

2.  Listen to stakeholders–that means including them in the planning.

3.  Read

Good.  This list still applies, especially the read part.  Here are the readings for data analysis.

First, it is important to know that there are two kinds of data–qualitative (words) and quantitative (numbers).  (As an aside, many folks think words that describe are quantitative data–they are still words even if you give them numbers for coding purposes, so treat them like words, not numbers).

  • Qualitative data analysis. When I needed to learn about what to do with qualitative data, I was given Miles and Huberman’s book.  (Sadly, both authors are deceased so there will not be a forthcoming revision of their 2nd edition, although the book is still available.)

Citation: Miles, M. B., & Huberman, A. Michael. (1994). Qualitative data analysis: An expanded source book. Thousand Oaks, CA: Sage Publications.

Fortunately, there are newer options, which may be as good.  I will confess, I haven’t read them cover to cover at this point (although they are on my to-be-read pile).

Citation:  Saldana, J.  (2009). The coding manual for qualitative researchers. Los Angeles, CA: Sage.

Bernard, H. R. & Ryan, G. W. (2010).  Analyzing qualitative data. Los Angeles, CA: Sage.

If you don’t feel like tackling one of these resources, Ellen Taylor-Powell has written a short piece  (12 pages in PDF format) on qualitative data analysis.

There are software programs for qualitative data analysis that may be helpful (Ethnograph, Nud*ist, others).  Most people I know prefer to code manually; even if you use a soft ware program, you will need to do a lot of coding manually first.

  • Quantitative data analysis. Quantitative data analysis is just as complicated as qualitative data analysis.  There are numerous statistical books which explain what analyses need to be conducted.  My current favorite is a book by Neil Salkind.

Citation: Salkind, N. J. (2004).  Statistics for people who (think they) hate statistics. (2nd ed. ). Thousand Oaks, CA: Sage Publications.

NOTE:  there is a 4th ed.  with a 2011 copyright available. He also has a version of this text that features Excel 2007.  I like Chapter 20 (The Ten Commandments of Data Collection) a lot.  He doesn’t talk about the methodology, he talks about logistics.  Considering the logistics of data collection is really important.

Also, you need to become familiar with a quantitative data analysis software program–like SPSS, SAS, or even Excel.  One copy goes a long way–you can share the cost and share the program–as long as only one person is using it at a time.  Excel is a program that comes with Microsoft Office.  Each of these has tutorials to help you.

A part of my position is to build evaluation capacity.  This has many facets–individual, team, institutional.

One way I’ve always seen as building capacity is knowing where to find the answer to the how to questions.  Those how to questions apply to program planning, evaluation design, evaluation implementation, data gathering, data analysis, report writing, and dissemination.  Today I want to give you resources to build your tool box.  These resources build capacity only if you use them.

RESOURCES for EVALUATION

1.  Contact your evaluation specialist.

2.  Listen to stakeholders–that means including them in the planning.

3.  Read.

If you don’t know what to read to give you information about a particular part of your evaluation, see resource Number 1 above.  For those of you who do not have the luxury of an evaluation specialist, I’m providing some reading resources below (some of which I’ve mentioned in previous blogs).

1.  For program planning (aka program development):  Ellen Taylor-Powell’s web site at the University of Wisconsin Extension.  Her web site is rich with information about program planning, program development, and logic models.

2.  For evaluation design and implementation:  Jody Fitzpatrick”s book.

Citation:  Fitzpatrick, J. L., Sanders, J. R., & Worthen, B. R. (2004). Program evaluation: Alternative approaches and practical guidelines.  (3rd ed.).  Boston: Pearson Education, Inc.

3.  For evaluation methods, that depends on the method you want to use for data gathering; it doesn’t cover the discussion of evaluation design, though.

  • For needs assessment, the books by Altschuld and Witkin (there are two).

(Yes, needs assessment is an evaluation activity).

Citation:  Witkin, B. R. & Altschuld, J. W. (1995).  Planning and conducting needs assessments: A practical guide. Thousand Oaks, CA:  Sage Publications.

Citation:  Altschuld, J. W. & Witkin B. R. (2000).  From needs assessment to action: Transforming needs into solution strategies. Thousand Oaks, CA:  Sage Publications, Inc.

  • For survey design:     Don Dillman’s book.

Citation:  Dillman, D. A., Smyth, J. D., & Christian, L. M. (2009).  Internet, mail, and mixed-mode surveys:  The tailored design method.  (3rd. ed.).  Hoboken, New Jersey: John Wiley & Son, Inc.

  • For focus groups:  Dick Krueger’s book.

Citation:  Krueger, R. A. & Casey, M. A. (2000).  Focus groups:  A practical guide for applied research. (3rd. ed.).  Thousand Oaks, CA: Sage Publications, Inc.

  • For case study:  Robert Yin’s classic OR

Bob Brinkerhoff’s book. 

Citation:  Yin, R. K. (2009). Case study research: Design and methods. (4th ed.). Thousand Oaks, CA: Sage, Inc.

Citation:  Brinkerhoff, R. O. (2003).  the success case method:  Find out quickly what’s working and what’s not. San Francisco:  Berrett-Koehler Publishers, Inc.

  • For multiple case studies:  Bob Stake’s book.

Citation:  Stake, R. E. (2006).  Multiple case study analysis. New York: The Guilford Press.

Since this post is about capacity building, a resource for evaluation capacity building:

Hallie Preskill and Darlene Russ-Eft’s book .

Citation:  Preskill, H. & Russ-Eft, D. (2005).  Building Evaluation Capacity: 72 Activities for teaching and training. Thousand Oaks, CA: Sage Publications.

I’ll cover reading resources for data analysis, report writing, and dissemination another time.

I’ve mentioned language use before.

I’ll talk about it today and probably again.

What the word–any word– means is the key to a successful evaluation.

Do you know what it means? Or do you think you know what it means? 

How do you find out if what you think it means is what your key funder (a stakeholder) thinks it means?  Or what the participants (target audience) thinks it means?  Or any other stakeholder (partners, for example) thinks it means…

You ask them.

You ask them BEFORE the evaluation begins.  You ask them BEFORE you have implemented the program.  You ask them when you plan the program.

During program planning, I bring to the table relevant stakeholders–folks similar to and different from those who will be the recipients of the program.  I ask them this evaluative question: “If you participated in this program, how will you know that the program is successful?  What has to happen/change to know that a difference has been made?”

Try it–the answers are often revealing, informative, and enlightening.  They are not often the answers you thought.  Listen to those stakeholders.  They have valuable insights.  They actually know something.

Once you have those answers, clarify any and all terminology so that everyone is on the same page.  What something means to you may means something completely different to someone else.

Impact is one of those words–it is both a noun and a verb.  Be careful how you use it and how it is used.  Go to a less loaded word–like results or effects.  Talk about measurable results that occur within a certain time frame–immediately after the program; several months after the program; several years after the program–depending on your program.  (If you are a forester, you may not see results for 40 years…)

Historically, April 15 is tax day (although in 2011, it is April 18 )–the day taxes are due to the revenue departments.

State legislatures are dealing with budgets and Congress is trying to balance a  Federal budget.

Everywhere one looks, money is the issue–this is especially true in these recession ridden time.  How does all this relate to evaluation, you ask?  This is the topic for today’s blog.  How does money figure into evaluation.

Let’s start with the simple and move to the complex.  Everything costs–and although I’m talking about money, time, personnel, and resources  (like paper, staples, electricity, etc.)  must also be taken into consideration.

When we talk about evaluation, four terms typically come to mind:  efficacy, effectiveness, efficiency, and fidelity.

Efficiency is the term that addresses money or costs.  Was the program efficient in its use of resources?  That is the question asked addressing efficiency.

To answer that question, there are three (at least) approaches that are used to address this question:

  1. Cost  or cost analysis;
  2. Cost effectiveness analysis; and
  3. Cost-benefit analysis.

Simply then:

  1. Cost analysis is the number of dollars it takes to deliver the program, including salary of the individual(s) planning the program.
  2. Cost effectiveness analysis is a computation of the target outcomes in an appropriate unit in ratio to the costs.
  3. Cost-benefit analysis is also a ratio of the costs of outcomes to the benefits of the program measured in the same units, usually money.

How are these computed?

  1. Cost can be measured by how much the consumer is willing to pay.  Costs can be the value of each resource that is consumed in the implementation of the program.  Or cost analysis can be “measuring costs so they can be related to procedures and outcomes” (Yates, 1996, p. 25).   So you list the money spent to implement the program, including salaries, and that is a cost analysis.  Simple.
  2. Cost effectiveness analysis says that there is some metric in which the outcomes are measured (number of times hands are washed during the day, for example) and that is put in ratio of the total costs of the program.  So movement from washing hands only once a day (a bare minimum) to washing hands at least six times a day would have the costs of the program (including salaries) divided by the changed number of times hands are washed a day (i.e., 5).  The resulting value is the cost-effectiveness analysis.  Complex.
  3. Cost-benefit analysis puts the outcomes in the same metric as the costs–in this case dollars.  The costs  (in dollars) of the program (including salaries) are put in ratio to the  outcomes (usually benefits) measured in dollars.  The challenge here is assigning a dollar amount to the outcomes.  How much is frequent hand washing worth? It is often measured in days saved from communicable/chronic/ acute  illnesses.  Computations of health days (reduction in days affected by chronic illness) is often difficult to value in dollars.  There is a whole body of literature in health economics for this topic, if you’re interested.  Complicated and complex.

Yates, B. T. (1996).  Analyzing costs, procedures, processes, and outcomes in human services.  Thousand Oaks, CA: Sage.

There has been a lot of buzz recently about the usefulness of the Kirkpatrick model

I’ve been talking about it (in two previous posts) and so have others.   This model has been around a long time and has continued to be useful in the training field.  Extension does a lot of training.  Does that mean this model should be used exclusively when training is the focus?  I don’t think so.  Does this model have merits.  I think so.  Could it be improved upon?  That depends on the objective of your program and your evaluation, so probably.

If you want to know about whether your participants react favorably to the training, then this model is probably useful.

If you want to know about the change in knowledge, skills,  attitudes, then this model may be useful.  You would need to be careful because knowledge is a slippery concept to measure.

If you want to know about the change in behavior, probably not. Kirkpatrick on the website says that application of learning is what is measured in the behavioral stage.  How do you observe behavior change at a training?  Observation is the obvious answer here and one does not necessarily observe behavior change at a training.  Intention to change is not mentioned in this level.

If you want to know what difference you made in the social, economic, and/or environmental conditions in which your participants live, work, and practice, then the Kirkpatrick model won’t take you there.  The 4th level (which is where evaluation starts for this model, according to Kirkpatrick) says:  To what degree targeted outcomes occur as a result of the training event and subsequent reinforcement. I do not see this as condition change or what I call impact.

A faculty member asked me for specific help in assessing impact.  First, one needs to define what is meant by impact.  I use the word to mean change in social, environmental, and/or economic conditions over the long run.  This means changes in social institutions like family, school, employment (social conditions). It means changes in the environment which may be clean water or clean air OR it may mean removing the snack food vending machine from the school (environmental conditions).  It means changes in some economic indicator, up or down, like return on investment, change in employment status,  or increase revenue (economic conditions).  This doesn’t necessarily mean targeted outcomes of the training event.

I hope that any training event will move participants to a different place in their thinking and acting that will manifest in the LONG RUN in changes in one of the three conditions mentioned above.  To get there, one needs to be specific in what one is asking the participants.  Intention to change doesn’t necessarily get to impact.  You could anticipate impact if participants follow through with their intention.  The only way to know that for sure  is to observe it.  We approximate that by asking good questions.

What questions are you asking about condition change to get at impacts of your training and educational programs?

Next week:  TIMELY TOPIC.  Any suggestions?

You’ve developed your program.  You think you’ve met a need.  You conduct an evaluation.  Low and behold!  Some of your respondents give you such negative feedback you wonder what program they attended.  Could it really have been your program?

This is the phenomena I call “all of the people all of the time”, which occurs regularly  in evaluating training  programs.  And it has to do with use–what you do with the results of this evaluation.  And you can’t do it–please all of the people all of the time, that is.  There will always be some sour grapes.  In fact, you will probably have more negative comments than positive comments.  People who are upset want you to know; people are happy are just happy.

Now, I’m sure you are really confused.  Good.  At least I’ve got your attention and maybe you’ll read to the end of today’s post.

You have seen this scenario:  You ask the participants for formative data so that you can begin planning the next event or program.  You ask about the venue, the time of year, the length of the conference, the concurrent offerings, the plenary speakers.  Although some of these data are satisfaction data (the first level, called Reaction,  in Don Kirkpatrick’s training model and the Reaction category in Claude Bennett’s TOPs Hierarchy [see diagram]

they are important part of formative evaluation; an important part of program planning.  You are using the evaluation report.  That is important.  You are not asking if the participants learned something.  You are not asking if they intend to change their behavior.  You are not asking about what conditions have changed.  You only want to know about their experience in the program.

What do you do with the sour grapes?  You could make vinegar, only that won’t be very useful and use is what you are after.  Instead, sort the data into those topics over which you have some control and those topics over which you have no control.  For example–you have control over who is invited to be a plenary speaker, if there will be a plenary speaker, how many concurrent sessions, who will teach those concurrent sessions;  you have no control over the air handling at the venue, the chairs at the venue, and probably, the temperature of the venue.

You can CHANGE those topics over which you have control.  Comments say the plenary speaker was terrible.  Do not invite that person to speak again.  Feedback says that the concurrent sessions didn’t provide options for classified staff, only faculty.  Decide the focus of your program and be explicit in the program promotional materials–advertise it explicitly to your target audience.  You get complaints about the venue–perhaps there is another venue; perhaps not.

You can also let your audience know what you decided based on your feedback.  One organization for which I volunteered sent out a white paper with all the concerns and how the organization was addressing them–or not.  It helped the grumblers see that the organization takes their feedback seriously.

And if none of this works…ask yourself: Is it a case of all of the people all of the time?