According to the counter on this blog, I’ve published 49 times.  Since last week was the one year anniversary of the inception of “Evaluation is an Everyday Activity”, which means 52 weeks, I missed a few weeks.  Not surprising with vacations, professional development, and writer’s block.  Today is a writer’s block day…I thought I’d do something about program theory.  I’m sure you are asking what has program theory to do with evaluating my program.   Let me explain…

An evaluation that is theory driven uses program theory as a tool to (according to Jody Fitzpatrick):

  1. understand the program to be evaluated
  2. guide the evaluation.

Pretty important contributions.  Faculty have often told me, I know my program’s good; everyone likes it.  But–

Can you describe the program theory that supports your program?

Huey Chen (1) defines program theory as, “a specification of what must be done to achieve the desired goals, what other important impacts may be anticipated, and how these goals and impacts would be generated.”  There are two parts of program theory:  normative theory and causative theory.  Normative theory (quoting Fitzpatrick) “…describes the program as it should be, its goals and outcomes, its interventions and the rationale for these, from the perspectives of various stakeholders.”  Causative theory, according to Fitzpatrick, “… makes use of existing research to describe the potential outcomes of the program based on characteristics of  the clients (read, target audience) and the program actions.  Using both normative and causative theories, one can develop a “plausible program model” or logic model.

Keep in mind that a “plausible program model” is only one of the possible models and the model  developed before implementation may need to change before the final evaluation.  Although anticipated outcomes are the ones you think will happen as a result of the program, Jonny Morell (2) provides a long list of programs where unanticipated outcomes happen before, during, and after the program implementation.  It might be a good idea to think of all potential outcomes–not just the ones you think might happen.  This is why program theory is important…to help you focus on the potential outcomes.

1.  Chen, H. (1990). Theory-driven evaluations.  Newbury Park, CA: Sage.

2.  Morell, J. A. (2010). Evaluation in the Face of Uncertainty. NY: Guilford Press.

There is an ongoing discussion about the difference between impact and outcome.  I think this is an important discussion because Extension professionals are asked regularly to demonstrate  the impact of their program.

There is no consensus about these terms.  They are often used interchangeably. Yet, the consensus is that they are not the same.  When Extension professionals plan an evaluation, it is important to keep these terms separate.  Their meaning is distinct and different.

So what exactly is IMPACT?

And what is an OUTCOME?

What points do we need to keep in mind when considering if the report we are making is a report of OUTCOMES or a report of IMPACTS.  Making explicit the meaning of these words before beginning the program is important.  If there is no difference in your mind, then that needs to be stated.  If there is a difference from your perspective, that needs to be stated as well.  It may all depend on who the audience is for the report.  Have you asked your supervisor (Staff Chair, Department Head, Administrator) what they mean by these terms?

One way to look at this issue is to go to simpler language:

  • What is the result (effect) of the intervention (read ‘program’)–that is, SO WHAT?  This is impact.
  • What is the intervention influence (affect) on the target audience–that is, WHAT HAPPENED?  This is outcome.

I would contend that impact is the effect (i.e., the result) and outcome is the affect (i.e., the influence).

Now to complicate this discussion a bit–where do OUTPUTS fit?

OUTPUTS are necessary and NOT sufficient to determine the influence (affect) or results (effect) of an intervention.  Outputs count things that were done–number of people trained; feet of stream bed reclaimed; number of curriculum written; number of…(fill in the blank).  Outputs do not tell you either the affect or the effect of the intervention.

The difference I draw may be moot if you do not draw the distinction.  If you don’t that is OK.  Just make sure that you are explicit with what you mean by these terms:  OUTCOMES and IMPACT.

I’ve been writing for almost a year, 50 some columns.  This week, before the Thanksgiving holiday, I want to share evaluation resources I’ve found useful and for which I am thankful.  Although there are probably others with which I am not familiar, these are ones for which I am thankful.

\

My colleagues at UWEX, University of  Wisconsin Extension Service, Ellen Taylor-Powell, and at Penn State Extension Service,

Nancy Ellen Kiernan,



both have resources that are very useful, easily accessed, clearly written.  Ellen’s can be found at the Quick Tips site and Nancy Ellen’s can be found at her Tipsheets index.  Both Nancy Ellen and Ellen have other links that may be useful as well.  Access their sites through the links above.

Last week, I mentioned the American Evaluation Association.     One of the important structures in AEA is the Topical Interest Groups (or TIGs).  Extension has a TIG called the Extension Education Evaluation which helps organize Extension professionals who are interested or involved in evaluation.  There is a wealth of information on the AEA web site.  about the evaluation profession,  access to the AEA elibrary, links to AEA on Facebook, Twitter, and LinkedIn.  You do NOT have to be a member,  to subscribe to blog, AEA365, which as the name suggests, is posted daily by different evaluators.  Susan Kistler, AEA’s executive director, posts every Saturday.  The November 20 post talks about the elibrary–check it out.

Many states and regions have local AEA affiliates.  For example, OPEN, Oregon Program Evaluators Network, serves southern Washington and Oregon.  It has an all volunteer staff who live mostly in Portland and Vancouver WA.  The AEA site lists over 20 affiliates across the country, many with their own website.  Those websites have information about connecting with local evaluators.

In addition to these valuable resources, National eXtension (say e-eXtension) has developed a community of practice devoted to evaluation and Mike Lambur, eXtension Evaluation and Research Leader, who can be reached at mike.lambur@extension.org. According to the web site, National eXtension “…is an interactive learning environment delivering the best, most researched knowledge from the smartest land-grant university minds across America. eXtension connects knowledge consumers with knowledge providers—experts like you who know their subject matter inside out.”

Happy Thanksgiving.  Be safe.

Recently, I attended the American Evaluation Annual (AEA) conference is San Antonio, TX. And although this is a stock photo, the weather (until Sunday) was like it seems in this photo.  The Alamo was crowded–curious adults, tired children, friendly dogs, etc.  What I learned was that  San Antonio is the only site in the US where there are five Spanish missions within 10 miles of each other.  Starting with the Alamo (the formal name is San Antonio de Valero), as you go south out of San Antonio, the visitor will experience the Missions Concepcion, San Juan, San Jose, and Espada, all of which will, at some point in the future, be on the Mission River Walk (as opposed to the Museum River Walk).  The missions (except the Alamo) are National Historic Sites.  For those of you who have the National Park Service Passport, site stamps are available.

AEA is the professional home for evaluators.  The AEA has approximately 6000 members and about 2500 of them attended the conference, called Evaluation 2010.  This year’s president, Leslie Cooksy, identified “Evaluation Quality”

as the theme for the conference.  Leslie says in her welcome letter, “Evaluation quality is an umbrella theme, with room underneath for all kinds of ideas–quality from the perspective of different evaluation approaches, the role of certification in quality assurance, metaevaluation and the standards used to judge quality…”  Listening to the plenary sessions, attending the concurrent sessions, networking with long time colleagues, I got to hear so many different perspectives on quality.

In the closing plenary, Hallie Preskill, 2007 AEA president, was asked to comment on the themes she heard throughout the conference.  She used mind mapping (a systems tool) to quickly and (I think) effectively organize the value of AEA.  She listed seven main themes:

  1. Truth
  2. Perspectives
  3. Context
  4. Design and methods
  5. Representation
  6. Intersections
  7. Relationships

Although she lists, context as a separate theme, I wonder if evaluation quality is really contextual first and then these other things.

Hallie listed sub themes under each of these topics:

  1. What is (truth)?  Whose (truth)?  How much data is enough?
  2. Whose (perspectives)?  Cultural (perspectives).
  3. Cultural (context). Location (context).  Systems (context).
  4. Multiple and mixed (methods).  Multiple case studies.  Stories.  Credible.
  5. Diverse (representation).  Stakeholder (representation).
  6. Linking (intersections).  Interdisciplinary (intersections).
  7. (Relationships) help make meaning.  (Relationships) facilitate quality.   (Relationships) support use.  (Relationships) keep evaluation alive.

Being a member of AEA is all this an more.  Membership is affordable ($80.00, regular; $60.00 for joint membership with the Canadian Evaluation Society; and $30.00 for full time students).  Benefits are worth that and more.  The conference brings together evaluators from all over.  AEA is quality.

I’ve been reminded recently about Kirkpatrick’s evaluation model.

Donald L. Kirkpatrick (1959) developed a four level model used primarily for evaluating training.  This model is still used extensively in the training field and is espoused by ASTD, the American Society of Training and Development.

It also occurred to me that Extension conducts a lot of training from pesticide handling to logic model use and that Kirkpatrick’s model is one that isn’t talked about a lot in Extension–at least I don’t use it as a reference.  And that may not be a good thing, given that Extension professionals are conducting training a lot of the time.

Kirkpatrick’s four levels are these:

  1. Reaction:  To what degree participants react favorably to the training
  2. Learning:  To what degree participants acquire intended knowledge, skills, and attitudes based on the participation in learning event
  3. Application:  To what degree do participants apply what they learned during training on the job
  4. Impact:  To what degree targeted outcomes occur, as a result of the learning event(s) and subsequent reinforcement

Sometimes it is important to know what the affective reaction our participants are having during and at the end of  the training.  I would call this a formative evaluation and formative evaluation is often used for program improvement.  Reactions are a way that participants can tell the Extension professional how things are going–i.e., what their reaction is–using a continuous feedback mechanism.  Extension professionals can use this to change the program, revise their approach, adjust the pace, etc.  The feedback mechanism doesn’t have to be constant–which is often the interpretation of “continuous”.  Soliciting feedback at natural breaks, using a show of hands, is often enough for on-the- spot adjustments.  It is a form of formative evaluation as it is an “in-process” evaluation.  Kirkpatrick’s level one (reaction)  doesn’t provide a measure of outcomes or impacts.  I might call it a “happiness” evaluation or a satisfaction evaluation–tells me only what is the participants’ reaction.  Outcome evaluation–to determine a measure of effectiveness–happens in a later level and is another approach to evaluation which I would call summative–although, Michael Patton might call developmental in a training situation where the outcome is always moving, changing, developing.

Kirkpatrick, D. L. (1959) Evaluating Training Programs, 2nd ed., Berrett Koehler, San Francisco.

Kirkpatrick, D. L. (comp.) (1998) Another Look at Evaluating Training Programs, ASTD, Alexandria, USA.

For more information about the Kirkpatrick model, see their site, Kirkpatrick Partners.

Last Wednesday, I had the privilege to attend the OPEN (Oregon Program Evaluators Network) annual meeting.

Michael Quinn Patton, the key note speaker, talked about  developmental evaluation and

utilization focused evaluation.  Utilization Focused Evaluation makes sense–use by intended users.

Developmental Evaluation, on the other hand, needs some discussion.

The way Michael tells the story (he teaches a lot through story) is this:

“I had a standard 5-year contract with a community leadership program that specified 2 1/2 years of formative evaluation for program improvement to be followed by 2 1/2 years of summative evaluation that would lead to an overall decision about whether the program was effective. ”   After 2 1/2 years, Michael called for the summative evaluation to begin.  The director  was adamant, “We can’t stand still for 2 years.  Let’s keep doing formative evaluation.  We want to keep improving the program… (I) Never (want to do a summative evaluation)”…if it means standardizing the program.  We want to keep developing and changing.”  He looked at Michael sternly, challengingly.  “Formative evaluation!  Summative evaluation! Is that all you evaluators have to offer?” Michael hemmed and hawed and said, “I suppose we could do…ummm…we could do…ummm…well, we might do, you know…we could try developmental evaluation!” Not knowing what that was, the director asked “What’s that?”  Michael responded, “It’s where you, ummm, keep developing.”  Developmental evaluation was born.

The evaluation field offered, until now, two global approaches to evaluation, formative for program improvement and summative to make an overall judgment of merit and worth.  Now, developmental evaluation (DE) offers another approach, one which is relevant to social innovators looking to bring about major social change.  It takes into consideration systems theory, complexity concepts, uncertainty principles,  nonlinearity, and emergence.  DE acknowledges that resistance and push back are likely when change happens.  Developmental evaluation recognized that change brings turbulence and suggests ways that “adapts to the realities of complex nonlinear dynamics rather than trying to impose order and certainty on a disorderly and uncertain world” (Patton, 2011).  Social innovators recognize that outcomes will emerge as the program moves forward and to predefine outcomes limits the vision.

Michael has used the art of Mark M. Rogers to illustrate the point.  The cartoon has two early humans, one with what I would call a wheel, albeit primitive, who is saying, “No go.  The evaluation committee said it doesn’t meet utility specs.  They want something linear, stable, controllable, and targeted to reach a pre-set destination.  They couldn’t see any use for this (the wheel).”

For Extension professionals who are delivering programs designed to lead to a specific change, DE may not be useful.  For those Extension professionals who vision something different, DE may be the answer.  I think DE is worth a look.

Look for my next post after October 14; I’ll be out of the office until then.

Patton, M. Q. (2011) Developmental Evaluation. NY: Guilford Press.

Ryan asks a good question: “Are youth serving programs required to have an IRB for applications, beginning and end-of-year surveys, and program evaluations?”  His question leads me to today’s topic.

The IRB is concerned with “research on human subjects”.  So you ask, When is evaluation a form research?

It all depends.

Although evaluation methods have evolved from  social science research, there are important distinctions between the two.

Fitzpatrick, Sanders, and Worthen list five differences between the two and it is in those differences that one must consider IRB assurances.

These five differences are:

  1. purpose,
  2. who sets the agenda,
  3. generalizability of results,
  4. criteria, and
  5. preparation.

Although these criteria differ for evaluation and research, there are times when evaluation and research overlap.    If the evaluation study adds to knowledge in a discipline or research informs our judgments about a program, then the distinctions are blurred and a broader view of the inquiry is needed and possibly an IRB approval.

IRB considers children a vulnerable population.  Vulnerable populations require IRB protection.  Evaluations with vulnerable populations may need IRB assurances.  IF you have a program that involves children AND you plan to use the program activities as the basis of an effectiveness evaluation (ass opposed to program improvement) AND use that evaluation as scholarship you will need IRB.

Ryan asks “what does publish mean”.  That question takes us to what is scholarship.  One definition of scholarship is that scholarship is creative work, that is validated by peers and communicated.  Published means communicating to peers in a peer reviewed journal or professional meeting not, for example, in a press release.

How do you decide if your evaluation needs IRB?  How do you decide if your evaluation is research or not?   Start with the purpose of your inquiry.  Do you want to add knowledge in the field?   Do you want to see if what you are doing is applicable in other settings?  Do you want others to know what you’ve done and why?  They you want to communicate this.  In academics, that means publishing it in a peer reviewed journal or presenting it at a professional meeting.  And to do that and use the information provided you by your participants who are human subjects, you will need IRB assurance that they are protected.

Every IRB is different.  Check with your institution.  Most work done by Extension professionals falls under the category of “exempt from full board review”.  It is the shortest review and the least restrictive.  Vulnerable populations, audio and/or video taping, or asking sensitive questions typically is categorized as expedited, a more stringent review than the “exempt” category, which takes a little longer.  IF you are working with vulnerable populations and asking for sensitive information,  doing an invasive procedure, or involving participants in something that could be viewed as coercive, then the inquiry will probably need full board review (which takes the longest turn around time.

September 25 – October 2 is Banned Book Week.

All of the books shown below have been or are banned.

and the American Library Association has once again published a list of banned or challenged books.  The September issue of the AARP Bulletin listed 50 banned books.  The Merriam Webster Dictionary was banned in a California elementary school in January 2010.

Yes, you say, so what?  How does that relate to program evaluation?

Remember the root of the work “evaluation” is value.  Someplace in the United States, some group used some criteria to “value” (or not) a book– to lodge a protest, successfully (or not), to remove a book from a library, school, or other source.  Establishing a criteria means that evaluation was taking place.  In this case, those criteria included being “too political,” having “too much sex,” being “irreligious,” being “socially offensive,” or some other criteria.   Some one, some place, some where has decided that the freedom to think for your self, the freedom to read, the importance of the First Amendment, the importance of free and open access to information are not important parts of our rights and they used evaluation to make that decision.

Although I don’t agree with censorship–I agree with the right that a person has to express her or his opinion as guaranteed by the First Amendment.  Yet in expressing an opinion, especially an evaluative opinion, an individual has a responsibility to express that opinion without hurting other people or property; to evaluate responsibly.

To aid evaluators to evaluate responsibly, the The American Evaluation Association has developed a set of five guiding principles for evaluators and even though you may not consider yourself a professional evaluator, considering these principals when conducting your evaluations is important and responsible.  The Guiding Principles are:

A. Systematic Inquiry: Evaluators conduct systematic, data-based inquiries;

B. Competence: Evaluators provide competent performance to stakeholders;

C. Integrity/Honesty: Evaluators display honesty and integrity in their own behavior, and attempt to ensure the honesty and integrity of the entire evaluation process;

D.  Respect for People:  Evaluators respect the security, dignity, and self-worth of respondents, program participants, clients, and other evaluation stakeholders; and

E. Responsibilities for General and Public Welfare: Evaluators articulate and take into account the diversity of general and public interests and values that may be related to the evaluation.

I think free and open access to information is covered by principle D and E.  You may or may not agree with the people who used evaluation to challenge a book and in doing so used evaluation.  Yet, as someone who conducts evaluation, you have a responsibility to consider these principles, making sure that your evaluations respect people and are responsible for general and public welfare (in addition to employing systematic inquiry, competence, and integrity/honesty).  Now–go read a good (banned) book!

A faculty member asked me how does one determine impact from qualitative data.  And in my mail box today was a publication from Sage Publishers inviting me to “explore these new and best selling qualitative methods titles from Sage.”

Many Extension professionals are leery of gathering data using qualitative methods.  “There is just too much data to make sense of it,” is one complaint I often hear.  Yes, one characteristic of qualitative data is the rich detail that usually results. (Of course is you are only asking closed ended questions resulting in Yes/No, the richness is missing.)  Other complaints include “What do I do with the data?” “How do I draw conclusions?”  “How do I report the findings?”  And as a result, many Extension professionals default to what is familiar–a survey.  Surveys, as we have discussed previously, are easy to code, easy to report (frequencies and percentages), and difficult to write well.

The Sage brochure provides resources to answer some of these questions.

Michael Patton’s 3rd edition of Qualitative Research and Evaluation Methods “…contains hundreds of examples and stories illuminating all aspects of qualitative inquiry…it offers strategies for enhancing quality and credibility of qualitative findings…and providing detailed analytical guidelines.”  Michael is the keynote speaker for the Oregon Program Evaluator Network (OPEN) fall conference where he will be talking about his new book, Developmental Evaluation. If you are in Portland, I encourage you to attend.  (For more information, see:

http://www.oregoneval.org/program/

Another reference I just purchased is Bernard and Ryan’ s volume, Analyzing Qualitative Data. This book is a systematic approach to making sense out of words. It, too, is available from Sage.

What does all this have to do with a analyzing a conversation?  A conversation is qualitative data.  It is made up of words.  Knowing what to do with those words will provide evaluation data that is powerful.  My director is forever saying the story is what legislators want to hear.  Stories are qualitative data.

One of the most common forms of conversation that Extension professionals use is focus groups.  It is a guided, structured, and focused conversation.  It can yield a wealth of information if the questions are well crafted, if those questions have been piloted tested, and the data are analyzed in a meaningful way.  There are numerous ways to analyze qualitative data (cultural domain analysis, KWIC analysis, discourse analysis, narrative analysis, grounded theory, content analysis, schema analysis, analytic induction and qualitative comparative analysis, and ethnographic decision models) all of which are discussed in the above mentioned reference.  Deciding which will best work with the gathered qualitative data is a decision only the principal investigator can make.  Comfort and experience will enter into that decision.  Keep in mind qualitative data can be reduced to numbers; numbers cannot be exploded to capture the words from which they came.

One response I got for last week’s query was about on-line survey services.  Are they reliable?  Are they economical?  What are the design limitations?  What are the question format limitations?

Yes.  Depends.  Some.  Not many.

Let me take the easy question first:  Are they economical?

Depends.  Cost of postage for paper survey (both out and back) vs. the time it takes to enter questions in system.  Cost of system vs. length of survey.  These are things to consider.

Because most people have access to email today,  using an on-line survey service is often the easiest and most economical way to distribute an evaluation survey.  Most institutional review boards view an on-line survey like a mail survey and typically grant a waiver of documentation of informed consent.  The consenting document is the entry screen and often an agree to participate question is included on that screen.

Are they valid and reliable?

Yes, but…The old adage “Garbage in, garbage out” applies here.  Like a paper survey, and internet survey is only as good as the survey questions.  Don Dillman, in his third edition “Internet, mail, and mixed-mode surveys” (co-authored with Jolene D.  Smyth and Leah Melani Christian), talks about question development.  Since he wrote the book (literally), I use this resource a lot!

What are the design limitations?

Some limitations apply…Each online survey service is different.  The most common service is Survey Monkey (www.surveymonkey.com).  The introduction to Survey Monkey says, “Create and publish online surveys in minutes, and view results graphically and in real time.”  The basic account with Survey Monkey is free.  It has limitations (number of questions [10]; limited number of question formats [15]; number of responses [100]). And you can upgrade to the Pro or Unlimited  for a subscription fee ($19.95/mo or $200/annually, respectively).  There are others.  A search using “survey services” returns many options such as Zoomerang or InstantSurvey.

What are the question format limitations?

Not many–both open-ended and closed ended questions can be asked.  Survey Monkey has 15 different formats from which to choose (see below).  I’m sure there may be others; this list covers most formats.

  • Multiple Choice (Only one Answer)
  • Multiple Choice (Multiple Answers)
  • Matrix of Choices (Only one Answer per Row)
  • Matrix of Choices (Multiple Answers per Row)
  • Matrix of Drop-down Menus
  • Rating Scale
  • Single Textbox
  • Multiple Textboxes
  • Comment/Essay Box
  • Numerical Textboxes
  • Demographic Information (US)
  • Demographic Information (International)
  • Date and/or Time
  • Image
  • Descriptive Text

Oregon State University has an in-house service sponsored by the College of Business (BSG–Business Survey Groups).  OSU also has an institutional account with Student Voice, an on-line service designed initially for learning assessment which I have found useful for evaluations.  Check your institution for options available.  For your next evaluation that involves a survey, think electronically.