Sure, you want to know the outcomes resulting from your program.  Sure, you want to know if your program is effective.  Perhaps, you will even attempt to answer the question, “So What?” when you program is effective on some previously identified outcome.  All that is important.

My topic today is something that is often over looked when developing an evaluation–the participant and program characteristics.

Do you know what your participants look like?

Do you know what your program looks like?

Knowing these characteristics may seem unimportant at the outset of the implementation.  As you get to the end, questions will arise–How many females?  How many Asians?  How many over 60?

Demographers typically ask demographic questions as part of the data collection.

Those questions often include the following categories:

  • Gender
  • Age
  • Race/ethnicity
  • Marital status
  • Household income
  • Educational level

Some of those may not be relevant to your program and you may want to include other general characteristic questions instead.  For example, in a long term evaluation of a forestry program where the target audience was individuals with wood lots, asking how many acres were owned was important and marital status did not seem relevant.

Sometimes asking some questions may seem intrusive–for example, household income or age.  In all demographic cases, giving the participant an option to not respond is appropriate.  When these data are reported, report the number of participants who chose not to respond.

When characterizing your program, it is sometimes important to know characteristics of the geographic area where the program is being implemented–rural, suburban, urban, ?  This is especially true when the program is a multisite program.   Local introduces an unanticipated variable that is often not recognized or remembered.

Any variation in the implementation–number of contact hours, for example, or the number of training modules.  The type of intervention is important as well–was the program delivered as a group intervention or individually. The time of the year that the program is implemented may also be important to document.  The time of the year may inadvertently introduce a history bias into the study–what is happening in September is different than what is happening in December.

Documenting these characteristics  and then defining them when reporting the findings helps to understand the circumstances surrounding the program implementation.  If the target audience is large, documenting these characteristics can provide comparison groups–did males do something differently than females?  Did participants over 50 do something different than participants 49 or under?

Keep in mind when collecting participant and program characteristic data, that these data help you and the audience to whom you disseminate the findings understand your outcomes and the effect of your program.

Last week I suggested a few evaluation related resolutions…one I didn’t mention which is easily accomplished is reading and/or contributing to AEA365.  AEA365 is a daily evaluation blog sponsored by the American Evaluation Association.  AEA’s Newsletter says: “The aea365 Tip-a-Day Alerts are dedicated to highlighting Hot Tips, Cool Tricks, Rad Resources, and Lessons Learned by and for evaluators (see the aea365 site here). Begun on January 1, 2010, we’re kicking off our second year and hoping to expand the diversity of voices, perspectives, and content shared during the coming year. We’re seeking colleagues to write one-time contributions of 250-400 words from their own experience. No online writing experience is necessary – you simply review examples on the aea356 Tip-a-Day Alerts site, craft your entry according to the contributions guidelines, and send it to Michelle Baron our blog coordinator. She’ll do a final edit and upload. If you have questions, or want to learn more, please review the site and then contact Michelle at aea365@eval.org. (updated December 2011)”

AEA365 is a valuable site.  I commend it to you.

Now the topic for today: Data sources–the why and the why not (or advantages and disadvantages for the source of information).

Ellen Taylor Powell, Evaluation Specialist at UWEX, has a handout that identifies sources of evaluation data.  These sources are existing information, people, and pictorial records and observations. Each source has advantages and disadvantages.

The source for the information below is the United Way publication, Measuring Program Outcomes (p. 86).

1.  Existing information such as Program Records are

  • Available
  • Accessible
  • Known sources and methods  of data collection

Program records can also

  • Be corrupt because of data collection methods
  • Have missing data
  • Omit post intervention impact data

2. Another form of existing information is Other Agency Records

  • Offer a different perspective
  • May contain impact data

Other agency records may also

  • Be corrupt because of data collection methods
  • Have missing data
  • May be unavailable as a data source
  • Have inconsistent time frames
  • Have case identification difficulties

3.  People are often main data source and include Individuals and General Public and

  • Have unique perspective on experience
  • Are an original source of data
  • General public can provide information when individuals are not accessible
  • Can serve geographic areas or specific population segments

Individuals and the general public  may also

  • Introduce a self-report bias
  • Not be accessible
  • Have limited overall experience

4.  Observations and pictorial records include Trained Observers and Mechanical Measurements

  • Can provide information on behavioral skills and practices
  • Supplement self reports
  • Can be easily quantified and standardized

These sources of data also

  • Are only relevant to physical observation
  • Need data collectors who must be reliably trained
  • Often result in inconsistent data with multiple observers
  • Are affected by the accuracy of testing devices
  • Have limited applicability to outcome measurement

My older daughter (I have two–Morgan, the older, and Mersedes, the younger, ) suggested I talk about the evaluative activities around the holidays…hmmm.

Since I’m experiencing serious writers block this week, I thought I’d revisit evaluation as an everyday activity, with a holiday twist.

Keep in mind that the root of evaluation is from the French after the Latin is value (Oxford English Dictionary on line says:  [a. Fr. évaluation, f. évaluer, f. é- =es- (:{em} L. ex) out + value VALUE.]).


Perhaps this is a good time to mention that the theme for Evaluation 2011 put forth by incoming AEA President, Jennifer Greene, is Values and Valuing in Evaluation.  I want to quote from her invitation letter, “…evaluation is inherently imbued with values.  Our work as evaluators intrinsically involves the process of valuing, as our charge is to make judgments (emphasis original) about the “goodness” or the quality, merit, or worth of a program.”

Let us consider the holidays “a program”. The Winter Holiday season starts (at least in the US and the northern hemisphere) with the  Thanksgiving holiday followed shortly thereafter by the first Sunday in Advent.  Typically this period of time includes at least the  following holidays:  St. Nicholas Day, Hanukkah, Winter Solstice, Christmas, Kwanzaa, Boxing Day, New Year’s, and Epiphany (I’m sure there are ones I didn’t list that are relevant).  This list typically takes us through January 6.  (I’m getting to the value part–stay with me…)

When I was a child, I remember the eager expectation of anticipating Christmas–none of the other holidays were even on my radar screen.  (For those of you who know me, you know how long ago that was…)  Then with great expectation (thank you, Charles),   I would go to bed and, as patiently as possible, await the moment when my father would turn on the tree lights, signaling that we children could descend to the living room.  Then poof!  That was Christmas. In 10 minutes it was done. The emotional bath I always took diminished greatly the value of this all important holiday.  Vowing that my children would grow up without the emotional bath of great expectations and dashed hopes, I choose to Celebrate the Season.  In doing so,  found value in the waiting of Advent, the majic of Hanukkah,  sharing of Kwanzaa, the mystery of Christmas and the traditions that come with all of these holidays.  There are other traditions that we revisit yearly, yet we find delight in remembering what the Winter Holiday traditions are and mean; remembering the foods we eat; the times we’ve shared.  From all this we find value in our program.  Do I still experience the emotional bath of childhood during this Holiday Season–not any more–and my children tell me that they like spreading the holidays out over the six week period.  I think this is the time of the year where we can take a second look at our programs (whether they are the holidays, youth development, watershed stewardship, nutrition education, or something else) and look for value in our programs–the part of the program that matters.  Evaluation is the work of capturing that value.  How we do that is what evaluation is all about.

One response I got for last week’s query was about on-line survey services.  Are they reliable?  Are they economical?  What are the design limitations?  What are the question format limitations?

Yes.  Depends.  Some.  Not many.

Let me take the easy question first:  Are they economical?

Depends.  Cost of postage for paper survey (both out and back) vs. the time it takes to enter questions in system.  Cost of system vs. length of survey.  These are things to consider.

Because most people have access to email today,  using an on-line survey service is often the easiest and most economical way to distribute an evaluation survey.  Most institutional review boards view an on-line survey like a mail survey and typically grant a waiver of documentation of informed consent.  The consenting document is the entry screen and often an agree to participate question is included on that screen.

Are they valid and reliable?

Yes, but…The old adage “Garbage in, garbage out” applies here.  Like a paper survey, and internet survey is only as good as the survey questions.  Don Dillman, in his third edition “Internet, mail, and mixed-mode surveys” (co-authored with Jolene D.  Smyth and Leah Melani Christian), talks about question development.  Since he wrote the book (literally), I use this resource a lot!

What are the design limitations?

Some limitations apply…Each online survey service is different.  The most common service is Survey Monkey (www.surveymonkey.com).  The introduction to Survey Monkey says, “Create and publish online surveys in minutes, and view results graphically and in real time.”  The basic account with Survey Monkey is free.  It has limitations (number of questions [10]; limited number of question formats [15]; number of responses [100]). And you can upgrade to the Pro or Unlimited  for a subscription fee ($19.95/mo or $200/annually, respectively).  There are others.  A search using “survey services” returns many options such as Zoomerang or InstantSurvey.

What are the question format limitations?

Not many–both open-ended and closed ended questions can be asked.  Survey Monkey has 15 different formats from which to choose (see below).  I’m sure there may be others; this list covers most formats.

  • Multiple Choice (Only one Answer)
  • Multiple Choice (Multiple Answers)
  • Matrix of Choices (Only one Answer per Row)
  • Matrix of Choices (Multiple Answers per Row)
  • Matrix of Drop-down Menus
  • Rating Scale
  • Single Textbox
  • Multiple Textboxes
  • Comment/Essay Box
  • Numerical Textboxes
  • Demographic Information (US)
  • Demographic Information (International)
  • Date and/or Time
  • Image
  • Descriptive Text

Oregon State University has an in-house service sponsored by the College of Business (BSG–Business Survey Groups).  OSU also has an institutional account with Student Voice, an on-line service designed initially for learning assessment which I have found useful for evaluations.  Check your institution for options available.  For your next evaluation that involves a survey, think electronically.

A good friend of mine asked me today if I knew of any attributes (which I interpreted to be criteria) of qualitative data (NOT qualitative research).  My friend likened the quest for attributes for qualitative data to the psychometric properties of a measurement instrument–validity and reliability–that could be applied to the data derived from those instruments.

Good question.  How does this relate to program evaluation, you may ask.  That question takes us to an understanding of paradigm.

Paradigm (according to Scriven in Evaluation Thesaurus) is a general concept or model for a discipline that may be influential in shaping the development of that discipline.  They do not (again according to Scriven) define truth; rather they define prima facie truth (i.e., truth on first appearance) which is not the same as truth.  Scriven goes on to say, “…eventually, paradigms are rejected as too far from reality and they are always governed by that possibility[i.e.,that they will be rejected] (page 253).”

So why is it important to understand paradigms.  They frame the inquiry. And evaluators are asking a question, that is, they are inquiring.

How inquiry is framed is based on the components of paradigm:

  • ontology–what is the nature of reality?
  • epistemology–what is the relationship between the known and the knower?
  • methodology–what is done to gain knowledge of reality, i.e., the world?

These beliefs shape how the evaluator sees the world and then guides the evaluator in the use of data, whether those data are derived from records, observations, interviews (i.e., qualitative data) or those data are derived from measurement,  scales,  instruments (i.e., quantitative data).  Each paradigm guides the questions asked and the interpretations brought to the answers to those questions.  This is the importance to evaluation.

Denzin and Lincoln (2005) in their 3rd edition volume of the Handbook of Qualitative Research

list what they call interpretive paradigms. They are described in Chapters 8 – 14 in that volume.  The paradigms are:

  1. Positivist/post positivist
  2. Constructivist
  3. Feminist
  4. Ethnic
  5. Marxist
  6. Cultural studies
  7. Queer theory

They indicate that each of these paradigms have criteria, a form of theory, and have a specific type of narration or report.  If paradigms have criteria, then it makes sense to me that the data derived in the inquiry formed by those paradigms would have criteria.  Certainly, the psychometric properties of validity and reliability (stemming from the positivist paradigm) relate to data, usually quantitative.  It would make sense to me that the parallel, though different, concepts in constructivist paradigm, trustworthiness and credibility,  would apply to data derived from that paradigm–often qualitative.

If that is the case–then evaluators need to be at least knowledgeable about paradigms.

In 1963, Campbell and Stanley (in their classic book, Experimental and Quasi-Experimental Designs for Research), discussed the retrospective pretest.  This is the method where by the participant’s attitude, knowledge, skills, behaviors, etc. existing prior to and after the program are assessed together AFTER the program. A novel approach to capturing what the participant knew, felt, did before they experienced the program.

Does it work?  Yes…and no (according to the folks in the know).

Campbell and Stanley mention the use of the retrospective pretest in measuring attitudes towards Blacks (they use the term Negro) of soldiers who are assigned to racially mixed vs. all white combat infantry units (1947) and to measure housing project occupants attitudes to being in integrated vs. segregated housing units when there was a housing shortage (1951).  Both tests showed no difference between the two groups in remembering prior attitudes towards the idea of interest.  Campbell and Stanley argue that having only posttest measures,  any difference found may have been attributable to selection bias.    They caution readers to “…be careful to note that the probable direction of memory bias is to distort the past…into agreement with (the) present…or has come to believe to be socially desirable…”

This brings up several biases that the Extension professional needs to be concerned with in planning and conducting an evaluation: selection bias, desired response bias, and response shift bias.  All of which can have serious implications for the evaluation.

Those are technical words for several limitations which can affect any evaluation.  Selection bias is the preference to put some participants into one group rather than the other.  Campbell and Stanley call this bias a threat to validity.  Desired response bias occurs when participants try to answer the way they think the evaluator wants them to answer.  Response shift bias happens when participants frame of reference or  understanding changes during the program, often due to misunderstanding  or preconceived ideas.

So these are the potential problems.  Are there any advantages/strengths to using the retrospective pretest?  There are at least two.  First, there is only one administration, at the end of the program.  This is advantageous when the program is short and when participants do not like to fill out forms (that is, minimizes paper burden).  And second, avoids the response-shift bias by not introducing information that may not be understood  prior to the program.

Theodore Lamb (2005) tested the two methods and concluded that the two approaches appeared similar and recommended the retrospective pretest if conducting a pretest/posttest is  difficult or impossible.  He cautions, however, that supplementing the data from the retrospective pretest with other data is necessary to demonstrate the effectiveness of the program.

There is a vast array of information about this evaluation method.  If you would like to know more, let me know.

I was reading another evaluation blog (the American Evaluation Association’s blog AEA365) which talked about data base design.  I was reminded that over the years, almost every Extension professional with whom I have worked has asked me the following question: “What do I do with my data now that I have all my surveys back?”

As Leigh Wang points out in her AEA365 comments, “Most training programs and publication venues focus on the research design, data collection, and data analysis phases, but largely leave the database design phase out of the research cycle.”  The questions that this statement raises are:

  1. How do/did you learn what to do with data once you have it?
  2. How do/did you decide to organize it?
  3. What software do/did you use?
  4. How important is it to make the data accessible to colleagues in the same field?

I want to know the answers to those questions.  I have some ideas.  Before I talk about what I do, I want to know what you do.  Email me, or comment on this blog.

A colleague of mine trying to explain observation to a student said, “Count the number of legs you see on the playground and divide by two. You have observed the number of students on the playground.” That is certainly one one way to look at the topic.

I’d like to be a bit more precise that that, though.  Observation is collecting information through the use of the senses–seeing, hearing, tasting, smelling, feeling.  To gather observations, the evaluator must have a clearly specified protocol–a step-by-step approach to what data are to be collected and how. The evaluator typically gets the first exposure to collecting information by observation at a very young age–learning to talk (hearing); learning to feed oneself (feeling); I’m sure you can think of other examples.  When the evaluator starts school and studies science,   when the teacher asks the student to “OBSERVE” the phenomenon and record what is seen, the evaluator is exposed to another approach to the method of observation.

As the process becomes more sophisticated, all manner of instruments may assist the evaluator–thermometers, chronometers, GIS, etc. And for that process to be able to be replicated (for validity), the steps become more and more precise.

Does that mean that looking at the playground and counting the legs and dividing by two has no place? Those who decry data manipulation would say agree that this form of observation yields information of questionable usefulness.   Those who approach observation as an unstructured activity would disagree and say that exploratory observation could result in an emerging premise.

You will see observation as the basis for ethnographic inquiry.  David Fetterman has a small volume (Ethnography: Step by step) published by Sage that explains how ethnography is used in field work.  Take simple ethnography a step up and one can read about meta-ethnography by George W. Noblit and R. Dwight Hare. I think my anthropology friends would say that observation is a tool used extensively by anthropologists. It is a tool that can be used by evaluators as well.