I had a conversation today about how to measure if I was making a difference in what I do.  Although the conversation was referring to working with differences, I am conscious that the work work I do and the work of working with differences transcends most disciplines and positions.  How does it relate to evaluation?

Perspective and voice.

These are two sides of the same coin.  Individuals come to evaluation with a history or perspective.  Individuals voice their view in the development of evaluation plans.  If individuals are not invited and/or do  not come to the table for the discussion, a voice is missing.

This conversation went on–the message was that voice and perspective are  more important in evaluations which employ a qualitative approach rather than a quantitative approach.  Yes—and no.

Certainly, words have perspective and provide a vehicle for voice.  And words are the basis for qualitative methods.   So this is the “Yes”.   Is this still an issue when the target audience is homogeneous?  Is it still an issue when the evaluator is “different” on some criteria than the target audience.  Or as one mental health worker once stated, only an addict can provide effective therapy to another addict.  Is that really the case?  Or do voice and perspective always over lay an evaluation?

Let’s look at quantitative methods.  Some would argue that numbers aren’t affected by perspective and voice.  I will argue that the basis for these numbers is words.  If words are turned into numbers are voice and perspective still an issue?  This is the “Yes and no”.  
I am reminded of the story of a brook and a Native American child.  The standardized test asked which of the following is similar to a brook.  The possible responses were (for the sake of this conversation) river, meadow, lake, inlet.  The Native American child, growing up in the desert Southwest, had never heard of the word “brook”.  Consequently got the item wrong.  This was one of many questions where perspective affected the response.  Wrong answers were totaled to a number subtracted from the possible total and a score (a number) resulted.  That individual number was grouped with other individual numbers and compared to numbers from another group using a statistical test (for the sake of conversation), a t-test.  Is the resulting statistic of significance valid?  I would say not.  So this is the “No”.  Here the voice and perspective have been obfuscated.

The statistical significance between those groups is clear according to the computation; clear that is  until one looks at the words behind the numbers.  It is in the words behind the numbers that perspective and voice affect the outcomes.

Statistics are not the dragon you think it is.

For many people, the field of statistics is a dragon in disguise and like dragons, most people shy away from statistics.

I have found that Neil Salkind’s book “Statistics for People Who (Think They) Hate Statistics” a good reference for understanding the basics of statistics.  The 4th edition is due out in September 2010.  This book  isn’t intimidating; it is easy to understand; it isn’t heavy on the math or formulas; it has a lot of tips.   I’m using it for this column.  I keep it on my desk along with Dillman.

Faculty who come to me with questions about analyzing their data typically want to know how to determine statistical significance.  But before I can talk to faculty about statistical significance, there are a few questions that need to be answered.

  • What type of measurement scale have you used?
  • How many groups do you have on which you have data?
  • How many variables do you have for those groups?
  • Are you examining relationships or differences?
  • What question(s) you want to answer?

Most people immediately jump to what test to use.  Don’t go there.  Start with what measurement scale do you have.  Then answer the other questions.

So let’s talk about scales of measurement.  All data are not created equally.  Some data are easier to analyze than other data.  Scale of measurement makes that difference.

There are four scales of measurement and most data fall into one of these four. They are either categorical (even if they have been converted to numbers) or numerical (originally numbers).  They are:

  • nominal
  • ordinal
  • interval
  • ratio

Scales of measurement are rules determining the particular levels at which outcomes are measured.  When you decide on an answer to a question, you are deciding on the scale of measurement, you are agreeing to the particular set of characteristics for that measurement.

Nominal scales name something. For example–gender is either male, female, or unknown/not stated; ethnicity is one of several names of groups.  When you gather demographic data, such as gender, ethnicity, or race, you are employing a nominal scale.  The data that result from nominal scales are categorical data–that is data resulting from categories which are mutually exclusive from each other.  The respondent is either male or female, not both.

Ordinal scale orders something; it puts the thing being measured in order–high to low; low to high.  Salkind gives the example of ranking candidates for a job.  Extension professionals (and many/most survey professionals) use ordinal scales in surveys (strongly agree to strongly disagree; don’t like to like a lot).  We do not know how much difference is between don’t like and likes a lot.  The data that result from ordinal scales are categorical data.

Interval scale is based on a continuum of equally spaced intervals along that continuum. Think of a thermometer; test score; weight.  We know that the intervals along the scale are equal to one another.  The data that result from interval scales are numerical data.

Ratio scale is a scale with absolute zero or a situation where the characteristic of interest is absent–like zero light or no molecular movement.  This rarely happens social or behavioral science, the work that most Extension Professionals do.  The data that result from ratio data are numerical data.

Why do we care?

  • Scales are ordered from the least precise (nominal)  to the most precise (ratio).
  • The scale used determines the detail provided by the data collected; more precision, more information.
  • The more precise scale is a scale which contains all the qualities of less precise scales (interval has the qualities of ordinal and nominal).

Using an inappropriate scale will invalidate your data and provide you with spurious outcomes which yield spurious impacts.

A colleague of mine trying to explain observation to a student said, “Count the number of legs you see on the playground and divide by two. You have observed the number of students on the playground.” That is certainly one one way to look at the topic.

I’d like to be a bit more precise that that, though.  Observation is collecting information through the use of the senses–seeing, hearing, tasting, smelling, feeling.  To gather observations, the evaluator must have a clearly specified protocol–a step-by-step approach to what data are to be collected and how. The evaluator typically gets the first exposure to collecting information by observation at a very young age–learning to talk (hearing); learning to feed oneself (feeling); I’m sure you can think of other examples.  When the evaluator starts school and studies science,   when the teacher asks the student to “OBSERVE” the phenomenon and record what is seen, the evaluator is exposed to another approach to the method of observation.

As the process becomes more sophisticated, all manner of instruments may assist the evaluator–thermometers, chronometers, GIS, etc. And for that process to be able to be replicated (for validity), the steps become more and more precise.

Does that mean that looking at the playground and counting the legs and dividing by two has no place? Those who decry data manipulation would say agree that this form of observation yields information of questionable usefulness.   Those who approach observation as an unstructured activity would disagree and say that exploratory observation could result in an emerging premise.

You will see observation as the basis for ethnographic inquiry.  David Fetterman has a small volume (Ethnography: Step by step) published by Sage that explains how ethnography is used in field work.  Take simple ethnography a step up and one can read about meta-ethnography by George W. Noblit and R. Dwight Hare. I think my anthropology friends would say that observation is a tool used extensively by anthropologists. It is a tool that can be used by evaluators as well.

Extension has consistently used survey as a method for collecting information.

Survey collects information through structured questionnaires resulting in quantitative data.   Don Dillman wrote the book, Internet, Mail and Mixed-Mode Surveys: The Tailored Design Method .  Although mail and individual interviews were once the norm, internet survey software has changed that.

Other ways  are often more expedient, less costly, less resource intensive than survey. When needing to collect information, consider some of these other ways:

  • Case study
  • Interviews
  • Observation
  • Group Assessment
  • Expert or peer review
  • Portfolio reviews
  • Testimonials
  • Tests
  • Photographs, slides, videos
  • Diaries, journals
  • Logs
  • Document analysis
  • Simulations
  • Stories
  • Unobtrusive measures

I’ll talk about these in later posts and provide resources for each of these.

When deciding what information collection method (or methods) to use, remember there are three primary sources of evaluation information. Those sources often dictate the methods of information collection. The Three sources are:

  1. Existing information
  2. People
  3. Pictorial records and observation

When using existing information, developing a systematic approach to LOOKING at the information source is what is important.

When gathering information from people, ASKING them is the approach to use–and how that asking is structured.

When using pictorial records and observations, determine what you are looking for before you collect information

I know it is Monday, not Tuesday or Wednesday. I will not have internet access Tuesday or Wednesday and I wanted to answer a question posed to me by a colleague and long time friend who has just begun her evaluation career.

Her question is:

What are the best methods to collect outcome evaluation data.Data 3 2_2

Good question.

The answer:  It all depends.

On what does the collection depend?

  • Your question.
  • Your use.
  • Your resources.

If your resources are endless (yeah, right…smiley ), then you can hire people; use all the time you need; and collect a wealth of data. Most folks aren’t this lucky.

If you plan to use your findings to convince someone, you need to think about what will be most convincing. Legislators like the STORY that tugs at the heart strings.

Administrators like, “Just the FACTS, ma’am.” Typically presented in a one-page format with bullets.

Program developers may want a little of both.

Depending on what question you want answered will depend on how you will collect the answer.

My friend, Ellen-Taylor Powell, at the University of Wisconsin Extension Service has developed a handout of data methods (see: Methods for Collecting Information).  This handout is in PDF form and can be downloaded. It is a comprehensive list of different data collection methods that can be adapted to answer your question within your available resources.

She also has a companion handout called Sources of Evaluation Information. I like this handout because it is clear and straight forward. I have found both very useful in the work I do.

Whole books have been written on individual methods. I can recommend some I like–let me know.