Oct
12
Filed Under (Data Analysis, Methodology, program evaluation) by englem on 12-10-2012

The topic of survey development seems to be  popping up everywhere–AEA365, Kirkpatrick Partners, eXtension Evaluation Community of Practice, among others.  Because survey development is so important to Extension faculty, I’m providing links and summaries.

 

 AEA365 says:

“… it is critical that you pre-test it with a small sample first.”  Real time testing helps eliminate confusion, improve clarity, and assures that you are asking a question that will give you an answer to what you want to know.  This is so important today when many surveys are electronic.

It is also important to “Train your data collection staff…Data collection staff are the front line in the research process.”  Since they are the people who will be collecting the data, they need to understand the protocols, the rationales, and the purposes of the survey.

Kirkpatrick Partners say:

“Survey questions are frequently impossible to answer accurately because they actually ask more than one question. “  This is the biggest problem in constructing survey questions.  They provide some examples of asking more than one question.

 

Michael W. Duttweiler, Assistant Director for Program Development and Accountability at Cornell Cooperative Extension stresses the four phases of survey construction:

  1. Developing a Precise Evaluation Purpose Statement and Evaluation Questions
  2. Identifying and Refining Survey Questions
  3. Applying Golden Rules for Instrument Design
  4. Testing, Monitoring and Revising

He then indicates that the next three blog posts will cover point 2, 3, and 4.

Probably my favorite post on survey recently was one that Jane Davidson did back in August, 2012 in talking about survey response scales.  Her “boxers or briefs” example captures so many issues related to survey development.

Writing survey questions which give you useable data that answers your questions about your program is a challenge; it is not impossible.  Dillman writes the book about surveys; it should be on your desk.

Here is the Dillman citation:
Dillman, D. A., Smyth, J. D., & Christian, L. M. (2009).  Internet, mail, and mixed-mode surveys: The tailored design method.  Hoboken, NJ: John Wiley & Sons, Inc.

Sep
28
Filed Under (Data Analysis, program evaluation) by englem on 28-09-2012

What is the difference between need to know and nice to know?  How does this affect evaluation?  I got a post this week on a blog I follow (Kirkpatrick) that talks about how much data does a trainer really need?  (Remember that Don Kirkpatrick developed and established an evaluation model for professional training back in the 1954 that still holds today.)

Most Extension faculty don’t do training programs per se, although there are training elements in Extension programs.  Extension faculty are typically looking for program impacts in their program evaluations.  Program improvement evaluations, although necessary, are not sufficient.  Yes, they provide important information to the program planner; they don’t necessarily give you information about how effective your program has been (i.e., outcome information). (You will note that I will use the term “impacts” interchangeably with “outcomes” because most Extension faculty parrot the language of reporting impacts.)

OK.  So how much data do you really need?  How do you determine what is nice to have and what is necessary (need) to have?  How do you know?

  1. Look at your logic model.  Do you have questions that reflect what you expect to have happen as a result of your program?
  2. Review your goals.  Review your stated goals, not the goals you think will happen because you “know you have a good program”.
  3. Ask yourself, How will I USE these data?  If the data will not be used to defend your program, you don’t need it.
  4. Does the question describe your target audience?  Although not demonstrating impact, knowing what your target audience looks like is important.  Journal articles and professional presentations want to know this.
  5. Finally, ask yourself, Do I really need to know the answer to this question or will it burden the participant.  If it is a burden, your participants will tend to not answer, then you  have a low response rate; not something you want.

Kirkpatrick also advises to avoid redundant questions.  That means questions asked in a number of ways and giving you the same answer; questions written in positive and negative forms.  The other question that I always include because it will give me a way to determine how my program is making a difference is a question on intention including a time frame.  For example, “In the next six months do you intend to try any of the skills you learned to day?  If so, which one.”  Mazmaniam has identified the best predictor of behavior change (a measure of making a difference) is stated intention to change.  Telling someone else makes the participant accountable.  That seems to make the difference.

 

Reference:

Mazmanian, P. E., Daffron, S. R., Johnson, R. E., Davis, D. A., & Kantrowits, M. P. (1998).   Information about barriers to planned change: A Randomized controlled trail involving continuing medical education lectures and commitment to change.  Academic Medicine, 73(8).

 

P.S.  No blog next week; away on business.

 

 

 

Sep
21
Filed Under (Data Analysis, program evaluation) by englem on 21-09-2012

Quantitative data analysis is typically what happens to data that are numbers (although qualitative data can be reduced to numbers, I’m talking here about data that starts as numbers.)  Recently, a library colleague sent me an article that was relevant to what evaluators often do–analyze numbers.

So why, you ask, am I talking about an article that is directed to librarians?  Although that article is is directed at librarians, it has relevance to Extension.  Extension faculty (like librarians), more often than not, use surveys to determine the effectiveness of their programs.  Extension faculty are always looking to present the most powerful survey conclusions (yes, I lifted from the article title), and no you don’t need to have a doctorate in statistics to understand these analyses.  The other good thing about this article is that it provides you with a link to an online survey-specific software:  (Raosoft’s calculator at http://www.raosoft.com/samplesize.html).

This article refers specifically to three metrics that are often overlooked by Extension faculty:  margin of error (MoE), confidence level (CL), and cross-tabulation analysis.   These are three statistics which will help you in your work. The article also does a nice job of listing the eight recommended best practices which I’ve appended here with only some of the explanatory text.

 

Complete List of Best Practices for Analyzing Multiple Choice Surveys

1. Inferential statistical tests. To be more certain of the conclusions drawn from survey data, use inferential statistical tests.

2. Confidence Level (CL). Choose your desired confidence level (typically 90%, 95%, or 99%) based upon the purpose of your survey and how confident you need to be of the results. Once chosen, don’t change it unless the purpose of your survey changes. Because the chosen confidence level is part of the formula that determines the margin of error, it’s also important to document the CL in your report or article where you document the margin of error (MoE).

3. Estimate your ideal sample size before you survey. Before you conduct your survey use a sample size calculator specifically designed for surveys to determine how many responses you will need to meet your desired confidence level with your hypothetical (ideal) margin of error (usually 5%).

4. Determine your actual margin of error after you survey. Use a margin of error calculator specifically designed for surveys (you can use the same Raosoft online calculator recommended above).

5. Use your real margin of error to validate your survey conclusions for your larger population.

6. Apply the chi-square test to your crosstab tables to see if there are relationships among the variables that are not likely to have occurred by chance.

7. Reading and reporting chi-square tests of cross-tab tables.

  • Use the .05 threshold for your chi-square p-value results in cross-tab table analysis.
  • If the chi-square p-value is larger than the threshold value, no relationship between the variables is detected. If the p-value is smaller than the threshold value, there is a statistically valid relationship present, but you need to look more closely to determine what that relationship is. Chi-square tests do not indicate the strength or the cause of the relationship.
  • Always report the p-value somewhere close to the conclusion it supports (in parentheses after the conclusion statement, or in a footnote, or in the caption of the table or graph).

8. Document any known sources of bias or error in your sampling methodology and in your survey design in your report, including but not limited to how your survey sample was obtained.

 

Bottom line:  read the article.

Hightower, C. & Kelly, S. (2012, Spring).  Infer more, describe less: More powerful survey conclusions through easy inferential tests.  Issues in Science and Technology Librarianship. DOI:10.5062/F45H7D64. [Online]. Available at: http://www.istl.org/12-spring/article1.html

Aug
14

Evaluation costs:  A few weeks ago, I posted a summary about evaluation costs. A recent AEA LinkedIn discussion was on the same topic (see this link).  If you have not linked to other evaluators, there are other groups besides AEA that have LinkedIn groups.  You might want to join one that is relevant.

New topic:  The video on surveys posted last week generated a flurry of comments (though not on this blog).  I think it is probably appropriate to revisit the topic of surveys.  As I decided to revisit this topic,  an AEA 365 post from the Wilder Research group talked about data coding related to longitudinal data.

Now, many surveys, especially Extension surveys, focus on cross sectional data not on longitudinal data.  They may, however, involve a large number of participants and the hot tips that are provided apply to coding surveys.  Whether the surveys Extension professionals develop involve 30, 300, or 3000 participants, these tips are important especially if the participants are divided into groups on some variable.  Although the hot tips in the Wilder post talk about coding, not surveys specifically, they are relevant to surveys and I’m repeating them here.   (I’ve also adapted the original tip to Extension use).

  • Anticipate different groups.  If you do this ahead of time, and write it down in a data dictionary or coding guide, your coding will be easier.  If the raw data are dropped, or for some other reason scrambled (like a flood, hurricane, or a sleepy night), you will be able to make sense out of the data quicker.
  • Sometimes there are preexisting identifying information (like location of the program) that have a logical code.  Use that code.
  • Precoding by the location sites helps keep the raw data organized and enables coding.

Over the rest of the year, I’ll be revisiting survey on a regular basis.  Survey is often used by Extension.  Developing a survey that provides you with information you want, can use, and makes sense is a useful goal.

New topic:  I’m thinking of varying the format of the blog or offering alternative formats with evaluation information.  I’m curious as to what would help you do your work better.  Below are a few options.  Let me know what you’d like.

  • Videos in blogs
  • Short concise (i.e., 10-15 minute) webinars
  • Guest writers/speakers/etc.
  • Other ideas
Aug
09
Filed Under (Methodology, program evaluation) by englem on 09-08-2012

A few weeks ago I  mentioned that a colleague of mine shared with me some insights she had about survey development.  She had an Aha! moment.   We had a conversation about that Aha! Moment and video taped the conversation.  To see the video, click here.

 

In thinking about what Linda learned, I realized that Aha! Moments could be a continuing series…so watch for more.

Let me know what you think.  Feedback is always welcome.

Oh–I want to remind you about an excellent resource for surveys.  Dillman’s current book, Internet, mail, and mixed-mode surveys:  The tailored design method.  It is a Wiley publication by Don A. Dillman, Jolene D. Smyth, and Leah Melani Christian.  Needs to be on your desk if you do any kind of survey work.

Jun
20
Filed Under (Data Analysis, Methodology, program evaluation) by englem on 20-06-2012

I started this post back in April.  I had an idea that needed to be remembered…it had to do with the unit of analysis; a question which often occurs in evaluation.  To increase sample size and, therefore,  power, evaluators often choose run analyses on the larger number when the aggregate, i.e., smaller number is probably the “true” unit of analysis.  Let me give you an example.

A program is randomly assigned to fifth grade classrooms in three different schools.  School A has three classrooms; school B has two classrooms; and school C has one classroom.  All together, there are approximately 180 students, six classrooms, three schools.  What is the appropriate unit of analysis?  Many people use students, because of the sample size issue.  Some people will use classroom because each got a different treatment.  Occasionally, some evaluators will use schools because that is the unit of randomization.  This issue elicits much discussion.  Some folks say that because students are in the school, they are really the unit of analysis because they are imbedded in the randomization unit.  Some folks say that students is the best unit of analysis because there are more of them.  That certainly is the convention.  What you need to decide is what is the unit and be able to defend that choice.  Even though I would loose power, I think I would go with the the unit of randomization.  Which leads me to my next point–truth.

At the end of the first paragraph, I use the words “true” in quotation marks. The Kirkpatricks in their most recent blog opened with a quote from the US CIA headquarters in Langley Virginia, “”And ye shall know the truth, and the truth shall make you free”.   (We wont’ talk about the fiction in the official discourse, today…)   (Don Kirkpatrick developed the four levels of evaluation specifically in the training and development field.)  Jim Kirkpatrick, Don’s son, posits that, “Applied to training evaluation, this statement means that the focus should be on discovering and uncovering the truth along the four levels path.”  I will argue that the truth is how you (the principle investigator, program director, etc.) see the answer to the question.  Is that truth with an upper case “T” or is that truth with a lower case “t”?  What do you want it to mean?

Like history (history is what is written, usually by the winners, not what happened), truth becomes what do you want the answer to mean.  Jim Kirkpatrick offers an addendum (also from the CIA), that of “actionable intelligence”.  He goes on to say that, “Asking the right questions will provide data that gives (sic) us information we need (intelligent) upon which we can make good decisions (actionable).”  I agree that asking the right question is important–probably the foundation on which an evaluation is based.  Making “good decisions”  is in the eyes of the beholder–what do you want it to mean.

Mar
20
Filed Under (program evaluation) by englem on 20-03-2012

An important question that evaluators ask is, “What difference is this program making?”  Followed quickly with, “How do you know?”

Recently, I happened on a blog called {grow} and the author, Mark Schaefer,  had a post called, “Did this blog make a difference?”  Since this is a question as an evaluator I am always asking, I jumped on the page.  Mr. Schaefer is in marketing and as a marketing expert he says the following, “You’re in marketing for one reason: Grow. Grow your company, reputation, customers, impact, profits. Grow yourself. This is a community that will help. It will stretch your mind, connect you to fascinating people, and provide some fun along the way.”  So I wondered how relevant this blog would be to me and other evaluators whether they blogged or not.

Mr. Schaefer is taking stock of his blog–a good thing to do for a blog that has been posted for a while.  So although he lists four innovations, he asks the reader to “…be the judge if it made a difference in your life, your outlook, and your business.”  The four innovations are

  1. Paid contributing columnists.  He actually paid the folks who contributed to his blog; not something those of us in Extension can do.
  2. {growtoons}. Cartoons designed specifically for the blog that “…adds an element of fun and unique social media commentary.”  Hmmm…
  3. New perspectives. He showcased fresh deserving voices; some that he agreed with and some that he did not.  A possibility.
  4. Video. He did many video blogs and that gave him the opportunity to “…shine the light on some incredible people…”  He interviews folks and posts the short video.  Yet another possibility.

His approach seems really different to what I do.  Maybe it is the content; maybe it is the cohort; maybe it is something else.  Maybe there is something to be learned from what he does.  Maybe this blog is making a difference.  Only I don’t know.  So, I take a clue from Mr. Schaefer and ask you to judge if it has made a difference in what you do–then let me know.  I’ve imbedded a link  to a quick survey that will NOT link to you nor in anyway identify you.  I will only be  using the findings for program improvement.  Please let me know.  Click here to link to the survey.

 

Oh, and I won’t be posting next week–spring break and I’ll be gone.

 

Mar
02
Filed Under (program evaluation) by englem on 02-03-2012

Last weekend, I was in Florida visiting my daughter at Eckerd College.  The College was offering an Environmental Film Festival and I had the good fortune to see Green Fire, a film about Aldo Leopold and the land ethic.   I had seen it at OSU and was impressed because it was not all doom and gloom; rather it celebrated Aldo Leopold as one of the three leading and  early conservationists  (the other two are John Muir and Henry David Thoreau ).  Dr. Curt Meine, who narrates the film and is a conservation biologist, was leading the discussion again; I had heard him at OSU.  At the showing early, I was able to chat with him about the film and its effects.  I asked him how he knew he was being effective.  His response was to tell me about the new memberships in the Foundation, the number of showings, and the size of the audience seeing the film.  Appropriate responses for my question.  What I really wanted to know was how did he know he was making a difference.  That is a different question; one which talks about change.  Change is what programs like Green Fire is all about.  It is what Aldo Leopold was all about (read Sand County Almanac to understand Leopold’s position.)

 

Change is what evaluation is all about.  But did I ask the right question?  How could I have phrased it differently to get at what change had occurred in the viewers of the film?  Did new memberships in the Foundation demonstrate change?  Knowing what question to ask is important for program planners as well as evaluators.  There are often multiple levels of questions that could be asked–individual, programmatic, organizational, regional, national, global.  Are they all equally important?  Do they provide a means forgathering pertinent data?  How are you going to use these data once you’ve gathered them?  How carefully do you think about the questions you ask when you craft your logic model?  When you draft a survey?  When you construct questions for focus groups?  Asking the right question will yield relevant answers.  It will show you what difference you’ve made in the lives of your target audience.

 

Oh, and if you haven’t see the film, Green Fire, or read the book, Sand County Almanac–I highly recommend them.

Aug
24
Filed Under (program evaluation) by englem on 24-08-2011

 

I started this post the third week in July.  Technical difficulties prevented me from completing the post.  Hopefully, those difficulties are now in the past.

A colleague asked me what can we do when we can’t measure actual behavior change in our evaluations.  Most evaluations can capture knowledge change (short term outcomes); some evaluations can capture behavior change (intermediate or medium term outcomes); very few can capture condition change (long term outcomes, often called impacts–though not by me).  I thought about that.  Intention to change behavior can be measured.  Confidence (self-efficacy) to change behavior can be measured.  For me, all evaluations need to address those two points.

Paul Mazmanian, Associate Dean for Continuing Professional Development and Evaluation Studies at Virginia Commonwealth University, has studied changing practice patterns for several years.  One study, conducted in 1998, reported that “…physicians in both study and control groups were significantly more likely to change (47% vs. 7% p< .001) if they indicated intent to change immediately following the lecture” (Academic Medicine. 1998; 73:882-886).   Mazmanian and his co-authors say in their conclusions that “successful change in practice may depend less on clinical and barriers information than on other factors that influence physicians’ performance.  To further develop the commitment-to-change strategy in measuring effects of planned change, it is important to isolate and learn the powers of individual components of the strategy as well as their collective influence on physicians’ clinical behavior.”

 

What are the implications for Extension and other complex organizations?   It makes sense to extrapolate from this information from the continuing medical education literature.  Physicians are adults; most of Extension’s audience are adults.  If stated intention to change is highly predictable  “immediately following the lecture” (i.e., continuing education program) based on stated intention to change, then stated intention to change solicited from participants in Extension programs immediately following the program delivery would increase the likelihood of behavior change.  One of the outcomes Extension wants to see is change in behavior (medium term outcomes).  Measuring those behavior changes directly (through observation, or some other method) is often outside the resources available.  Measuring those intended behavior changes is within the scope of Extension resources.  Using a time frame (such as 6 months) helps bound the anticipated behavior change.  In addition, intention to change can be coupled with confidence to implement the behavior change to provide the evaluator with information about the effect of the program.  The desired effect is high confidence to change and willingness to implement the change within the specified time frame.  If Extension professionals find that result, then it would be safe to say that the program is successful.

REFERENCES

1.  Mazmanian, P.E., Daffron, S. R., Johnson, R. E., Davis, D. A., Kantrowitz, M. P.  (1998).  Information about barriers to planned change:  A Randomized controlled trial involving continuing medical education lectures and commitment to change.  Academic Medicine, 73 (8), 882-886.

2.  Mazmanian, P. E. & Mazmanian, P. M.  (1999).  Commitment to change: Theoretical foundations, methods, and outcomes.  The Journal of Continuing Education in the Health Professions, 19, 200 – 207.

3.  Mazmanian, P. E., Johnson, R. E, Zhang, A. Boothby, J. & Yeatts, E. J. (2001).  Effects of a signature on rates of change: A randomized controlled trial involving continuing medical education and the commitment-to-change model.  Academic Medicine, 76 (6), 642-646.

 

May
12
Filed Under (Methodology, program evaluation) by englem on 12-05-2011

We recently held Professional Development Days for the Division of Outreach and Engagement.  This is an annual opportunity for faculty and staff in the Division to build capacity in a variety of topics.  The question this training posed was evaluative:

How do we provide meaningful feedback?

Evaluating a conference or a multi-day, multi-session training is no easy task.  Gathering meaningful data is a challenge.  What can you do?  Before you hold the conference (I’m using the word conference to mean any multi-day, multi-session training), decide on the following:

  • Are you going to evaluate the conference?
  • What is the focus of the evaluation?
  • How are you going to use the results?

The answer to the first question is easy:  YES.  If the conference is an annual event (or a regular event), you will want to have participants’ feedback of their experience, so, yes, you will evaluate the conference. Look at a Penn State Tip Sheet 16 for some suggestions.  (If this is a one time event, you may not; though as an evaluator, I wouldn’t recommend ignoring evaluation.)

The second question is more critical.  I’ve mentioned in previous blogs the need to prioritize your evaluation.  Evaluating a conference can be all consuming and result in useless data UNLESS the evaluation is FOCUSED.  Sit down with the planners and ask them what they expect to happen as a result of the conference.  Ask them if there is one particular aspect of the conference that is new this year.  Ask them if feedback in previous years has given them any ideas about what is important to evaluate this year.

This year, the planners wanted to provide specific feedback to the instructors.  The instructors had asked for feedback in previous years.  This is problematic if planning evaluative activities for individual sessions is not done before the conference.  Nancy Ellen Kiernan, a colleague at Penn State, suggests a qualitative approach called a Listening Post.  This approach will elicit feedback from participants at the time of the conference.  This method involves volunteers who attended the sessions and may take more persons than a survey.  To use the Listening Post, you must plan ahead of time to gather these data.  Otherwise, you will need to do a survey after the conference is over and this raises other problems.

The third question is also very important.  If the results are just given to the supervisor, the likelihood of them being used by individuals for session improvement or by organizers for overall change is slim.  Making the data usable for instructors means summarizing the data in a meaningful way, often visually.  There are several way to visually present survey data including graphs, tables, or charts.  More on that another time.  Words often get lost, especially if words dominate the report.

There is a lot of information in the training and development literature that might also be helpful.  Kirkpatrick has done a lot of work in this area.  I’ve mentioned their work in previous blogs.

There is no one best way to gather feedback from conference participants.  My advice:  KISS–keep it simple and straightforward.