Dec
02
Filed Under (criteria, program evaluation) by englem on 02-12-2013

I was reminded about the age of this blog (see comment below).  Then it occurred to me:  I’ve been writing this blog since December 2009.  That is 4 years of almost weekly posts.  And even though evaluation is my primary focus, I occasionally get on my soap box and do something different (White Christmas Pie, anyone?).  My other passion besides evaluation is food and cooking.  I gave a Latke party on Saturday and the food was pretty–and it even tasted good.  I was more impressed by the visual appeal of my table; my guests were more impressed by the array of tastes, flavors, and textures.  I’d say the evening was a success.  This blog is a metaphor for that table.  Sometimes I’m impressed with the visual appeal; sometimes I’m impressed with the content.  Today is an anniversary.  Four years.  I find that amazing (visual appeal).  The quote below (a comment offered by a reader on the post “Is this blog making a difference?”, a post I made a long time ago) is about content.

“Judging just from the age of your blog I must speculate that you’ve done something right. If not then I doubt you’d still be writing regularly. Evaluation of your progress is important but pales in comparison to the importance of writing fresh new content on a regular basis. Content that can be found no place else is what makes a blog truly useful and indeed helps it make a difference.”

Audit or evaluation?

I’m an evaluator; I want to know what difference the “program” is making in the lives of the participants.  The local school district where I live, work, and send my children to school has provided middle school children with iPads iPad.  They want to “audit” their use.  I commend the school district for that initiative (both giving the iPads as well wanting to determine the effectiveness).  I wonder if they really want to know what difference the electronics are making in the lives of the students.  I guess I need to go re-read Tom Schwandt’s 1988 book, “Linking Auditing and Metaevaluation”, a book he wrote with Ed Halpern, Tom Schwandt book  as well as see what has happened in the last 25 years (and it is NOT that I do not have anything else to read…smiley).  I think it is important to note the sentence (taken from the forward), “Nontraditional studies are found not only in education, but also in…divers fields …(and the list they provide is a who’s who in social science).  The problem of such studies is “establishing their merit”.  That is always a problem with evaluation–establishing the merit, worth, value of a program (study).

We could spend a lot of time debating the  merit, worth, value of using electronics in the pursuit of learning.  (In fact, Jeffrey Selingo writes about the need to personalize instruction using electronics in his 2013 book “College (Un)bound”college unbound by jeffry selingo–very readable, recommended.)   I do not think counting the number of apps or the number of page views is going to answer the question posed.  I do not think counting the number of iPads returned in working condition will either.  This is an interesting experiment.  How , reader, would you evaluate the merit, worth, value of giving iPads to middle school children?  All ideas are welcome–let me know because I do not have an answer, only an idea.

Nov
21
Filed Under (criteria, program evaluation) by englem on 21-11-2013

For the first time in my lifetime the first day of Hanukkah is also Thanksgiving.  The pundits are are sagely calling the event Thanksgivukkah.thanksgivukkah image  According to this referenced source, the first day of Hanukkah will not happen again for over 70,000 years.  However, according to another source, this overlap could happen again in 2070 and 2165.  Although I do not think I’ll be around in 2070, my children could be (they are 17 and 20 of this writing).  I find this phenomenon really interesting–Thanksgiving usually starts the US holiday season and Hanukkah falls later, during Advent.  Not so this year.  I wonder how people combine latkes and Thanksgiving (even without the turkey).  Loaded latkes? Thanksgivukkah latkes (My appreciation to Kia.)

So I’m sure you are wondering, HOW EXACTLY DOES THIS RELATE TO EVALUATION?

I decided that it was time to revisit my blog title, Evaluation is an Everyday Activity. Every day you evaluate something.  Although you do not necessarily articulate out loud the criteria against which you are determining merit, worth, and value, you have those criteria.  I have them for latkes AND Thanksgiving.  Our latkes must be crispy; of winter vegetables including potatoes.  This allows me to use a variety of winter vegetables I may have gotten in my CSA.  (Beet latkes? Sweet potato latkes?  Celeriac latkes?  You bet!)   Our Thanksgiving is to have foods for which we are truly thankful.  That allows us to think about gratitude.  Each year our menu is different because each year we are thankful for different things.  (I must confess, however, we always have pie–pumpkin, which I make from home grown pumpkin/squash, and chocolate pecan, which is an original old family recipe.)  One year when we put all the food on the table, all the food was green.  We didn’t plan it that way; it just happened because they were foods for which we were thankful.  This year, we will have mashed potatoes (by the Queen of mashed potatoes), Celebration Filo, both the gluten-free (made with rice wrappers and no onion, garlic, or dairy) and glutened versions (the version which we renamed and is in the link above), and something else that will probably be green.  This year I’m thankful for my gluten-free; dairy-free friend who will join us for Thanksgiving and I’m working up alternatives to accommodate her and still satisfy the rest of us.

So you see, even when I’m thinking about Thanksgiving, latkes, and gratitude, I’m thinking about evaluation.  What merit does the “program” have?  What is its worth?  What is its value?  Those are all evaluative questions that apply to Thanksgiving (and latkes and gratitude). Thanksgiving 2

So you see, Evaluation is an Everyday Activity.

I won’t be blogging next week.  Enjoy.  Be grateful.Thanksgiving

 

 

Sep
17

I know–how does this relate to evaluation?  Although I think it is obvious, perhaps it isn’t.

I’ll start with a little background.  In 1994, M. Scott Peck published  A World Waiting To Be Born: Civility Rediscovered. scott peck civility In that book he defined a problem (and there are many) facing the then 20th century person ( I think it applies to the 21st century person as well).  That problem  was incivility or the “…morally destructive patterns of  self-absorption, callousness, manipulativeness, and  materialism so ingrained in our routine behavior that we  do not even recognize them.”  He wrote this in 1994–well before the advent of the technology that has enabled humon to disconnect from fellow humon while being connected.  Look about you and count the folks with smart phones.  Now, I’ll be the first to agree that technology has enabled a myriad of activities that 20 years ago (when Peck was writing this book) were not even conceived by ordinary folks.  Then technology took off…and as a result, civility, community,  and, yes, even compassion went by the way.

Self-absorption, callousness, manipulativeness, materialism are all characteristics of the lack of, not only civility (as Peck writes), also loss of community and lack of compassion.  If those three (civility, community, compassion) are lost–where is there comfort?  Seems to me that these three are interrelated.

To expand–How many times have you used your smart phone to text someone across the room? (Was it so important you couldn’t wait until you could talk to him/her in person–face-to-face?) How often have you thought to yourself how awful an event is and didn’t bother to tell the other person?  How often did you say the good word? The right thing?  That is evaluation–in the everyday sense.  Those of us who call ourselves evaluators are only slightly different from those of you who don’t.  Although evaluators do evaluation for a living, everyone does it because evaluation is part of what gets us all through the day.

Ask your self as an evaluative task–was I nice or was I mean?  This reflects civility, compassion, and even community.–even very young children know that difference.  Civility and compassion can be taught to kindergarteners–ask the next five year old you see–was it nice or was it mean?  They will tell you.  They don’t lie.  Lying is a learned behavior–that, too, is evaluative.

You can ask your self guiding questions about community; about compassion; about comfort.  They are all evaluative questions because you are trying to determine if you have made a difference.  You CAN be the change you want to see in the world; you can be the change you want to be.  That, too is evaluative.  Civility.  Compassion.  Community.  Comfort. compassion 2

You implement a program.  You think it is effective; that it makes a difference; that it has merit and worth.  You develop a survey to determine the merit and worth of the program.  You send the survey out to the target audience which is an intact population–that is, all of the participants are in the target audience for the survey.  You get less than 4o% response rate.  What does that mean?  Can you use the results to say that the participants saw merit in the program?  Do the results indicate that the program has value; that it made a difference if only 40% let you know what they thought.

I went looking for some insights on non-responses and non-responders.  Of course, I turned to Dillman  698685_cover.indd(my go to book for surveys…smiley).  His bottom line: “…sending reminders is an integral part of minimizing non-response error” (pg. 360).

Dillman (of course) has a few words of advice.  For example, on page 360, he says, ” Actively seek means of using follow-up reminders in order to reduce non-response error.”  How do you not burden the target audience with reminders, which are “…the most powerful way of improving response rate…” (Dillman, pg. 360).  When reminders are sent they need to be carefully worded and relate to the survey being sent.  Reminders stress the importance of the survey and the need for responding.

Dillman also says (on page 361) to “…provide all selected respondents with similar amounts and types of encouragement to respond.”  Since most of the time incentives are not an option for you the program person, you have to encourage the participants in other ways.  So we are back to reminders again.

To explore the topic of non-response further, there is a booksurvey non-response (Groves, Robert M., Don A. Dillman, John Eltinge, and Roderick J. A. Little (eds.). 2002. Survey Nonresponse. Wiley-Interscience: New York) that deals with the topic. I don’t have it on my shelf, so I can’t speak to it.  I found it while I was looking for information on this topic.

I also went on line to EVALTALK and found this comment which is relevant to evaluators attempting to determine if the program made a difference:  “Ideally you want your non-response percents to be small and relatively even-handed across items. If the number of nonresponds is large enough, it does raise questions as to what is going for that particular item, for example, ambiguous wording or a controversial topic. Or, sometimes a respondent would rather not answer a question than respond negatively to it. What you do with such data depends on issues specific to your individual study.”  This comment was from Kathy Race of Race & Associates, Ltd.,  September 9, 2003.

A bottom line I would draw from all this is respond…if it was important to you to participate in the program then it is important for you to provide feedback to the program implementation team/person.

 

 


 

May
24
Filed Under (criteria, program evaluation) by englem on 24-05-2013

Recently, I was privileged to see the recommendations of  William (Bill) Tierney on the top education blogs.  (Tierney is the Co-director of the Pullias Center for Higher Education at the University of Southern California.)  He (among others) writes the blog, 21st scholar.  The blogs are actually the recommendation of his research assistant Daniel Almeida.  These are the recommendations:

  1. Free Technology for Teachers

  2. MindShift

  3. Joanne Jacobs

  4. Teaching Tolerance

  5. Brian McCall’s Economics of Education Blog

What criteria were used?  What criteria would you use?  Some criteria that come to mind are interest, readability, length, frequency.  But I’m assuming that they would be your criteria (and you know what assuming does…)

If I’ve learned anything in my years as an evaluator, it is to make assumptions explicit.  Everyone comes to the table with built in biases (called cognitive biases).  I call them personal and situational biases (I did my dissertation on those biases). So by making your assumptions explicit (and thereby avoiding personal and situational biases), you are building a rubric because a rubric is developed from criteria for a particular product, program, policy, etc.

How would you build your rubric? Many rubrics are in chart format, that is columns and rows with the criteria detailed in those cross boxes.  That isn’t cast in stone.  Given the different ways people view the world–linear, circular, webbed–there may be others, I would set yours up in the format that works best for you.  The only thing to keep in mind is be specific.

Now, perhaps you are wondering how this relates to evaluation in the way I’ve been using evaluation.  Keep in mind evaluation is an everyday activity.  And everyday, all day, you perform evaluations.  Rubrics formalizes the evaluations you conduct–by making the criteria explicit.  Sometimes you internalize them; sometimes you write them down.  If you need to remember what you did the last time you were in a similar situation, I would suggest you write them down. rubric cartoon No, you won’t end up with lots of little sticky notes posted all over.  Use your computer.  Create a file.  Develop criteria that are important to you.  Typically, the criteria are in a table format; an x by x form.  If you are assigning number, you might want to have the rows be the numbers (for example, 1-10) and the columns be words that describe those numbers (for example, 1 boring; 10 stimulating and engaging).  Rubrics are used in reviewing manuscripts, student papers, assigning grades to activities as well as programs.  Your format might look like this:generic rubric

Or it might not.  What other configuration have you seen rubrics?  How would you develop your rubric?  Or would you–perhaps you prefer a bunch of sticky notes.  Let me know.

May
10
Filed Under (criteria, Methodology, program evaluation) by englem on 10-05-2013

Recently, I came across a blog post by Daniel Green, DanGreen-150x150who is the head of strategic media partnerships at the Bill and Melinda Gates Foundation.  He coauthored this post with Mayur Patel, Mayur_Patel__2012.jpg.200x0_q85vice president of strategy and assessment at the Knight Foundation.  I mention this because those two foundations have contributed $3.25 million in seed funding “…to advance a better understanding of audience engagement and media impact…”.  They are undertaking an ambitious project to develop a rubric (of sorts) to determine “…how media influences the ways people think and act, and contributes to broader societal changes…”.   Although it doesn’t specifically say, I include social media in the broad use of “media”.  The blog post talks about broader agenda–that of informed and engaged communities.  These foundations believe that an informed and engaged communities will strengthen “… democracy and civil society to helping address some of the world’s most challenging social problems.”

Or in other words,  what difference is being made, which is something I wonder about all the time.  (I’m an evaluator, after all, and I want to know what difference is made.)

Although there are strong media forces out there (NYTimes, NPR, BBC, the Guardian, among others), I wonder about the strength and effect of social media (FB, Twitter, LinkedIn, blogs, among others).  Anecdotally, I can tell you that social media is everywhere and IS changing the way people think and act.  I watch my now 17 y/o who uses the IM feature on her social media to communicate with her friends, set up study dates, find out homework assignments, not the phone like I did.  I watch my now 20 y/o multitask–talk to me on Skype and read and respond to  her FB entry.  She uses IM as much as her sister.  I know that social media was instrumental in the Arab spring. I know that major institutions have social media connections (FB, Twitter, LinkedIn, etc.).  Social media is everywhere.  And we have no good way to determine if it is making a difference and what that difference is.

For something so ubiquitous (social media), why is there no way to evaluate social media other than through the use of analytics?  I’ve been asking that question since I first posted my query “Is this blog making a difference?” back in March 2012.  Since I’ve been posting since December 2009, that gave me over 2 years from which to gather data.  That is a luxury when it comes to programming, especially when many programs often are a few hours in duration and an evaluation is expected.

I hope that this project provides useful information for those of us who have come kicking and screaming to social media and have seen the light.  Even though they are talking about the world of media, I’m hoping that they can come up with measures that address the social aspect of media. The technology provided IS useful; the question is what difference is it making?

May
02
Filed Under (criteria) by englem on 02-05-2013

We are four months into 2013 and I keep asking the question “Is this blog making a difference?”  I’ve asked for an analytic report to give me some answers.  I’ve asked you readers for your stories.

Let’s hear it for SEOs and how they pick up that title–I credit that with the number of comments I’ve gotten.  I AM surprised at the number of comments I have gotten since January (hundreds, literally).  Most say things like, “of course it is making a difference.”  Some compliment me on my writing style.  Some are in a foreign language which I cannot read (I am illiterate when it comes to Cyrillic, Arabic, Greek, Chinese, and other non-English alphabets).  Some are marketing–wanting ping backs to their recently started blogs for some product.  Some have commented specifically on the content (sample size and confidence intervals); some have commented on the time of year (vernal equinox).  Occasionally, I get a comment like the comment below and I keep writing.

The questions of all questions… Do I make a difference? I like how you write and let me answer your question. Personally I was supposed to be dead ages ago because someone tried to kill me for the h… of it … Since then (I barely survived) I have asked myself the same question several times and every single time I answer with YES. Why? Because I noticed that whatever you do, there is always someone using what you say or do to improve their own life. So, I can answer the question for you: Do you make a difference? Yes, you do, because there will always be someone who uses your writings to do something positive with it. So, I hope I just made your day! :-) And needless to say, keep the blog posts coming!

Enough update.  New topic:  I just got a copy of the third edition of Miles and Huberman (my to go reference for qualitative data analysis).  Wait you say–Miles and Huberman are dead–yes, they are.  Johnny Saldana (there needs to be a~ above the “n” in his name only I don’t know how to do that with this keyboard) was approached by Sage to be the third author and revise and update the book.  A good thing, I think.  Miles and Huberman’s second edition was published in 1994.  That is almost 20 years.  I’m eager to see if it will hold as a classic given that there are many other books on qualitative coding in press currently.  (The spring research flyer from Gilford lists several on qualitative inquiry and analysis from some established authors.)

I also recently sat in on a research presentation of a candidate for a tenure track position here at OSU who talked about how the analysis of qualitative data was accomplished.  Took me back to when I was learning–index cards and sticky notes.  Yes, there are marvelous software programs out there (NVivo, Ethnograph, N*udist); I will support the argument that the best way to learn about your qualitative data is to immerse yourself in it with color coded index cards and sticky notes.  Then you can use the software to check your results.  Keep in mind, though, that you are the PI and you will bring many biases to the analysis of your data.

 

Apr
23
Filed Under (criteria, Methodology, program evaluation) by englem on 23-04-2013

Harold Jarche says in his April 21 post, “What I’ve learned about blogging is that you have to do it for yourself. Most of my posts are just thoughts that I want to capture.”  What an interesting way to look at blogging.  Yes, there is content; yes, there is substance.  What there is most are captured thoughts.  Thoughts committed to “paper” before they fly away.  How many times have you said to yourself–if only…because you don’t remember what you were thinking; where you were going.  It may be a function of age; it may be a function of the times; it may be a function of other things as well (too little sleep, too much information, lack of f0cus).

When I blog on evaluation, I want to provide content that is meaningful.  I want to provide substance (as I understand it) in the field of evaluation.  Most of all, I want to capture what I’m thinking at the moment (like now).  Last week was a good example of capturing thoughts.  I wasn’t making up the rubric content; it is real.  All evaluation needs to have criteria against which the “program” is judged for merit and worth.  How else can you determine  the value of something?  So I ask you:  What criteria do you use in the moment you decide?  (and a true evaluator will say, “It depends…”)

A wise man (Elie Wiesel) said, “A man’s (sic) life, really, is not made up of years but of moments, all of which are fertile and unique.”  Even though he has not laid out explicitly his rubric, it is clear what makes them have merit and worth– “moments which are fertile and unique”.  An interesting way to look at life, eh?

 

Jarche gives us a 10 year update about his experience blogging.  He is asking a question I’ve been asking:  He asks what has changed and what has he learned in the past 10 years.  He talks about metrics (spammers and published posts).  I can do that.  He doesn’t talk about analytics (although I’m sure he could) and I don’t  want to talk about analytics, either.  Some comments on my blog suggest that I look at length of time spent on a page…that seems like a reasonable metric.  What I really want to hear is what has changed (Jarche talks about what has changes as being perpetual beta).  Besides the constantly changing frontier of social media, I go back to the comment by Elie Wiesel–moments that are fertile and unique.  How many can you say you’ve had today?  One will make my day–one will get my gratitude.  Today I am grateful for being able to blog.

Apr
18
Filed Under (criteria, program evaluation) by englem on 18-04-2013

A rubric is a way to make criteria (or standards) explicit and it does that in writing so that there can be no misunderstanding.  It is found in many evaluative activities especially assessment of classroom work.  (Misunderstanding is still possible because the English language is often not clear–something I won’t get into today; suffice it to say that a wise woman said words are important–keep that in mind when crafting a rubric.)

 

This week there were many events that required rubrics. Rubrics may have been implicit; they certainly were not explicit.  Explicit rubrics were needed.

 

I’ll start with apologies for the political nature of today’s post.

Yesterday’s  activity of the US Senate is an example where a rubric would be valuable.  Gabby  Giffords said it best:  

Certainly, an implicit rubric for this event can be found in this statement:

  Only it was not used.  When there are clear examples of inappropriate behavior; behavior that my daughters’ kindergarten teacher said was mean and not nice, a rubric exists.  Simple rubrics are understood by five year olds (was that behavioir mean OR was that behavior nice).  Obviously 46 senators could only hear the NRA; they didn’t hear that the behavior (school shootings) was mean.

Boston provided us with another example of the mean vs. nice rubric.  Bernstein got the concept of mean vs. nice.

Music is nice; violence is mean.

Helpers are nice; bullying is mean. 

There were lots of rubrics, however implicit, for that event.    The NY Times reported that helpers (my word) ran TOWARD those in need not away from the site of the explosion (violence).   There were many helpers.  A rubric existed, however implicit.

I want to close with another example of a rubric: 

I’m no longer worked up–just determined and for that I need a rubric.  This image may not give me the answer; it does however give me pause.

 

For more information on assessment and rubrics see: Walvoord, B. E. (2004).  Assessment clear and simple.  San Francisco: Jossey-Bass.

 

 

In a conversation with a colleague on the need for IRB when what was being conducted was evaluation not research, I was struck by two things:

  1. I needed to discuss the protections provided by IRB  (the next timely topic??) and
  2. the difference between evaluation and research needed to be made clear.

Leaving number 1 for another time, number 2 is the topic of the day.

A while back, AEA 365 did a post on the difference between evaluation and research (some of which is included below) from a graduate students perspective.  Perhaps providing other resources would be valuable.

To have evaluation grouped with research is at worst a travesty; at best unfair.  Yes, evaluation uses research tools and techniques.  Yes, evaluation contributes to a larger body of knowledge (and in that sense seeks truth, albeit contextual).  Yes, evaluation needs to have institutional review board documentation.  So in many cases, people could be justified in saying evaluation and research are the same.

NOT.

Carol Weiss   (1927-2013, she died in January) has written extensively on this difference and  makes the distinction clearly.  Weiss’s first edition of Evaluation Research  was published in 1972.She revised this volume in 1998 and issued it under the title of Evaluation. (Both have subtitles.)

She says that evaluation applies social science research methods and makes the case that it is intent of the study which makes the difference between evaluation and research.  She lists the following differences (pp 15 – 17, 2nd ed.):

  1. Utility;
  2. Program-driven questions;
  3. Judgmental quality;
  4. Action setting;
  5. Role Conflicts;
  6. Publication; and
  7. Allegiance.

 

(For those of you who are still skeptical, she also lists similarities.)  Understanding and knowing the difference between evaluation and research matters.  I recommend her books.

Gisele Tchamba who wrote the AEA365 post says the following: 

  1. Know the difference.  I came to realize that practicing evaluation does not preclude doing pure research. On the contrary, the methods are interconnected but the aim is different (I think this mirrors Weiss’s concept of intent).
  2. The burden of explaining. Many people in academia vaguely know the meaning of evaluation. Those who think they do mistake evaluation for assessment in education. Whenever I meet with people whose understanding of evaluation is limited to educational assessment, I use Scriven’s definition and emphasis words like “value, merit, and worth”.
  3. Distinguishing between evaluation and social science research.  Theoretical and practical experiences are helpful ways to distinguish between the two disciplines. Extensive reading of evaluation literature helps to see the difference.

She also sites a Trochim definition that is worth keeping in mind as it captures the various unique qualities of evaluation.  Carol Weiss mentioned them all in her list (above):

  •  “Evaluation is a profession that uses formal methodologies to provide useful empirical evidence about public entities (such as programs, products, performance) in decision making contexts that are inherently political and involve multiple often conflicting stakeholders, where resources are seldom sufficient, and where time-pressures are salient”.

Resources: