I was reminded recently about the 1992 AEA meeting in Seattle, WA.  That seems like so long ago.  The hot topic of that meeting was whether qualitative data or quantitative data were best.  At the time I was a nascent evaluator having been in the field less that 10 years and absorbed debates like this as a dry sponge does water.  It was interesting; stimulating; exciting.  It felt cutting edge.

Now 20+ years later, I wonder what all the hype was about.  Now, there can be rigor in what ever data are collected, regardless of type (numbers or words); language has been developed to look at that rigor.   (Rigor can also escape the investigator regardless of the data collected; another post, another day.)  Words are important for telling stories (and there is a wealth of information on how story can be rigorous) and numbers are important for counting (and numbers have a long history of use–Thanks Don Campbell).  Using both (that is, mixed methods) makes really good sense when conducting an evaluation in community environments, work that I’ve done for most of my career (community-based work).

I was reading another evaluation blog (ACET) and found the following bit of information that I thought I’d share as it is relevant to looking at data.  This particular post (July, 2012) was a reflection of the author. (I quote from that blog).

  • § Utilizing both quantitative and qualitative data. Many of ACET’s evaluations utilize both quantitative (e.g., numerical survey items) and qualitative (e.g., open-ended survey items or interviews) data to measure outcomes. Using both types of data helps triangulate evaluation findings. I learned that when close-ended survey findings are intertwined with open-ended responses, a clearer picture of program effectiveness occurs. Using both types of data also helps to further explain the findings. For example, if 80% of group A “Strongly agreed” to question 1, their open-ended responses to question 2 may explain why they “Strongly agreed” to question 1.

Triangulation was a new (to me at least) concept in 1981 when a whole chapter was devoted to the topic in a volume dedicated to Donald Campbell, titled Scientific Inquiry and the Social Sciences. scientific inquiry and the social sciences   I have no doubt that this concept was not new; Crano, the author of this chapter titled “Triangulation and Cross-Cultural Research”, has three and one half pages of references listed that support the premise put forth in the chapter.  Mainly, that using data from multiple different sources may increase the understanding of the phenomena under investigation.  That is what triangulation is all about–looking at a question from multiple points of view; bringing together the words and the numbers and then offering a defensible explanation.

I’m afraid that many beginning evaluators forget that words can support numbers and numbers can support words.

May
24
Filed Under (criteria, program evaluation) by Molly on 24-05-2013

Recently, I was privileged to see the recommendations of  William (Bill) Tierney on the top education blogs.  (Tierney is the Co-director of the Pullias Center for Higher Education at the University of Southern California.)  He (among others) writes the blog, 21st scholar.  The blogs are actually the recommendation of his research assistant Daniel Almeida.  These are the recommendations:

  1. Free Technology for Teachers

  2. MindShift

  3. Joanne Jacobs

  4. Teaching Tolerance

  5. Brian McCall’s Economics of Education Blog

What criteria were used?  What criteria would you use?  Some criteria that come to mind are interest, readability, length, frequency.  But I’m assuming that they would be your criteria (and you know what assuming does…)

If I’ve learned anything in my years as an evaluator, it is to make assumptions explicit.  Everyone comes to the table with built in biases (called cognitive biases).  I call them personal and situational biases (I did my dissertation on those biases). So by making your assumptions explicit (and thereby avoiding personal and situational biases), you are building a rubric because a rubric is developed from criteria for a particular product, program, policy, etc.

How would you build your rubric? Many rubrics are in chart format, that is columns and rows with the criteria detailed in those cross boxes.  That isn’t cast in stone.  Given the different ways people view the world–linear, circular, webbed–there may be others, I would set yours up in the format that works best for you.  The only thing to keep in mind is be specific.

Now, perhaps you are wondering how this relates to evaluation in the way I’ve been using evaluation.  Keep in mind evaluation is an everyday activity.  And everyday, all day, you perform evaluations.  Rubrics formalizes the evaluations you conduct–by making the criteria explicit.  Sometimes you internalize them; sometimes you write them down.  If you need to remember what you did the last time you were in a similar situation, I would suggest you write them down. rubric cartoon No, you won’t end up with lots of little sticky notes posted all over.  Use your computer.  Create a file.  Develop criteria that are important to you.  Typically, the criteria are in a table format; an x by x form.  If you are assigning number, you might want to have the rows be the numbers (for example, 1-10) and the columns be words that describe those numbers (for example, 1 boring; 10 stimulating and engaging).  Rubrics are used in reviewing manuscripts, student papers, assigning grades to activities as well as programs.  Your format might look like this:generic rubric

Or it might not.  What other configuration have you seen rubrics?  How would you develop your rubric?  Or would you–perhaps you prefer a bunch of sticky notes.  Let me know.

May
16
Filed Under (Data Analysis, program evaluation) by Molly on 16-05-2013

Ever wonder where the 0.05 probability level number was derived?  Ever wonder if that is the best number?  How many of you were taught in your introduction to statistics course that 0.05 is the probability level necessary for rejecting the null hypothesis of no difference?  This confidence may be spurious.  As Paul Bakker indicates in the AEA 365 blog post for March 28, “Before you analyze your data, discuss with your clients and the relevant decision makers the level of confidence they need to make a decision.”  Do they really need to be 95% confident?  Or would 90% confidence be sufficient?  What about 75% or even 55%?

Think about it for a minute?  If you were a brain surgeon, you wouldn’t want anything less than 99.99% confidence;  if you were looking at level of risk for a stock market investment, 55% would probably make you a lot of money.  The academic community  has held to and used the probability level of 0.05 for years (the computation of the p value dating back to 1770).   (Quoting Wikipedia, ” In the 1770s Laplace considered the statistics of almost half a million births. The statistics showed an excess of boys compared to girls. He concluded by calculation of a p-value that the excess was a real, but unexplained, effect.”) Fisher first proposed the 0.05 level in 1025 and established a one in 20 limit for statistical significance when considering a two tailed test.   Sometimes the academic community makes the probability level even more restrictive by using 0.01 or 0.001 to demonstrate that the findings are significant.  Scientific journals expect 95% confidence or a probability level of at least 0.05.

Although I have held to these levels, especially when I publish a manuscript, I have often wondered if this level makes sense.  If I am only curious about a difference, do I need 0.05?  Oor could I use 0.10 or 0.15 or even 0.20?  I have often asked students if they are conducting confirmatory or exploratory research?  I think confirmatory research expects a more stringent probability level.  I think exploratory research requires a less stringent probability level.  The 0.05 seems so arbitrary.

Then there is the grounded theory approach which doesn’t use a probability level.  It generates theory from categories which are generated from concepts which are identified from data, usually qualitative in nature.  It uses language like fit, relevance, workability, and modifiability.  It does not report statistically significant probabilities as it doesn’t use inferential statistics.  Instead, it uses a series of probability statements about the relationships between concepts.

So what do we do?  What do you do?  Let me know.

May
10
Filed Under (criteria, Methodology, program evaluation) by Molly on 10-05-2013

Recently, I came across a blog post by Daniel Green, DanGreen-150x150who is the head of strategic media partnerships at the Bill and Melinda Gates Foundation.  He coauthored this post with Mayur Patel, Mayur_Patel__2012.jpg.200x0_q85vice president of strategy and assessment at the Knight Foundation.  I mention this because those two foundations have contributed $3.25 million in seed funding “…to advance a better understanding of audience engagement and media impact…”.  They are undertaking an ambitious project to develop a rubric (of sorts) to determine “…how media influences the ways people think and act, and contributes to broader societal changes…”.   Although it doesn’t specifically say, I include social media in the broad use of “media”.  The blog post talks about broader agenda–that of informed and engaged communities.  These foundations believe that an informed and engaged communities will strengthen “… democracy and civil society to helping address some of the world’s most challenging social problems.”

Or in other words,  what difference is being made, which is something I wonder about all the time.  (I’m an evaluator, after all, and I want to know what difference is made.)

Although there are strong media forces out there (NYTimes, NPR, BBC, the Guardian, among others), I wonder about the strength and effect of social media (FB, Twitter, LinkedIn, blogs, among others).  Anecdotally, I can tell you that social media is everywhere and IS changing the way people think and act.  I watch my now 17 y/o who uses the IM feature on her social media to communicate with her friends, set up study dates, find out homework assignments, not the phone like I did.  I watch my now 20 y/o multitask–talk to me on Skype and read and respond to  her FB entry.  She uses IM as much as her sister.  I know that social media was instrumental in the Arab spring. I know that major institutions have social media connections (FB, Twitter, LinkedIn, etc.).  Social media is everywhere.  And we have no good way to determine if it is making a difference and what that difference is.

For something so ubiquitous (social media), why is there no way to evaluate social media other than through the use of analytics?  I’ve been asking that question since I first posted my query “Is this blog making a difference?” back in March 2012.  Since I’ve been posting since December 2009, that gave me over 2 years from which to gather data.  That is a luxury when it comes to programming, especially when many programs often are a few hours in duration and an evaluation is expected.

I hope that this project provides useful information for those of us who have come kicking and screaming to social media and have seen the light.  Even though they are talking about the world of media, I’m hoping that they can come up with measures that address the social aspect of media. The technology provided IS useful; the question is what difference is it making?

May
02
Filed Under (criteria) by Molly on 02-05-2013

We are four months into 2013 and I keep asking the question “Is this blog making a difference?”  I’ve asked for an analytic report to give me some answers.  I’ve asked you readers for your stories.

Let’s hear it for SEOs and how they pick up that title–I credit that with the number of comments I’ve gotten.  I AM surprised at the number of comments I have gotten since January (hundreds, literally).  Most say things like, “of course it is making a difference.”  Some compliment me on my writing style.  Some are in a foreign language which I cannot read (I am illiterate when it comes to Cyrillic, Arabic, Greek, Chinese, and other non-English alphabets).  Some are marketing–wanting ping backs to their recently started blogs for some product.  Some have commented specifically on the content (sample size and confidence intervals); some have commented on the time of year (vernal equinox).  Occasionally, I get a comment like the comment below and I keep writing.

The questions of all questions… Do I make a difference? I like how you write and let me answer your question. Personally I was supposed to be dead ages ago because someone tried to kill me for the h… of it … Since then (I barely survived) I have asked myself the same question several times and every single time I answer with YES. Why? Because I noticed that whatever you do, there is always someone using what you say or do to improve their own life. So, I can answer the question for you: Do you make a difference? Yes, you do, because there will always be someone who uses your writings to do something positive with it. So, I hope I just made your day! 🙂 And needless to say, keep the blog posts coming!

Enough update.  New topic:  I just got a copy of the third edition of Miles and Huberman (my to go reference for qualitative data analysis).  Wait you say–Miles and Huberman are dead–yes, they are.  Johnny Saldana (there needs to be a~ above the “n” in his name only I don’t know how to do that with this keyboard) was approached by Sage to be the third author and revise and update the book.  A good thing, I think.  Miles and Huberman’s second edition was published in 1994.  That is almost 20 years.  I’m eager to see if it will hold as a classic given that there are many other books on qualitative coding in press currently.  (The spring research flyer from Gilford lists several on qualitative inquiry and analysis from some established authors.)

I also recently sat in on a research presentation of a candidate for a tenure track position here at OSU who talked about how the analysis of qualitative data was accomplished.  Took me back to when I was learning–index cards and sticky notes.  Yes, there are marvelous software programs out there (NVivo, Ethnograph, N*udist); I will support the argument that the best way to learn about your qualitative data is to immerse yourself in it with color coded index cards and sticky notes.  Then you can use the software to check your results.  Keep in mind, though, that you are the PI and you will bring many biases to the analysis of your data.