Ever wonder where the 0.05 probability level number was derived? Ever wonder if that is the best number? How many of you were taught in your introduction to statistics course that 0.05 is the probability level necessary for rejecting the null hypothesis of no difference? This confidence may be spurious. As Paul Bakker indicates in the AEA 365 blog post for March 28, “Before you analyze your data, discuss with your clients and the relevant decision makers the level of confidence they need to make a decision.” Do they really need to be 95% confident? Or would 90% confidence be sufficient? What about 75% or even 55%?
Think about it for a minute? If you were a brain surgeon, you wouldn’t want anything less than 99.99% confidence; if you were looking at level of risk for a stock market investment, 55% would probably make you a lot of money. The academic community has held to and used the probability level of 0.05 for years (the computation of the p value dating back to 1770). (Quoting Wikipedia, ” In the 1770s Laplace considered the statistics of almost half a million births. The statistics showed an excess of boys compared to girls. He concluded by calculation of a p-value that the excess was a real, but unexplained, effect.”) Fisher first proposed the 0.05 level in 1025 and established a one in 20 limit for statistical significance when considering a two tailed test. Sometimes the academic community makes the probability level even more restrictive by using 0.01 or 0.001 to demonstrate that the findings are significant. Scientific journals expect 95% confidence or a probability level of at least 0.05.
Although I have held to these levels, especially when I publish a manuscript, I have often wondered if this level makes sense. If I am only curious about a difference, do I need 0.05? Oor could I use 0.10 or 0.15 or even 0.20? I have often asked students if they are conducting confirmatory or exploratory research? I think confirmatory research expects a more stringent probability level. I think exploratory research requires a less stringent probability level. The 0.05 seems so arbitrary.
Then there is the grounded theory approach which doesn’t use a probability level. It generates theory from categories which are generated from concepts which are identified from data, usually qualitative in nature. It uses language like fit, relevance, workability, and modifiability. It does not report statistically significant probabilities as it doesn’t use inferential statistics. Instead, it uses a series of probability statements about the relationships between concepts.
So what do we do? What do you do? Let me know.
Recently, I came across a blog post by Daniel Green,
who is the head of strategic media partnerships at the Bill and Melinda Gates Foundation. He coauthored this post with Mayur Patel,
vice president of strategy and assessment at the Knight Foundation. I mention this because those two foundations have contributed $3.25 million in seed funding “…to advance a better understanding of audience engagement and media impact…”. They are undertaking an ambitious project to develop a rubric (of sorts) to determine “…how media influences the ways people think and act, and contributes to broader societal changes…”. Although it doesn’t specifically say, I include social media in the broad use of “media”. The blog post talks about broader agenda–that of informed and engaged communities. These foundations believe that an informed and engaged communities will strengthen “… democracy and civil society to helping address some of the world’s most challenging social problems.”
Or in other words, what difference is being made, which is something I wonder about all the time. (I’m an evaluator, after all, and I want to know what difference is made.)
Although there are strong media forces out there (NYTimes, NPR, BBC, the Guardian, among others), I wonder about the strength and effect of social media (FB, Twitter, LinkedIn, blogs, among others). Anecdotally, I can tell you that social media is everywhere and IS changing the way people think and act. I watch my now 17 y/o who uses the IM feature on her social media to communicate with her friends, set up study dates, find out homework assignments, not the phone like I did. I watch my now 20 y/o multitask–talk to me on Skype and read and respond to her FB entry. She uses IM as much as her sister. I know that social media was instrumental in the Arab spring. I know that major institutions have social media connections (FB, Twitter, LinkedIn, etc.). Social media is everywhere. And we have no good way to determine if it is making a difference and what that difference is.
For something so ubiquitous (social media), why is there no way to evaluate social media other than through the use of analytics? I’ve been asking that question since I first posted my query “Is this blog making a difference?” back in March 2012. Since I’ve been posting since December 2009, that gave me over 2 years from which to gather data. That is a luxury when it comes to programming, especially when many programs often are a few hours in duration and an evaluation is expected.
I hope that this project provides useful information for those of us who have come kicking and screaming to social media and have seen the light. Even though they are talking about the world of media, I’m hoping that they can come up with measures that address the social aspect of media. The technology provided IS useful; the question is what difference is it making?
Harold Jarche says in his April 21 post, “What I’ve learned about blogging is that you have to do it for yourself. Most of my posts are just thoughts that I want to capture.” What an interesting way to look at blogging. Yes, there is content; yes, there is substance. What there is most are captured thoughts. Thoughts committed to “paper” before they fly away. How many times have you said to yourself–if only…because you don’t remember what you were thinking; where you were going. It may be a function of age; it may be a function of the times; it may be a function of other things as well (too little sleep, too much information, lack of f0cus).
When I blog on evaluation, I want to provide content that is meaningful. I want to provide substance (as I understand it) in the field of evaluation. Most of all, I want to capture what I’m thinking at the moment (like now). Last week was a good example of capturing thoughts. I wasn’t making up the rubric content; it is real. All evaluation needs to have criteria against which the “program” is judged for merit and worth. How else can you determine the value of something? So I ask you: What criteria do you use in the moment you decide? (and a true evaluator will say, “It depends…”)
A wise man (Elie Wiesel) said, “A man’s (sic) life, really, is not made up of years but of moments, all of which are fertile and unique.” Even though he has not laid out explicitly his rubric, it is clear what makes them have merit and worth– “moments which are fertile and unique”. An interesting way to look at life, eh?
Jarche gives us a 10 year update about his experience blogging. He is asking a question I’ve been asking: He asks what has changed and what has he learned in the past 10 years. He talks about metrics (spammers and published posts). I can do that. He doesn’t talk about analytics (although I’m sure he could) and I don’t want to talk about analytics, either. Some comments on my blog suggest that I look at length of time spent on a page…that seems like a reasonable metric. What I really want to hear is what has changed (Jarche talks about what has changes as being perpetual beta). Besides the constantly changing frontier of social media, I go back to the comment by Elie Wiesel–moments that are fertile and unique. How many can you say you’ve had today? One will make my day–one will get my gratitude. Today I am grateful for being able to blog.
A rubric is a way to make criteria (or standards) explicit and it does that in writing so that there can be no misunderstanding. It is found in many evaluative activities especially assessment of classroom work. (Misunderstanding is still possible because the English language is often not clear–something I won’t get into today; suffice it to say that a wise woman said words are important–keep that in mind when crafting a rubric.)
This week there were many events that required rubrics. Rubrics may have been implicit; they certainly were not explicit. Explicit rubrics were needed.
I’ll start with apologies for the political nature of today’s post.
Yesterday’s activity of the US Senate is an example where a rubric would be valuable. Gabby Giffords said it best: 
Certainly, an implicit rubric for this event can be found in this statement:
Only it was not used. When there are clear examples of inappropriate behavior; behavior that my daughters’ kindergarten teacher said was mean and not nice, a rubric exists. Simple rubrics are understood by five year olds (was that behavioir mean OR was that behavior nice). Obviously 46 senators could only hear the NRA; they didn’t hear that the behavior (school shootings) was mean.
Boston provided us with another example of the mean vs. nice rubric. Bernstein got the concept of mean vs. nice.
Music is nice; violence is mean.
Helpers are nice; bullying is mean. 
There were lots of rubrics, however implicit, for that event. The NY Times reported that helpers (my word) ran TOWARD those in need not away from the site of the explosion (violence). There were many helpers. A rubric existed, however implicit.
I want to close with another example of a rubric: 
I’m no longer worked up–just determined and for that I need a rubric. This image may not give me the answer; it does however give me pause.
For more information on assessment and rubrics see: Walvoord, B. E. (2004). Assessment clear and simple. San Francisco: Jossey-Bass.
Today is the first full day of spring…this morning when I biked to the office it rained (not unlike winter…) and it was cold (also, not unlike winter)…although I just looked out the window and it is sunny so maybe spring is really here. Certainly the foliage tells us it is spring–forsythia, flowering quince, ornamental plum trees; although the crocuses are spent, daffodils shine from front yards; tulips are in bud, and daphne–oh, the daphne–is in its glory.
I’ve already posted this week; next week is spring break at OSU and at the local high school. I won’t be posting. So I leave you with this thought: Evaluation is an everyday activity, one you and I do often without thinking; make evaluation systematic and think about the merit and worth. Stop and smell the flowers.
Harold Jarche shared in his blog a comment by a participant in one of his presentations. The comment is:
Knowledge is evolving faster than can be codified in formal systems and is depreciating in value over time.
This is really important for those of us who love the printed work (me) and teach (me and you). A statement like this tells us that we are out of date the moment we open our mouths; those institutions on which we depended for information (schools, libraries, even churches) are now passe.
The exponential growth of knowledge is much like that of population. I think this graphic image of population (by Waldir) is pretty telling (click on the image to read the fine print). The evaluative point that this brings home to me is the delay in making information available.
Do you (like me) when you say, “Look it up”, think web, not press, books, library, hard copy? Do you (like me) wonder how and where this information originated when the information is so cutting edge? Do you (like me) wonder how to keep up or even if you can? Books take over a year to come to fruition (I think the 2 year frame is more representative). Journal manuscripts take 6 to 9 months on a quick journal turn around. Blogs are faster and they express opinion; could they be a source of information?
I’ve decided to go to an advanced qualitative data seminar this summer as part of my professional development because I’m using more and more qualitative data (I still use quantitative data, too). It is supposed to be cutting edge. The book on which the seminar is based won’t be published until next month (April). How much information has been developed since that book went to press? How much information will be shared at the seminar? Or will that seminar be old news (and like old news, be ready for fish)? The explosion of information like the explosion of population, may be a good thing; or not. The question is what is being done with that knowledge? How is it being used? Or is it? Is the knowledge explosion an excuse for people to be information illiterate? To become focused (read narrow) in their field? What are you doing with what I would call miscellaneous information that is gathered unsystematically? What are you doing with information now–how are you using it for professional development–or are you?
Today’s post is longer than I usually post. I think it is important because it captures an aspect of data analysis and evaluation use that many of us skip right over: How to present findings using the tools that are available. Let me know if this works for you.
Ann Emery blogs at Emery Evaluation. She challenged readers a couple of weeks ago to reproduce a bubble chart in either Excel or R. This week she posted the answer. She has given me permission to share that information with you. You can look at the complete post at Dataviz Copycat Challenge: The Answers.
I’ve also copied it here in a shortened format:
“Here’s my how-to guide. At the bottom of this blog post, you can download an Excel file that contains each of the submissions. We each used a slightly different approach, so I encourage you to study the file and see how we manipulated Excel in different ways.
Here’s that chart from page 7 of the State of Evaluation 2012 report. We want to see whether we can re-create the chart in the lower right corner. The visualization uses circles, which means we’re going to create a bubble chart in Excel.
To fool Excel into making circles, we need to create a bubble chart in Excel. Click here for a Microsoft Office tutorial. According to the tutorial, “A bubble chart is a variation of a scatter chart in which the data points are replaced with bubbles. A bubble chart can be used instead of a scatter chart if your data has three data series.”
We’re not creating a true scatter plot or bubble chart because we’re not showing correlations between any variables. Instead, we’re just using the foundation of the bubble chart design – the circles. But, we still need to envision our chart on an x-y axis in order to make the circles.
It helps to sketch this part by hand. I printed page 7 of the report and drew my x and y axes right on top of the chart. For example, 79% of large nonprofit organizations reported that they compile statistics. This bubble would get an x-value of 3 and a y-value of 5.
I didn’t use sequential numbering on my axes. In other words, you’ll notice that my y-axis has values of 1, 3, and 5 instead of 1, 2, and 3. I learned that the formatting seemed to look better when I had a little more space between my bubbles.
Open a new Excel file and start typing in your values. For example, we know that 79% of large nonprofit organizations reported that they compile statistics. This bubble has an x-value of 3, a y-value of 5, and a bubble size of 79%.
Go slowly. Check your work. If you make a typo in this step, your chart will get all wonky.
Highlight the three columns on the right – the x column, the y column, and the frequency column. Don’t highlight the headers themselves (x, y, and bubble size). Click on the “Insert” tab at the top of the screen. Click on “Other Charts” and select a “Bubble Chart.”

You’ll get something that looks like this:

First, add the basic data labels. Right-click on one of the bubbles. A drop-down menu will appear. Select “Add Data Labels.” You’ll get something that looks like this:
Second, adjust the data labels. Right-click on one of the data labels (not on the bubble). A drop-down menu will appear. Select “Format Data Labels.” A pop-up screen will appear. You need to adjust two things. Under “Label Contains,” select “Bubble Size.” (The default setting on my computer is “Y Value.”) Next, under “Label Position,” select “Center.” (The default setting on my computer is “Right.)
Your basic bubble chart is finished! Now, you just need to fiddle with the formatting. This is easier said than done, and probably takes the longest out of all the steps.
Here’s how I formatted my bubble chart:
Your final bubble chart will look something like this:

For more details about formatting charts, check out these tutorials.
Click here to download the Excel file that I used to create this bubble chart. Please explore the chart by right-clicking to see how the various components were made. You’ll notice a lot of text boxes on top of each other!”
Just spent the last 40 minutes reading comments that people have made to my posts. Some were interesting; some were advertising (aka marketing) their own sites; one suggested I might revisit the “about” feature of my blog and express why I blog (other than it is part of my work). So I revisited my “about” page, took out conversation, and talked about the reality as I’ve experienced it for the last three plus years. So check out the about page–I also updated info about me and my family. The comment about updating my “about” page was a good one. It is an evaluative activity; one that was staring me in the face and I hadn’t realized it. I probably need to update my photo as well…next time…:)
In a conversation with a colleague on the need for IRB when what was being conducted was evaluation not research, I was struck by two things:
Leaving number 1 for another time, number 2 is the topic of the day.
A while back, AEA 365 did a post on the difference between evaluation and research (some of which is included below) from a graduate students perspective. Perhaps providing other resources would be valuable.
To have evaluation grouped with research is at worst a travesty; at best unfair. Yes, evaluation uses research tools and techniques. Yes, evaluation contributes to a larger body of knowledge (and in that sense seeks truth, albeit contextual). Yes, evaluation needs to have institutional review board documentation. So in many cases, people could be justified in saying evaluation and research are the same.
NOT.
Carol Weiss
(1927-2013, she died in January) has written extensively on this difference and makes the distinction clearly. Weiss’s first edition of Evaluation Research was published in 1972.
She revised this volume in 1998 and issued it under the title of Evaluation.
(Both have subtitles.)
She says that evaluation applies social science research methods and makes the case that it is intent of the study which makes the difference between evaluation and research. She lists the following differences (pp 15 – 17, 2nd ed.):
(For those of you who are still skeptical, she also lists similarities.) Understanding and knowing the difference between evaluation and research matters. I recommend her books.
Gisele Tchamba who wrote the AEA365 post says the following:
She also sites a Trochim definition that is worth keeping in mind as it captures the various unique qualities of evaluation. Carol Weiss mentioned them all in her list (above):
Resources:
One of the expectations for the evaluation capacity building program that just finished is that the program findings will be written up for publication in scientific journals.
Easy to say. Hard to do.
Writing is HARD.
To that end, I’m going to dig out my old notes from when I taught technical writing to graduate students, medical students, residents, and young faculty and give a few highlights.
Happy writing.