Understanding Research – Political Polls and Their Context

Date: October 26, 2012 | IQS Research | News | Comments Off on Understanding Research – Political Polls and Their Context

Yesterday, the President of IQS Research Shawn Herbig spent an hour on the radio discussing some of the intricacies involved in the research and polling process.  Given the current election season, one thing we know for certain is that there is no shortage of polling results being released.

So that begs the question, how do we know which polls are right and which are not?  Is each new poll released on a daily basis reflecting real changes in how we think about the candidates?  Is polling and research indicative of emotions or behaviors, or both?  These are some the things Herbig tackled yesterday.

We posted a discussion late last year about how it may be a good idea to look at what are called polls of  polls, which take into consideration the summation of research done on a particular topic (in this case, political polling).  This will help to “weed out” fluff polls that may not be very accurate, and to place a heavier emphasis on the trend rather than specific points in time.

But beyond this, understanding the the  methodology behind polls is useful when deciding whether or not those results are reliable.  A few things to note:

1. What is the sample size? – Political polls in particular are attempting to gauge what an entire country of over 200 million registered voters think about an election.  A sample size needs to be 385 to be representative of a population of 200 million.  But oftentimes you see polls with around 1,000 respondents.  Oversampling allows researchers to make cuts in the data (say, what women think , or what what African Americans think) and still maintain a comfortable confidence level in the results.

2. How was the sample collected? – Polls on the internet, or ones that are done on media websites, aren’t too trustworthy.  They attract a particular group of respondents, thus skewing the results one way or another.  Scientific research maintains that a sample must be collected randomly in order for those results to be Representative in a population.  In other words, each person selected for a political poll, for instance, must have an equal chance to be selected as any other person in the population.

3.  Understand the context of poll/research – When the poll was taken is crucial in understanding what it is telling us.  For instance, there was a lot of polling done after each one of the presidential debates.  Not only did researchers ask who won the debate, but they also asked who those being polled were going to vote for.  After the first debate (which we could argue went in Romney’s favor), most polls showed the lead Obama had going into the debate had vanished.  Several polls showed Romney with a sizable lead.  But was this a statistical push due to the recent debate and the emotion surrounding it? Or was this increase real?

Recent polls show a leveling between the two candidates now that the debates are over, and a more objective look at the candidates can be achieved.  However, it is nearly impossible to eliminate emotion in responses, especially in a context as controversial a politics.

4. Interpreting Results – Interpretation ties in nicely with understanding the context of the research that you are viewing.  But there is a task for each of us as we interpret, and that is to leave behind our preconceived notions about the results.  This is very hard to do, as it is a natural human instinct to believe what justifies our own reasoning.  This is know as Confirmation Bias, and it can impact the way we accept or discount the research.

Taking all this into account can help us to sift through the commotion and find the value of the research being produced.  This isn’t just for political polling, but can be used for all research that you encounter.  Being good consumers of research can take a lot of effort, but it is the only way to gain a more realistic view of the world around you.

view all

Statistics – When science goes awry due to the lies of men

Date: July 15, 2011 | Shawn Herbig | News | Comments Off on Statistics – When science goes awry due to the lies of men

David J. Hand recently authored a short, concise book about statistics, aptly named Statistics, and in it he attempts to bring forth an argument that statistics is a fascinating and very applicable science.  I don’t argue with Hand in the slightest – I do find statistics very interesting, namely because I recognize its everyday uses.  It is interesting enough, and if you have time and are keen on this sort of subject matter, then by all means go ahead and pick yourself up a copy.

My personal interest in the book aside, I would like to focus on a line of text at the beginning of Chapter 1.  I’m sure most of us are familiar of the infamous Twain quote, “There are lies, damned lies, and statistics.”  The musings of Twain that led to these words implied that statistics can be twisted, turned, mutilated, cooked, and subjected to other forms of mutation to get them to say what you want them to say.  In short, many people have manipulated statistics to support a lie.

But perhaps we are less familiar with Frederick Mostellar, who once said “it is easy to lie with statistics, but easier to lie without them.”  Mostellar was one of the most recognized statisticians of the 20th century – he helped found the statistics department at Harvard, was president of numerous professional organizations dealing with statistics, and was possibly one of the most dedicated teachers of statistics in the United States.

Being such an expert in the field, of course Mostellar recognized that statistics could be manipulated to get them to say what one wants them say.  However, a statistic isn’t just a number we use to describe things; it is a representation of the world we live in.  Underneath the surface of percentages or coefficients of determination is an entire world, built upon solid science and mathematical certainties.  In this sense, statistics are a beautiful thing.  We can use them to understand an otherwise complex, seemingly chaotic world – condense it down to numbers and figures that help to explain our surroundings.

Of course, there are those who have been burned by statistics.  Most people, I’m sure, unknowingly.  But let us not blame the statistic.  After all, it’s hard to place blame on an inanimate representation such as a number.  It doesn’t mean us any harm.  The real fault should be placed on those people who use statistics for ill, those who manipulate the numbers for their own gain.  Statistics in their purity are not contradictory – it’s those who use them that sometimes are.  Of course, that can be applied to any discipline, not just statistics.

Undertaking research (and creating the subsequent statistics of findings) is certainly something not to be taken lightly.  It requires dedication, motivation, and a clear goal to obtain truth.  There will always be those who bend the numbers to either create a false truth or hide the real truth, but statistics should not be slighted or disregarded because of that.  What the science should provoke is conversation and understanding in an effort to come to rational conclusions on how to move forward.  As my own boss and mentor often says, statistics is perhaps one of the only sciences in which two different individuals studying them can come up with two different conclusions, and so long as they are not truly contradictory, both of them can be right.  And that is the quintessential model of decision-making, using grounded results and findings to move an initiative forward.

Maybe you don’t agree with Hand’s (and to some degree my own) proposition that statistics is fascinating and cool and the most exciting of disciplines.  But at the very least I hope you agree that statistics make the world around us better known, and thus more real.

If you would like to read Hand’s Statistics, here is the information:
Hand, David J.
Statistics, Sterling Publishing Co., New York, 2010

view all

McKinsey’s results on employer based healthcare: It’s all in how you look at it.

Date: June 28, 2011 | Shawn Herbig | News | Comments Off on McKinsey’s results on employer based healthcare: It’s all in how you look at it.

McKinsey & Company recently released a report that claims that 30% of employers in the United States will drop their employee insurance coverage in 2014, when the Patient Protection & Affordable Care Act (aka Obamacare) takes full affect.  Before we get into the controversy surrounding these results (and they are indeed making waves across the research and government communities), let’s take a brief look at what the study indicated:

  • 30% of employers will definitely or probably stop offering ESI in the years after 2014.
  • Among employers with a high awareness of reform, this proportion increases to more than 50%, and upward of 60% will pursue some alternative to traditional ESI.
  • At least 30% of employers would gain economically from dropping coverage even if they completely compensated employees for the change through other benefit offerings or higher salaries.
  • Contrary to what many employers assume, more than 85% of employees would remain at their jobs even if their employer stopped offering ESI, although about 60% would expect increased compensation.

The point of contention is generally with the first figure mentioned above, namely that 30% of employers will drop employer-sponsored insurance.  It’s a figure that you see popping up all over the place, cited by the New York TimesNational Public RadioLos Angeles Times, and the Wall Street Journal (to name a few); not to mention all the independent bloggers and journalists out there posting on the topic.  In response, McKinsey has released its methodology and even has created a separate email address to direct all inquiries of the study.  The survey itself has also been released.

But why is this turning into such a controversy?  Simply put, it does not correspond with the past figures citing attrition of ESI due to the Affordable Care Act.  Furthermore, it is being used as political fodder among the Republican presidential nominees to attack Obamacare.  Other research conducted by the Mercer Group , the Rand Corporation, and the Urban Institute have all cited attrition projections that are much lower than McKinsey’s 30%.  And, what is more, it took McKinsey some time to release the methodology (they rejected any requests upfront, until they began to feel the pressure of every major news outlet screaming for it).

It is easy to become caught up in the controversy surrounding this, but a researcher, I am more curious as to the reasons why the controversy exists in the first place; particularly the data underlying the results and the way in which it was collected.

To be forthright, the methodology of the research seems to be sound enough, and the questionnaire itself does not appear to be skewed in such a way as to solicit particular responses.  But if this is the case, then why all the fuss?  The answer to this question lies beneath the surface of all this, as it is a function of context framing rather than accuracy.  And this, to some degree, is addressed by McKinsey in their methodology response.

McKinsey’s study was a study of perceptions, while Urban Institute et al used forecasting models to predict the impact the healthcare bill would have on ESI.  Given this delineation, it becomes clearer perhaps why such discrepancies exist between the various studies.

But why, you may ask, should these two models differ so drastically, and which one is more correct and reliable?  Well, for the second question, only time will tell and I’m not about to about to open that can of worms, but the first question can be answered pretty simply.

Perception studies, like the one McKinsey performed, are based on responses during a single point in time and are influenced by the emotions around that topic at that time.  As we have seen, the emotions surrounding this topic are particularly contentious right now.  This is not to say that all perception studies are fraught with emotion and because of this emotion they cannot be trusted.

In this case, the employers were asked if they would continue their ESI based on a specific scenario.  Some 30% indicated that they likely would not.  When 2014 arrives, maybe all 30% will do exactly as they indicated, but more likely, some of the respondents will change their minds based on the final financial implications as well as new information at that time.

Forecasting models, on the other hand, are designed to take into account numerous scenarios based on what the healthcare bill may provide and the predicted responses to the same.  Typically using regression modeling and past performance to functionally predict the behaviors of both people who indicated they would drop ESI and those that didn’t.

Perhaps emotions will die down and a larger percentage of employers will decide to stay with their ESI, or perhaps the forecasting models are underestimating the actual response come 2014 – time will provide that answer.  McKinsey’s study should not be discounted because it results differ from previous predictions.  And let’s not forget that the perceptions and opinions that they measured are indeed those of the decision-makers themselves.  However, this is a perfect example of how people can be misled by statistics and figures that give the appearance of contradiction.  I’m not trying to argue which model is right or which is wrong.  Both are valid and serve a valuable purpose.  My point here is an attempt to shed some light on perhaps why these models differ.

Research is about providing answers and both models provide different parts of the answer.  If we allow our own emotions and preconceived notions to take control then we will lose this answer in the midst of controversy.

view all

The Power of a Sample – Voodoo or Science?

Date: May 18, 2011 | Shawn Herbig | News | Comments Off on The Power of a Sample – Voodoo or Science?

A recent study carried out by our company and The Civil Rights Project for Jefferson County Public Schools came under fire for a common misconception among those who don’t fully understand the power of random sampling. Without going into a long, drawn-out discussion of what the study entailed, the project aimed to gain an understanding of the Louisville community’s perceptions of the student assignment plan and the diversity goals it seeks to accomplish. Perhaps the methods would not have come under such scrutiny had the findings been less controversial, but regardless, the methods did indeed come under attack.

But if we take a moment to understand the science behind sampling methods, and realize that it is not voodoo magic, then I think the community can begin to focus on the real issues the study uncovered. To put it simply, sampling is indeed science. Without going into the theory of probablity and the numerous mathematical assesssments to test the validity of a sample, we can say that a random sample, so long as the laws of probablity and nature hold true, and some tear in the fabric of the universe has not occured, is certainly representative of any population it attempts to embody.

Let us first begin to understand why this is so. When I taught statistics and probability to undergrads during my days as an instructor, I found I needed to keep this explanation simple – not because my students lacked the intellengence to fully understand this, but more so because probablity theory can get a little sticky, and keeping the examples simple seemed to work best. Imagine we have a coin – a fair sided coin that is not weighted in any way (aside from a screw up from the Treasury, in which case your coin could be worth a bundle of cash). We all know this example. If you flip it, you have a 50-50 chance of getting a particular side of that coin. In essence, that is the law of probability (the simplest of many).

Random sampling is the same way. While there are various methods to go about sampling a population randomly, Simple Random Sampling is the easiest and most commonly used. To put it simply, each member of a population is assigned a unique value, and a random generator picks values within a defined range (say 1 to 1,000,000). Each member of that population has an equal chance of being selected. These chosen members become the lucky ones to be a true representation of a population. They are not “chosen” in the sense that they get to drink the Koolaid and ascend beyond, but they are chosen to speak on behalf of an entire population. Pretty cool, huh?!

These samples are representative because, well, probability tells that it is. I can spend pages and pages of your precious, valuable time discussing why this is the case, but that discussion will undoubtedly put you to sleep. However, this is why not every person in a population needs to be surveyed. And, it is a great cost conserving measure when you only have to sample, say, 500 people to respresent a much larger population. Here I can bore you again with monotonic relationships and exponential sampling benefits, but I will not do that. (You can thank me later).

Now for the real bang! Say you want to measure satisfaction of city services within a small city of 50,000 people. In order to have a representative sample, all you need is a sampling of 382 people (with a 5% margin of error). Now, say that you want to do the same study, only on the entire city of Louisville, with a population of nearly 1.5 million. What size sample do you think you need? Are you ready for this? The number is 385! Wow. Only 3 more randomly selected residents are needed for a population 30 times greater. The beauty of sampling, and wonders of monotonic relationships! More on that later. You can play around with all sorts of sample size calculators (or do it by long hand, if you dare). I suggest this site.

Of course, if you want a smaller margin of error (in essence, if you want to be more confident that your sample is truly accurate of your population), you need to a larger sample. But I’ll post a discussion on margins of error and confidence levels another day. I leave you now to ponder the brillance of statistics!!

view all