Good question. Being an analyst, I have become familiar with various platforms for analysis, from basic spreadsheet software, to open-source statistical packages, to top-of-the-line products, all of which serve the needs of researchers with varying levels of complexity. While spreadsheets are perfectly ample for your “run of the mill” percentages and distributions, you certainly wouldn’t want to use it for, say, logistical regression. (Even if such programs offered the calculation, it would take a bold individual to trust the accuracy of the statistic if throws at you.)
But let’s be honest, much of the market research community only ever runs basic distribution statistics, and for that, a program such as Excel is perfectly ample. It certainly provides the nice graphics you would want to pretty up your reports, something SPSS, for example, has not yet mastered. I have spent many a day mulling over how to create decent looking charts in SPSS, but always to no avail. Then again SPSS isn’t designed to be a graphical leader, but certainly trumps Excel in analytical capacity.
Which leads me to the question: when is Excel not enough? IBM (the parent company of SPSS) recently released a white paper on the potential dangers of using spreadsheet applications in complex analytical procedures. Aside from this paper being an obvious sales pitch for SPSS, it does bring up some very valid points.
Before diving in to these potential dangers, let’s first discuss what’s good about spreadsheets:
1. They are great tools for organizing data quickly and efficiently. The sorting functions and vertical lookup capacity of Excel, for example, is unmatched.
2. For a quick analysis of basic characteristics of your data, spreadsheets can more than handle these simple calculations. If all you want is to report the percentage of customers who are satisfied with your product, then spreadsheets are indeed all you will need.
Okay, but what happens when I want to go beyond the basic? You can use Excel to calculate t-tests and correlations to more accurately describe relationships that exist in your data, but beyond that, Excel becomes a little risky to trust. It certainly has calculations to perform regression analysis and all sorts of fancy forecasting tools, but to be quite frank, Excel’s algorithms to perform such tasks simply aren’t as advanced as the ones you would use in SPSS, SAS, or STATA. Yes, you can perform regression modeling in Excel, but statistical software packages not only do the modeling, but also test for certain assumptions that improve the accuracy of the testing and indicate significance at levels spreadsheets cannot.
Of course, programs such as SPSS and SAS take much more knowledge and training to utilize their full capabilities, but the end results will be much more illustrative and solid should the research call for a deeper analysis beyond the basic. Here at IQS, we use a myriad of programs, including spreadsheet software, but realize the limitations and functions of each of the programs we use. When we create forecasting models, for instance, we realize the complexity involved in the calculations, so we do not use Excel for such tasks. Research shows that 90% of all spreadsheets contain at least one error. And you can be certain that as calculations become more complex, the preponderance of these errors increases. The scary thing is that most of these mistakes go unnoticed, so companies can literally be making decisions based off faulty or inaccurate data.
The simple moral of this story is this: Know the limitations of spreadsheets. While they can be the best thing in the world for some projects, they can be very dangerous and risky to use for others. We’ve all been there, realizing a mistake after a report has been released. Most of the time they are minor, but I personally would hate to be the one having to retract modeling results because a mistake made in my calculations.
Here is a link to the article released by IBM on The Risks of Using Spreadsheets for Statistical Analysis.