Automation Paving the Way to Standardize ‘Good Quality’ Data in Surveys

By Tim McCarthy, Imperium General Manager

As brands look to better understand the rapidly evolving requirements and motivations of their customers, the need for quality data has never been more urgent.

But, with demand for respondents at an all-time high, one of the central challenges facing the industry is that there’s no baseline of data quality on which all parties can agree – and with so much left to subjective reviews, it’s easy to see how disagreements on data quality continue to proliferate.

In principle, everyone involved – sample providers, market research agencies and brands alike – wants the same thing: good respondents. In practice, this means removing those who demonstrate some level of bad behavior during a survey without eliminating otherwise strong candidates who’ve provided one or two sub-optimal responses. Spotting bad respondents manually is trickier than it sounds and extremely time consuming. Moreover, without an agreed benchmark for quality, it’s not surprising that standards vary.

Respondents are often removed at the first sign of a poor response because it is very labor intensive to review the data more holistically and determine if a poor response was an isolated incident (potentially due to survey setup/design), or whether additional data contributed to detecting a broader unfavorable response pattern.

At Imperium, we believe that the answer lies in moving to a more automated respondent-scoring process. Reducing subjectivity and tailoring quality checks to the specifics of each survey is key to reaching a more balanced agreement on what constitutes “good” and “bad” respondents. A smoothly automated system also increases project speed while greatly decreasing the cost and duration of manual checks.

We’ve been reviewing data from our new QualityScore™ solution and have revealed some useful insights. We analyzed approximately 200K respondents across 125+ projects and found tangible links between the various types of behavior that produce an overall bad respondent score.

For example, our analysis revealed that those who scored in the bottom quadrant for Open-Ends had a 40 percent likelihood of being in the bottom quadrant for quality overall, while speeding (16 percent correlation) and straight-lining (10 percent correlation) were less reliable indicators of generally poor-quality respondents.

When it comes to banding respondents based on quality, QualityScore metrics have led us to some interesting observations. Our analysis shows that, on average, 8 percent of respondents fall into our poor-quality range, while 65 percent rate highly. This leaves about 20 or 30 percent whose results are less clear cut.

It’s an important group, comprising a number of respondents that may have triggered one or two flags, but have nevertheless scored within an acceptable range. Ditching all of these respondents at first sign of concern will not only waste a significant percentage of the potential respondent pool, which could lead to difficulty in reaching quotas, but also risks biasing data at a time when listening to more diverse voices is critical.

Importantly, QualityScore uses machine learning to compare each respondent’s data against peers from that specific question/survey. For example, if any part of the survey is set up in a way that lends itself to straight-lining or poor Open-End responses, respondents will only be flagged for poor quality if there is other supporting evidence.

Our data shows that clients using the fully automated QualityScore solution save about 85 percent of the time they would otherwise spend checking survey results to identify bad respondents. This not only provides time and cost savings for our clients, but, by reducing the potential for conflict between sample providers and market researchers, we hope it will provide a sound basis for driving data quality higher for the industry as a whole.


Tim McCarthy is General Manager at Imperium. He has over 15 years of experience managing market research and data-collection services and is an expert in survey programming software, data analysis and data quality. Imperium is the foremost provider of technology services and customized solutions to panel and survey organizations, verifying personal information and restricting fraudulent online activities.

Back To News