What percentage of your population do you need to hear from?

The number one question we get asked is: “What percentage of our population do we need to hear from?” And the answer always surprises them: it isn’t a percentage. Only the total number of responses matters. The target range is about 250 to 600 people – no matter the size of your population.

While population doesn’t matter, a few other things do.

Briefly

You can get a good representation of a whole community with a surprisingly small number of responses – if your data is statistically valid.

You’ve heard of a “margin of error”.

This tells you how close your sample should be to the true result you would get if you could get responses from the whole population. It takes 267 people to get a 6% margin of error, then up to 384 people for a 5% error and 600 people for a 4% error.

The number of responses isn’t the whole story though. If people decide to respond because of the topic, then you have a self -selection problem.

This means your sample of responses may not be representative of the broader population. The self-selection error can easily be so big that it turns data to junk, even when you have hundreds or thousands of responses.

You also have to worry about how input is structured. You need questions and answer choices that will give you the data you need. Even if you have hundreds of perfectly representative responses, biased questions and unbalanced answer sets can ruin your data too.

If you can achieve all three of these things you will have the reliable, statistically valid community input that you need.

More Details - 3 things your surveys need

1) You need 250 to 600 responses

Whether you have 10,000 people or 10,000,000 people in your community, 600 people is still a 4% margin of error.

Check out our margin of error tool here!

Our calculator gives you the standard margin of error with a 95% confidence level. For a margin of error of 4%, this means that 19 out of 20 times (95% of the times) that you take a sample, the data from your sample will be within +4% or -4% of the true answer.

The population size only starts to matter when it drops below a few thousand people and small populations just make the margin of error slightly smaller.

To understand why population doesn’t matter, think of yourself randomly picking M&Ms out of a big 5 pound bag and keeping track of how many there are of each color as you group them on a table.

As you go from a few to dozens to hundreds on the table you start to notice that the percentages for each color aren’t changing much with each additional M&M. If the bag happened that you picked from happened to be bigger nothing would change.

All that matters for measuring percentages for each color is that you have picked out a certain number of M&Ms from whatever size bag.

2) You need responses that are not self-selected to the topic

How you get your sample of responses matters for statistical validity. If people self-select themselves or self-organize themselves to give input based on the topic, the error from self-selection can dwarf the sample size margin of error.

We had one customer with online survey data (489 responses) that told them 85% would pay for something. Putting 489 responses in the calculator gives a 4.4% margin of error so you might think the true result is about 81% to 89%. They used FlashVote to eliminate self-selection error and found out only 33% would pay.

Huge self-selection problems afflict all traditional public input like meetings, emails, social media or online surveys. The worst part is that you can’t know how bad your data is until you have good data – so you most people have no idea that the data they have is junk.

Suppose you are picking out M&Ms from a bowl that someone sorted to be all green ones. It doesn’t matter how many you pick out, you’ll still get the wrong answer that all M&Ms are green. And without a regular bag sample for comparison, you end up stuck thinking that.

3) You need unbiased questions and answer choices

If you have a flawless sample of responses, with a huge number of perfectly randomly selected people giving input, you can still get junk data by using bad questions. The most common problems are leading questions (“How awesome are we?”) and unbalanced answer sets (“Really awesome OR Totally awesome”).Those are just 2 of the 23 quality control checks we use to cover everything from readability to presenting tradeoffs.

Put another way, it you pick out M&Ms in a dark room, you won’t see the colors you need to sort.

Summary

You can think of the “margin of error” as the best case, lowest possible error you can have. If you have self-selection or bad questions on top of low margin of error, your data can quickly turn to junk.

In the case of traditional offline and online public input, the data from the noisy, self-selected few is usually misleading. This is why local governments across the country are turning to FlashVote. With statistically valid community input in 48 hours, they can finally hear from the many not just the noisy.