This page offers guidance on target sample sizes to organisations conducting audience and visitor surveys as part of the Impact & Insight Toolkit programme. *View/download a PDF version of this page **here.*

The target sample sizes included are specific to the mandatory dimensions in the Toolkit surveys. For that reason, the recommended sample sizes may be different to other sample size guidance which relates to different survey questions.

**Why is sample size important? **

When you carry out an audience or visitor survey, you are trying to find out some information about everyone who attended an event. For example, you may want to know the mean average score that would be awarded by audience members for the dimension ‘Captivation: It was absorbing and held my attention’.

Normally it is too expensive or impractical to collect data from every audience member. Instead, you survey a **sample** of audience members. A sample is a representative portion of a population you are interested in – in this case, the total audience. You use data from your sample to estimate attributes of the audience as a whole.

A general sampling principle is to survey as many people as possible with the resources available to you. Why is this? If you sample a small group of people, it is possible that, by coincidence, only people who had very positive or very negative feelings about the event are surveyed. Data from this sample would give an inaccurate and unfair representation of the event. The more people who are surveyed, the smaller the chance that this will happen, and data from your sample will give a more accurate representation of how those attended felt.

So the larger the sample size, the better – but there comes a point where increasing the sample size doesn’t do a great deal more to improve the robustness of the results. The art of sampling is to aim for a sample size that gives you a level of confidence in the results that you are happy with – without spending too much time or money.

**Choosing an Appropriate Sample Size**

To choose an appropriate sample size, you need to consider the total audience size, the types of things you want to estimate and what ‘margin of error’ you are comfortable with around your estimates.

For example, suppose your total audience size is 500 and you want to estimate the mean average score awarded by audience members for the dimension ‘Captivation’. You can decide how many audience members to sample by deciding what margin of error to aim for. The margin of error is a statistical measure of how confident you are in your estimate of the mean Captivation score. It shows how close you think the average score of your sample is to the ‘true’ average score of all 500 people who experienced the work.

If your sample produces an estimated mean Captivation score of 70 with a margin of error of 10%, then it is likely that if you surveyed all 500 audience members you would get a ‘true’ mean Captivation score of between 63 and 77.

If your sample produces an estimated mean Captivation score of 70 with a margin of error of 5%, then it is likely that if you surveyed all 500 audience members you would get a ‘true’ mean Captivation score of between 66.5 and 73.5.

The table below shows the minimum sample sizes required for Impact & Insight Toolkit evaluations to achieve different margins of error around dimension scores for different total audience sizes[1]. (These numbers are indicative and the actual error calculated from real events is expected to be different from the figures provided here.)

We recommend aiming for a 5% margin of error for most evaluations. However, if this is not feasible with the resources available to you then an 8% or 10% margin is still sufficient.

**Sample Sizes for In-Depth Analysis**

The table above gives minimum sample sizes required to estimate the mean dimension scores of the total audience with different margins of error. However, you may want to carry out more detailed demographic analysis of dimension scores. For example, you may want to explore whether one’s gender affects the way the work is experienced.

In this case, you would calculate the mean average score awarded for Captivation by:

- All those who identify as female in your sample
- All those who identify as male in your sample
- All those who identify in another way in your sample

You would then test to see if the differences between the mean averages were significant.

You would want to achieve an appropriate margin of error around all your estimated mean Captivation scores. You would need sufficient and representative numbers of those of different genders in your sample – which would mean a much larger sample size overall.

If you are evaluating an event where you would like to analyse dimension scores by gender, age or any other demographic variables, please contact support@countingwhatcounts.co.uk to discuss what sample size to aim for.

**Evaluating Small Events**

If you are evaluating a small event with a total audience size of around 50, say, then it may be difficult to achieve the minimum sample size of 23 recommended in the table above. Achieving a sample size of 23 from a total audience of 50 would mean achieving a survey response rate of 46%, which may not be realistic.

If you are evaluating a small event and cannot achieve the sample size recommended here, then you can still derive meaning from your evaluation results. For example, suppose you receive 10 completed surveys from a total audience of 50. You may not be able to estimate the average dimension scores of your total audience with much statistical confidence, but you can describe the experiences and characteristics of those 10 people, who make up a fifth of your audience – particularly if your survey included some open text questions to allow respondents to explain their thoughts and feelings about the event.

Please contact support@countingwhatcounts.co.uk if you would like to help with interpreting data collected through the evaluation of small events.

[1] See Sample Size Guidance Technical Appendix for more information about how the figures in the table were calculated.

*The information on this page was last updated on 17 March, 2020.*