This past month we have been bringing to you via our social media channels some important points to remember as evaluation and the Toolkit are revisited this autumn. We present these to you with the intention of encouraging you to consider your upcoming evaluation practice and how achieved insight might be furthered as you evaluate artistic and cultural experiences.
‘Low scores are bad.’
Yes, dimensions are scored on a scale of 0 – 100. Yes, we often think 100 is the best score. But it’s a little more complicated than that – context MATTERS.
For example, let’s say that for a particular event, the organisers don’t intend for it to be ground-breaking. It is therefore not concerning if the event scores lowly for the dimension, ‘Originality: It was ground-breaking.’
Don’t forget – scores are relative. It is important to consider dimension scores as they relate to other respondent groups and other evaluations. For example, let’s say the event organisers state in the prior-event survey that they largely expect the event to be ground-breaking, giving the statement an average agreement score of 75. The public, however, gives an average score of 50. This indicates a misalignment between expectation and experience. It is therefore not the number itself, but the context that determines how ‘good’ a score is. To read a bit more about interpreting your final results, you can visit our Insights Report guidance.
‘Audience feedback is the only feedback that really matters.’
Usually when we think about collecting feedback, audiences first come to mind. However, as we touched on previously, the insight is in the detail. The triangulation between self, peer and public responses on the same dimension statement indicates where your organisation’s expectations are in line with the experiences of your peers and audiences. To read more on the value of self-assessment and triangulation within the Toolkit, take a look at this blogpost.
‘Peer reviewers selected for an evaluation must have the same job titles as the evaluation’s self-assessors’.
Whilst it is true that these people will likely be well-placed to evaluate your work, they are not the only ones whose feedback could be considered that of a peer.
Ask your team, who could provide valuable feedback on your work? Don’t box yourself in with any assumptions about what a valid peer reviewer should look like. Just make sure to choose someone who is not directly involved in the curation or creation of the work; a peer reviewer should not be emotionally invested in the work.
Browse the Peer Matching Resource to connect with peers outside of your existing professional networks. Any other questions about peer review? Head back to our guide or take a look at an NPO’s experience of peer review.
‘I have a small sample size therefore I can’t take any insight from my data’
You CAN still derive meaning from your evaluations even if your sample size isn’t large. For example, suppose you receive 10 completed surveys from a total audience of 50. You may not be able to estimate the average dimension scores of your total audience with much statistical confidence, but you can describe the experiences and characteristics of those 10 people, who make up a fifth of your audience – particularly if your survey included some open text questions to allow respondents to explain their thoughts and feelings about the event. Sample size targets should vary depending on the event. If you’re planning an event and want to know where you should be aiming, read our guide.
‘Numerical values can’t be placed on art’
Within the Impact & Insight Toolkit, dimensions statements ask respondents to state their level of agreement on a scale of 0 – 100. The numbers are indicative of attendees’ perspectives on a given piece of work. Alongside other values, they tell a story more detailed than your average survey questions. Take a look at an NPO’s experience of using the results to dimensions questions to further their organisation’s practices and funding applications.
In the words of one of our users, ‘I think the increased level of detail that metrics produce can inform practice and therefore inform our approach to developing new audiences much more effectively than the standard audience question that I have seen for years about whether people liked what they have seen.’
‘Using standardised metric statements on artistic outcomes or qualities is just a tick box system.’
Allied to generating and triangulating self, peer and public responses to those metrics statements, it is an evaluation approach which allows for a complex measurement of sentiment & an exploration of the differences between how an organisation would rate their work compared to their peers and public audiences.
A tick box system suggests that agreement/disagreement with a dimension statement can be answered with a yes or a no. In reality, for a series of cultural experience dimensions, one individual can produce a many different combinations of answers for one piece of work.
Of course, analysis of each individual’s response is done with data summaries, but this level of granularity is retained and enables individual ‘response identities’ for every person who answers those questions. That’s without us mentioning the opportunity within the Culture Counts system for respondents to offer text-based answers on what they felt about the performance or event; or for cultural organisations to develop other bespoke and open questions to garner feedback from their peers and public. Not a tick box in sight. For an NPO’s perspective on considering the outputs achieved in their evaluation, take a look at this case study.
‘Some horrible funder is going to use all this stuff to make crude algorithmic funding judgements on cultural organisations.’
The Toolkit’s evaluation approach is not going to lead to crude league tables, and the performance assessment uses and abuses commonly associated with them. The emphasis is on sophisticated aggregate data analysis looking for insights at the level of say an artform, or mode of presentation.
This is demonstrated by our Quality Metrics National Test (QMNT) report, the precursor to the Impact & Insight Toolkit. We believe in the power of big data and fine-grained aggregate data analysis; combining the dimensions data with meta-data tags (artform; location etc.) allowing us and users to spot really interesting patterns in the data.
We know that the huge diversity of the culture sector renders crude comparisons inappropriate even within apparently similar groups of organisations – there are few if any ‘apple’ to ‘apple’ comparisons possible in the cultural sector. Such benchmarking strips the aggregate data from its interpretative context losing all meaning and insight in the process. We do not believe it offers insight for cultural organisations, investors, or policy makers. What we do believe is that, when carefully handled, the ‘big data’ can provide us all with conversation kick-starters about the importance of different artistic and cultural experiences.
‘We already know what our audiences think about our work, so we won’t learn anything new.’
It is always worth collecting feedback from your audiences unless you believe that your organisation lacks any room for improvement! Each work is unique, each audience member who provides feedback is unique, and you can never know what audience members will say unless you give them the chance. Just because you’ve received particular responses in the past does not mean they will continue forever into the future! Additionally, when you collect data over time, you are able to view a wider story about the way audiences have perceived your work over time. We would encourage you to think more about using the Toolkit to further your evaluation strategy: what do you want to learn; why do you want to learn it; how will you gather that insight?
We are always happy to engage in conversation with Toolkit users about their evaluations, what their results mean and how to develop their evaluations to ensure they speak to their organisational aims, so don’t hesitate to get in touch!