Using the Reporting Dashboard Last modified: 08 May 2024 Using the Reporting Dashboard Contents: This guidance document takes you through using the Reporting Dashboard to create Activity Reports and Ambition Progress Reports. For information on the background and other context, please see the associated blogpost. Reporting Dashboard Overview To access the Reporting Dashboard, go to https://dashboard.impactandinsight.co.uk/ You can log into the Reporting Dashboard by using your Culture Counts login details. The Reporting Dashboard has two different report types to choose from which you select by using the sidebar to the left of the screen: Activity Report – how much evaluation activity has taken place. Ambition Progress Report – the results of the evaluations that have taken place. If your Culture Counts account is associated with more than one organisation, you will be able to select which one you want to report for by using the sidebar. Whilst the two report types look at different sorts of data, for both you can: Select the date range that you would like to see the data for and report on View a variety of tables and charts which provide analysis of the data Select the evaluations you would like to include in a PDF export of the report Create and download a PDF version of the report We expect that the general use of the dashboard will flow in the order set out above. That is: selecting a date range; looking at the content; and, if you need to share the report with your colleagues, creating a version you can share. Date Selection To select a date range, you click on either of the ‘From’ or ‘To’ boxes to bring up a date selector (see below). The date range will default to 1st April 2023 until the current date. The earliest date that can be viewed is 1st April 2022. Once you have selected a date range, you then need to click the ‘Update Report’ button before the changes will register. When you do, it might take some time for the data to load. A ‘Loading report…’ sign will show when the data is loading. Downloading the Report The controls to download the report are at the very bottom of the page. If there are certain evaluations you do not wish to include in your download, you can select them here. Like the date selection, you need to click the ‘Create Downloadable Report’ button before anything will happen. When you click the button, it will take a few seconds for the downloadable report to load, and afterwards another button will appear with the download link. The downloaded version of the report includes the same content as the online dashboard plus a ‘front page’ which introduces the report. Activity Report Guidance This report shows the amount of evaluation activity that has taken place in the chosen reporting period. It has three sections: Data collection over time – visualising when your organisation was collecting data Total data collected – how much data was collected in total What is being measured – what dimensions your evaluations were using Data collection over time The first two charts visualise data which was being collected by your organisation during the date period you have selected. These charts are intended to show you whether your organisation was consistently collecting data throughout the reporting period and across multiple different evaluations, or whether there are big gaps when no evaluation was taking place. The bars in both charts represent periods of consistent data collection. A wide bar indicates that that evaluation was collecting data every day for multiple days. For example, in the first chart below, the highlighted box shows 8 days of consistent collection of public surveys. (The evaluations included in the examples are fictitious.) The charts also provide colour coding. In the first chart, the bars are colour coded by the type of survey respondent, and the second chart colour codes the bars by the volume of data collected. Total data collected This section has a table showing you the amount of data collected. For each evaluation, it shows you how much data was collected within the specific reporting period (e.g., 1st April 2023 to 1st October 2023), and how much data that evaluation has collected in total. These numbers may differ because an evaluation might have started before or continue after the reporting period you are looking at. For example, in the table below, we can see that Whispering Pines has collected 309 public responses in total, but only 17 of those responses were in the chosen reporting period. What is being measured This section tells you which dimensions you have used in your evaluations. As with the rest of the report, it will only look at evaluations which were actively collecting data in the chosen reporting period. There can potentially be two charts in this section, although not everyone will have both. The first chart shows you the dimensions you have used which are from the Dimensions Framework. The second chart shows the dimensions you have used which are outside the Dimensions Framework (if you have used any). All dimensions within the framework have met certain standards of quality and will provide rich benchmarks over time. For dimensions outside the framework, this isn’t necessarily the case. If you want to continue using a dimension which isn’t in the framework there isn’t necessarily a problem with that, just be aware of the potential limitations. The chart will show you the dimension name, the statement, and the number of times you have used it. The bars in the chart are colour coded by the Domain that the dimension comes from (e.g., Social Outcomes or Qualities). The dimensions will only show you the default statement. For example, for Enjoyment, if you have used both ‘I had a good time’ and ‘They had a good time’, they will both be counted as the same dimension and only the default statement will be shown; in this case, ‘I had a good time’ would be shown as it is the default statement. This is shown in the example below: Ambition Progress Report Guidance The Ambition Progress Report shows the results of all your evaluations which took place within the chosen reporting period in one place. It provides tools which allow you to: Compare the dimension results of different evaluations Compare your dimension results to benchmarks See which dimensions you are consistently achieving higher/lower results for Read written feedback by peer reviewers and self-assessors which relates to why they gave the dimension scores they did, recommendations they have for improvements, and any other feedback they have on their experience There are a few concepts introduced in this report which we’ll explain before going into the details of the report. Glossary of new concepts Benchmarking and Statistical Significance Tests The current benchmark available for comparison comes from the entire aggregate Toolkit dataset and is not filtered by date, art form, or organisation. Dimension results are compared to a benchmark using a statistical significance test. It is very rare for a dimension result to be exactly the same as the benchmark; it is usually above or below by some amount. A statistical test tells us whether the difference is just a coincidence or whether we are confident the difference is real (“significant” in statistical terms). This gives us three potential outcomes of a test: above the benchmark, below the benchmark, on par with the benchmark[1]. This is where sample sizes and margin of error become important. If you have a larger sample size and smaller margin of error, we can detect smaller changes with the statistical tests. In practice, this means that if you have bigger sample sizes you are more likely to see ‘above’ or ‘below’ results and less likely to see ‘on par’ results. Finally, for technical reasons, these tests don’t work very well when there is a very small sample size. So, for that reason, we only run a test for a dimension result if there are at least 25 public responses. Dimension Performance Tracker (DPT) This is a metric provided for each dimension you have used, and which changes each time you get a result which is different to the benchmark. Getting results above the benchmark increases DPT, and a result below the benchmark reduces DPT. For example, let’s say your dimension result for Captivation is 0.87 and the benchmark is 0.83. We run a statistical test, and it confirms: this is a significant result! Your DPT for Captivation will increase by 0.04 (0.87 is bigger than 0.83 by 0.04). This change to the DPT will only occur if the statistical test confirms that the difference is ‘significant’. If it isn’t, the DPT will not change. This type of metric is useful because it can be monitored over longer time periods to assess your organisation’s performance against its chosen quality or outcome metrics. Evaluation Rating Your evaluations are given a rating on a scale out of 5. The rating is calculated by comparing the dimension results for that evaluation to the benchmarks. The way it is calculated is as follows: The rating starts at 2.5 For every dimension result above the benchmark, the rating is increased by 0.5 For every dimension result below the benchmark, the rating is reduced by 0.5 If there are three or more dimension results on par with the benchmark, the rating is increase by 0.5 This also means that if you use fewer than 5 dimensions you cannot achieve a 5/5 rating. Report Content The report contains four sections: Dimension Usage – which dimensions are covered in the report and who was surveyed Public Dimension Results – comparing your public dimension results to benchmarks Evaluation Ratings – gives your evaluations a rating based on their public dimension results Individual Evaluation Reports – a page(s) for each evaluation included in the report showing dimension results and any feedback from peer reviewers and self-assessors, via the standardised free text questions Dimension Usage This section lists the dimensions that are included in the evaluations in the report and shows which domain they are from (e.g., Cultural Outcomes) and which respondent groups they were asked of. Public Dimension Results This section shows the Dimension Performance Tracker (DPT) metric for each of the dimensions you have used. It separates your dimensions into different tables, dependent on their DPT (above, below or on par). These tables show the DPT values for each dimension, as well as how many results had enough data for a significance test and the overall mean average for that dimension. After the tables, there is a chart which shows how the individual evaluations contributed to the DPT. In the example below, we can see that the evaluation Silhouette’s Dance contributed positively to the DPT for Relevance, but negatively for the DPT for Enthusiasm. Evaluation Ratings This section shows the ratings for your evaluations that we introduced earlier. In addition to the rating itself, it also provides a short explanation for the rating, the sample size (n), and the start date and finish date for the evaluation (please know that these dates will only show if they have been added to the evaluation as a ‘property’). Individual Evaluation Reports This section creates a mini-report for every evaluation included in the report. The mini-report includes a table showing your dimension results, including the results of the significance test. It also shows you the free text feedback from peer reviewers and self-assessors, where standardised free text questions were used. This feedback is given to the following questions (if they were included in your surveys): Dimensions reasoning: “If you would like to, please explain your reasoning for the scores given.” Programme recommendation: “Following on from your visit to this event are there any changes you would recommend to the organisation to inform its future programming?” Exceeding expectations: “What in particular exceeded your expectations?” Open feedback: “If you have any further comments about the work, please write them here.” Technical Appendix Benchmark Creation To create the benchmarks, we first take all the public survey responses for the statements which correspond to a given dimension as well as the evaluation which they correspond to. For example, for the dimension Enjoyment, that would be responses to the statements ‘I had a good time’; ‘They had a good time’ and ‘We had a good time’. We then count the number of responses per evaluation. Only evaluations with at least 30 responses for that dimension are included in the benchmark. The mean for each evaluation is then calculated. By the Central Limit Theorem, this gives us a sampling distribution for the dimension which is normally distributed. As it is normally distributed, we can estimate population parameters for the distribution. The population standard deviation isn’t known. Therefore, we use a Student’s t-distribution to model the dimension data. Finally, by using the model, it provides us with a confidence interval and margin of error around the mean for the dimension. For example: ItemExample DimensionEnjoyment Sample size threshold (per evaluation)30 responses Evaluations included in benchmark113 Dimension mean0.892 Confidence level95% Margin of error1% Confidence interval0.883 to 0.902 Statistical Testing When we compare your dimension results to the benchmark, we construct a confidence interval for your dimension result and then see if this confidence interval overlaps with the confidence interval for the benchmark. The dimension results are nonparametric. So, to construct the confidence interval we use bootstrapping[2]. From the original sample, we take 1,000 resamples and use this to calculate the 95% confidence level interval. If this interval does not overlap the dimension benchmark interval, we consider it a ‘significant’ result. [1] Technically this is actually “there is insufficient evidence for us to reject the null hypothesis that the dimension result is the same as the benchmark”, but that is a mouthful. [2] https://statisticsbyjim.com/hypothesis-testing/bootstrapping/ Discover more resources