Decorative - Research Data Scotland logo

Guest blog: Not all data and statistics can improve lives – how to tell the good from bad

Reading Time: 5 minutes

Author: Professor Roger Halliday, CEO, Research Data Scotland

As with all guest blogs, what follows are the views of the authors and not those of SPICe or indeed the Scottish Parliament.

It was a pleasure last month to present a session about critical numbers and how to interrogate them effectively to MSPs at the Scottish Parliament alongside the Office for Statistics Regulation.

As the former Scottish Government Chief Statistician for eleven years, communicating the importance of numbers, both large and small, was a daily part of my role. As CEO of the newly established Research Data Scotland, the use of how data can be accessed more efficiently is the cornerstone of our mission. To understand the power of data, society needs people who are confident in critically assessing and understanding numbers.

About Research Data Scotland

Research Data Scotland’s (RDS) mission is to connect researchers to public sector data, making it easier and quicker to access evidence to help others create policies that are more resilient for the benefit of everyone.

Data is often locked away in lots of individual systems, across many different organisations, and isn’t in a format that makes access or integration easy. 

We help researchers – be they in academic or other settings such as local government, third sector or business – find and make use of health, social care and administrative data to improve the lives of people in Scotland. This in turn can help Government and public bodies know what is working, and better understand causes of our policy challenges, as we can bring data together securely to give a person, household, business or placed centred view of the world.  

We are a not-for-profit charitable organisation created and funded by the Scottish Government – this is only our second year of operating as a not-for-profit charitable organisation. We are a partnership between Scottish Government, leading universities, and public bodies, such as Public Health Scotland (PHS) and National Records Scotland (NRS). 

Our first year was about drawing together plans based on an understanding of the needs of researchers. This year is about delivering on those plans, as outlined in our new business plan, and a key part of this will be creating a new Researcher Access Service. This new service will be the end-to-end pathway enabling researchers to apply for and get access to secure data, and replaces current inefficient systems. For example, our research tells us it takes on average seven months for people to access the data they have requested.  

One aim for Research Data Scotland is to attract research investment to Scotland.  We have excellent data and as described above, we are working to simplify access to that data under secure conditions. We’re also identifying data access arrangements for industry for research for the benefit of wider society. We recently did a review of current practice and found  a lack of consistency across Scotland. There’s a need for a policy statement based upon best practice that is rooted in a firm legal position, public acceptability, and one that would maximise the public benefit from industry using public data. We’re now working with a number of partners to develop that Scotland policy statement.

Trust from the public and business to handle their data responsibly is at the cornerstone of everything we do. We have established an approach for involving and engaging the public. Also, as part of our commitment to involve members of the public directly in our work, we are joining forces with Scottish Centre for Administrative Data Research (SCADR) to co-host a public panel on data. This will enable us to share resources and continue to listen and understand how people wish their data to be used, help them see the benefits from research and the robust measures in place to keep data safe. 

Not all data and statistics can improve lives – how to tell the good from bad

This brings us full circle to the session with MSPs and staff, hosted with SPICe. The use of data and statistics is an increasingly important tool in parliamentary scrutiny but the volume and complexity of data can often be a challenge for MSPs and officials to interpret confidently and to tell good from bad data. Being able to assess numbers presented to you in a media report or a report is a useful skill and the Royal Statistical Society, amongst others, has some valuable resources that can help. Here are some of tips and questions to ask yourself:

  • Context is key: is it a big number – how many or how much per person, or household? Is the baseline a sensible starting point to measure change over or has it been chosen to exaggerate or minimise change?
  • There are different averages: the mean and median often tell different stories. Which average is being used and why?
  • Accuracy and variability: in general, you can be more confident in data based on larger numbers, so a key question to ask is how big is the sample underpinning this number?
  • Surveys and polls: Some surveys give more reliable results than others. Some surveys are biased because the organisation commissioning the survey designs it to provide that bias. So, ask who commissioned the survey and think about potential biases that might bring in. Does the sample reflect the wider population? And were questions were asked most appropriate?
  • Percentages: A key bit of terminology to get right in interpreting change is percentage change versus  a percentage point? For example, VAT increased to 20% in January 2011​. This is a rise of 2.5 percentage points, not a rise of 2.5%
  • Correlation and causation: a significant correlation between two variables does not imply one causes the other.​ Often there is a common cause for both variables, or it’s just a coincidence. For example, using data from USA, this graph shows the per capita consumption of cheese and deaths caused by becoming tangled in bedsheets. Is eating cheese causing Americans to be tangled in bedsheets? You might be forgiven for thinking so but in this is a simple case, the critical question to ask yourself is does A cause B or is it just a spurious correlation?

Ask yourself:

  • What is being counted?​
  • How was it counted? ​
  • Is it a big number?​
  • How certain are we about a number or change?​
  • What statistics give the picture needed?

Ask yourself:

  • Comparisons: what is being compared?​
  • Inference: Does A cause B or is there common cause or spurious correlation?
  • Do the numbers feel right? 
Line chart showing correlation between cheese consumption per capita with the number of people who died by becoming tangled in their bedsheets.

Official statistics are produced to high standards that avoid the potential pitfalls of bias and inaccurate reporting. It was excellent to hear from the official statistics regulator about their work ensuring those across the Government Statistical Service meet that high quality, and calling out occasions where statistics are quoted in public debate that don’t live up to those standards.

I was really impressed to see a full-house, with over 40 people at the breakfast briefing, and the level of engagement from MSPs and staff. I hope in future sessions with SPICe we can delve into future areas of concern such as AI and how we might remain curious and questioning of the information presented while we amass even more data and statistics.

Professor Roger Halliday, CEO, Research Data Scotland