The missing numbers
About obtaining data sets to support researchers
If you are a follower of email lists, you might have seen a few requests recently for the search for data sets to support researchers. There is a new search trend developing and a need to improve data literacy.
Ben Goldacre has certainly made some recommendations in his recent report Better, Broader, Safer: using health data for research and analysis, about the need for a national data repository, where data is allocated crown copyright to support re-use and the need to improve data literacy But how on earth do you go about starting a search?
Our Informatics Department uses the NHS Data Model & Dictionary for England, it is a one top shop to learn the lingo of data science. It also has access to different data sets. There may also be pockets of work about with local data set collections, so they are worth an ask.
NHS Digital also hold a collection of National Data Sets for things like community, commissioning, workforce, paediatric critical care, female genital mutilation. For a more international perspective check out WHO Data Collections and the Health Topic of OECD Data. OECD also have the useful overview (Health at a glance 2021) of international health indicators covering health status, risk factors, access to healthcare, quality outcomes, expenditure and workforce.
Data can be generated from a wide range of sources. Google has now developed a search engine for data sets. Or for more unofficial lists, consider checking out GitHub list, you will note that it also lists Medical Subject Headings as a data set and links to genomic data.
Inevitably in libraries we also have the back up of data which is held in the literature and in charity reports. Data mining is something which we need to consider doing to help plug some of the gaps in the official data sets. How else will we need to change search practice as research methodologies and technologies change?
Library Manager, Mid Cheshire Hospitals NHS Foundation Trust