Amid the furore over the delayed Care.data scheme, the reality is that the storage of pseudoanonymised patient data is already common practice, writes Dr David Springate. He argues that a national primary care database will bring big benefits – and says the risk of individuals’ data being de-anonymised by big pharma companies or criminals is remote.
Without a doubt, NHS England and the government have made a hash of the public-relations campaign building up to the launch of the Care.data scheme. They omitted references to opting out on the promotional literature and failed to properly discuss risks, however remote, to do with personal data sharing and security.
They also made a distinctly underwhelming case for the advantages of a national primary care database. Most of these faults probably stem from a fundamental lack of understanding of the science and technology involved by the communicators. But they have left the government and NHS England open to attacks on the integrity of the project and of simply wanting to sell off our personal and private records to the highest bidder.
The news has been full of opinion pieces on Care.data for the past few weeks. One thing that seems to have escaped all of the contributors of the articles I have read is that, for a sizeable proportion of the UK population, the sharing of their electronic medical records with academia and Pharma is already a reality and has been for decades.
About a third of UK patients already have their electronic medical records held on the main current UK primary care databases ( e.g. CPRD, THIN, QResearch), and many have their pseudoanonymised data accessible (for a fee) to both medical researchers such as myself and to private companies, including drug companies .
Such data are currently stored and accessed in almost exactly the same way that Care.data will be. In fact, apart from the scale of the project, the only real differences with the new system are that the data will be kept on government owned servers at the Health and Safety Information Centre rather than in the data centres of private or semi-private companies – and that now patients have the explicit option to opt-out whereas with the current systems, it is GPs who decide whether or not their practices as a whole will contribute data.
At the moment, in the majority of cases patients will not even be aware that their data are being collected, let alone be offered the opportunity to consent.
This said; the benefits of having a national primary care database should not be underestimated. A huge number of medical charities, funding bodies and journals including the British Health Foundation, Alzheimer’s Society, Cancer Research UK, Nature and the Wellcome Trust all attest to this.
Primarily it will prove invaluable for the effective management of NHS resources. From a research perspective, it will mean the UK will have perhaps the most complete database of patient health care of any country and continue to be a world leader in ‘big health data’ and informatics.
There have already been well over a thousand peer reviewed articles using UK primary care databases and electronic medical records databases have been used to debunk bad science such as associations between the MMR vaccine and autism.
The huge sample size available via Care.data means we will have higher power to detect low prevalence conditions, side effects and interactions that are currently effectively outside of evidence-based medicine. This extends to drug companies, who surely have a duty of care to monitor the population exposed to their products to look for side effects and interactions with other drugs that would never be picked up in randomised controlled trials.
One concern that the government has failed to adequately address is that of the availability of de-anonymised data. HSCIC will store identifiable data (as do CPRD and the other primary care databases now) but will only allow this to be released to specific agencies in exceptional circumstances after approval from an independent advisory group. An example of how this system works now is using CPRD or FARSITE to identify patients for suitability for clinical trials.
Note that in these cases, identified data are used for linkage to a practice but are never available directly to the researcher. Again, Care.data represents only an extension of currently available systems.
A second concern is that companies will illegally de-anonymise our medical records and use them to, for example, target us better for their products (if they are a big pharma company) or skew our insurance claims (if they are an insurance company). The implausibility of this is fairly clear to anyone who works with electronic medical records or indeed any data on this scale. The complexity of the process of illegal de-anonymisation is such that it is unlikely that it would be possible to identify more than a very small proportion of patients with an ultimately unknown error rate.
The database would tell someone (willing to put a lot of work in) that, for example, there is a 90 year old female with diabetes, dementia and on certain medication registered with and attending a certain practice. To identify this patient, external information would be needed and it is unclear where it would be obtained from. Would the company survey the area at random asking about patients who fit the profile?
And obviously that would only be ‘feasible’ for combinations of characteristics that are not prevalent (so very old patients, numerous disorders or diseases, rare conditions etc…). It seems impossible that a company would break the law and grandiosely jeopardise its standing, by embarking on such a complicated enterprise with practically zero returns.
Alternatively, ‘hackers’ may do the identification and sell the data on themselves to whoever will pay. The ludicrousness of such an argument is pretty obvious: why would someone pay for the name of that 90 year old female mentioned above (which cannot be identified solely through the database and considerable external effort is required)? Would it make any difference to identity thieves what medication you are on? No.
They would be after your name, age, sex, occupation and so on; information that is either unavailable in the database (name, occupation) or is an absolute requisite for identifying your medical record in the database (age, sex).
On the other hand your social media data, credit card information and iTunes, Paypal and Amazon passwords are all much easier to harvest, intrinsically more valuable and doing so does not necessarily require PhD-level knowledge of large databases and months or years invested in training to properly extract this information.
People’s concerns about Care.data are legitimate and they deserve to have them properly addressed but if this doesn’t happen and the project ultimately fails or is reduced in scope, this will be bad news for the NHS, for research and for patients in the UK and beyond.