A record number of clinical trials worldwide signify important medical advances to come. To make the most of the data, we need better ways for sharing it without compromising the privacy of the individuals involved in clinical trials. The virtues of sharing may seem obvious: the more studies, the more data available for analysis and research. The number of clinical trials going on today is impressive, as is the volume of data produced from them.
As data from ClinicalTrial.gov indicates, worldwide there were more than 40,000 clinical trials filed with government regulators in 2016. Studies in the United States comprised almost 40% of studies in the entire world, demonstrating the country’s focus on the search for new and better treatments.
Big data getting bigger
Since many of the studies will continue for years, the amount of valuable data produced will be enormous. The data from ongoing studies is expanding rapidly, partly as a result of national governments requiring that more data supported by public funds be shared. The number of registered studies with posted results was a little more than 1,800 when ClinicalTrial.gov began. Today, there are more than 23,500 studies with results. This may only be the tip of the iceberg, as many studies are privately funded and not required to register.
There are many benefits to sharing this abundance of data, including the following:
- allowing researchers to replicate the analysis in published studies,
- facilitating secondary analysis of individual trial data, as well as pooled trial data,
- supporting meta-analysis using individual patient data,
- providing transparency into the decision making of regulators approving new medications or indications,
- sharing trial designs for different therapeutic areas and their rationale, and
- making data available for educational purposes.
The pharmaceutical industry is a major source of data from clinical trials. Its companies share clinical trial data under a number of different data release mechanisms. These apply to both individual patient datasets as well as the clinical reports.
These mechanisms address the needs of different stakeholders, from academic researchers to the media, citizen scientists, patients, and other companies working in the same therapeutic areas. Each mechanism has pros and cons in terms of the effort needed to get access to the data and the quality of the data that will be shared. While the benefits of sharing are great, so are the risks, such as patient privacy concerns and data breaches. While the risks will never be zero, and may increase somewhat due to the sheer amount of data generated, they can be managed at a reasonable level—and that should be the unwavering commitment of those who are responsible for data protection.
Challenges and how to overcome them
The pharmaceutical industry has made significant investments in the last two years to develop the infrastructure and expertise to improve these data sharing mechanisms. There are still some common challenges that these companies face.
- Sharing clinical trial data raises patient privacy concerns. Any data shared needs to be anonymized to protect the patients’ identities. Not only is this a legal requirement, it lowers the risk of exposure and data breaches that could result in law suits, financial penalties, and reputational damage.
- Privacy risks need to be managed across all of the data release mechanisms. Information about the same trial may be shared, for example, with a CRO that has signed a data sharing agreement. The availability of trial information in one form should not increase the privacy risks for patients when the trial information is also released in another form.
- There needs to be greater awareness of globally accepted standards. There are a number of guidelines and standards that all call for a risk-based approach for the de-identification of private health information. The Health Information Trust Alliance (HITRUST) recently released a de-identification framework, which organizations can use when creating, accessing, storing, or exchanging personal information. This framework has set a standard which health organizations can follow to protect patient privacy while enabling data use for secondary purposes. Other organizations, like the Institute of Medicine, PhUSE, The European Medicines Agency, and the Council of Canadian Academies have published similar guidelines that permit sharing sensitive data while managing the risks of re-identification.
- It is not clear if regulators, such as the FDA, will also require companies to share their clinical trial data and whether their requirements will be consistent with other prevailing practices, such as those in Europe. In these cases, global healthcare organizations should position themselves as leaders by adopting the most robust data sharing protocols.
New technology, processes, and governance innovations have been initiated by organizations involved in clinical trials. These innovations will allow large-scale, cost-effective data sharing to be sustainable in the longer term while protecting patient personal information.
These are some of the innovations:
- Natural Language Processing tools can now identify patient personal information in unstructured formats, such as large clinical study reports. These are being built into software tools that are already on the market.
- Meta search engines are being developed that allow the discovery of clinical trial data across multiple portals and repositories such as AllTrials and Vivli.
- There are now training and certification programs that teach techniques for removing personal information from clinical trial data. These programs follow the guidelines laid out by HITRUST, the Institute of Medicine, and European Medicines Agency.
- Academic research funders, journals, and public health agencies, such as the National Institutes of Health, are increasingly active in supporting and promoting clinical trial data sharing efforts.
We are entering a new era where access to valuable new streams of clinical trial data is becoming the expected norm. We need this data because it can help create treatments for serious illnesses including rare forms of cancer in adults and children. Because more data available means more risk to data privacy, there are solutions for protecting personal information that can enable much more data sharing and medical innovation. Once these solutions are fully understood and applied, we should see significant benefits to patients, enhanced public trust in the pharmaceutical industry, and faster innovation in the discovery of new medicines.