Health Researchers: The COVID-19 Pandemic Needs an Open-Data Response

Nov. 3, 2020
Writing in the Health Affairs Blog, a team of healthcare researchers has examined the data situation surrounding COVID-19, and is urging the creation of an open-data network to help mitigate the pandemic

Writing in the Health Affairs Blog on November 2, Sunyoung Pyo, Ph.D., Luigi Reggi, and Erika G. Martin, Ph.D., M.P.H., on Monday published an article entitled “The Potential Role Of Open Data In the COVID-19 Pandemic: Challenges And Opportunities.” In it, they argue that a new approach to data needs to emerge in order to help U.S. healthcare system leaders get a better handle on the data whose use could change the trajectory of the pandemic.

“The scale and diffuse impact of the global 2019 novel coronavirus (COVID-19) pandemic is unprecedented in our lifetime,” the researchers write. “As of October 23, 2020, less than a year into the pandemic, there have been more than 41.7 million cases and more than one million deaths globally. Without a vaccine or widespread treatment access, the primary population-focused COVID-19 mitigation strategies are behavioral interventions such as restricting population mobility and encouraging good hygiene such as wearing facial coverings and washing hands frequently.”

Still, the authors note, “There is one tool for the COVID-19 response that was not as robust in past pandemics: open data. For about 15 years, a ‘quiet open data revolution’ has led to the widespread availability of governmental data that are publicly accessible, available in multiple formats, free of charge, and with unlimited use and distribution rights. The underlying logic of open data’s value is that diverse users including researchers, practitioners, journalists, application developers, entrepreneurs, and other stakeholders will synthesize the data in novel ways to develop new insights and applications. Specific products have included providing the public with information about their providers and health care facilities, spotlighting issues such as high variation in the cost of medical procedures between facilities, and integrating food safety inspection reports into Yelp to help the public make informed decisions about where to dine. It is believed that these activities will in turn empower health care consumers and improve population health.”

To begin with, there is a lot of publicly available data to work with. “One set of open data use cases is the curation of data from diverse sources for better visualization of the scope of the pandemic and information exchange,” the researchers write. “Researchers at Johns Hopkins University synthesized publicly available data from across the world into COVID-19 data dashboards displaying domestic and global trends in cumulative incidence, recoveries, and deaths; as well as supplemental interactive visualizations on topics such as flattening the curve and the availability of state-specific data on COVID-19 data by race. The New York Times has highly interactive domestic and global narrative data visualizations on the location of hotspots, local trajectories, deaths, and other outcomes. These narrative visualizations end with layperson summaries of current knowledge about the virus and how readers can reduce their risk. In Italy, the Io Conto civic platform allows directors and employees at public hospitals to report positive cases and other pandemic-related data, which has led to the provision of municipal-level data that were previously not accessible. These examples illustrate how non-governmental users including academics, journalists, and the civic tech community have creatively leveraged these open data to facilitate our understanding of the pandemic, communicate risk to the public, promote quick analysis by researchers, and enhance data quality efforts. The role of ‘open solutions’ to facilitate research and information on the virus has been highlighted by both the United Nations Educational, Scientific, and Cultural Organization (UNESCO) and the Organization for Economic Co-operation and Development (OECD),” they note.

Another set of use cases, the authors note, has been “the creation of mobile applications to empower consumers to make data-informed decisions on how to adjust their retail behaviors to reduce their personal risk. Early in the pandemic, South Korea disclosed information on places where infected persons visited, including the geocoded locations of retail shops and religious facilities. Using government data, private-sector software developers created popular mobile applications such as Corona 100m and Corona Map that send push notifications to users about the location of newly infected cases and their recent movements; for example, Corona 100m alerts users when they are within 100 meters of a location where an infected person has recently visited. Many other countries are following suit by implementing or developing their own “Corona apps.” The Taiwanese government’s publicly available real-time data on face mask availability have been used by developers to create applications that allow users to know where masks are in supply and peak shopping times, with goals of reducing anxiety about mask shortages and limiting crowding in stores.”

As a May 21 article in Reuters online by Hyonhee Shin, Hyunjoo Jin, and Josh Smith, entitled “How South Korean turned an urban planning system into a virus tracking database,” noted, “When a man in Seoul tested positive for the new coronavirus in May, South Korean authorities were able to confirm his wide-ranging movements in and outside the city in minutes, including five bars and clubs he visited on a recent night out. The fast response - well ahead of many other countries facing outbreaks - was the result of merging South Korea’s already advanced methods of collecting information and tracking the virus into a new data sharing system that patches together cellphone location data and credit card records. “The Epidemic Investigation Support System (EISS), introduced in late March, effectively removed technological barriers to sharing that information between authorities, by building on the country’s ‘Smart City’ data system.”

Further, those reporters noted, “That platform was originally designed to let local authorities share urban planning information, from population to traffic and pollution, by uploading data in Excel spreadsheets and other formats. Now it forms the foundation for a data clearing house that has turbocharged South Korea’s response to the virus. While personal location and credit card data has been available for use by South Korean health investigators for years, previous systems required physical paperwork to request the data before it was uploaded to analytical software. That took investigators about two to three days to gather a patient’s personal data to trace their contacts. The new system digitizes the entire process, including the requests, and can reduce that time to less than an hour, officials say. Investigators can use it to analyze transmission routes and detect likely infection hotspots.”

The Health Affairs Blog writers concede that there are major obstacles to overcome, among them “major financial and staff resources and expertise,” as well as “data quality, timeliness, completeness, and availability.” In addition, they note that, “For countries such as Korea and Singapore that disclosed granular data, additional challenges have been privacy concerns and increased social stigma, which can in turn discourage community members from being tested. There is a serious risk of reidentification, with some academic researchers and human rights activists raising privacy and civil liberty concerns about the amount of detailed information released and how it might be used by governments (for example, in South Korea, information includes age, gender, time, and name of businesses frequented including whether a toilet was used and a mask was worn). There are already examples of these data leading to individual damages such as a suspension of Uber accounts in Mexico to a couple drivers who gave a ride to an infected patient and their recent passengers; internet mobs in South Korea who have used data to re-identify individuals and harass them; and the public outing of an Australian doctor by the health minister.”

All that having been said, Pyo, Reggi, and Martin argue that publicly available and other data can successfully be leveraged, if certain conditions are met. “First and foremost, it is critical to continue to advocate for government transparency,” they urge. “The White House’s recent mandate that US hospitals send data on COVID-19 hospitalizations and equipment to a new federal database developed by a private contractor at the Department of Health and Human Services, rather than to the existing National Healthcare Safety Network run by the Centers for Disease Control and Prevention, has led to concerns about the transparency of the new data collection process and the potential impact on data availability. Data transparency has also been a concern at the US state level.”

And, the researchers emphasize, “Beyond a culture of government transparency, continued investments in open data infrastructure are needed. Successful open data efforts require developing leadership, governance procedures such as open data handbooks and processes to ensure that de-identification standards have been met, ensuring the development and publication of comprehensive metadata, committing resources for start-up and ongoing maintenance, and making the publication of open data a routine government function.” Still, they insist, “[T]he pandemic can be an opportunity to renew attention on enhancing the quality, timeliness, and completeness of government-produced health data, by strengthening existing information management practices.”

Sponsored Recommendations

Explore how healthcare leaders are shifting from reactive maintenance to proactive facility strategies. Learn how data-driven planning and strategic investment can boost operational...
Navigate healthcare's facility challenges. Get strategies to protect assets and ensure long-term stability.
Join Claroty, Cisco, and Children's Hospital Los Angeles (CHLA) on-demand as they uncover the reasons behind common pitfalls encountered by hospitals in network segmentation efforts...
Cyber-physical systems (CPS) in healthcare encompass OT assets and systems, along with a proliferation of connected devices. This includes clinical assets, medical devices, building...