We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review.
- Authors
Sedlakova, Jana; Daniore, Paola; Horn Wintsch, Andrea; Wolf, Markus; Stanikic, Mina; Haag, Christina; Sieber, Chloé; Schneider, Gerold; Staub, Kaspar; Alois Ettlin, Dominik; Grübner, Oliver; Rinaldi, Fabio; von Wyl, Viktor
- Abstract
Digital data play an increasingly important role in advancing health research and care. However, most digital data in healthcare are in an unstructured and often not readily accessible format for research. Unstructured data are often found in a format that lacks standardization and needs significant preprocessing and feature extraction efforts. This poses challenges when combining such data with other data sources to enhance the existing knowledge base, which we refer to as digital unstructured data enrichment. Overcoming these methodological challenges requires significant resources and may limit the ability to fully leverage their potential for advancing health research and, ultimately, prevention, and patient care delivery. While prevalent challenges associated with unstructured data use in health research are widely reported across literature, a comprehensive interdisciplinary summary of such challenges and possible solutions to facilitate their use in combination with structured data sources is missing. In this study, we report findings from a systematic narrative review on the seven most prevalent challenge areas connected with the digital unstructured data enrichment in the fields of cardiology, neurology and mental health, along with possible solutions to address these challenges. Based on these findings, we developed a checklist that follows the standard data flow in health research studies. This checklist aims to provide initial systematic guidance to inform early planning and feasibility assessments for health research studies aiming combining unstructured data with existing data sources. Overall, the generality of reported unstructured data enrichment methods in the studies included in this review call for more systematic reporting of such methods to achieve greater reproducibility in future studies. Author summary: The digital revolution has led to an exponential growth of novel sources of data, such as data from social media or wearables. These data are mainly unstructured, which means they are not available in a pre-defined format that is easy to analyze. Digital unstructured data present an unprecedented opportunity for health researchers to enrich the existing knowledge base for studies and contribute to personalized and evidence-based medicine. We reviewed literature to summarize challenges that researchers commonly encounter and their possible solutions for combining digital unstructured data with other data sources in health research. The novelty and large availability of digital unstructured data are connected with two overarching barriers and challenges. First, digital unstructured data require novel forms of processing and standardization. Second, there is a lack of standardized guidelines, tools or techniques analyzing and incorporating them in research. Our review provides guidance for initial research planning aimed at researchers who wish to apply digital unstructured data enrichment in their studies, and best practices to overcome such challenges through a feasibility assessment.
- Subjects
ONLINE information services; PSYCHOLOGY information storage &; retrieval systems; MANAGEMENT of medical records; SYSTEMATIC reviews; MEDICAL care research; RESEARCH funding; DESCRIPTIVE statistics; DATA analysis; MEDLINE
- Publication
PLoS Digital Health, 2023, Vol 2, Issue 10, p1
- ISSN
2767-3170
- Publication type
Article
- DOI
10.1371/journal.pdig.0000347