We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language Processing.
- Authors
Fevrier, Helene B.; Liu, Liyan; Herrinton, Lisa J.; Li, Dan
- Abstract
Key variables recorded as text in colonoscopy and pathology reports have been extracted using natural language processing (NLP) tools that were not easily adaptable to new settings. We aimed to develop a reliable NLP tool with broad adaptability. During 1996–2016, Kaiser Permanente Northern California performed 401,566 colonoscopies with linked pathology. We randomly sampled 1000 linked reports into a Training Set and developed an NLP tool using SAS® PERL regular expressions. The NLP tool captured five colonoscopy and pathology variables: type, size, and location of polyps; extent of procedure; and quality of bowel preparation. We used a Validation Set (N = 3000) to confirm the variables' classifications using manual chart review as the reference. Performance of the NLP tool was assessed using the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Cohen's κ. Cohen's κ ranged from 93 to 99%. The sensitivity and specificity ranged from 95 to 100% across all categories. For categories with prevalence exceeding 10%, the PPV ranged from 97% to 100% except for adequate quality of preparation (prevalence 92%), for which the PPV was 65%. For categories with prevalence below 10%, the PPVs ranged from 62% to 100%. NPVs ranged from 94% to 100% except for the "complete" extent of procedure, for which the NPV was 73%. Using information from a large community-based population, we developed a transparent and adaptable NLP tool for extracting five colonoscopy and pathology variables. The tool can be readily tested in other healthcare settings.
- Subjects
COLONOSCOPY; DATABASE management; INFORMATION retrieval; MEDICAL records; NATURAL language processing; PREDICTIVE tests; ACQUISITION of data methodology
- Publication
Journal of Medical Systems, 2020, Vol 44, Issue 9, pN.PAG
- ISSN
0148-5598
- Publication type
Article
- DOI
10.1007/s10916-020-01604-8