Clinical data reuse for clinical research (PHIS+ pneumonia and acute appendicitis)

The PHIS+ project is enriching an administrative database with clinical data from six different pediatric hospitals (Boston Children’s Hospital, Children’s Hospital of Philadelphia, Children’s Hospital of Pittsburgh, Cincinnati Children’s Hospital Medical Center, Primary Children’s Hospital (Salt Lake City, UT), and Seattle Children’s Hospital to enable comparative effectiveness studies in the pediatric population. Clinical data in the PHIS+ repository is standardized and harmonized using biomedical terminologies and common data models. But unlike laboratory results, which are available in discrete format for comparative effectiveness research analyses, imaging reports are available only in narrative clinical text and lack standardization in structure and format. To allow for efficient and rapid access to this data, we developed Natural Language Processing (NLP) applications. Our efforts focused on two pediatric diseases: bacterial community-acquired pneumonia, and complicated acute appendicitis.

Appendicitis is the most common abdominal surgical emergency in children, and treatment of the disease is associated with relatively high resource utilization due to its high incidence and morbidity of perforated disease. Increasing evidence suggests that oral antibiotic therapy may be as efficacious as intravenous antibiotics administered through a peripherally inserted central catheter (PICC), at a lower cost and with fewer complications. Our NLP application focused on the detection of patients treated with a PICC line, allowing for significantly higher sensitivity than methods based on existing structured coded clinical data.

Community-acquired pneumonia (CAP) is a leading cause of hospitalization among children, but the effectiveness of common management strategies is unknown. Our NLP application was developed to determine the diagnosis of bacterial pneumonia from pediatric diagnostic imaging reports by extracting pneumonia characteristics (i.e., presence, symmetry, and size of pleural effusion and pulmonary infiltrate). Our application detected patients with probable bacterial pneumonia with significantly higher sensitivity (71% versus 52.7%) and similar positive predictive value and specificity than domain experts.



  • Meystre, S. M., Khalifa, A., Gouripeddi, R., & Rangel, S. D. (2015). Automatic Extraction of Pediatric Acute Appendicitis Treatment Devices from Diagnostic Imaging Reports in a Multi-Institutional Clinical Repository. AMIA Summit on Translational Bioinformatics. San Francisco, CA.
  • Meystre, S. M., Gouripeddi, R., Shah, S. S., & Mitchell, J. A. (2013). Automatic Pediatric Pneumonia Characteristics Extraction from Diagnostic Imaging Reports in a Multi-Institutional Clinical Repository (p. 178). Presented at the AMIA Summits Transl Sci Proc, CRI.
  • Meystre, S. M., Gouripeddi, R., Trivedi, A., & Rangel, S. D. (2013). Pediatric Acute Appendicitis Treatment Devices Automatic Extraction from Diagnostic Imaging Reports in a Multi-Institutional Clinical Repository (p. 999). Presented at the AMIA Annu Symp Proc, Washington, DC.