Transfer Learning NLP to Improve Adoption of Clinical Text in Multi-Site Studies

Page last updated April 14, 2026

Study Design: Other, Data Science
PCORnet Infrastructure: Common Data Model (CDM), Patient partners or engagement
Principal Investigator:
Yonghui Wu
Institution: University of Florida and University of Florida Health
PCORnet® Network Partner: OneFlorida+
Funder: Patient-Centered Outcomes Research Institute (PCORI); (Project webpage)
Funding Date: 2024
Study Duration: 2024 – 2027
Participating PCORnet® Clinical Research Networks: INSIGHT, OneFlorida+
Therapeutic Area: Data Science
Status: Active, not recruiting

Research Question(s): Can large language models - an advanced form of artificial intelligence (AI) - improve the generalizability of patient information extraction and clinical phenotyping across different healthcare systems?

Semantic Data Quality Standards for Multi-Center Clinical Research Studies and Networks

Page last updated May 11, 2026

Study Design: Other, Methods to improve study design, methods to support data research networks
PCORnet Infrastructure: Common Data Model (CDM), Single IRB, Patient partners or engagement, Clinical Research Collaboration Agreement
Principal Investigator: L. Charles Bailey
Institution: The Children's Hospital of Philadelphia
PCORnet® Network Partner: PEDSnet
Funder: Patient-Centered Outcomes Research Institute (PCORI); (Project webpage)
Funding Date: 2021
Study Duration: 2021 – 2026
Participating PCORnet® Clinical Research Networks: PEDSnet
Therapeutic Area: Data Science
Condition: Data quality assessment, data quality analysis, data quality reporting, standards development
Status: Active, not recruiting

Research Question(s):

  1. Can we find ways to more accurately describe how suitable data are to answer a specific research question?
  2. What are the tools that can be used across studies to consistently describe whether the data are high quality?

Primary Publication(s):

Razzaghi H, Dickinson K, Wieand K, et al. A multifaceted approach to advancing data quality and fitness standards in multi-institutional networks. J Am Med Inform Assoc. 2025;ocaf181. doi:10.1093/jamia/ocaf181

Razzaghi H, Wieand K, Dickinson KL, et al. Beyond missingness: systematizing methods for comprehensive data fitness assessment in clinical research. J Med Internet Res. 2026;28(1):e76398. Published April 14, 2026.
doi.org/10.2196/76398