In phases 1 and 2, we included adults without diabetes with an in-person office visit at a primary care clinic (n = 19) at an academic medical center and at least one HbA1c 5.7–6.4% between 7/1/2016 and 12/31/2018. We based the initial keyword search strategy on the authors’ clinical experience (Table 1). In phase 1, we identified and extracted PCP encounter notes matching ≥ 1 keyword from two clinics. Through random chart review of notes for patients meeting the inclusion/exclusion criteria but not containing any keyword, we identified additional keywords. The Supplement provides additional details.
Table 1 Keywords Included in Search Strategy and Frequency of Keywords Matching to Clinical Discussion About Prediabetes In phase 2, using data from 17 other clinics, we extracted the first PCP visit note following lab results indicating prediabetes (n = 1095 encounters) and applied the updated keyword search strategy (n = 391 encounters). Two reviewers (E.T. and J.L.S.) manually annotated the notes to determine whether they contained clinical discussions of prediabetes. We applied NLP techniques using machine learning to replicate human annotation.3 To reduce overfitting and classification bias and confirm internal validation, we applied 10-fold cross-validation to shuffle the training and test sets.4 We selected logistic regression and bi-directional recurrent neural networks based on performance. We evaluated classification results using sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).
Two reviewers (E.T. and R.L.S.) reviewed each note from phase 2 to describe the prediabetes discussions: (1) labs ordered/reviewed (HbA1c or fasting glucose), (2) lifestyle counseling, (3) diabetes prevention program (DPP) discussion/referral, (4) nutrition discussion/referral, and (5) metformin discussion or ordering/continuation. We calculated proportions with the denominator being the number of patients with a documented discussion about prediabetes and numerator being the number of patients who had each type of discussion listed above (STATA, version 15). This study was approved by the Johns Hopkins IRB.