Application of Information Retrieval Approaches to Case Classification in the Vaccine Adverse Event Reporting System
- First Online:
- Cite this article as:
- Botsis, T., Woo, E.J. & Ball, R. Drug Saf (2013) 36: 573. doi:10.1007/s40264-013-0064-4
Automating the classification of adverse event reports is an important step to improve the efficiency of vaccine safety surveillance. Previously we showed it was possible to classify reports using features extracted from the text of the reports.
The aim of this study was to use the information encoded in the Medical Dictionary for Regulatory Activities (MedDRA®) in the US Vaccine Adverse Event Reporting System (VAERS) to support and evaluate two classification approaches: a multiple information retrieval strategy and a rule-based approach. To evaluate the performance of these approaches, we selected the conditions of anaphylaxis and Guillain–Barré syndrome (GBS).
We used MedDRA® Preferred Terms stored in the VAERS, and two standardized medical terminologies: the Brighton Collaboration (BC) case definitions and Standardized MedDRA® Queries (SMQ) to classify two sets of reports for GBS and anaphylaxis. Two approaches were used: (i) the rule-based instruments that are available by the two terminologies (the Automatic Brighton Classification [ABC] tool and the SMQ algorithms); and (ii) the vector space model.
We found that the rule-based instruments, particularly the SMQ algorithms, achieved a high degree of specificity; however, there was a cost in terms of sensitivity in all but the narrow GBS SMQ algorithm that outperformed the remaining approaches (sensitivity in the testing set was equal to 99.06 % for this algorithm vs. 93.40 % for the vector space model). In the case of anaphylaxis, the vector space model achieved higher sensitivity compared with the best values of both the ABC tool and the SMQ algorithms in the testing set (86.44 % vs. 64.11 % and 52.54 %, respectively).
Our results showed the superiority of the vector space model over the existing rule-based approaches irrespective of the standardized medical knowledge represented by either the SMQ or the BC case definition. The vector space model might make automation of case definitions for spontaneous report review more efficient than current rule-based approaches, allowing more time for critical assessment and decision making by pharmacovigilance experts.