Skip to main content

Advertisement

Log in

Cross-Sectional Relatedness Between Sentences in Breast Radiology Reports: Development of an SVM Classifier and Evaluation Against Annotations of Five Breast Radiologists

  • Published:
Journal of Digital Imaging Aims and scope Submit manuscript

Abstract

Introduce the notion of cross-sectional relatedness as an informational dependence relation between sentences in the conclusion section of a breast radiology report and sentences in the findings section of the same report. Assess inter-rater agreement of breast radiologists. Develop and evaluate a support vector machine (SVM) classifier for automatically detecting cross-sectional relatedness. A standard reference is manually created from 444 breast radiology reports by the first author. A subset of 37 reports is annotated by five breast radiologists. Inter-rater agreement is computed among their annotations and standard reference. Thirteen numerical features are developed to characterize pairs of sentences; the optimal feature set is sought through forward selection. Inter-rater agreement is F-measure 0.623. SVM classifier has F-measure of 0.699 in the 12-fold cross-validation protocol against standard reference. Report length does not correlate with the classifier’s performance (correlation coefficient = −0.073). SVM classifier has average F-measure of 0.505 against annotations by breast radiologists. Mediocre inter-rater agreement is possibly caused by: (1) definition is insufficiently actionable, (2) fine-grained nature of cross-sectional relatedness on sentence level, instead of, for instance, on paragraph level, and (3) higher-than-average complexity of 37-report sample. SVM classifier performs better against standard reference than against breast radiologists’s annotations. This is supportive of (3). SVM’s performance on standard reference is satisfactory. Since optimal feature set is not breast specific, results may transfer to non-breast anatomies. Applications include a smart report viewing environment and data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.nist.gov/tac/2010/RTE/

References

  1. American College of Radiology: Breast Imaging Reporting and Data System Atlas. American College of Radiology, Reston, 2003

    Google Scholar 

  2. Reiner BI: Customization of medical report data. J Digit Imaging 23(4):363–73, 2010

    Article  PubMed  Google Scholar 

  3. Gershanik EF, Lacson R, Khorasani R: Critical finding capture in the impression section of radiology reports. AMIA Annu Symp 2011:465–9, 2011

    Google Scholar 

  4. Friedman C, Johnson SB: Natural language and text processing in biomedicine. In: Shortliffe EH, Cimino JJ Eds. Biomedical informatics; computer applications in health care and medicine. Springer, New York, 2006, pp 312–43

    Google Scholar 

  5. Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB: A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1:161–74, 1994

    Article  PubMed  CAS  Google Scholar 

  6. Friedman C, Hripcsak G, Shagina L, Liu H: Representing information in patient reports using natural language processing and the extensible markup language. J Am Med Inform Assoc 6:76–87, 1999

    Article  PubMed  CAS  Google Scholar 

  7. Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, Halpern EF, Thrall JH: Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology 234(2):323–29, 2005

    Article  PubMed  Google Scholar 

  8. Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, Chute CG: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc 17:507–13, 2010

    Article  PubMed  Google Scholar 

  9. Jain NL, Knirsch CA, Friedman C, Hripcsak G: Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. Proc AMIA Annu Fall Symp, 1996, pp 542–46

  10. Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C: Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc 15(1):87–98, 2008

    Article  PubMed  Google Scholar 

  11. Dang PA, Kalra MK, Blake MA, Schultz TJ, Stout M, Halpern EF, Dreyer KJ: Use of Radcube for extraction of finding trends in a large radiology practice. J Digit Imaging 22(6):629–40, 2009

    Article  PubMed  Google Scholar 

  12. Sevenster M, van Ommering R, Qian Y: Automatically correlating clinical findings and body locations in radiology reports using MedLEE. J Digit Imaging 25:240–9, 2012

    Article  PubMed  Google Scholar 

  13. Chang C, Lin C: LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27, 2011

    Article  Google Scholar 

  14. Kudo T, Matsumoto Y: Chunking with support vector machines. In: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, 2001, pp 1–8

  15. Manning CD, Raghavan P, Schuetze H: Introduction to information retrieval. Cambridge University Press, Cambridge, 2008

    Book  Google Scholar 

  16. Hripcsak G, Rothschild AS: Agreement, the F-measure, and reliability in information retrieval. J Am Med Inform Assoc 12(3):296–98, 2005

    Article  PubMed  Google Scholar 

  17. Varma S, Simon R: Bias in error estimation when using cross-validation for model selection. BMC Bioinforma 7:91, 2006

    Article  Google Scholar 

  18. Salzberg SL: On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min Knowl Discov 1:317–27, 1997

    Article  Google Scholar 

  19. Dagan I, Glickman O, Magnini B: The PASCAL recognising textual entailment challenge. In Lecture Notes in Computer Science, vol. 3944. Berlin: Springer, 2006, pp 177–190

  20. Bar-Haim R, Dagan I, Dolan B, Ferro L, Giampiccolo D, Magnini B, Szpektor I: The second PASCAL recognising textual entailment challenge. Proc PASCAL RTE-2 Chall 3944:177–90, 2005

    Google Scholar 

  21. Pakray P, Bandyopadhyay S, Gelbukh A: Textual entailment using lexical and syntactic similarity. Int J Artif Intell Appl 2(1):43–58, 2011

    Google Scholar 

  22. Bayer S, Burger J, Ferro L, Henderson J, Yeh A: MITRE’s Submissions to the EU Pascal RTE Challenge. Proc PASCAL RTE-1 Challenge, 2005, pp 41–44

  23. Tatu M, Iles B, Slavik J, Novischi A, Moldovan D: COGEX at the Second Recognizing Textual Entailment Challenge. Proc. of the PAS-CAL RTE-2 Challenge, 2006

  24. Tatu M, Moldovan D: COGEX at RTE3. RTE ’07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, 2007, pp 22–27

  25. Hickl A, Williams J, Bensley J, Roberts K, Rink B, Shi Y: Recognizing textual entailment with LCC’s GROUNDHOG system. Proc. of the PAS-CAL RTE-2 Challenge, 2006

  26. Hickl A, BensleyJ: A Discourse Commitment-Based Framework for Recognizing Textual Entailment. Proceedings of the Workshop on Textual Entailment and Paraphrasing, 2007

  27. Li H, Hu Y, Li Z, Wan X, Xiao J: PKUTM participation in TAC2011. Proceeding RTE ’07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, 2010

  28. Jijkoun V, De Rijke M: Recognizing textual entailment using lexical similarity. Proceedings Pascal 2005 Textual Entailment Challenge Workshop, 2005

  29. Burnside ES, Davis J, Costa VS, Dutra IDC, Kahn CE, Fine J, Page D: Knowledge discovery from structured mammography reports using inductive logic programming. AMIA Ann Symposium, 2005, pp 96–100

  30. Wang R, Neumann G: Recognizing Textual Entailment Using Sentence Similarity based on Dependency Tree Skeletons. Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, 2007, pp 36–41

  31. Aronson AR, Lang FM: An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc 17(3):229–36, 2010

    PubMed  Google Scholar 

  32. Goldstein I: Automated classification of the narrative of medical reports using natural language processing. University at Albany, State University of New York, 2011

Download references

Acknowledgments

The authors gratefully acknowledge Yassine Benajiba, Steffen Pauws, and the anonymous referees for valuable comments on an earlier version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Merlijn Sevenster.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 424 kb)

ESM 2

(DOC 452 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sevenster, M., Qian, Y., Abe, H. et al. Cross-Sectional Relatedness Between Sentences in Breast Radiology Reports: Development of an SVM Classifier and Evaluation Against Annotations of Five Breast Radiologists. J Digit Imaging 26, 977–988 (2013). https://doi.org/10.1007/s10278-013-9612-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-013-9612-9

Keywords

Navigation