Skip to main content

Advertisement

Log in

Modeling Perceptual Similarity Measures in CT Images of Focal Liver Lesions

  • Published:
Journal of Digital Imaging Aims and scope Submit manuscript

Abstract

Motivation: A gold standard for perceptual similarity in medical images is vital to content-based image retrieval, but inter-reader variability complicates development. Our objective was to develop a statistical model that predicts the number of readers (N) necessary to achieve acceptable levels of variability. Materials and Methods: We collected 3 radiologists’ ratings of the perceptual similarity of 171 pairs of CT images of focal liver lesions rated on a 9-point scale. We modeled the readers’ scores as bimodal distributions in additive Gaussian noise and estimated the distribution parameters from the scores using an expectation maximization algorithm. We (a) sampled 171 similarity scores to simulate a ground truth and (b) simulated readers by adding noise, with standard deviation between 0 and 5 for each reader. We computed the mean values of 2–50 readers’ scores and calculated the agreement (AGT) between these means and the simulated ground truth, and the inter-reader agreement (IRA), using Cohen’s Kappa metric. Results: IRA for the empirical data ranged from =0.41 to 0.66. For between 1.5 and 2.5, IRA between three simulated readers was comparable to agreement in the empirical data. For these values , AGT ranged from =0.81 to 0.91. As expected, AGT increased with N, ranging from =0.83 to 0.92 for N = 2 to 50, respectively, with =2. Conclusion: Our simulations demonstrated that for moderate to good IRA, excellent AGT could nonetheless be obtained. This model may be used to predict the required N to accurately evaluate similarity in arbitrary size datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Federle MP, Blachar A: CT evaluation of the liver: principles and techniques. Seminars in Liver Disease 21(2):135–45, 2001

    Article  PubMed  CAS  Google Scholar 

  2. Aisen AM, Broderick LS, Winer-Muram H, Brodley CE, Kak AC, Pavlopoulou C, et al: Automated storage and retrieval of thin-section CT images to assist diagnosis: system description and preliminary assessment. Radiology 228(1):265–70, 2003

    Article  PubMed  Google Scholar 

  3. Datta R, Joshi D, Li J, Wang J: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing. Survey 40:1–60, 2008

    Article  Google Scholar 

  4. Aigrain P, Zhang H, Petkovic D: Content-Based Representation and Retrieval of Visual Media: A Review of the State-of-the-art. Multimedia Tools and Applications 3:179–202, 1996

    Article  Google Scholar 

  5. Müller H, Rosset A, Vallée JP, Terrier F, Geissbuhler A: A reference data set for the evaluation of medical image retrieval systems. Comput. Med Imaging Graph 28:295–305, 2004

    Article  PubMed  Google Scholar 

  6. Muramatsu C, Li Q, Schmidt RA, Shiraishi J, Li Q, Fujita H, Doi K: Presentation of similar images for diagnosis of breast masses on mammograms: analysis of the effect on residents. Proceedings of the SPIE 7260:72600R–72600R8, 2009

    Article  Google Scholar 

  7. Muramatsu C, Li Q, Schmidt R, Suzuki K, Shiraishi J, Newstead G, Doi K: Experimental determination of subjective similarity for pairs of clustered microcalcifications on mammograms: observer study results. Medical Physics 33(9):3460–8, 2006

    Article  PubMed  Google Scholar 

  8. Muramatsu C, Li Q, Schmidt R, Shiraishi J, Doi K: Investigation of psychophysical similarity measures for selection of similar images in the diagnosis of clustered microcalcifications on mammograms. Medical Physics 35(12):5695–702, 2008

    Article  PubMed  Google Scholar 

  9. Muramatsu C, Li Q, Schmidt RA, Shiraishi J, Doi K: Determination of similarity measures for pairs of mass lesions on mammograms by use of BI-RADS lesion descriptors and image features. Acad Radiol 16(4):443–449, 2009

    Article  PubMed  Google Scholar 

  10. Muramatsu C, Schmidt RA, Shiraishi J, Li Q, Doi K: Presentation of similar images as a reference for distinction between benign and malignant masses on mammograms: analysis of initial observer study. Journal of Digital Imaging 23(5):592–602, 2010

    Article  PubMed  Google Scholar 

  11. Nakayama R, Abe H, Shiraishi J, Doi K: Evaluation of Objective Similarity Measures for Selecting Similar Images of Mammographic Lesions. Journal of Digital Imaging 24(1):75–85, 2011

    Article  PubMed  Google Scholar 

  12. Li Q, Li F, Shiraishi J, Katsuragawa S, Sone S, Doi K: Investigation of new psychophysical measures for evaluation of similar images on thoracic computed tomography for distinction between benign and malignant nodules. Medical Physics 30(10):2584–93, 2003

    Article  PubMed  Google Scholar 

  13. Muramatsu C, Li Q, Suzuki K, Schmidt RA, Shiraishi J, Newstead GM, Doi K: Investigation of psychophysical measure for evaluation of similar images for mammographic masses: preliminary results. Medical Physics 32(7):2295–304, 2005

    Article  PubMed  Google Scholar 

  14. Kitchin DR, et al: Learning radiology a survey investigating radiology resident use of textbooks, journals, and the internet. Academic Radiology 14:1113–1120, 2007

    Article  PubMed  Google Scholar 

  15. Faruque J, Rubin D, Beaulieu C, Rosenberg J, Kamaya A, Tye G, Summers R, Napel S: A Scalable Reference Standard of Visual Similarity for a Content-Based Image Retrieval System. IEEE Symposium on Healthcare, Informatics, and Systems Biology, San Jose, 2011 158–165

  16. Landis J, Koch G: The measurement of observer agreement for categorical data. Biometrics 33:159–174, 1977

    Article  PubMed  CAS  Google Scholar 

  17. Gwet K: Statistical Tables for Inter-Rater Agreement. StatAxis Publishing, Gaithersburg, 2001

    Google Scholar 

  18. Sim J, Wright C: The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements. Physical Therapy 85:257–268, 2005

    PubMed  Google Scholar 

  19. Fisher R: Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh, 1925

    Google Scholar 

Download references

Acknowledgments

We are grateful to the following people for participating in our study: Aya Kamaya MD, Grace Tye MD, and Ronald Summers MD, PhD. We would like to acknowledge these following funding sources for supporting this project: SIIM 2011–2012 Research Grant, NIH Training Grant T32 GM063495.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessica Faruque.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Faruque, J., Rubin, D.L., Beaulieu, C.F. et al. Modeling Perceptual Similarity Measures in CT Images of Focal Liver Lesions. J Digit Imaging 26, 714–720 (2013). https://doi.org/10.1007/s10278-012-9557-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-012-9557-4

Keywords

Navigation