DeepScope: Nonintrusive Whole Slide Saliency Annotation and Prediction from Pathologists at the Microscope
Modern digital pathology departments have grown to produce whole-slide image data at petabyte scale, an unprecedented treasure chest for medical machine learning tasks. Unfortunately, most digital slides are not annotated at the image level, hindering large-scale application of supervised learning. Manual labeling is prohibitive, requiring pathologists with decades of training and outstanding clinical service responsibilities. This problem is further aggravated by the United States Food and Drug Administration’s ruling that primary diagnosis must come from a glass slide rather than a digital image. We present the first end-to-end framework to overcome this problem, gathering annotations in a nonintrusive manner during a pathologist’s routine clinical work: (i) microscope-specific 3D-printed commodity camera mounts are used to video record the glass-slide-based clinical diagnosis process; (ii) after routine scanning of the whole slide, the video frames are registered to the digital slide; (iii) motion and observation time are estimated to generate a spatial and temporal saliency map of the whole slide. Demonstrating the utility of these annotations, we train a convolutional neural network that detects diagnosis-relevant salient regions, then report accuracy of 85.15% in bladder and 91.40% in prostate, with 75.00% accuracy when training on prostate but predicting in bladder, despite different pathologists examining the different tissues. When training on one patient but testing on another, AUROC in bladder is 0.79 ± 0.11 and in prostate is 0.96 ± 0.04. Our tool is available at https://bitbucket.org/aschaumberg/deepscope.
AJS was supported by NIH/NCI grant F31CA214029 and the Tri-Institutional Training Program in Computational Biology and Medicine (via NIH training grant T32GM083937). This research was funded in part through the NIH/NCI Cancer Center Support Grant P30CA008748. AJS thanks Terrie Wheeler, Du Cheng, and the Medical Student Executive Committee of Weill Cornell Medical College for free 3D printing access, instruction, and support. AJS thanks Mariam Aly for taking the photo of the camera on the orange 3D-printed mount in Fig. 1, and attention discussion. We acknowledge fair use of part of a doctor stick figure image in Fig. 1 from 123rf.com. AJS thanks Mark Rubin for helpful pathology discussion. AJS thanks Paul Tatarsky and Juan Perin for Caffe install help on the Memorial Sloan Kettering supercomputer. We gratefully acknowledge NVIDIA Corporation for providing us a GPU as part of the GPU Research Center award to TJF, and for their support with other GPUs.
- 1.Ball, R., North, C.: The effects of peripheral vision and physical navigation on large scale visualization. In: Proceedings of Graphics Interface, pp. 9–16 (2008)Google Scholar
- 2.Ball, R., North, C., Bowman, D.: Move to improve: promoting physical navigation to increase user performance with large displays, pp. 191–200. ACM (2007)Google Scholar
- 6.Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database, pp. 248–255. IEEE, June 2009Google Scholar
- 9.Farneback, G.: Polynomial expansion for orientation and motion estimation. Ph.D. thesis, Linkoping University, Sweden (2002)Google Scholar
- 14.Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding, June 2014Google Scholar
- 16.Keerativittayanun, S., Rakjaeng, K., Kondo, T., Kongprawechnon, W., Tungpimolrut, K., Leelasawassuk, T.: Eye tracking system for ophthalmic operating microscope, pp. 653–656. IEEE, August 2009Google Scholar
- 17.Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks (2012)Google Scholar
- 18.Krupinski, E., Tillack, A., Richter, L., Henderson, J., Bhattacharyya, A., Scott, K., Graham, A., Descour, M., Davis, J., Weinstein, R.: Eye-movement study and human performance using telepathology virtual slides. Implications for medical education and differences with experience. Hum. Pathol. 37(12), 1543–1556 (2006)CrossRefGoogle Scholar
- 19.Mercan, E., Aksoy, S., Shapiro, L., Weaver, D., Brunye, T., Elmore, J.: Localization of diagnostically relevant regions of interest in whole slide images, pp. 1179–1184. IEEE, August 2014Google Scholar
- 20.Parwani, A., Hassell, L., Glassy, E., Pantanowitz, L.: Regulatory barriers surrounding the use of whole slide imaging in the United States of America. J. Pathol. Inform. 5(1) (2014)Google Scholar
- 25.Shupp, L., Ball, R., Yost, B., Booker, J., North, C.: Evaluation of viewport size and curvature of large, high-resolution displays, pp. 123–130. Canadian Information Processing Society (2006)Google Scholar
- 26.Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting, vol. 15, pp. 1929–1958, June 2014Google Scholar