Abstract
An important limitation to the development of AI-based solutions for In Vitro Fertilization (IVF) is the black-box nature of most state-of-the-art models, due to the complexity of deep learning architectures, which raises potential bias and fairness issues. The need for interpretable AI has risen not only in the IVF field but also in the deep learning community in general. This has started a trend in literature where authors focus on designing objective metrics to evaluate generic explanation methods. In this paper, we study the behavior of recently proposed objective faithfulness metrics applied to the problem of embryo stage identification. We benchmark attention models and post-hoc methods using metrics and further show empirically that (1) the metrics produce low overall agreement on the model ranking and (2) depending on the metric approach, either post-hoc methods or attention models are favored. We conclude with general remarks about the difficulty of defining faithfulness and the necessity of understanding its relationship with the type of approach that is favored.
Supported by Nantes Excellence Trajectory (NExT).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Afnan, M., et al.: Interpretable, not black-box, artificial intelligence should be used for embryo selection. Human Reprod. Open 2021(4), hoab040 (2021). https://doi.org/10.1093/hropen/hoab040
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery (KDD’19), pp. 2623–2631. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3292500.3330701
Alqaraawi, A., Schuessler, M., Weiß, P., Costanza, E., Berthouze, N.: Evaluating Saliency Map Explanations for Convolutional Neural Networks: A User Study (IUI ’20), pp. 275–285. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3377325.3377519
Bastings, J., Filippova, K.: The elephant in the interpretability room: why use attention as explanation when we have saliency methods? In: Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pp. 149–155. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.blackboxnlp-1.14
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (2018). https://doi.org/10.1109/wacv.2018.00097
Chen, C., Li, O., Barnett, A., Su, J., Rudin, C.: This looks like that: deep learning for interpretable image recognition. In: NeurIPS (2019)
Ciray, H.N., et al.: Proposed guidelines on the nomenclature and annotation of dynamic human embryo monitoring by a time-lapse user group. Human Reprod. 29(12), 2650–2660 (2014). https://doi.org/10.1093/humrep/deu278
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop (2011)
Desai, S., Ramaswamy, H.G.: Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 972–980 (2020). https://doi.org/10.1109/WACV45572.2020.9093360
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YicbFdNTTy
Fukui, H., Hirakawa, T., Yamashita, T., Fujiyoshi, H.: Attention branch network: learning of attention mechanism for visual explanation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10697–10706 (2019). https://doi.org/10.1109/CVPR.2019.01096
Gomez, T., et al.: A time-lapse embryo dataset for morphokinetic parameter prediction. Data Brief 42 (2022). https://doi.org/10.1016/j.dib.2022.108258
Gomez, T., Fréour, T., Mouchère, H.: Metrics for saliency map evaluation of deep learning explanation methods. In: International Conference on Pattern Recognition and Artificial Intelligence (2022). https://doi.org/10.48550/ARXIV.2201.13291
Gomez, T., Ling, S., Fréour, T., Mouchère, H.: Br-npa: a non-parametric high-resolution attention model to improve the interpretability of attention (2021). https://doi.org/10.48550/ARXIV.2106.02566
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hu, T., Qi, H.: See better before looking closer: Weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891 (2019)
Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Inhorn, M.C., Patrizio, P.: Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century. Human Reprod. Update 21(4), 411–426 (2015). https://doi.org/10.1093/humupd/dmv016
Jung, H., Oh, Y.: Towards better explanations of class activation mapping. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1336–1344 (2021)
Kendall, M.G.: The treatment of ties in ranking problems. Biometrika 33(3), 239–251 (1945). http://www.jstor.org/stable/2332303
Khan, A., Gould, S., Salzmann, M.: Deep convolutional neural networks for human embryonic cell counting. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 339–348. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46604-0_25
Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) (ICCV ’15), pp. 1449–1457. IEEE Computer Society, Washington, DC (2015). https://doi.org/10.1109/ICCV.2015.170
van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(86), 2579–2605 (2008). http://jmlr.org/papers/v9/vandermaaten08a.html
McInnes, L., Healy, J., Saul, N., GroSSberger, L.: Umap: Uniform manifold approximation and projection. J. Open Source Softw. 3(29), 861 (2018). https://doi.org/10.21105/joss.00861
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Petsiuk, V., Das, A., Saenko, K.: Rise: randomized input sampling for explanation of black-box models. In: BMVC (2018)
Pribenszky, C., Nilselid, A.M., Montag, M.: Time-lapse culture with morphokinetic embryo selection improves pregnancy and live birth chances and reduces early pregnancy loss: a meta-analysis. Reprod. BioMed. Online 35, 511–520 (2017)
Rad, R.M., Saeedi, P., Au, J., Havelock, J.: Blastomere cell counting and centroid localization in microscopic images of human embryo. In: 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6 (2018). https://doi.org/10.1109/MMSP.2018.8547107
Sawada, Y., et al.: Artificial intelligence with attention branch network and deep learning can predict live births by using time-lapse imaging of embryos after in vitro fertilisation. Reprod. BioMed. Online (2021). https://doi.org/10.1016/j.rbmo.2021.05.002
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019). https://doi.org/10.1007/s11263-019-01228-7
Tsai, C.H., Brusilovsky, P.: Evaluating Visual Explanations for Similarity-Based Recommendations: User Perception and Performance, pp. 22–30. Association for Computing Machinery, New York (2019)
van der Waa, J., Nieuwburg, E., Cremers, A., Neerincx, M.: Evaluating XAI: a comparison of rule-based and example-based explanations. Artif. Intell. 291, 103404 (2021). https://doi.org/10.1016/j.artint.2020.103404
Wang, H., et al.: Score-cam: score-weighted visual explanations for convolutional neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 111–119. IEEE Computer Society, Los Alamitos (2020). https://doi.org/10.1109/CVPRW50498.2020.00020
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://doi.org/10.1109/CVPR.2016.319
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Gomez, T., Fréour, T., Mouchère, H. (2023). Comparison of Attention Models and Post-hoc Explanation Methods for Embryo Stage Identification: A Case Study. In: Rousseau, JJ., Kapralos, B. (eds) Pattern Recognition, Computer Vision, and Image Processing. ICPR 2022 International Workshops and Challenges. ICPR 2022. Lecture Notes in Computer Science, vol 13645. Springer, Cham. https://doi.org/10.1007/978-3-031-37731-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-37731-0_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37730-3
Online ISBN: 978-3-031-37731-0
eBook Packages: Computer ScienceComputer Science (R0)