Embracing New Techniques in Deep Learning for Estimating Image Memorability

Needell, Coen D.; Bainbridge, Wilma A.

doi:10.1007/s42113-022-00126-5

Embracing New Techniques in Deep Learning for Estimating Image Memorability

Original Paper
Published: 11 April 2022

Volume 5, pages 168–184, (2022)
Cite this article

Computational Brain & Behavior Aims and scope Submit manuscript

1132 Accesses
16 Citations
42 Altmetric
5 Mentions
Explore all metrics

Abstract

Various works have suggested that the memorability of an image is consistent across people, and thus can be treated as an intrinsic property of an image. Using computer vision models, we can make specific predictions about what people will remember or forget. While older work has used now-outdated deep learning architectures rooted in shallow visual processing to predict image memorability, innovations in the field have given us new techniques to apply to this problem. Here, we propose and evaluate five alternative deep learning models which exploit developments in the field from the last 5 years, largely the introduction of residual neural networks, which are intended to allow the model to use semantic information in the memorability estimation process. These new models were tested against the prior state of the art with a combined dataset built to optimize both within-category and across-category predictions. Our findings suggest that the key prior memorability network had overstated its generalizability and was overfit on its training set. Our new models outperform this prior model, leading us to conclude that residual networks outperform simpler convolutional neural networks in memorability regression. We make our new state-of-the-art model readily available to the research community, allowing memory researchers to make predictions about memorability on a wider range of images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Multiple instance learning based deep CNN for image memorability prediction

Article 12 October 2019

Predicting memorability of face photographs with deep neural networks

Article Open access 13 January 2024

Image Memorability Using Diverse Visual Features and Soft Attention

Availability of Data and Material

The model is available from the python packaging authority (https://pypi.org/project/resmem/), and an online demo is available on the Brain Bridge Lab website (https://brainbridgelab.uchicago.edu/resmem). Miscellaneous data, including feature analyses, prediction performance within all subcategories of MemCat, and an archival copy of the pretrained model, are hosted on OSF at (https://osf.io/qf5ry/). The data used to train ResMem came from two sources. LaMem is hosted by MIT (http://memorability.csail.mit.edu/download.html). MemCat is hosted by the Flemish government (https://gestaltrevision.be/projects/memcat/).

Code Availability

The code for the ResMem package as published is hosted on GitHub at (https://github.com/Brain-Bridge-Lab/resmem). The code used to generate figures and run analyses is split across two repositories, https://github.com/Brain-Bridge-Lab/BrainBridge-MemNet) and https://github.com/Brain-Bridge-Lab/resmem-analysis).

References

Bainbridge, W.A. (2017). The memorability of people: intrinsic memorability across transformations of a person’s face. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(5), 706–716. https://doi.org/10.1037/xlm0000339.
PubMed Google Scholar
Bainbridge, W.A. (2019). Memorability: how what we see influences what we remember. In K.D. Federmeier D.M. Beck (Eds.) Psychology of Learning and Motivation, (Vol. 70 pp. 1–27).
Bainbridge, W.A., & Rissman, J. (2018). Dissociating neural markers of stimulus memorability and subjective recognition during episodic retrieval. Scientific Reports, 8(1), 8679. https://doi.org/10.1038/s41598-018-26467-5.
Article Google Scholar
Bainbridge, W.A., Isola, P., & Oliva, A. (2013). The intrinsic memorability of face photographs. Journal of Experimental Psychology: General, 142(4), 1323–1334. https://doi.org/10.1037/a0033872.
Article Google Scholar
Bainbridge, W.A., Dilks, D.D., & Oliva, A. (2017). Memorability: a stimulusdriven perceptual neural signature distinctive from memory. NeuroImage, 149, 141–152. https://doi.org/10.1016/j.neuroimage.2017.01.063.
Article Google Scholar
Bainbridge, W.A., Berron, D., Schütze, H., Cardenas-Blanco, A., Metzger, C., Dobisch, L., Bittner, D., Glanz, W., Spottke, A., Rudolph, J., Brosseron, F., Buerger, K., Janowitz, D., Fliessbach, K., Heneka, M., Laske, C., Buchmann, M., Peters, O., Diesing, D., ..., Düzel, E. (2019). Memorability of photographs in subjective cognitive decline and mild cognitive impairment: implications for cognitive assessment. Alzheimer’s and Dementia: Diagnosis, Assessment & Disease Monitoring, 11(1), 610–618. https://doi.org/10.1016/j.dadm.2019.07.005.
Google Scholar
Basavaraju, S., Gaj, S., & Sur, A. (2019). Object memorability prediction using deep learning: location and size bias. Journal of Visual Communication and Image Representation, 59, 117–127. https://doi.org/10.1016/j.jvcir.2019.01.008.
Article Google Scholar
Chellapilla, K., Puri, S., & Simard, P. (2006). High performance convolutional neural networks for document processing. In G. Lorette (Ed.) Tenth international workshop on frontiers in handwriting recognition. Université de Rennes 1. http://www.suvisoft.com. https://hal.inria.fr/inria-00112631. La Baule: Suvisoft.
Cichy, R.M., Khosla, A., Pantazis, D., Torralba, A., & Oliva, A. (2016). Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific Reports, 6(1), 27755. https://doi.org/10.1038/srep27755.
Article Google Scholar
Cireşan, D.C., Meier, U., Gambardella, L.M., & Schmidhuber, J. (2010). Deep, big, simple neural nets for handwritten digit recognition. Neural Computation, 22(12), 3207–3220. https://doi.org/10.1162/NECO_a_00052.
Article PubMed Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). Miami: IEEE, DOI https://doi.org/10.1109/CVPR.2009.5206848, (to appear in print).
Dubey, R., Peterson, J., Khosla, A., Yang, M.-H., & Ghanem, B. (2015). What makes an object memorable? Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, 1089–1097.
Fajtl, J., Argyriou, V., Monekosso, D., & Remagnino, P. (2018). AMNet: memorability estimation with attention. In 2018 IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2018.00666 (pp. 6363–6372). Salt Lake City: IEEE.
Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In 2009 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2009.5206772 (pp. 1778–1785). Miami: IEEE.
Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. https://doi.org/10.1007/BF00344251.
Article Google Scholar
Goetschalckx, L., & Wagemans, J. (2019). MemCat: a new category-based image set quantified on memorability. PeerJ, 7, 8169. https://doi.org/10.7717/peerj.8169.
Article Google Scholar
Harris, C.R., Millman, K.J., van der Walt, S.J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N.J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M.H., Brett, M., Haldane, A., del Río, J.F., Wiebe, M., Peterson, P., ..., Oliphant, T.E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2.
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. arXiv:1512.03385 [cs].
Hovhannisyan, M., Clarke, A., Geib, B.R., Cicchinelli, R., Monge, Z., Worth, T., Szymanski, A., Cabeza, R., & Davis, S.W. (2021). The visual and semantic features that predict object memory: concept property norms for 1,000 object images. Memory & Cognition, 49, 712–731. https://doi.org/10.3758/s13421-020-01130-5.
Article Google Scholar
Huiskes, M.J., & Lew, M.S. (2008). The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM international conference on multimedia information retrieval. MIR ’08. https://doi.org/10.1145/1460096.1460104 (pp. 39–43). New York: Association for Computing Machinery.
Isola, P., Xiao, J., Torralba, A., & Oliva, A. (2011a). What makes an image memorable? 145–152. https://doi.org/10.1109/CVPR.2011.599572.
Isola, P., Parikh, D., Torralba, A., & Oliva, A. (2011b). Understanding the intrinsic memorability of images. In Advances in neural information processing systems.
Isola, P., Xiao, J., Parikh, D., Torralba, A., & Oliva, A. (2014). What makes a photograph memorable? IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1469–1482. https://doi.org/10.1109/TPAMI.2013.200.
Article Google Scholar
Jaegle, A., Mehrpour, V., Mohsenzadeh, Y., Meyer, T., Oliva, A., & Rust, N. (2019). Population response magnitude variation in inferotemporal cortex predicts image memorability. eLife, 8, 47596. https://doi.org/10.7554/eLife.47596.
Article Google Scholar
Jozwik, K.M., Kriegeskorte, N., Cichy, R.M., & Mur, M. (2018). Deep convolutional neural networks, features, and categories perform similarly at explaining primate high-level visual representations. In 2018 Conference on cognitive computational neuroscience. https://doi.org/10.32470/CCN.2018.1232-0. Philadelphia: Cognitive Computational Neuroscience.
Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In IEEE international conference on computer vision (ICCV).
Khosla, A., Bainbridge, W.A., Torralba, A., & Oliva, A. (2013). Modifying the memorability of face photographs. In 2013 IEEE International Conference on Computer Vision. https://doi.org/10.1109/ICCV.2013.397(pp. 3200–3207). Sydney: IEEE.
Khosla, A., Das Sarma, A., & Hamid, R. (2014). What makes an image popular?. In Proceedings of the 23rd international conference on World Wide Web—WWW ’14. https://doi.org/10.1145/2566486.2567996 (pp. 867–876). Seoul: ACM Press.
Khosla, A., Raju, A.S., Torralba, A., & Oliva, A. (2015). Understanding and predicting image memorability at a large scale. In 2015 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2015.275 (pp. 2390–2398). Santiago: IEEE.
Koch, G.E., Akpan, E., & Coutanche, M.N. (2020). Image memorability is predicted at different stages of a convolutional neural network. bioRxiv https://www.biorxiv.org/content/early/2020/03/14/834796.full.pdf. https://doi.org/10.1101/834796.
Kramer, M., Hebart, M.H., Baker, C.I., & Bainbridge, W.A. (2021). Characterizing memorability in representational space: analyzing relative contributions of perceptual and conceptual information. In Vision Sciences Society.
Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386.
Article Google Scholar
Lloyd, E.C., Shehzad, Z., Schebendach, J., Bakkour, A., Xue, A.M., Assaf, N.F., Jilani, R., Walsh, B.T., Steinglass, J., & Foerde, K. (2020). Food folio by columbia center for eating disorders: a freely available food image database. Frontiers in Psychology, 11, 3556. https://doi.org/10.3389/fpsyg.2020.585044.
Article Google Scholar
Li, X., Bainbridge, W.A., & Bakkour, A. (2022). Memorable but not chosen: no effect of memorability on value-based decisions. PsyArXiv.
Machajdik, J., & Hanbury, A. (2010). Affective image classification using features inspired by psychology and art theory. In Proceedings of the 18th ACM international conference on multimedia. MM ’10. https://doi.org/10.1145/1873951.1873965 (pp. 83–92). New York: Association for Computing Machinery.
Mohsenzadeh, Y., Mullin, C., Oliva, A., & Pantazis, D. (2019). The perceptual neural trace of memorable unseen scenes. Scientific Reports, 9(1), 6033. https://doi.org/10.1038/s41598-019-42429-x.
Article Google Scholar
Murray, N., Marchesotti, L., & Perronnin, F. (2012). AVA: a large-scale database for aesthetic visual analysis. In 2012 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2012.6247954(pp. 2408–2415). Providence: IEEE.
Olah, C., Mordvintsev, A., & Schubert, L. (2017). Feature visualization. Distill, 2(11), 10–2391500007. https://doi.org/10.23915/distill.00007.
Article Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., & Chintala, S. (2019). PyTorch: an imperative style, high-performance deep learning library. In advances in neural information processing systems 32 (pp. 8024-8035). Curran associates, inc. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
Ramanathan, S., Hutchison, D., Kanade, T., Kittler, J., Kleinberg, J.M., Mattern, F., Mitchell, J.C., Naor, M., Nierstrasz, O., Pandu Rangan, C., Steffen, B., Sudan, M., Terzopoulos, D., Tygar, D., Vardi, M.Y., Weikum, G., Katti, H., Sebe, N., Kankanhalli, M., & Chua, T. -S. (2010). An eye fixation database for saliency detection in images. In K. Daniilidis, P. Maragos, & N Paragios (Eds.) Computer vision—ECCV 2010 (Vol. 6314). https://doi.org/10.1007/978-3-642-15561-13 (pp. 30–43). Berlin: Springer.
Saleh, B., Farhadi, A., & Elgammal, A. (2013). Object-centric anomaly detection by attribute-based reasoning. In 2013 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2013.107 (pp. 787–794). Portland: IEEE.
Squalli-Houssaini, H., Duong, N.Q.K., Gwenaelle, M., & Demarty, C. -H. (2018). Deep Learning for Predicting Image Memorability. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). https://doi.org/10.1109/ICASSP.2018.8462292 (pp. 2371–2375). Calgary: IEEE.
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., & Torralba, A. (2010). SUN database: large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2010.5539970 (pp. 3485–3492). San Francisco: IEEE.
Xie, W., Bainbridge, W.A., Inati, S.K., Baker, C.I., & Zaghloul, K.A. (2020). Memorability of words in arbitrary verbal associations modulates memory retrieval in the anterior temporal lobe. Nature Human Behaviour, 4(9), 937–948. https://doi.org/10.1038/s41562-020-0901-2.
Article Google Scholar
Yamins, D.L.K., Hong, H., Cadieu, C.F., Solomon, E.A., Seibert, D., & DiCarlo, J.J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111(23), 8619–8624. https://doi.org/10.1073/pnas.1403112111.
Article Google Scholar

Download references

Acknowledgements

We would like to acknowledge Lore Goetschalckx for providing us with details on MemNet’s implementation. We would also like to acknowledge Deepasri Prasad and Max Kramer for information sharing and general feedback.

Funding

Not applicable

Author information

Authors and Affiliations

Department of Psychology, University of Chicago, 5848 S. University Ave, Chicago, IL, 60637, USA
Coen D. Needell & Wilma A. Bainbridge

Authors

Coen D. Needell
View author publications
You can also search for this author in PubMed Google Scholar
Wilma A. Bainbridge
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

W.A. Bainbridge conceived of the presented idea. C.D. Needell designed, programmed, and tuned the model. C.D. Needell and W.A. Bainbridge wrote the manuscript. W.A. Bainbridge supervised the project.

Corresponding author

Correspondence to Coen D. Needell.

Ethics declarations

Ethics Approval

Not applicable

Consent to Participate

Not applicable

Consent for Publication

Not applicable

Conflict of Interest

Not applicable

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 722 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Needell, C.D., Bainbridge, W.A. Embracing New Techniques in Deep Learning for Estimating Image Memorability. Comput Brain Behav 5, 168–184 (2022). https://doi.org/10.1007/s42113-022-00126-5

Download citation

Accepted: 13 January 2022
Published: 11 April 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s42113-022-00126-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Embracing New Techniques in Deep Learning for Estimating Image Memorability

Abstract

Access this article

Similar content being viewed by others

Multiple instance learning based deep CNN for image memorability prediction

Predicting memorability of face photographs with deep neural networks

Image Memorability Using Diverse Visual Features and Soft Attention

Availability of Data and Material

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher’s Note

Electronic supplementary material

(PDF 722 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Embracing New Techniques in Deep Learning for Estimating Image Memorability

Abstract

Access this article

Similar content being viewed by others

Multiple instance learning based deep CNN for image memorability prediction

Predicting memorability of face photographs with deep neural networks

Image Memorability Using Diverse Visual Features and Soft Attention

Availability of Data and Material

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher’s Note

Electronic supplementary material

(PDF 722 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation