Skip to main content
Log in

Embracing New Techniques in Deep Learning for Estimating Image Memorability

  • Original Paper
  • Published:
Computational Brain & Behavior Aims and scope Submit manuscript

Abstract

Various works have suggested that the memorability of an image is consistent across people, and thus can be treated as an intrinsic property of an image. Using computer vision models, we can make specific predictions about what people will remember or forget. While older work has used now-outdated deep learning architectures rooted in shallow visual processing to predict image memorability, innovations in the field have given us new techniques to apply to this problem. Here, we propose and evaluate five alternative deep learning models which exploit developments in the field from the last 5 years, largely the introduction of residual neural networks, which are intended to allow the model to use semantic information in the memorability estimation process. These new models were tested against the prior state of the art with a combined dataset built to optimize both within-category and across-category predictions. Our findings suggest that the key prior memorability network had overstated its generalizability and was overfit on its training set. Our new models outperform this prior model, leading us to conclude that residual networks outperform simpler convolutional neural networks in memorability regression. We make our new state-of-the-art model readily available to the research community, allowing memory researchers to make predictions about memorability on a wider range of images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Availability of Data and Material

The model is available from the python packaging authority (https://pypi.org/project/resmem/), and an online demo is available on the Brain Bridge Lab website (https://brainbridgelab.uchicago.edu/resmem). Miscellaneous data, including feature analyses, prediction performance within all subcategories of MemCat, and an archival copy of the pretrained model, are hosted on OSF at (https://osf.io/qf5ry/). The data used to train ResMem came from two sources. LaMem is hosted by MIT (http://memorability.csail.mit.edu/download.html). MemCat is hosted by the Flemish government (https://gestaltrevision.be/projects/memcat/).

Code Availability

The code for the ResMem package as published is hosted on GitHub at (https://github.com/Brain-Bridge-Lab/resmem). The code used to generate figures and run analyses is split across two repositories, https://github.com/Brain-Bridge-Lab/BrainBridge-MemNet) and https://github.com/Brain-Bridge-Lab/resmem-analysis).

References

  • Bainbridge, W.A. (2017). The memorability of people: intrinsic memorability across transformations of a person’s face. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(5), 706–716. https://doi.org/10.1037/xlm0000339.

    PubMed  Google Scholar 

  • Bainbridge, W.A. (2019). Memorability: how what we see influences what we remember. In K.D. Federmeier D.M. Beck (Eds.) Psychology of Learning and Motivation, (Vol. 70 pp. 1–27).

  • Bainbridge, W.A., & Rissman, J. (2018). Dissociating neural markers of stimulus memorability and subjective recognition during episodic retrieval. Scientific Reports, 8(1), 8679. https://doi.org/10.1038/s41598-018-26467-5.

    Article  Google Scholar 

  • Bainbridge, W.A., Isola, P., & Oliva, A. (2013). The intrinsic memorability of face photographs. Journal of Experimental Psychology: General, 142(4), 1323–1334. https://doi.org/10.1037/a0033872.

    Article  Google Scholar 

  • Bainbridge, W.A., Dilks, D.D., & Oliva, A. (2017). Memorability: a stimulusdriven perceptual neural signature distinctive from memory. NeuroImage, 149, 141–152. https://doi.org/10.1016/j.neuroimage.2017.01.063.

    Article  Google Scholar 

  • Bainbridge, W.A., Berron, D., Schütze, H., Cardenas-Blanco, A., Metzger, C., Dobisch, L., Bittner, D., Glanz, W., Spottke, A., Rudolph, J., Brosseron, F., Buerger, K., Janowitz, D., Fliessbach, K., Heneka, M., Laske, C., Buchmann, M., Peters, O., Diesing, D., ..., Düzel, E. (2019). Memorability of photographs in subjective cognitive decline and mild cognitive impairment: implications for cognitive assessment. Alzheimer’s and Dementia: Diagnosis, Assessment & Disease Monitoring, 11(1), 610–618. https://doi.org/10.1016/j.dadm.2019.07.005.

    Google Scholar 

  • Basavaraju, S., Gaj, S., & Sur, A. (2019). Object memorability prediction using deep learning: location and size bias. Journal of Visual Communication and Image Representation, 59, 117–127. https://doi.org/10.1016/j.jvcir.2019.01.008.

    Article  Google Scholar 

  • Chellapilla, K., Puri, S., & Simard, P. (2006). High performance convolutional neural networks for document processing. In G. Lorette (Ed.) Tenth international workshop on frontiers in handwriting recognition. Université de Rennes 1. http://www.suvisoft.com. https://hal.inria.fr/inria-00112631. La Baule: Suvisoft.

  • Cichy, R.M., Khosla, A., Pantazis, D., Torralba, A., & Oliva, A. (2016). Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific Reports, 6(1), 27755. https://doi.org/10.1038/srep27755.

    Article  Google Scholar 

  • Cireşan, D.C., Meier, U., Gambardella, L.M., & Schmidhuber, J. (2010). Deep, big, simple neural nets for handwritten digit recognition. Neural Computation, 22(12), 3207–3220. https://doi.org/10.1162/NECO_a_00052.

    Article  PubMed  Google Scholar 

  • Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248–255). Miami: IEEE, DOI https://doi.org/10.1109/CVPR.2009.5206848, (to appear in print).

  • Dubey, R., Peterson, J., Khosla, A., Yang, M.-H., & Ghanem, B. (2015). What makes an object memorable? Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, 1089–1097.

  • Fajtl, J., Argyriou, V., Monekosso, D., & Remagnino, P. (2018). AMNet: memorability estimation with attention. In 2018 IEEE/CVF conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2018.00666 (pp. 6363–6372). Salt Lake City: IEEE.

  • Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In 2009 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2009.5206772 (pp. 1778–1785). Miami: IEEE.

  • Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202. https://doi.org/10.1007/BF00344251.

    Article  Google Scholar 

  • Goetschalckx, L., & Wagemans, J. (2019). MemCat: a new category-based image set quantified on memorability. PeerJ, 7, 8169. https://doi.org/10.7717/peerj.8169.

    Article  Google Scholar 

  • Harris, C.R., Millman, K.J., van der Walt, S.J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N.J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M.H., Brett, M., Haldane, A., del Río, J.F., Wiebe, M., Peterson, P., ..., Oliphant, T.E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2.

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. arXiv:1512.03385 [cs].

  • Hovhannisyan, M., Clarke, A., Geib, B.R., Cicchinelli, R., Monge, Z., Worth, T., Szymanski, A., Cabeza, R., & Davis, S.W. (2021). The visual and semantic features that predict object memory: concept property norms for 1,000 object images. Memory & Cognition, 49, 712–731. https://doi.org/10.3758/s13421-020-01130-5.

    Article  Google Scholar 

  • Huiskes, M.J., & Lew, M.S. (2008). The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM international conference on multimedia information retrieval. MIR ’08. https://doi.org/10.1145/1460096.1460104 (pp. 39–43). New York: Association for Computing Machinery.

  • Isola, P., Xiao, J., Torralba, A., & Oliva, A. (2011a). What makes an image memorable? 145–152. https://doi.org/10.1109/CVPR.2011.599572.

  • Isola, P., Parikh, D., Torralba, A., & Oliva, A. (2011b). Understanding the intrinsic memorability of images. In Advances in neural information processing systems.

  • Isola, P., Xiao, J., Parikh, D., Torralba, A., & Oliva, A. (2014). What makes a photograph memorable? IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1469–1482. https://doi.org/10.1109/TPAMI.2013.200.

    Article  Google Scholar 

  • Jaegle, A., Mehrpour, V., Mohsenzadeh, Y., Meyer, T., Oliva, A., & Rust, N. (2019). Population response magnitude variation in inferotemporal cortex predicts image memorability. eLife, 8, 47596. https://doi.org/10.7554/eLife.47596.

    Article  Google Scholar 

  • Jozwik, K.M., Kriegeskorte, N., Cichy, R.M., & Mur, M. (2018). Deep convolutional neural networks, features, and categories perform similarly at explaining primate high-level visual representations. In 2018 Conference on cognitive computational neuroscience. https://doi.org/10.32470/CCN.2018.1232-0. Philadelphia: Cognitive Computational Neuroscience.

  • Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In IEEE international conference on computer vision (ICCV).

  • Khosla, A., Bainbridge, W.A., Torralba, A., & Oliva, A. (2013). Modifying the memorability of face photographs. In 2013 IEEE International Conference on Computer Vision. https://doi.org/10.1109/ICCV.2013.397(pp. 3200–3207). Sydney: IEEE.

  • Khosla, A., Das Sarma, A., & Hamid, R. (2014). What makes an image popular?. In Proceedings of the 23rd international conference on World Wide Web—WWW ’14. https://doi.org/10.1145/2566486.2567996 (pp. 867–876). Seoul: ACM Press.

  • Khosla, A., Raju, A.S., Torralba, A., & Oliva, A. (2015). Understanding and predicting image memorability at a large scale. In 2015 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2015.275 (pp. 2390–2398). Santiago: IEEE.

  • Koch, G.E., Akpan, E., & Coutanche, M.N. (2020). Image memorability is predicted at different stages of a convolutional neural network. bioRxiv https://www.biorxiv.org/content/early/2020/03/14/834796.full.pdf. https://doi.org/10.1101/834796.

  • Kramer, M., Hebart, M.H., Baker, C.I., & Bainbridge, W.A. (2021). Characterizing memorability in representational space: analyzing relative contributions of perceptual and conceptual information. In Vision Sciences Society.

  • Krizhevsky, A., Sutskever, I., & Hinton, G.E. (2012). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386.

    Article  Google Scholar 

  • Lloyd, E.C., Shehzad, Z., Schebendach, J., Bakkour, A., Xue, A.M., Assaf, N.F., Jilani, R., Walsh, B.T., Steinglass, J., & Foerde, K. (2020). Food folio by columbia center for eating disorders: a freely available food image database. Frontiers in Psychology, 11, 3556. https://doi.org/10.3389/fpsyg.2020.585044.

    Article  Google Scholar 

  • Li, X., Bainbridge, W.A., & Bakkour, A. (2022). Memorable but not chosen: no effect of memorability on value-based decisions. PsyArXiv.

  • Machajdik, J., & Hanbury, A. (2010). Affective image classification using features inspired by psychology and art theory. In Proceedings of the 18th ACM international conference on multimedia. MM ’10. https://doi.org/10.1145/1873951.1873965 (pp. 83–92). New York: Association for Computing Machinery.

  • Mohsenzadeh, Y., Mullin, C., Oliva, A., & Pantazis, D. (2019). The perceptual neural trace of memorable unseen scenes. Scientific Reports, 9(1), 6033. https://doi.org/10.1038/s41598-019-42429-x.

    Article  Google Scholar 

  • Murray, N., Marchesotti, L., & Perronnin, F. (2012). AVA: a large-scale database for aesthetic visual analysis. In 2012 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2012.6247954(pp. 2408–2415). Providence: IEEE.

  • Olah, C., Mordvintsev, A., & Schubert, L. (2017). Feature visualization. Distill, 2(11), 10–2391500007. https://doi.org/10.23915/distill.00007.

    Article  Google Scholar 

  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., & Chintala, S. (2019). PyTorch: an imperative style, high-performance deep learning library. In advances in neural information processing systems 32 (pp. 8024-8035). Curran associates, inc. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.

  • Ramanathan, S., Hutchison, D., Kanade, T., Kittler, J., Kleinberg, J.M., Mattern, F., Mitchell, J.C., Naor, M., Nierstrasz, O., Pandu Rangan, C., Steffen, B., Sudan, M., Terzopoulos, D., Tygar, D., Vardi, M.Y., Weikum, G., Katti, H., Sebe, N., Kankanhalli, M., & Chua, T. -S. (2010). An eye fixation database for saliency detection in images. In K. Daniilidis, P. Maragos, & N Paragios (Eds.) Computer vision—ECCV 2010 (Vol. 6314). https://doi.org/10.1007/978-3-642-15561-13 (pp. 30–43). Berlin: Springer.

  • Saleh, B., Farhadi, A., & Elgammal, A. (2013). Object-centric anomaly detection by attribute-based reasoning. In 2013 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2013.107 (pp. 787–794). Portland: IEEE.

  • Squalli-Houssaini, H., Duong, N.Q.K., Gwenaelle, M., & Demarty, C. -H. (2018). Deep Learning for Predicting Image Memorability. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). https://doi.org/10.1109/ICASSP.2018.8462292 (pp. 2371–2375). Calgary: IEEE.

  • Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., & Torralba, A. (2010). SUN database: large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2010.5539970 (pp. 3485–3492). San Francisco: IEEE.

  • Xie, W., Bainbridge, W.A., Inati, S.K., Baker, C.I., & Zaghloul, K.A. (2020). Memorability of words in arbitrary verbal associations modulates memory retrieval in the anterior temporal lobe. Nature Human Behaviour, 4(9), 937–948. https://doi.org/10.1038/s41562-020-0901-2.

    Article  Google Scholar 

  • Yamins, D.L.K., Hong, H., Cadieu, C.F., Solomon, E.A., Seibert, D., & DiCarlo, J.J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111(23), 8619–8624. https://doi.org/10.1073/pnas.1403112111.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge Lore Goetschalckx for providing us with details on MemNet’s implementation. We would also like to acknowledge Deepasri Prasad and Max Kramer for information sharing and general feedback.

Funding

Not applicable

Author information

Authors and Affiliations

Authors

Contributions

W.A. Bainbridge conceived of the presented idea. C.D. Needell designed, programmed, and tuned the model. C.D. Needell and W.A. Bainbridge wrote the manuscript. W.A. Bainbridge supervised the project.

Corresponding author

Correspondence to Coen D. Needell.

Ethics declarations

Ethics Approval

Not applicable

Consent to Participate

Not applicable

Consent for Publication

Not applicable

Conflict of Interest

Not applicable

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 722 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Needell, C.D., Bainbridge, W.A. Embracing New Techniques in Deep Learning for Estimating Image Memorability. Comput Brain Behav 5, 168–184 (2022). https://doi.org/10.1007/s42113-022-00126-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42113-022-00126-5

Keywords

Navigation