Skip to main content
Log in

Context-assisted face clustering framework with human-in-the-loop

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Automatic face clustering, which aims to group faces referring to the same people together, is a key component for face tagging and image management. Standard face clustering approaches that are based on analyzing facial features can already achieve high-precision results. However, they often suffer from low recall due to the large variation of faces in pose, expression, illumination, occlusion, etc. To improve the clustering recall without reducing the high precision, we leverage the heterogeneous context information to iteratively merge the clusters referring to same entities. We first investigate the appropriate methods to utilize the context information at the cluster level, including using of “common scene”, people co-occurrence, human attributes, and clothing. We then propose a unified framework that employs bootstrapping to automatically learn adaptive rules to integrate this heterogeneous contextual information, along with facial features, together. Finally, we discuss a novel methodology for integrating human-in-the-loop feedback mechanisms that leverage human interaction to achieve the high-quality clustering results. Experimental results on two personal photo collections and one real-world surveillance dataset demonstrate the effectiveness of the proposed approach in improving recall while maintaining very high precision of face clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. Notice, in general, there could be different models for assigning weights to paths in addition to the flow model considered in the paper. For example, paths that go through larger group nodes could be assigned higher weight since larger groups of people tend to be better context than smaller ones.

  2. We note that while larger dataset exists, (e.g., LFW, PubFig), these datasets (LFW and PubFig) are not suitable for our work because they only provide single face rather than the whole image, whereas we focus on disambiguating faces in a photo collection.

References

  1. Ahonen T, Hadid A et al (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal

  2. Amigo E et al (2008) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Technical Report

  3. An L, Kafai M, Bhanu B (2013) Dynamic bayesian network for unconstrained face recognition in surveillance camera networks. IEEE J Emerg SelectTopics Circuits Syst 3(2):155–164

    Article  Google Scholar 

  4. An L, Bhanu B, Yang S (2012) Boosting face recognition in real-world surveillance videos. In: IEEE ninth international conference on advanced video and signal-based surveillance (AVSS), pp 270–275

  5. Berg TL, Berg AC et al (2004) Names and faces in the news. In: IEEE ICPR

  6. Chen Z, Kalashnikov DV, Mehrotra S (2007) Adaptive graphical approach to entity resolution In: JCDL

  7. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR

  8. Etemad K, Chellappa R (1997) Discriminant analysis for recognition of human face images. In: AVBPA

  9. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. In: Science

  10. Gallagher A, Chen T (2008) Clothing cosegmentation for recognizing people. In: IEEE CVPR

  11. Gallagher A, Chen T (2009) Understanding images of groups of people. In: IEEE CVPR

  12. Kalashnikov DV, Chen Z, Mehrotra S, Nuray-Turan R (2011) Web people search via connection analysis. In: TKDE

  13. Kumar N et al (2011) Describable visual attributes for face verification and image search. In: IEEE TPAMI

  14. Lee YJ, Grauman K (2011) Face discovery with social context. In: BMVC

  15. Nuray-Turan R, Kalashnikov DV, Mehrotra S (2012) Exploiting web querying for web people search. In: ACM TODS

  16. Project sherlock @ uci. http://sherlock.ics.uci.edu

  17. Shimizu K, Nitta N et al (2012) Classification based group photo retrieval with bag of people features. In: ICMR

  18. Tang J, Hong R, Yan S, Chua T-S, Qi G-J, Jain R (2011) Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans Intell Syst Technol 2(2):14–23

    Article  Google Scholar 

  19. Tang J, Zha Z-J, Tao D, Chua T-S (2012) Semantic-gap-oriented active learning for multilabel image annotation. IEEE Trans Image Process 21(4):2354–2360

    Article  MathSciNet  Google Scholar 

  20. Tang J, Yan S, Hong R, Qi G, Chua T (2009) Inferring semantic concepts from community-contributed images and noisy tags. In: ACM multimedia

  21. Wu P, Tang F (2010) Improving face clustering using social context. In: ACM multimedia

  22. Yagnik J, Islam A (2007) Learning people annotation from the web via consistency learning. In: MIR

  23. Zhang W et al (2010) Beyond face: improving person clustering in consumer photos by exploring contextual information. In: ICME

  24. Zhang L, Kalashnikov DV, Mehrotra S (2013) A unified framework for context assisted face clustering. In: ACM international conference on multimedia retrieval (ACM ICMR 2013), Dallas

  25. Zhang L, Kalashnikov DV, Mehrotra S, Vaisenberg R (2013) Context-based person identification framework for smart video surveillance. Machine Vision and Applications, pp. 1–15

  26. Zhang L, Vaisenberg R, Mehrotra S, Kalashnikov DV (2011) Video entity resolution: Applying er techniques for smart video surveillance. In: PerCom workshops

  27. Zhang L, Zhang K, Li C (2008) A topical pagerank based algorithm for recommender systems. In: ACM conference on research and development in information retrieval (SIGIR), pp 713–714

  28. Zhao M, Teo Y et al (2006) Automatic person annotation of family photo album. In: CIVR

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liyan Zhang.

Additional information

This work was supported in part by NSF grants CNS-1118114, CNS-1059436, CNS-1063596. It is part of NSF supported project Sherlock @ UCI (http://sherlock.ics.uci.edu): a UC Irvine project on Data Quality and Entity Resolution [16].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Kalashnikov, D.V. & Mehrotra, S. Context-assisted face clustering framework with human-in-the-loop. Int J Multimed Info Retr 3, 69–88 (2014). https://doi.org/10.1007/s13735-014-0052-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-014-0052-1

Keywords

Navigation