Skip to main content

Advertisement

Log in

Face retrieval system based on elastic web crawler over cloud computing

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The web pages are considered as the main source of the available and provided information that is characterized by variation in its content. The facial recognition plays a key role in knowledge management and identity authentication systems. Although the rapid advance of the web technologies and face recognition systems, the improvement of real-time performance is still the bottleneck. The main objective of this study is to propose a real-time face retrieval system as a service over cloud computing based on a web face crawler. The proposed architecture ensures that the total response time is reduced and the resource utilization is optimized. The web crawlers fetch web pages and extract images in elastic storage over the cloud. Then the collected images are used to extract human faces and to prepare the faces images by succeeding phases to be ready for recognition and identifying the matched face of the collection. The proposed service depends on Principle Component Analysis (PCA) algorithm for feature extraction and dimensionality reduction. Furthermore, K-Nearest Neighbors (KNN) is used to classify the crawled facial images over cloud resources. The experimental results investigated that an enhancement of crawling speed is achieved by increasing the crawler instances. Moreover, the accuracy is enhanced in the face recognition based on the Euclidean over other metrics such as Manhattan and Cosine dissimilarity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Data is available as referenced in the text.

References

  1. Amazon (2020) AWS Amazon EC2. https://aws.amazon.com/ec2/. Accessed 15 Feb. 2020.

  2. Amazon (2020) AWS Amazon RDS. https://aws.amazon.com/rds/. Accessed 15 Feb. 2020.

  3. Bahrami M, Singhal M (2015) DCCSOA: a dynamic cloud computing service-oriented architecture. In: 2015 IEEE international conference on information reuse and integration. USA, San Francisco, pp 158–165

    Chapter  Google Scholar 

  4. Bahrami M, Singhal M, Zhuang Z (2015) A cloud-based web crawler architecture. 18th International Conference on Intelligence in Next Generation Networks (ICIN), Paris, 216-223.

  5. Bahurupi SP, Chaudhari DS (2012) Principal component analysis for face recognition. International Journal of Engineering and Advanced Technology 1(5):91–94

    Google Scholar 

  6. Boldi P, Codenotti B, Santini M, Vigna S (2004) Ubicrawler: a scalable fully distributed web crawler. Software: Practice and Experience 34(8):711–726

    Google Scholar 

  7. Borade SN, Deshmukh RR, Shrishrimal P (2016) Effect of distance measures on the performance of face recognition using principal component analysis. In Intelligent Systems Technologies and Applications, Springer, Cham, pp 569–577

    Google Scholar 

  8. Dubey SR (2019) Face retrieval using frequency decoded local descriptor. Multimed Tools Appl 78(12):16411–16431

    Article  Google Scholar 

  9. Edwards J, McCurley K, Tomlin J (2001) An adaptive model for optimizing performance of an incremental web crawler. In Proceedings of the 10th international conference on World Wide Web, 106-113.

  10. ElAraby ME, Moftah HM, Abuelenin SM, Rashad MZ (2018) Elastic web crawler service-oriented architecture over cloud computing. Arab J Sci Eng 43(12):8111–8126

    Article  Google Scholar 

  11. ElAraby ME, Abuelenin SM, Moftah HM, Rashad MZ (2019) A new architecture for improving focused crawling using deep neural network. Journal of Intelligent & Fuzzy Systems 37(1):1233–1245

    Article  Google Scholar 

  12. Fang Y, Zhang W, Liu N (2020) On the perception analysis of user feedback for interactive face retrieval. ACM Transactions on Applied Perception (TAP) 17(3):1–20

    Article  Google Scholar 

  13. Gens F (2009) New IDC IT cloud services survey: top benefits and challenges, IDC. http://blogs.idc.com/ie/?p=730. .

  14. Gupta S, Thakur K, Kumar M (2020) 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions. Vis Comput:1–10

  15. Heydon A, Najork M (1999) Mercator: a scalable, extensible web crawler. World Wide Web 2(4):219–229

    Article  Google Scholar 

  16. Hsieh JM, Gribble SD, Levy HM (2010) The architecture and implementation of an extensible web crawler. In: NSDI, the 7th USENIX conference on networked systems design and implementation. CA, USA, pp 329–344

    Google Scholar 

  17. Inzinger C, Nastic S, Sehic S, Vögler M, Li F, Dustdar S (2014) MADCAT: a methodology for architecture and deployment of cloud application topologies. In 2014 IEEE 8th international symposium on service oriented system engineering, 13-22.

  18. Jing Y, Baluja S (2008) Pagerank for product image search. In Proceedings of the 17th international conference on World Wide Web, ACM, 307-316.

  19. Khan MA, Jalal AS (2020) A framework for suspect face retrieval using linguistic descriptions. Expert Syst Appl 141:112925

    Article  Google Scholar 

  20. Lin FC, Ngo HH, Dow CR (2020) A cloud-based face video retrieval system with deep learning. The Journal of Supercomputing, 1-21.

  21. Mika P, Tummarello G (2008) Web semantics in the clouds. IEEE Intell Syst 23(5):82–87

    Article  Google Scholar 

  22. Moreno-Vozmediano R, Montero RS, Llorente IM (2012) Key challenges in cloud computing: enabling the future internet of services. IEEE Internet Comput 17(4):18–25

    Article  Google Scholar 

  23. Nguyen, H. M., Ly, N. Q., &Phung, T. T. (2018, March). Large-scale face image retrieval system at attribute level based on facial attribute ontology and deep neuron network. In Asian conference on intelligent information and database systems (pp. 539-549). Springer, Cham.

  24. NIST-National Institute of Standards and Technology (2020) The NIST cloud computing program - NCCP. https://www.nist.gov/programs-projects/nist-cloud-computing-program-nccp. .

  25. Ortiz EG, Becker BC (2014) Face recognition for web-scale datasets. Comput Vis Image Underst 118:153–170

    Article  Google Scholar 

  26. Poon B, Amin MA, Yan H (2011) Performance evaluation and comparison of PCA based human face recognition methods for distorted images. Int J Mach Learn Cybern 2(4):245–259

    Article  Google Scholar 

  27. Rodriguez-Vaamonde S, Torresani L, Fitzgibbon AW (2014) What can pictures tell us about web pages? Improving document search using images. IEEE Trans Pattern Anal Mach Intell 37(6):1274–1285

    Article  Google Scholar 

  28. Shams MY, Sarhan SH, Tolba AS (2017) Adaptive deep learning vector quantisation for multimodal authentication. Journal of information hiding and multimedia signal processing 8(3):702–722

    Google Scholar 

  29. Suchitra S, Poovaraghan RJ (2020) Dynamic multi-attribute priority based face attribute detection for robust face image retrieval system. Multimed Tools Appl:1–25

  30. University of California, Irvine - UCI. http://uci.edu/. Accessed 10 Feb. 2020.

  31. Wang J, Tan Y (2013) Efficient Euclidean distance transform algorithm of binary images in arbitrary dimensions. Pattern Recogn 46(1):230–242

    Article  Google Scholar 

  32. Wang D, Hoi SC, He Y, Zhu J (2012) Mining weakly labeled web facial images for search-based face annotation. IEEE Trans Knowl Data Eng 26(1):166–179

    Article  Google Scholar 

  33. Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, Zhou ZH (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37

    Article  Google Scholar 

  34. Xu S, Yoon HJ, Tourassi G (2014) A user-oriented web crawler for selectively acquiring online content in e-health research. Bioinformatics 30(1):104–114

    Article  Google Scholar 

  35. Zhang D, Lu G (2003) Evaluation of similarity measurement for image retrieval. In International Conference on Neural Networks and Signal Processing, IEEE, pp 928–931

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by ME ElAraby. The first draft of the manuscript was written by ME ElAraby and Mahmoud Y. Shams commented on previous versions of the manuscript and reviewed the final version. All authors read and approved the final manuscript.

Corresponding author

Correspondence to M. E. ElAraby.

Ethics declarations

Conflicts of interest/Competing interests

Not applicable.

Code availability

New software developed with custom code.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

ElAraby, M.E., Shams, M.Y. Face retrieval system based on elastic web crawler over cloud computing. Multimed Tools Appl 80, 11723–11738 (2021). https://doi.org/10.1007/s11042-020-10271-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10271-3

Keywords

Navigation