Abstract
The web pages are considered as the main source of the available and provided information that is characterized by variation in its content. The facial recognition plays a key role in knowledge management and identity authentication systems. Although the rapid advance of the web technologies and face recognition systems, the improvement of real-time performance is still the bottleneck. The main objective of this study is to propose a real-time face retrieval system as a service over cloud computing based on a web face crawler. The proposed architecture ensures that the total response time is reduced and the resource utilization is optimized. The web crawlers fetch web pages and extract images in elastic storage over the cloud. Then the collected images are used to extract human faces and to prepare the faces images by succeeding phases to be ready for recognition and identifying the matched face of the collection. The proposed service depends on Principle Component Analysis (PCA) algorithm for feature extraction and dimensionality reduction. Furthermore, K-Nearest Neighbors (KNN) is used to classify the crawled facial images over cloud resources. The experimental results investigated that an enhancement of crawling speed is achieved by increasing the crawler instances. Moreover, the accuracy is enhanced in the face recognition based on the Euclidean over other metrics such as Manhattan and Cosine dissimilarity.
Similar content being viewed by others
Data availability
Data is available as referenced in the text.
References
Amazon (2020) AWS Amazon EC2. https://aws.amazon.com/ec2/. Accessed 15 Feb. 2020.
Amazon (2020) AWS Amazon RDS. https://aws.amazon.com/rds/. Accessed 15 Feb. 2020.
Bahrami M, Singhal M (2015) DCCSOA: a dynamic cloud computing service-oriented architecture. In: 2015 IEEE international conference on information reuse and integration. USA, San Francisco, pp 158–165
Bahrami M, Singhal M, Zhuang Z (2015) A cloud-based web crawler architecture. 18th International Conference on Intelligence in Next Generation Networks (ICIN), Paris, 216-223.
Bahurupi SP, Chaudhari DS (2012) Principal component analysis for face recognition. International Journal of Engineering and Advanced Technology 1(5):91–94
Boldi P, Codenotti B, Santini M, Vigna S (2004) Ubicrawler: a scalable fully distributed web crawler. Software: Practice and Experience 34(8):711–726
Borade SN, Deshmukh RR, Shrishrimal P (2016) Effect of distance measures on the performance of face recognition using principal component analysis. In Intelligent Systems Technologies and Applications, Springer, Cham, pp 569–577
Dubey SR (2019) Face retrieval using frequency decoded local descriptor. Multimed Tools Appl 78(12):16411–16431
Edwards J, McCurley K, Tomlin J (2001) An adaptive model for optimizing performance of an incremental web crawler. In Proceedings of the 10th international conference on World Wide Web, 106-113.
ElAraby ME, Moftah HM, Abuelenin SM, Rashad MZ (2018) Elastic web crawler service-oriented architecture over cloud computing. Arab J Sci Eng 43(12):8111–8126
ElAraby ME, Abuelenin SM, Moftah HM, Rashad MZ (2019) A new architecture for improving focused crawling using deep neural network. Journal of Intelligent & Fuzzy Systems 37(1):1233–1245
Fang Y, Zhang W, Liu N (2020) On the perception analysis of user feedback for interactive face retrieval. ACM Transactions on Applied Perception (TAP) 17(3):1–20
Gens F (2009) New IDC IT cloud services survey: top benefits and challenges, IDC. http://blogs.idc.com/ie/?p=730. .
Gupta S, Thakur K, Kumar M (2020) 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions. Vis Comput:1–10
Heydon A, Najork M (1999) Mercator: a scalable, extensible web crawler. World Wide Web 2(4):219–229
Hsieh JM, Gribble SD, Levy HM (2010) The architecture and implementation of an extensible web crawler. In: NSDI, the 7th USENIX conference on networked systems design and implementation. CA, USA, pp 329–344
Inzinger C, Nastic S, Sehic S, Vögler M, Li F, Dustdar S (2014) MADCAT: a methodology for architecture and deployment of cloud application topologies. In 2014 IEEE 8th international symposium on service oriented system engineering, 13-22.
Jing Y, Baluja S (2008) Pagerank for product image search. In Proceedings of the 17th international conference on World Wide Web, ACM, 307-316.
Khan MA, Jalal AS (2020) A framework for suspect face retrieval using linguistic descriptions. Expert Syst Appl 141:112925
Lin FC, Ngo HH, Dow CR (2020) A cloud-based face video retrieval system with deep learning. The Journal of Supercomputing, 1-21.
Mika P, Tummarello G (2008) Web semantics in the clouds. IEEE Intell Syst 23(5):82–87
Moreno-Vozmediano R, Montero RS, Llorente IM (2012) Key challenges in cloud computing: enabling the future internet of services. IEEE Internet Comput 17(4):18–25
Nguyen, H. M., Ly, N. Q., &Phung, T. T. (2018, March). Large-scale face image retrieval system at attribute level based on facial attribute ontology and deep neuron network. In Asian conference on intelligent information and database systems (pp. 539-549). Springer, Cham.
NIST-National Institute of Standards and Technology (2020) The NIST cloud computing program - NCCP. https://www.nist.gov/programs-projects/nist-cloud-computing-program-nccp. .
Ortiz EG, Becker BC (2014) Face recognition for web-scale datasets. Comput Vis Image Underst 118:153–170
Poon B, Amin MA, Yan H (2011) Performance evaluation and comparison of PCA based human face recognition methods for distorted images. Int J Mach Learn Cybern 2(4):245–259
Rodriguez-Vaamonde S, Torresani L, Fitzgibbon AW (2014) What can pictures tell us about web pages? Improving document search using images. IEEE Trans Pattern Anal Mach Intell 37(6):1274–1285
Shams MY, Sarhan SH, Tolba AS (2017) Adaptive deep learning vector quantisation for multimodal authentication. Journal of information hiding and multimedia signal processing 8(3):702–722
Suchitra S, Poovaraghan RJ (2020) Dynamic multi-attribute priority based face attribute detection for robust face image retrieval system. Multimed Tools Appl:1–25
University of California, Irvine - UCI. http://uci.edu/. Accessed 10 Feb. 2020.
Wang J, Tan Y (2013) Efficient Euclidean distance transform algorithm of binary images in arbitrary dimensions. Pattern Recogn 46(1):230–242
Wang D, Hoi SC, He Y, Zhu J (2012) Mining weakly labeled web facial images for search-based face annotation. IEEE Trans Knowl Data Eng 26(1):166–179
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, Zhou ZH (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Xu S, Yoon HJ, Tourassi G (2014) A user-oriented web crawler for selectively acquiring online content in e-health research. Bioinformatics 30(1):104–114
Zhang D, Lu G (2003) Evaluation of similarity measurement for image retrieval. In International Conference on Neural Networks and Signal Processing, IEEE, pp 928–931
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by ME ElAraby. The first draft of the manuscript was written by ME ElAraby and Mahmoud Y. Shams commented on previous versions of the manuscript and reviewed the final version. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest/Competing interests
Not applicable.
Code availability
New software developed with custom code.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
ElAraby, M.E., Shams, M.Y. Face retrieval system based on elastic web crawler over cloud computing. Multimed Tools Appl 80, 11723–11738 (2021). https://doi.org/10.1007/s11042-020-10271-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10271-3