Embedded Real-Time Visual Search with Visual Distance Estimation
Visual Search algorithms are a class of methods that retrieve images by their content. In particular, given a database of reference images and a query image the goal is to find an image among the database that depicts the same object as in the query, if any. Moreover, in many different real case applications more than one object of interest could be viewed in the query image. Furthermore, in this kind of situations, often, it is not sufficient to identify the object depicted on a query image but its precise localization inside the scene viewed by the camera is also requested. In this paper we propose to couple a Visual Search system, which can retrieve multiple objects from the same query image, with an additional Distance Estimation module that exploits the localization information already computed inside the Visual Search stage to estimate localization of the object in three dimensions. In this work we implement the complete image retrieval and spatial localization pipeline (including relative distance estimation) on two different embedded devices, exploiting also their GPU in order to get near real time performances on low-power devices. Lastly, the accuracy of the proposed distance estimation is evaluated on a dataset of annotated query-reference pairs ad-hoc created.
KeywordsVisual Search CBIR LoG SIFT Distance estimation
This work was supported in part by the H2020 European project COSSIM.
- 1.Gill, H.: Cyber-physical systems: beyond ES, SNs, SCADA. In: Presentation in the Trusted Computing in Embedded Systems (TCES) Workshop (2010)Google Scholar
- 2.Niblack, W.C., Barber, R., Equitz, W., Flickner, M.D., Glasman, E.H., Petkovic, D., Yanker, P., Faloutsos, C., Taubin, G.: The QBIC project: Querying images by content, using color, texture, and shape. In: Storage and Retrieval for Image and Video Databases (SPIE) (1994)Google Scholar
- 4.The moving picture experts group website. http://mpeg.chiariglione.org/standards/mpeg-7/compact-escriptors-visual-search
- 6.Pagani, A., Stricker, D.: Structure from motion using full spherical panoramic cameras. In: ICCV Workshops, pp. 375–382. IEEE (2011)Google Scholar
- 7.Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I.D., Leonard, J.J.: Simultaneous localization and mapping: present, future, and the robust-perception age. CoRR, vol. abs/1606.05830 (2016)Google Scholar
- 9.Paracchini, M., Marcon, M., Plebani, E., Pau, D.P.: Visual search of multiple objects from a single query. In: 2016 IEEE 6th International Conference on Consumer Electronics-Berlin (ICCE-Berlin), Berlin, Germany, pp. 41–45. IEEE (2016)Google Scholar
- 10.Haralick, R.M., Shapiro, L.G.: Computer and Robot Vision. Addison-Wesley Longman Publishing Co. Inc, Boston (1992)Google Scholar
- 12.Plebani, E., Buzzella, A., Pau, D.P., Marcon, M.: Mixing retrieval and tracking using compact visual descriptors. In: 2013 IEEE 3rd International Conference on Consumer Electronics-Berlin (ICCE-Berlin), Berlin, Germany. IEEE (2013)Google Scholar