Earth Science Informatics

, Volume 4, Issue 1, pp 17–28 | Cite as

Satellite image retrieval using low memory locality sensitive hashing in Euclidean space

  • Ruben Buaba
  • Abdollah Homaifar
  • Mohamed Gebril
  • Eric Kihn
  • Mikhail Zhizhin
Research Article

Abstract

This paper presents the use of the Low Memory Locality Sensitive Hashing (LMLSH) technique operating in Euclidean space to build a data structure for the Defense Meteorological Satellite Program (DMSP) satellite imagery database. The LMLSH technique finds satellite image matches in sublinear search time. The texture feature vectors of the images are extracted using pyramid-structured wavelet transform coupled with Gaussian central moment technique. These feature vectors and families of hash functions, drawn randomly and independently from a Gaussian distribution, are used to build hash tables. Given a query, the hash tables are used to pull out the best matches to that query and this is done in a sublinear search time complexity. When tested, our algorithm has proven to be approximately twenty six times faster than the Linear Search (LS) algorithm. In addition, the LMLSH algorithm searches about two percent of the entire database randomly to find the possible matches to any given query without loss of accuracy compared to the absolute best matches returned by its LS counterpart.

Keywords

Item Image Approximate nearest neighbor Exact nearest neighbor Texture feature vector and match set 

Parameter notations

P

Data set containing the texture feature vectors of the images

N

Number of texture feature vectors in P

d

Dimensionality of the texture feature vector

p

Any vector belonging to P

d

A d-dimensional vector space such that if p ∈ ℜ d then p is a d-dimensional vector

q

A query vector such that q ∈ ℜ d

L

Number of tables

m

A prime number representing the number of bins per table

Ngd    (μ,σ2)

A Gaussian distribution with mean μ and variance σ 2

HGj

A family of hash functions drawn randomly from N gd     (0,d 2) for table j

α

Load factor, i.e. expected number of texture feature vectors that project into the same bin

β(q, R)

A sphere of radius R centered at q

δ

The probability that a true nearest neighbor is not reported to a given q

Pr

Probability.

Notes

Acknowledgement

We are grateful to National Oceanic and Atmospheric Administration (NOAA)/National Geophysical Data Center in Boulder, Colorado, for providing us with the DMSP satellite imagery database. This work is partially supported by NOAA/National Center for Atmospheric Research Educational Program under Cooperative Agreement No: NA060AR4810187.

References

  1. Arya S, Mount DM, Netanyahu NS, Silverman R, Wu A (1994) An optimal algorithm for approximate nearest neighbor searching. Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, Arlington, VA, pp. 573–582, January 23–25Google Scholar
  2. Bentley J (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18:509–517CrossRefGoogle Scholar
  3. Beyer KS, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbor meaningful? Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, pp. 217–235, January 10–12Google Scholar
  4. Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147CrossRefGoogle Scholar
  5. Buaba R, Gebril M, Homaifar A, Kihn E, Zhizhin M (2010) Locality sensitive hashing for satellite images using texture feature vectors. Proceedings of the 2010 IEEE Aerospace Conference, Big Sky, MT, pp. 1–10, March 6–13Google Scholar
  6. Buhler J (2001) Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17:419–428CrossRefGoogle Scholar
  7. Buhler J (2002) Provably sensitive indexing strategies for biosequence similarity search. Proceedings of the 6th Annual International Conference on Computational Molecular Biology (RECOMB02), Washington DC, pp. 399–417, April 18–21Google Scholar
  8. Buhler J, Tompa M (2001) Finding motifs using random projections. Proceedings of the 5th Annual International Conference on Computational Molecular Biology (RECOMB01), Montreal, Canada, pp. 69–76, April 22–25Google Scholar
  9. Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman J, Yang C (2000) Finding interesting associations without support pruning. Proceedings of the 16th International Conference on Data Engineering (ICDE), San Diego, CA, pp. 64–78, February 28–March 3Google Scholar
  10. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn. McGraw-Hill, Boston, pp 221–252Google Scholar
  11. Datar M, Indyk P, Immorlica N, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. Proceedings of the 20th ACM Annual Symposium on Computational Geometry, Brooklyn, NY, pp. 253–262, June 9–11Google Scholar
  12. Do MN, Vetterli M (2002) Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance. IEEE Trans Image Process 11:146–158CrossRefGoogle Scholar
  13. Gebril M, Buaba R, Homaifar A, Kihn E, Zhizhin M (2010) Structural indexing of satellite images using texture feature extraction retrieval. Proceedings of the 2010 IEEE Aerospace Conference, Big Sky, MT, pp. 1–9, March 6–13Google Scholar
  14. Georgescu B, Shimshoni I, Meer P (2003) Mean shift based clustering in high dimensions: a texture classification example. Proceedings of the 9th IEEE International Conference on Computer Vision, Los Alamitos, CA, pp. 456–463, October 13–16Google Scholar
  15. Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), Edinburgh, Scotland, UK, pp. 518–529, September 7–10Google Scholar
  16. Har-Peled S (2001) A replacement for voronoi diagrams of near linear size. Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, Las Vegas, NV, pp. 94–103, October 8–11Google Scholar
  17. Hinneburg A, Aggarwal C, Keim DA (2000) What is the nearest neighbor in high dimensional spaces? Proceedings of the 26th International Conference on Very Large Databases (VLDB), Junghoo Cho, Sougata Mukherjea, pp 506–515, September 10–14Google Scholar
  18. Indyk P, Motwani R (1998) Approximate nearest neighbor: towards removing the curse of dimensionality. Proceedings of the 30th Annual ACM Symposium on Theory of Computing, Dallas, TX, pp. 604–613, May 24–26Google Scholar
  19. Kleinberg J (1997) Two algorithms for nearest-neighbor search in high dimensions. Proceedings of the 29th Annual ACM Symposium on Theory of Computing, EI Paso, TX, pp. 599–608, May 1–4Google Scholar
  20. Kushilevitz E, Ostrovsky R, Rabani Y (1998) Efficient search for approximate nearest neighbor in high dimensional spaces. Proceedings of the 30th ACM Symposium on Theory of Computing, Dallas, TX, pp. 614–623, May 24–26Google Scholar
  21. Nasser S, Alkhaldi R, Vert G (2006) A modified Fuzzy K-means clustering using expectation maximization. Proceedings of the IEEE International Conference on Fuzzy Systems, Vancouver, BC, Canada, pp. 231–235, July 16–21Google Scholar
  22. Ouyang Z, Memon N, Suel T, Trendafilov D (2002) Cluster-based delta compression of collections of files. Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE), Singapore, pp. 257–266, December 12–14Google Scholar
  23. Slaney M, Casey M (2008) Locality-sensitive hashing for finding nearest neighbors. IEEE Signal Process Mag 25:128–131CrossRefGoogle Scholar
  24. Thorpe S, Fize D, Marlot C (1996) Speed of processing in the human visual system. Nature 381:520–522CrossRefGoogle Scholar
  25. Weber R, Schek HJ, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. Proceedings of the 24th Int. Conf. on Very Large Data Bases (VLDB), New York City, NY, pp. 194–205, August 24–27Google Scholar
  26. Yang C (2001) MACS: music audio characteristic sequence indexing for similarity retrieval. Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, pp. 123–126, October 21–24Google Scholar
  27. Zolotarey VM (1986) One-dimensional stable distributions. In: American Mathematical Society. Translations of Mathematical Monographs, vol. 65. Providence, Rhode Island, pp. 269–298Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Ruben Buaba
    • 1
  • Abdollah Homaifar
    • 1
  • Mohamed Gebril
    • 1
  • Eric Kihn
    • 2
  • Mikhail Zhizhin
    • 3
  1. 1.Autonomous Control and Information Technology CenterDepartment of Electrical and Computer EngineeringGreensboroUSA
  2. 2.NOAA/NGDCBoulderUSA
  3. 3.Russian Academy of Science CGDSMoscowRussia

Personalised recommendations