NDT 2010: Networked Digital Technologies pp 162-171 | Cite as
Locality Preserving Scheme of Text Databases Representative in Distributed Information Retrieval Systems
Conference paper
Abstract
This paper proposes an efficient and effective "Locality Preserving Mapping" scheme that allows text databases representatives to be mapped onto a global information retrieval system such as Peer-to-Peer Information Retrieval Systems (P2PIR). The proposed approach depends on using Locality Sensitive Hash functions (LSH), and approximate min-wise independent permutations to achieve such task. Experimental evaluation over real data, along with comparison between different proposed schemes (with different parameters) will be presented in order to show the performance advantages of such schemes.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Bawa, M., Condie, T., Ganesan, P.: LSH forest: self tuning indexes for similarity search. In: Proceedings of the 14th International on World Wide Web (WWW 2005), New York, NY, USA, pp. 651–660 (2005)Google Scholar
- 2.Bhattacharya, I., Kashyap, S.R., Parthasarathy, S.: Similarity Searching in Peer-to-Peer Databases. In: The 25th IEEE International Conference on Distributed Computing Systems (ICDCS 2005), Columbus, OH, pp. 329–338 (2005)Google Scholar
- 3.Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: The 34th ACM Symposium on Theory of Computing, pp. 380–388 (2002)Google Scholar
- 4.Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: The Twentieth Annual Symposium on Computational Geometry (SCG 2004), Brooklyn, New York, USA, pp. 253–262 (2004)Google Scholar
- 5.Gupta, A., Agrawal, D., Abbadi, A.E.: Approximate Range Selection Queries in Peer-to-Peer Systems. In: The CIDR Conference, pp. 254–273 (2003)Google Scholar
- 6.Cai, D., He, X., Han, J.: Document Clustering Using Locality Preserving Indexing. IEEE Transactions on Knowledge and Data Engineering 17(12), 1624–1637 (2005)CrossRefGoogle Scholar
- 7.Mokbel, M.F., Aref, W.G., Grama, A.: Spectral LPM: An Optimal Locality-Preserving Mapping using the Spectral (not Fractal) Order. In: The 19th International Conference on Data Engineering (ICDE 2003), pp. 699–701 (2003)Google Scholar
- 8.Sagan, H.: Space-Filling Curves. Springer, Berlin (1994)MATHGoogle Scholar
- 9.Indyk, P., Motwani, R.: Approximate Nearest Neighbors: towards Removing the Curse of Dimensionality. In: The Symp. Theory of Computing, pp. 604–613 (1998)Google Scholar
- 10.Indyk, P.: Nearest neighbors in High-Dimensional Spaces. In: CRC Handbook of Discrete and Computational Geometry. CRC, Boca Raton (2003)Google Scholar
- 11.Motwani, R., Naor, A., Panigrahy, R.: Lower bounds on Locality Sensitive Hashing. In: The ACM Twenty-Second Annual Symposium on Computational Geometry SCG 2006, Sedona, Arizona, USA, pp. 154–157 (2006)Google Scholar
- 12.Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Efficient Filtering with Sketches in the Ferret Toolkit. In: The 8th ACM International Workshop on Multimedia Information Retrieval (MIR 2006), Santa Barbara, California, USA, pp. 279–288 (2006)Google Scholar
- 13.Qamra, A., Meng, Y., Chang, E.Y.: Enhanced Perceptual Distance Functions and Indexing for Image Replica Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 379–391 (2005)CrossRefGoogle Scholar
- 14.Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-Wise Independent Permutations. Journal of Computer and System Sciences 60, 630–699 (2000)MATHCrossRefMathSciNetGoogle Scholar
- 15.Broder, A.Z.: On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences, p. 21 (1997)Google Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2010