Aggarwal, C.C., Yu, P.S.: The igrid index: reversing the dimensionality curse for similarity indexing in high dimensional space. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 119–129 (2000)
Amato, G., Esuli, A., Falchi, F.: A comparison of pivot selection techniques for permutation-based indexing. Inf. Syst. 52, 176–188 (2015)
Article
Google Scholar
Baldi, P., Sadowski, P., Whiteson, D.: Searching for exotic particles in high-energy physics with deep learning. Nat. Commun. 5, 4308 (2014)
Article
Google Scholar
Barrios, J.M., Bustos, B., Skopal, T.: Analyzing and dynamically indexing the query set. Inf. Syst. 45, 37–47 (2014)
Article
Google Scholar
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: International Conference on Database Theory, Springer, New York, pp. 217–235 (1999)
Blake, C., Merz, C.: Uci repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA. http://www.ics.uci.edu/~mlearn/mlrepository.html (1998)
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (2008)
Chambi, S., Lemire, D., Kaser, O., Godin, R.: Better bitmap performance with roaring bitmaps (2014). arXiv:1402.6407
Chen, L., Gao, Y., Zheng, B., Jensen, C.S., Yang, H., Yang, K.: Pivot-based metric indexing. Proc. VLDB Endow. 10(10), 1058–1069 (2017)
Article
Google Scholar
Donoho, D.L.: High-dimensional data analysis: the curses and blessings of dimensionality. AMS Math. Chall. Lect. 1, 32 (2000)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96, 226–231 (1996)
Google Scholar
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Gao, J., Jagadish, H.V., Lu, W., Ooi, B.C.: Dsh: data sensitive hashing for high-dimensional k-nn search. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, ACM, pp. 1127–1138 (2014)
García, S., Luengo, J., Sáez, J.A., López, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)
Article
Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., pp. 518–529 (1999)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data (New York, NY, USA), SIGMOD ’84, ACM, pp. 47–57 (1984)
Guzun, G., Canahuate, G.: Hybrid query optimization for hard-to-compress bit-vectors. VLDB J 25, 339–354 (2015)
Article
Google Scholar
Guzun, G., Canahuate, G.: Supporting dynamic quantization for high-dimensional data analytics. In: Proceedings of the ExploreDB’17 (New York, NY, USA), ExploreDB ’17, ACM, pp. 6:1–6:6 (2017)
Guzun, G., Canahuate, G.: Distributed query-aware quantization for high-dimensional similarity searches. In: Advances in Database Technology: Proceedings. International Conference on Extending Database Technology, vol. 2018, pp. 373–384 (2018)
Guzun, G., Canahuate, G., Chiu, D., Sawin, J.: A tunable compression framework for bitmap indices. In: IEEE 30th International Conference on Data Engineering (ICDE), IEEE, 2014, pp. 484–495 (2014)
Guzun, G., Tosado, J., Canahuate, G.: Slicing the dimensionality: top-k query processing for high-dimensional spaces. In: Transactions on Large-Scale Data-and Knowledge-Centered Systems XIV. Springer, New York, pp. 26–50 (2014)
Guzun, G., Canahuate, G., Chiu, D.: A two-phase Mapreduce algorithm for scalable preference queries over high-dimensional data. In: Proceedings of the 20th International Database Engineering & Applications Symposium, IDEAS (2016)
Guzun, G., McClurg, J.C., Canahuate, G., Mudumbai, R.: Power efficient big data analytics algorithms through low-level operations. In: 2016 IEEE International Conference on Big Data (Big Data), IEEE, pp. 355–361 (2016)
Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech/ J. 29(2), 147–160 (1950)
MathSciNet
Article
Google Scholar
Huang, Q., Feng, J., Zhang, Y., Fang, Q., Ng, W.: Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proc. VLDB Endow. 9(1), 1–12 (2015)
Article
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Article
Google Scholar
Johnson, J., Douze, M., Jégou, H.: Billion-scale similarity search with gpus (2017). arXiv:1702.08734
Kamel, I., Faloutsos, C.: Hilbert r-tree: an improved r-tree using fractals. In: Proceedings of the 20th International Conference on Very Large Data Bases (San Francisco, CA, USA), VLDB ’94, Morgan Kaufmann Publishers Inc., pp. 500–509 (1994)
Katayama, N., Satoh, S.: The sr-tree: an index structure for high-dimensional nearest neighbor queries. SIGMOD Rec. 26(2), 369–380 (1997)
Article
Google Scholar
Kerber, R.: Chimerge: discretization of numeric attributes. In: Proceedings of the 10th National Conference on Artificial Intelligence. San Jose, CA, July 12–16, pp. 123–128 (1992)
Kurgan, L.A., Cios, K.J.: Caim discretization algorithm. IEEE Trans. Knowl. Data Eng. 16(2), 145–153 (2004)
Article
Google Scholar
Lemire, D., Kaser, O., Gutarra, E.: Reordering rows for better compression: beyond the lexicographic order. ACM Trans. Datab. Syst. 37(3), 20:1–20:29 (2012)
Google Scholar
Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2014)
Book
Google Scholar
Liu, F., Lee, H.J.: Use of social network information to enhance collaborative filtering performance. Expert Syst. Appl. 37(7), 4772–4778 (2010)
Article
Google Scholar
O’Neil, P., Quass, D.: Improved query performance with variant indexes. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of data, ACM Press, pp. 38–49 (1997)
Pareto, V.: Manual of Political Economy. Augustus M. Kelley Publishers, New York (1906)
Google Scholar
Phung, S.L., Bouzerdoum, A., Chai, D.: Skin segmentation using color pixel classification: analysis and comparison. IEEE Trans. Pattern Anal. Mach. Intell. 27(1), 148–154 (2005)
Article
Google Scholar
Rinfret, D.: Answering preference queries with bit-sliced index arithmetic. In: Proceedings of the 2008 C3S2E Conference, ACM, pp. 173–185 (2008)
Rinfret, D., O’Neil, P., O’Neil, E.: Bit-sliced index arithmetic. In: ACM SIGMOD Record, vol. 30, ACM, pp. 47–57 (2001)
Samet, H.: The Design and Analysis of Spatial Data Structures, vol. 199. Addison-Wesley, Reading, MA (1990)
Google Scholar
Shy Goh, K., Li, B., Chang, E.: Dyndex: a dynamic and non-metric space indexer. In: In ACM Multimedia, pp. 466–475 (2002)
Tsai, C., Lee, C., Yang, W.: A discretization algorithm based on class-attribute contingency coefficient. Inf. Sci. 178(3), 714–731 (2008)
Article
Google Scholar
Tung, A., K.H., Zhang, R., Koudas, N., Ooi, B.C.: Similarity search: a matching based approach. In: VLDB’2006: Proceedings of the 32nd International Conference on Very Large Data Bases, VLDB Endowment, pp. 631–642 (2006)
Weber, R.: Parallel va-file. In: Proceeding of ECDL, Springer, New York, pp. 83–92 (2000)
Weber, R., Blott, S.: An approximation based data structure for similarity search. Technical Report (1997)
Weber, R., Schek, H.-J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24rd International Conference on Very Large Data Bases (San Francisco, CA, USA), VLDB ’98, Morgan Kaufmann Publishers Inc., pp. 194–205 (1998)
Wu, K., Otoo, E.J., Shoshani, A., Nordberg, H.: Notes on design and implementation of compressed bit vectors. Technical Report LBNL/PUB-3161, Lawrence Berkeley National Laboratory (2001)
Wu, K., Otoo, E.J., Shoshani, A.: Compressing bitmap indexes for faster search operations. In: Proceedings of the 2002 International Conference on Scientific and Statistical Database Management Conference (SSDBM’02), pp. 99–108 (2002)
Zipf, G.K.: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Ravenio Books, Cambridge (2016)
Google Scholar