Searchable Storage in Cloud Computing pp 153-178 | Cite as
Data Similarity-Aware Computation Infrastructure for the Cloud
Abstract
The cloud is emerging for scalable and efficient cloud services. In order to meet the needs of handling massive data and decreasing data migration, the computation infrastructure requires efficient data placement and proper management for cached data. We propose an efficient and cost-effective multilevel caching scheme, called MERCURY, as computation infrastructure of the cloud. The idea behind MERCURY is to explore and exploit data similarity and support efficient data placement. In order to accurately and efficiently capture the data similarity, we leverage low-complexity Locality-Sensitive Hashing (LSH). In our design, in addition to the problem of space inefficiency, we identify that a conventional LSH scheme also suffers from the problem of homogeneous data placement. To address these two problems, we design a novel Multicore-enabled LSH (MC-LSH) that accurately captures the differentiated similarity across data. The similarity-aware MERCURY hence partitions data into L1 cache, L2 cache, and main memory based on their distinct localities, which help optimize cache utilization and minimize the pollution in the last-level cache. Besides extensive evaluation through simulations, we also implemented MERCURY in a system. Experimental results based on real-world applications and datasets demonstrate the efficiency and efficacy of our proposed schemes (©{2014}IEEE. Reprinted, with permission, from Ref. [1].).
References
- 1.Y. Hua, X. Liu, D. Feng, Data similarity-aware computation infrastructure for the cloud. IEEE Trans. Comput. (TC) 63(1), 3–16 (2014)MathSciNetCrossRefGoogle Scholar
- 2.IDC iView, Extracting Value from Chaos (2011)Google Scholar
- 3.Science Staff, Dealing with data - challenges and opportunities. Science 331(6018), 692–693 (2011)CrossRefGoogle Scholar
- 4.M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica et al., A view of cloud computing. Commun. ACM 53(4), 50–58 (2010)CrossRefGoogle Scholar
- 5.S. Bykov, A. Geller, G. Kliot, J. Larus, R. Pandya, J. Thelin, Orleans: cloud computing for everyone, in Proceedings of the ACM Symposium on Cloud Computing (SOCC) (2011)Google Scholar
- 6.S. Wu, F. Li, S. Mehrotra, B. Ooi, Query optimization for massively parallel data processing, in Proceedings of the ACM Symposium on Cloud Computing (SOCC) (2011)Google Scholar
- 7.L. Soares, D. Tam, M. Stumm, Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer, in Proceedings of the MICRO (2009), pp. 258–269Google Scholar
- 8.S. Biswas, D. Franklin, A. Savage, R. Dixon, T. Sherwood, F. Chong, Multi-execution: multicore caching for data-similar executions, in Proceedings of the ISCA (2009)Google Scholar
- 9.M. Chaudhuri, Pagenuca: selected policies for page-grain locality management in large shared chip-multiprocessor caches, in Proceedings of the HPCA (2009), pp. 227–238Google Scholar
- 10.S. Srikantaiah, R. Das, A.K. Mishra, C.R. Das, M. Kandemir, A case for integrated processor-cache partitioning in chip multiprocessors, in Proceedings of the SC (2009)Google Scholar
- 11.X. Ding, K. Wang, X. Zhang, SRM-buffer: an OS buffer management technique to prevent last level cache from thrashing in multicores, in Proceedings of the EuroSys (2011)Google Scholar
- 12.Y. Chen, S. Byna, X. Sun, Data access history cache and associated data prefetching mechanisms, in Proceedings of the SC (2007)Google Scholar
- 13.J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, P. Sadayappan, Enabling software management for multicore caches with a lightweight hardware support, in Proceedings of the SC (2009)Google Scholar
- 14.D. Zhan, H. Jiang, S.C. Seth, STEM: spatiotemporal management of capacity for intra-core last level caches, in Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2010)Google Scholar
- 15.D. Zhan, H. Jiang, S.C. Seth, Locality & utility co-optimization for practical capacity management of shared last level caches, in Proceedings of the ACM International Conference on Supercomputing (2012)Google Scholar
- 16.J. Stuecheli, D. Kaseridis, D. Daly, H. Hunter, L. John, The virtual write queue: coordinating DRAM and last-level cache policies, in Proceedings of the ISCA (2010)Google Scholar
- 17.Y. Hua, X. Liu, D. Feng, MERCURY: a scalable and similarity-aware scheme in multi-level cache hierarchy, in Proceedings of the IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) (2012)Google Scholar
- 18.P. Indyk, R. Motwani, Approximate nearest neighbors: towards removing the curse of dimensionality, in Proceedings of the STOC (1998)Google Scholar
- 19.A. Forin, B. Neekzad, N. Lynch, Giano: the two-headed system simulator, Technical Report MSR-TR-2006-130 (Microsoft Research, Redmond, 2006)Google Scholar
- 20.S. Biswas, D. Franklin, T. Sherwood, F. Chong, Conflict-avoidance in multicore caching for data-similar executions, in Proceedings of the ISPAN (2009)Google Scholar
- 21.PostgreSQL, http://www.postgresql.org/
- 22.R. Lee, X. Ding, F. Chen, Q. Lu, X. Zhang, MCC-DB: minimizing cache conflicts in multi-core processors for databases. Proc. VLDB 2(1), 373–384 (2009)CrossRefGoogle Scholar
- 23.T.R.B. Bershad, D. Lee, B. Chen, Avoiding conflict misses dynamically in large direct-mapped caches, in Proceedings of the ASPLOS (1994)Google Scholar
- 24.Y. Yan, X. Zhang, Z. Zhang, Cacheminer: a runtime approach to exploit cache locality on smp. IEEE Trans. Parallel Distrib. Syst. 11(4), 357–374 (2000)CrossRefGoogle Scholar
- 25.K. Zhang, Z. Wang, Y. Chen, H. Zhu, X. Sun, Pac-plru: a cache replacement policy to salvage discarded predictions from hardware prefetchers, Proceedings of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) (2011), pp. 265–274Google Scholar
- 26.G. Suh, S. Devadas, L. Rudolph, Analytical cache models with applications to cache partitioning, in Proceedings of the ACM ICS (2001)Google Scholar
- 27.The Forest CoverType dataset, UCI machine learning repository, http://archive.ics.uci.edu/ml/datasets/Covertype
- 28.D. Ellard, J. Ledlie, P. Malkani, M. Seltzer, Passive NFS tracing of email and research workloads, in Proceedings of the FAST (2003)Google Scholar
- 29.E. Riedel, M. Kallahalla, R. Swaminathan, A framework for evaluating storage system security, in Proceedings of the FAST (2002)Google Scholar
- 30.SPEC2000, http://www.spec.org/cpu2000/
- 31.S. Carr, K. Kennedy, Compiler blockability of numerical algorithms, Proceedings of the Supercomputing Conference (1992)Google Scholar
- 32.E.E.R.M.S. Lam, M.E. Wolf, The cache performance and optimizations of blocked algorithms, in Proceedings of the ASPLOS (1991)Google Scholar
- 33.M.S.L.T.C. Mowry, A. Gupta, Design and evaluation of a compiler algorithm for prefetching, in Proceedings of the ASPLOS (1992)Google Scholar
- 34.J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, P. Sadayappan, Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems, in Proceedings of the HPCA (2008)Google Scholar
- 35.Q. Lv, W. Josephson, Z. Wang, M. Charikar, K. Li, Multi-probe LSH: efficient indexing for high-dimensional similarity search, Proceedings of the VLDB(2007), pp. 950–961Google Scholar
- 36.R. Shinde, A. Goel, P. Gupta, D. Dutta, Similarity search and locality sensitive hashing using ternary content addressable memories, in Proceedings of the SIGMOD (2010), pp. 375–386Google Scholar
- 37.A. Joly, O. Buisson, A posteriori multi-probe locality sensitive hashing, Proceedings of the ACM International Conference on Multimedia (2008)Google Scholar
- 38.G. Taylor, P. Davies, M. Farmwald, The TLB slice-a low-cost high-speed address translation mechanism, in Proceedings of the ISCA (1990)Google Scholar
- 39.A. Andoni, P. Indyk, Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)CrossRefGoogle Scholar
- 40.L. Fan, P. Cao, J. Almeida, A. Broder, Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Netw. 8(3), 281–293 (2000)CrossRefGoogle Scholar
- 41.B. Bloom, Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefGoogle Scholar
- 42.Y. Tao, K. Yi, C. Sheng, P. Kalnis, Quality and efficiency in high-dimensional nearest neighbor search, in Proceedings of the SIGMOD (2009)Google Scholar
- 43.Y. Hua, B. Xiao, D. Feng, B. Yu, Bounded LSH for similarity search in peer-to-peer file systems, in Proceedings of the ICPP (2008), pp. 644–651Google Scholar
- 44.TPC, http://www.tpc.org/
- 45.Y. Hua, B. Xiao, B. Veeravalli, D. Feng, Locality-sensitive bloom filter for approximate membership query. IEEE Trans. Comput. 61(6), 817–830 (2012)MathSciNetCrossRefGoogle Scholar
- 46.Z. Zhang, Z. Zhu, X. Zhang, Cached dram for ilp processor memory access latency reduction. IEEE Micro 21(4), 22–32 (2001)CrossRefGoogle Scholar
- 47.S. Byna, Y. Chen, X. Sun, R. Thakur, W. Gropp, Parallel I/O prefetching using MPI file caching and I/O signatures, in Proceedings of the SC (2008)Google Scholar
- 48.Z. Zhang, Z. Zhu, X. Zhang, Design and optimization of large size and low overhead off-chip caches. IEEE Trans. Comput. 53(7), 843–855 (2004)CrossRefGoogle Scholar
- 49.N. Hardavellas, M. Ferdman, B. Falsafi, A. Ailamaki, Near-optimal cache block placement with reactive nonuniform cache architectures. IEEE Micro 30(1), 20–28 (2010)CrossRefGoogle Scholar
- 50.J. Torrellas, A. Tucker, A. Gupta, Benefits of cache-affinity scheduling in shared-memory multiprocessors: a summary, in Proceedings of the ACM SIGMETRICS (1993)Google Scholar
- 51.H. Lee, S. Cho, B. Childers, Cloudcache: expanding and shrinking private caches, in Proceedings of the HPCA (2011), pp. 219–230Google Scholar
- 52.X. Zhang, S. Dwarkadas, K. Shen, Hardware execution throttling for multi-core resource management, in Proceedings of the USENIX Annual Technical Conference (2009)Google Scholar
- 53.A. Basu, N. Kirman, M. Kirman, M. Chaudhuri, J. Martinez, Scavenger: a new last level cache architecture with global block priority, in Proceedings of the MICRO (2007), pp. 421–432Google Scholar
- 54.J. Chhugani, A. Nguyen, V. Lee, W. Macy, M. Hagog, Y. Chen, A. Baransi, S. Kumar, P. Dubey, Efficient implementation of sorting on multi-core SIMD CPU architecture, in Proceedings of the VLDB (2008)Google Scholar
- 55.S. Park, T. Kim, J. Park, J. Kim, H. Im, Parallel skyline computation on multicore architectures, in Proceedings of the ICDE (2009)Google Scholar
- 56.S. Das, S. Antony, D. Agrawal, A. El Abbadi, Thread cooperation in multicore architectures for frequency counting over multiple data streams, in Proceedings of the VLDB (2009)Google Scholar
- 57.J. Cieslewicz, K. Ross, Adaptive aggregation on chip multiprocessors, in Proceedings of the VLDB (2007)Google Scholar
- 58.L. Qiao, V. Raman, F. Reiss, P. Haas, G. Lohman, Main-memory scan sharing for multi-core CPUs, in Proceedings of the VLDB (2008)Google Scholar
- 59.W. Han, J. Lee, Dependency-aware reordering for parallelizing query optimization in multi-core CPUs, in Proceedings of the SIGMOD (2009)Google Scholar
- 60.S. Tatikonda, S. Parthasarathy, Mining tree-structured data on multicore systems, in Proceedings of the VLDB (2009)Google Scholar
- 61.C. Kim, T. Kaldewey, V. Lee, E. Sedlar, A. Nguyen, N. Satish, J. Chhugani, A. Di Blas, P. Dubey, Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs, in Proceedings of the VLDB (2009)Google Scholar
- 62.M. Kleanthous, Y. Sazeides, CATCH: a mechanism for dynamically detecting cache-content-duplication and its application to instruction caches, in Proceedings of the DATE (2008)Google Scholar
- 63.A. Alameldeen, D. Wood, Adaptive cache compression for high-performance processors, in Proceedings of the ISCA (2004)Google Scholar
- 64.J. Chang, G. Sohi, Cooperative caching for chip multiprocessors, in Proceedings of the ISCA (2006)Google Scholar
- 65.C. Kim, D. Burger, S. Keckler, An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches, in Proceedings of the ASPLOS (2002)Google Scholar
- 66.Z. Chishti, M. Powell, T. Vijaykumar, Distance associativity for high-performance energy-efficient non-uniform cache architectures, in Proceedings of the MICRO (2003)Google Scholar
- 67.R. Manikantan, K. Rajan, R. Govindarajan, Nucache: an efficient multicore cache organization based on next-use distance, in Proceedings of the HPCA (2011), pp. 243–253Google Scholar
- 68.N. Lakshminarayana, J. Lee, H. Kim, Age based scheduling for asymmetric multiprocessors, in Proceedings of the ACM/IEEE Supercomputing Conference (2009)Google Scholar
- 69.J. Zhou, J. Cieslewicz, K. Ross, M. Shah, Improving database performance on simultaneous multithreading processors, in Proceedings of the VLDB (2005)Google Scholar
- 70.S. Boyd-Wickizer, R. Morris, M.F. Kaashoek, Reinventing scheduling for multicore systems, in Proceedings of the HotOS (2009)Google Scholar
- 71.L. Shalev, J. Satran, E. Borovik, M. Ben-Yehuda, IsoStack: highly efficient network processing on dedicated cores, in Proceedings of the USENIX Annual Technical Conference (2010)Google Scholar