International Journal on Digital Libraries

, Volume 6, Issue 1, pp 98–111 | Cite as

BroadScale: Efficient scaling of heterogeneous storage systems

  • Shu-Yuen D. Yao
  • Cyrus Shahabi
  • Roger Zimmermann
Regular Paper


Scalable storage architectures enable digital libraries and archives for the addition or removal of storage devices to increase storage capacity and bandwidth or retire older devices. Past work in this area have mainly focused on statically scaling homogeneous storage devices. However, heterogeneous devices are quickly being adopted for storage scaling since they are usually faster, larger, more widely available, and more cost-effective. We propose BroadScale, an algorithm based on Random Disk Labeling, to dynamically scale heterogeneous storage systems by distributing data objects according to their device weights. Assuming a random placement of objects across a group of heterogeneous storage devices, our optimization objectives when scaling are to ensure a uniform distribution of objects, redistribute a minimum number of objects, and maintain fast data access with low computational complexity. We show through experimentation that BroadScale achieves these requirements when scaling heterogeneous storage.


Scalable storage systems Random data placement Load balancing Heterogeneous disk scaling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Byers, J., Considine, J., Mitzenmacher, M.: Simple load balancing for distributed hash tables. In: Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS '03) (February 2003)Google Scholar
  2. 2.
    Chou, C.-F., Golubchik, L., Lui, J.C.S.: Striping doesnapos; scale: how to achieve scalability for continuous media servers with replication. In: Proceedings of the International Conference on Distributed Computing Systems, pp. 64–71 (April 2000)Google Scholar
  3. 3.
    Dan, A., Sitaram, D.: An online video placement policy based on bandwidth to space ratio (BSR). In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 376–385, San Jose, California (May 1995)Google Scholar
  4. 4.
    Garey, M.R., Johnson, D.S.: Computer and intractability: A guide to the theory of NP-completeness, Chapter 6, pp. 124–127. W. H. Freeman and Company, New York (1979)Google Scholar
  5. 5.
    Ghandeharizadeh, S., Kim, D.: On-line reorganization of data in scalable continuous media servers. In: 7th International Conference and Workshop on Database and Expert Systems Applications (DEXAapos;6) (September 1996)Google Scholar
  6. 6.
    Goel, A., Shahabi, C., Yao, S.-Y.D., Zimmermann, R.: SCADDAR: An efficient randomized technique to reorganize continuous media blocks. In: Proceedings of the 18th International Conference on Data Engineering, pp. 473–482 (February 2002)Google Scholar
  7. 7.
    Gray, J., Shenoy, P.: Rules of thumb in data engineering. In: Proceedings of the 16th International Conference on Data Engineering, pp. 3–10 (February 2000)Google Scholar
  8. 8.
    Honicky, R.J., Miller, E.L.: A fast algorithm for online placement and reorganization of replicated data. In: 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), Nice, France (April 2003)Google Scholar
  9. 9.
    Karger, D., Lehman, E., Leighton, T., Levine, M., Lewin, D., Panigrahy, R.: Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In: Proceedings of the 29th ACM Symposium on Theory of Computing (STOC), pp. 654–663 (May 1997)Google Scholar
  10. 10.
    Knuth, D.E.: The Art of Computer Programming, vol. 3. Addison-Wesley, Reading, MA (1998)Google Scholar
  11. 11.
    Martin, C, Narayan, P.S., B. Özden, Rastogi, R., Silberschatz, A.: The fellini multimedia storage server. In: Chung S.M. (eds.) Multimedia information storage and management, Chapter 5. Kluwer Academic Publishers, Boston (August 1996). ISBN: 0-7923-9764-9Google Scholar
  12. 12.
    Muntz, R., Santos, J., Berson, S.: RIO: A real-time multimedia object server. In: ACM Sigmetrics Performance Evaluation Review, vol. 25 (September 1997)Google Scholar
  13. 13.
    Park, S.K., Miller, K.W.: Random number generators: Good ones are hard to find. Commun. ACM, 31(10), 1192–1201 (1988)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Ross, K.W.: Hash-routing for collections of shared web caches. IEEE Netw. Mag., 11(6), 37–44 (1997)Google Scholar
  15. 15.
    Santos, J.R., Muntz, R.R.: Performance analysis of the RIO Multimedia Storage System with Heterogeneous Disk Configurations. In: ACM Multimedia, pp. 303–308, Bristol, UK (September 1998)Google Scholar
  16. 16.
    Santos, J.R., Muntz, R.R., Ribeiro-Neto, B.: Comparing Random Data Allocation and Data Striping in Multimedia Servers. In: SIGMETRICS, Santa Clara, California (June 2000)Google Scholar
  17. 17.
    Shahabi, C., Zimmermann, R., Fu, K., Yao, S.-Y.D.: Yima: A Second Generation Continuous Media Server. IEEE Comput. pp. 56–64 (June 2002)Google Scholar
  18. 18.
    Shenoy, P., Goyal, P., Vin, H.M.: Architectural Considerations for Next Generation File Systems. Multimedia Syst., 8(4), 270–283 (2002)CrossRefGoogle Scholar
  19. 19.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 ACM SIGCOMM Conference, pp. 149–160 (2001)Google Scholar
  20. 20.
    Thaler, D.G., Ravishankar, C.V.: Using name-based mappings to increase hit rates. IEEE/ACM Trans. Network. 6(1), 1–14 (1998)Google Scholar
  21. 21.
    Thomson, J., Adams, D., Cowley, P.J., Walker, K.: Metadataapos; role in a scientific archive. IEEE Comput. 36(12), 27–34 (2003)Google Scholar
  22. 22.
    Wang, Y., Du, D.H.C.: Weighted striping in multimedia servers. In: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS '97), pp. 102–109 (June 1997)Google Scholar
  23. 23.
    Yao, S.-Y.D., Shahabi, C., Larson, P.-Å.: Hash-based labeling techniques for storage scaling. The VLDB journal: The international journal on very large data bases (2004). ISSN: 1066-8888 (Paper) 0949-877X (Online), DOI: 10.1007/s00778-004-0124-6, Issue: Online First.Google Scholar
  24. 24.
    Zimmermann, R.: Continuous media placement and scheduling in heterogeneous disk storage systems. Ph.D. Dissertation, University of Southern California, Los Angeles, California (December 1998)Google Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Shu-Yuen D. Yao
    • 1
  • Cyrus Shahabi
    • 1
  • Roger Zimmermann
    • 1
  1. 1.Computer Science DepartmentUniversity of Southern CaliforniaLos Angeles

Personalised recommendations