Skip to main content

Adaptive Request Scheduling for Parallel Scientific Web Services

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5069))

  • 1236 Accesses

Abstract

Scientific web services often possess data models and query workloads quite different from commercial ones and are much less studied. Individual queries have to be processed in parallel by multiple server nodes, due to the computation- and data-intensiveness of the processing. Meanwhile, each query is performed against portions of a large, common dataset. Existing scheduling policies from traditional environments (namely cluster web servers and supercomputers) consider only the data or the computation aspect alone and are therefore inadequate for this new type of workload.

In this paper, we systematically investigate adaptive scheduling for scientific web services, by taking into account parallel computation scalability, data locality, and load balancing. Our case study focuses on high-throughput query processing on biological sequence databases, a fundamental task performed daily by millions of scientists, who increasingly prefer to use web services powered by parallel servers. Our research indicates that intelligent resource allocation and scheduling are crucial in improving the overall performance of a parallel sequence database search server. Failure to consider either the parallel computation scalability or the data locality issues can significantly hurt the system throughput and query response time. Also, no single static strategy works best for all request workloads or all resources settings. In response, we present several dynamic scheduling techniques that automatically adapt to the request workload and system configuration in making scheduling decisions. Experiments on a cluster using 32 processors show the combination of these techniques delivers a several-fold improvement in average query response time across various workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschula, S., Gisha, W., Millerb, W., Meyersc, E., Lipmana, D.: Basic local alignment search tool. Journal of Molecular Biology 215(3) (1990)

    Google Scholar 

  2. Bealer, K., Coulouris, G., Dondoshansky, I., Madden, T., Merezhuk, Y., Raytselis, Y.: A fault-tolerant parallel scheduler for blast. In: SC 2004 (2004)

    Google Scholar 

  3. Bjornson, R., Sherman, A., Weston, S., Willard, N., Wing, J.: TurboBLAST(r): A parallel implementation of BLAST built on the TurboHub. In: IPDPS (2002)

    Google Scholar 

  4. Braun, R.C., Pedretti, K.T., Casavant, T.L., Scheetz, T.E., Birkett, C.L., Roberts, C.A.: Parallelization of local blast service on workstation clusters. Future Gener. Comput. Syst. 17(6), 745–754 (2001)

    Article  MATH  Google Scholar 

  5. Camp, N., Cofer, H., Gomperts, R.: High-throughput BLAST, http://www.sgi.com/industries/sciences/chembio/resources/papers/HTBlast/HT_Whitepaper.html

  6. Cardellini, V., Casalicchio, E., Colajanni, M., Yu, P.: The state of the art in locally distributed web-server systems. ACM Computing Surveys 34(2) (2002)

    Google Scholar 

  7. Chi, E., Shoop, E., Carlis, J., Retzel, E., Riedl, J.: Efficiency of shared-memory multiprocessors for a genetic sequence similarity search algorithm. Technical Report TR97-005, University of Minnesota, Computer Science Department (1997)

    Google Scholar 

  8. Dandamudi, S., Yu, H.: Performance of adaptive space sharing processor allocation policies for distributed-memory multicomputers. JPDC 58(1) (1999)

    Google Scholar 

  9. Darling, A., Carey, L., Feng, W.: The design, implementation, and evaluation of mpiBLAST. In: Proceedings of the ClusterWorld Conference and Expo, in conjunction with The HPC Revolution (2003)

    Google Scholar 

  10. Feitelson, D.: A survey of scheduling in multiprogrammed parallel systems. Technical Report IBM/RC 19790(87657) (1994)

    Google Scholar 

  11. Gardner, M., Feng, W., Archuleta, J., Lin, H., Ma, X.: Parallel genomic sequence-searching on an ad-hoc grid: Experiences, lessons learned, and implications. In: Löwe, W., Südholt, M. (eds.) SC 2006. LNCS, vol. 4089. Springer, Heidelberg (2006)

    Google Scholar 

  12. Grant, J., Dunbrack Jr., R., Manion, F., Ochs, M.: BeoBLAST: distributed BLAST and PSI-BLAST on a Beowulf cluster. Bioinformatics 18(5) (2002)

    Google Scholar 

  13. Lin, H., Ma, X., Chandramohan, P., Geist, A., Samatova, N.: Efficient data access for parallel BLAST. In: IPDPS, Washington, DC, USA (2005)

    Google Scholar 

  14. Lin, H., Ma, X., Li, J., T, Y., Samatova, N.: Processor and data scheduling for online parallel sequence database servers. Technical Report TR-2007-23. North Carolina State Univeristy (2007)

    Google Scholar 

  15. Mathog, D.: Parallel BLAST on split databases. Bioinformatics 19(14) (2003)

    Google Scholar 

  16. McGinnis, S., Madden, T.: BLAST: at the core of a powerful and diverse set of sequence analysis tools. In: Nucleic Acids Res. (2004)

    Google Scholar 

  17. Message Passing Interface Forum. MPI: Message-Passing Interface Standard (1995)

    Google Scholar 

  18. Mu’alem, A., Feitelson, D.: Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. In: IEEE TPDS, vol. 12 (2001)

    Google Scholar 

  19. Ostell, J.: Databases of discovery. ACM Queue 3(3) (2005)

    Google Scholar 

  20. Pai, V., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., Nahum, E.: Locality-aware request distribution in cluster-based network servers. In: ASPLOS-VIII (1998)

    Google Scholar 

  21. Rosti, E., Smirni, E., Dowdy, L.W., Serazzi, G., Carlson, B.M.: Robust partitioning policies of multiprocessor systems. Perform. Eval. 19(2-3), 141–165 (1994)

    Article  Google Scholar 

  22. Wang, C., Alqaralleh, B., Zhou, B., Till, M., Zomaya, A.: A BLAST service built on data indexed overlay network. e-science (2005)

    Google Scholar 

  23. Wang, J., Mu, Q.: Soap-HT-BLAST: high throughput BLAST based on Web services. BIOINFORMATICS -OXFORD- (2003)

    Google Scholar 

  24. Zhu, H., Smith, B., Yang, T.: Scheduling optimization for resource-intensive web requests on server clusters. In: SPAA (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bertram Ludäscher Nikos Mamoulis

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, H., Ma, X., Li, J., Yu, T., Samatova, N. (2008). Adaptive Request Scheduling for Parallel Scientific Web Services. In: Ludäscher, B., Mamoulis, N. (eds) Scientific and Statistical Database Management. SSDBM 2008. Lecture Notes in Computer Science, vol 5069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69497-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69497-7_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69476-2

  • Online ISBN: 978-3-540-69497-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics