Scalable Digital Libraries Based on NCSTRL/Dienst

  • Kurt Maly
  • Mohammad Zubair
  • Hesham Anan
  • Dun Tan
  • Yunchuan Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1923)

Abstract

NCSTRL (The Networked Computer Science Technical Report Library) is a successful digital library for scientific and technical information. It uses the Dienst protocol that was developed by ARPA-funded CS-TR project. We encountered several problems while implementing NCSTRL based largescale libraries: UPS for Los Alamos and JDL for JTASC. The document collection for these libraries can range from several hundred thousands to few millions. The first problem we found that the native Dienst implementation does not scale beyond approximately 30,000 records. Secondly we found that the implementation is tightly coupled to the Unix platform. Finally, for a large number of hits the NCSTRL search interface support is limited in terms of usability. To address these problems, we replaced the Dienst repository service implementation with an Oracle-based implementation using servlet technology. The Oracle database stores the index information (metadata) and is partitioned horizontally to speed searching through different archives. Furthermore, indexes were built in order to speed the search by different key items such as the author name, the title and the abstract. Our implementation significantly reduced the average wait time for a user for searches that resulted in a large number of hits. In addition, we get all the other benefits of using servlet technology such as efficiency and portability. In this paper, we present the performance results of the new implementation and compare it with that of the implementation of the Dienst protocol in NCSTRL.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Accomazi, A., Eichhorn, G., Kurtz, M. J., Grant, C. S., Murray, S. S.,: Astronomical information Discovery and Access: Design and Implementation of the ADS Bibliographic Services. Astronomical data Analysis Software and Systems VI, 125, (1997), 357–360Google Scholar
  2. 2.
    Browne, S., Dongarra, J., Horner, J., McMahan, P., Wells, S.: National HPCC Software Exchange (NHSE): Uniting the High Performance Computing and Communications Community. D-Lib Magazine, May (1998), http://www.dlib.org/dlib/may98/browne/05browne.html
  3. 3.
    Davis, J.R., Kraft, D. B., Lagoze, C.: Dienst: Building a Production Technical Report Server. Advances in Digital Libraries. Springer-Verlag, (1995), 211–222Google Scholar
  4. 4.
    Davis, J. R., Lagoze, C.: The Networked Computer Science Technical Report Library. Cornell CS TR96-1595, July, (1996)Google Scholar
  5. 5.
    Dushay, N., French, J. C., Lagoze, C.: A Characterization Study of NCSTRL Distributed Searching. Cornell CS TR99-1725, January (1999)Google Scholar
  6. 6.
    Entlich, R., Garson, L., Lesk, M., Normore, L., Olsen, J., Weibel, S.: Making a Digital Library: The Contents of the CORE Project. ACM Transactions on Information Systems, 15(2), (1997), 103–123CrossRefGoogle Scholar
  7. 8.
    Hunter, J, and Crawfor, W.: Java Servlet Programming. O’Reilly and Associates, October1998.Google Scholar
  8. 9.
    Leiner, B.M.: The NCSTRL Approach to Open Architecture for the Confederated Digital Library. D-Lib Magazine, December (1998)Google Scholar
  9. 10.
    Maly, K., French, J., Fox, E., Salman, A.: Wide Area Technical Report Service-Technical Reports Online. Communications of the ACM, p. 45, April (1995)Google Scholar
  10. 11.
    Maly, K., Nelson, M. L, Zubair, M..: Smart Objects, Dumb Archives A User-Centric, Layered Digital Library Framework. D-Lib Magazine, March (1999), Volume 5 Issue 3Google Scholar
  11. 12.
    Maly, K., Nelson, M. L., Shen, S. N. T, Zubair M..: Buckets: Aggregative, Intelligent Agents for Publishing. WebNet Journal, Vol. 1, No. 1, March (1999), 58–65Google Scholar
  12. 13.
    Nelson, M. L., Maly, K., Shen, S. N. T., Zubair, M.: NCSTRL+: Adding Multi-Discipline and Multi-Genre Support to the Dienst Protocol Using Clusters and Buckets. Proceeding of Advances in Digital Libraries 98, Santa Barbara, CA, April 22-24 (1998)Google Scholar
  13. 14.
    Schatz, B., Chen, H.: Building Large-Scale Digital Libraries. IEEE Computer, 29(5), (1996), 22–26Google Scholar
  14. 15.
    Schatz, B., Mischo, W. H., Cole, T. W., Hardin, J. B., Bishop, A. P., Chen, H.: Federating Diverse Collections of Scientific Literature. IEEE Computer, 29(5), (1996), 28–36Google Scholar
  15. 16.
    Sompel, H. V., Nelson, M.L., Lyapunov, V.M., Zubair, M., Liu, X., Krichel, T., Hochestenbach, P., Maly, K., Kholief, M., O’Connell, H.:The UPS Prototype project: exploring the obstacles in creating a cross e-print archive end-user service. D-Lib Magazine, February (2000), Volume 6 Issue 2Google Scholar
  16. 17.
    Sompel, H. V., Lagoze, C.: The Santa Fe Convention of the Open Archives Initiative. DLib Magazine, February (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Kurt Maly
    • 1
  • Mohammad Zubair
    • 1
  • Hesham Anan
    • 1
  • Dun Tan
    • 1
  • Yunchuan Zhang
    • 1
  1. 1.Department of Computer ScienceOld Dominion UniversityNorfolkUSA

Personalised recommendations