Scalable Digital Libraries Based on NCSTRL/Dienst
- First Online:
- Cite this paper as:
- Maly K., Zubair M., Anan H., Tan D., Zhang Y. (2000) Scalable Digital Libraries Based on NCSTRL/Dienst. In: Borbinha J., Baker T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2000. Lecture Notes in Computer Science, vol 1923. Springer, Berlin, Heidelberg
NCSTRL (The Networked Computer Science Technical Report Library) is a successful digital library for scientific and technical information. It uses the Dienst protocol that was developed by ARPA-funded CS-TR project. We encountered several problems while implementing NCSTRL based largescale libraries: UPS for Los Alamos and JDL for JTASC. The document collection for these libraries can range from several hundred thousands to few millions. The first problem we found that the native Dienst implementation does not scale beyond approximately 30,000 records. Secondly we found that the implementation is tightly coupled to the Unix platform. Finally, for a large number of hits the NCSTRL search interface support is limited in terms of usability. To address these problems, we replaced the Dienst repository service implementation with an Oracle-based implementation using servlet technology. The Oracle database stores the index information (metadata) and is partitioned horizontally to speed searching through different archives. Furthermore, indexes were built in order to speed the search by different key items such as the author name, the title and the abstract. Our implementation significantly reduced the average wait time for a user for searches that resulted in a large number of hits. In addition, we get all the other benefits of using servlet technology such as efficiency and portability. In this paper, we present the performance results of the new implementation and compare it with that of the implementation of the Dienst protocol in NCSTRL.
Unable to display preview. Download preview PDF.