Research and Implement of Real-Time Data Loading System IMIL

  • Han WeiHong
  • Jia Yan
  • Yang ShuQiang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4256)


With rapid development of the Internet and communication technology, massive data has been accumulated in many web-based applications such as deep web applications and web search engines. Increasing data volumes pose enormous challenges to data-loading techniques. This paper presents a data loading system in real time, the IMIL (Internet Monitoring Information Loader) that is used in RT-IMIS (Real-time Internet Monitoring Information System), which monitors real-time internet flux, manages network security, and collects a mass of Internet real-time information. IMIL consists of an extensible fault-tolerant hardware architecture, an efficient algorithm for bulk data loading using SQL*Loader and exchange partition mechanism, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques with loading speed of every Cluster, increasing from 220 million records per day to 1.2 billion per day, and achieving the top loading speed of 6TB data when 10 Clusters are in parallel. This framework offers a promising approach for loading other large and complex databases.


Data Loading Data Query Query Plan Database Table Text Index 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cai, Y.D., Aydt, R., Brunner, R.J.: Optimized Data Loading for a Multi-Terabyte Sky Survey Repository. In: SC 2005, Seattle, Washington, November 12-18 (2005)Google Scholar
  2. 2.
    Szalay, A., Kunszt, P., Thakar, A., Gray, J., Brunner, R.: Designing and Mining Multi-Terabyte Astronomy Archives: The Sloan Digital Sky Survey. In: Proc. SIGMOD, Austin, TX (May 2000)Google Scholar
  3. 3.
    Szalay, A., Gray, J., Thakar, A., Kunszt, P., Malik, T., Raddick, J., Stoughton, C., van den Berg, J.: The SDSS SkyServer-Public Access to the Sloan Digital Sky Server Data. Microsoft Technical Report. MSR-TR-2001-104 (November 2001)Google Scholar
  4. 4.
    Szalay, A., Gray, J., Thakar, A., Boroski, B., Gal, R., Li, N., Kunszt, P., Malik, T., O’Mullane, W., NietoSantisteban, M., Raddick, J., Stoughton, C., van den Berg, J.: The SDSS DR1 SkyServer, Public Access to a Terabyte of Astronomical Data,
  5. 5.
    Berchen, J., Seeger, B.: An Evaluation of Generic Bulk Loading Techniques. In: Proc. 27th VLDB Conference, Rome, Italy (2001)Google Scholar
  6. 6.
    Amer-yahia, S., Cluet, S.: A Declarative Approach to Optimize Bulk Loading into Databases. ACM Transactions on Database Systems 29(2) (June 2004)Google Scholar
  7. 7.
    Böhm, C., Kriegel, H.-P.: Efficient Bulk Loading of Large High-Dimensional Indexes. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 251–260. Springer, Heidelberg (1999)Google Scholar
  8. 8.
    Leutenegger, S., Nicol, D.: Efficient Bulk-Loading of Gridfiles. IEEE Transactions on Knowledge and Data Engineering 9(3), 410–420 (1997)CrossRefGoogle Scholar
  9. 9.
    Burleson, D.: Hypercharge Oracle data load speed,
  10. 10.
    Burleson, D.: Hypercharging Oracle Data Loading,

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Han WeiHong
    • 1
  • Jia Yan
    • 1
  • Yang ShuQiang
    • 1
  1. 1.Computer SchoolNational University of Defense TechnologyChangshaChina

Personalised recommendations