A Web-Based Transformation System for Massive Scientific Data

  • Shi Feng
  • Jie Song
  • Xuhui Bai
  • Daling Wang
  • Ge Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4256)


In the domain of science research, a mass of data obtained and generated by instruments are in the form of text. How to make the best use of these data has become one of the issues for both nature science researchers and computer professions. Many of these data contain their logic structure inside, but they are different from the self-describing semi-structured data, for these data are separate from the schema. Because of the great increase of the data amount, the traditional way of studying on these data can not meet the needs of high performance and flexible access. Relational DBMS is a good technique for organizing and managing data. In this paper, a mapping model—STRIPE—between scientific text and relational database is proposed. Using STRIPE, we design and implement a Web-based massive scientific data transformation system, which gives a good solution to the problem of the massive scientific data management, query and exchange. The evaluation to the system shows that it can greatly improve the efficiency of scientific data transformation, and offer scientists a novel platform for studying the data.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Atzeni, P., Ceri, S., Paraboschi, S., Torlone, R.: Database System Concents, Languages and Architecture. McGraw-Hill, New York (1999)Google Scholar
  2. 2.
    Buneman, P., Davidson, S.B., Hart, K., Overton, G.C., Wong, L.: A Data Transformation System for Biological Data Sources. In: VLDB, pp. 158–169 (1995)Google Scholar
  3. 3.
    Buneman, P., Davidson, S.B., Fernandez, M.F., Suciu, D.: Adding Structure to Unstructured Data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 336–350. Springer, Heidelberg (1996)Google Scholar
  4. 4.
    Buneman, P., Khanna, S., Tajima, K., Tan, W.C.: Archiving scientific data. ACM Trans. Database Syst. 29, 2–42 (2004)CrossRefGoogle Scholar
  5. 5.
    Deutsch, A., Fernandez, M.F., Suciu, D.: Storing Semistructured Data with STORED. SIGMOD, 431–442 (1999)Google Scholar
  6. 6.
    Extensible Markup Language:
  7. 7.
    Gray, J., Liu, D.T., Nieto-Santisteban, M.A., Szalay, A., DeWitt, D.J., Heber, G.: Scientific data management in the coming decade. SIGMOD Record 34(4), 34–41 (2005)CrossRefGoogle Scholar
  8. 8.
    Liu, Y., Liu, X., Xiao, L., Ni, L.M., Zhang, X.: Location-Aware Topology Matching in P2P Systems. In: Proc. of the IEEE INFOCOM (2004)Google Scholar
  9. 9.
    McHugh, J., Abiteboul, S., Goldman, R., Quass, D., Widom, J.: Lore: A Database Management System for Semistructured Data. SIGMOD Record 26(3), 54–66 (1997)CrossRefGoogle Scholar
  10. 10.
  11. 11.
    National Marine Data Information and Service,
  12. 12.
    National Oceanographic Data Center,
  13. 13.
    Papakonstantinou, Y., Garcia-Molina, H., Widom, J.: Object exchange across heterogeneous information sources. In: ICDE, pp. 251–260 (1995)Google Scholar
  14. 14.
    Quass, D., Rajaraman, A., Ullman, J.D., Widom, J., Sagiv, Y.: Querying Semistructured Heterogeneous Information. Journal of Systems Integration 7(3/4), 381–407 (1997)CrossRefGoogle Scholar
  15. 15.
  16. 16.
    Vidgen, R.T.: Constructing a web information system development methodology. Inf. Syst. J. 12(3), 247–261 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Shi Feng
    • 1
  • Jie Song
    • 1
  • Xuhui Bai
    • 1
  • Daling Wang
    • 1
  • Ge Yu
    • 1
  1. 1.College of Information Science and EngineeringNortheastern UniversityShenyangP.R. China

Personalised recommendations