Skip to main content

Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction

  • Conference paper
On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops (OTM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4277))

Abstract

Many researches related to storing XML data have been performed and some of them proposed methods to improve the performance of databases by reducing the joins between tables. Those methods are very efficient in deriving and optimizing tables from a DTD or XML schema in which elements and attributes are defined. Nevertheless, those methods are not effective in an XML schema for biological information such as microarray data because even though microarray data have complex hierarchies just a few core values of microarray data repeatedly appear in the hierarchies. In this paper, we propose a new algorithm to extract core features which is repeatedly occurs in an XML schema for biological information, and elucidate how to improve classification speed and efficiency by using a decision tree rather than pattern matching in classifying structural similarities. We designed a database for storing biological information using features extracted by our algorithm. By experimentation, we showed that the proposed classification algorithm also reduced the number of joins between tables.

An erratum to this chapter can be found at http://dx.doi.org/10.1007/11915034_125.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Schoning, H.: Tamino - A DBMS designed for XML. In: Proceedings of the 17th ICDE Conference, Heidelberg, Germany, pp. 149–154 (2001)

    Google Scholar 

  2. Tatarinov, I., Viglas, S.D.: Storing and Querying Ordered XML Using a Relational Database System. In: Proceedings of the 2002 ACM SIGMODACM SIGMOD international conference on Management of data, Madison, Wisconsin, pp. 204–215 (2002)

    Google Scholar 

  3. Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitz, D., Naughton, J.: Relational databases for querying xml documents: Limitations and opportunities. In: Proc. Intl. Conf. on 25th VLDB (1999)

    Google Scholar 

  4. Runapongsa, K., Patel, J.M.: Storing and Querying XML Data in Object-Relational DBMSs. In: EDBT Workshop XMLDM 2002, pp. 266–285 (2002)

    Google Scholar 

  5. Laur, P.A., Masseglia, F., Poncelet, P.: Schema Mining: Finding Structural Regularity among Semistructured Data. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, p. 498. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  6. Anne, L., Pascal, P., Maguelonne, T.: Towards a fuzzy approach for mining XML mediator schemas. In: Fuzzy Logic and the Senmantic Web Workshop (2005)

    Google Scholar 

  7. Witten, I.H., Frank, E.: Data Mining Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, San Francisco

    Google Scholar 

  8. Wang, H., Li, J., Luo, J., He, Z.: XCpaqs:Compression of XML Document with XPath Query Support. In: IEEE, Proceedings of the International Conference on Information Technology:Coding and Computing (ITCC 2004) (2004)

    Google Scholar 

  9. Sarkans, U., Parkinson, H., Lara, G.G., Oezcimen, A., Sharma, A., Abeygunawardena, N., Contrino, S., Holloway, E., Rocca-Serra, P., Mukherjee, G., Shojatalab, M., Kapushesky, M., Sansone, S.A., Farne, A., Rayner, T., Brazma, A.: The ArrayExpress gene expression database: a software engineering and implementation perspective. Bioinformatics 21, 1495–1501 (2005)

    Article  Google Scholar 

  10. Levene, M., Wood, P.: XML Structure Compression. In: Proc. 2nd Int. Workshop on Web Dynamics (2002)

    Google Scholar 

  11. JAXB (Java Architecture for XML Binding).: http://java.sun.com/xml/downloads/jaxb.html

  12. XSLT (XML Stylesheet Language Transformations).: http://www.w3.org/Style/XSL/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jeong, J., Shin, D., Cho, C., Shin, D. (2006). Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. OTM 2006. Lecture Notes in Computer Science, vol 4277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11915034_96

Download citation

  • DOI: https://doi.org/10.1007/11915034_96

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-48269-7

  • Online ISBN: 978-3-540-48272-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics