Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction

Jeong, Jongil; Shin, Dongil; Cho, Chulho; Shin, Dongkyoo

doi:10.1007/11915034_96

Jongil Jeong¹⁹,
Dongil Shin¹⁹,
Chulho Cho²⁰ &
…
Dongkyoo Shin¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4277))

Included in the following conference series:

OTM Confederated International Conferences "On the Move to Meaningful Internet Systems"

763 Accesses
1 Citations

Abstract

Many researches related to storing XML data have been performed and some of them proposed methods to improve the performance of databases by reducing the joins between tables. Those methods are very efficient in deriving and optimizing tables from a DTD or XML schema in which elements and attributes are defined. Nevertheless, those methods are not effective in an XML schema for biological information such as microarray data because even though microarray data have complex hierarchies just a few core values of microarray data repeatedly appear in the hierarchies. In this paper, we propose a new algorithm to extract core features which is repeatedly occurs in an XML schema for biological information, and elucidate how to improve classification speed and efficiency by using a decision tree rather than pattern matching in classifying structural similarities. We designed a database for storing biological information using features extracted by our algorithm. By experimentation, we showed that the proposed classification algorithm also reduced the number of joins between tables.

An erratum to this chapter can be found at http://dx.doi.org/10.1007/11915034_125.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Clustering XML documents by patterns

Article Open access 23 January 2015

XM-tree: data driven computational model by using metric extended nodes with non-overlapping in high-dimensional metric spaces

Article 18 April 2018

An Efficient Schema Matching Approach Using Previous Mapping Result Set

References

Schoning, H.: Tamino - A DBMS designed for XML. In: Proceedings of the 17th ICDE Conference, Heidelberg, Germany, pp. 149–154 (2001)
Google Scholar
Tatarinov, I., Viglas, S.D.: Storing and Querying Ordered XML Using a Relational Database System. In: Proceedings of the 2002 ACM SIGMODACM SIGMOD international conference on Management of data, Madison, Wisconsin, pp. 204–215 (2002)
Google Scholar
Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitz, D., Naughton, J.: Relational databases for querying xml documents: Limitations and opportunities. In: Proc. Intl. Conf. on 25th VLDB (1999)
Google Scholar
Runapongsa, K., Patel, J.M.: Storing and Querying XML Data in Object-Relational DBMSs. In: EDBT Workshop XMLDM 2002, pp. 266–285 (2002)
Google Scholar
Laur, P.A., Masseglia, F., Poncelet, P.: Schema Mining: Finding Structural Regularity among Semistructured Data. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, p. 498. Springer, Heidelberg (2000)
Chapter Google Scholar
Anne, L., Pascal, P., Maguelonne, T.: Towards a fuzzy approach for mining XML mediator schemas. In: Fuzzy Logic and the Senmantic Web Workshop (2005)
Google Scholar
Witten, I.H., Frank, E.: Data Mining Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, San Francisco
Google Scholar
Wang, H., Li, J., Luo, J., He, Z.: XCpaqs:Compression of XML Document with XPath Query Support. In: IEEE, Proceedings of the International Conference on Information Technology:Coding and Computing (ITCC 2004) (2004)
Google Scholar
Sarkans, U., Parkinson, H., Lara, G.G., Oezcimen, A., Sharma, A., Abeygunawardena, N., Contrino, S., Holloway, E., Rocca-Serra, P., Mukherjee, G., Shojatalab, M., Kapushesky, M., Sansone, S.A., Farne, A., Rayner, T., Brazma, A.: The ArrayExpress gene expression database: a software engineering and implementation perspective. Bioinformatics 21, 1495–1501 (2005)
Article Google Scholar
Levene, M., Wood, P.: XML Structure Compression. In: Proc. 2nd Int. Workshop on Web Dynamics (2002)
Google Scholar
JAXB (Java Architecture for XML Binding).: http://java.sun.com/xml/downloads/jaxb.html
XSLT (XML Stylesheet Language Transformations).: http://www.w3.org/Style/XSL/

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Sejong University, 98 Kunja-Dong, Kwangjin-Ku, Seoul, 143-747, Korea
Jongil Jeong, Dongil Shin & Dongkyoo Shin
College of Business Administration, Kyung Hee University, 1 Hoegi-Dong, Dongdaemun-Ku, Seoul, 130-701, Korea
Chulho Cho

Authors

Jongil Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Dongil Shin
View author publications
You can also search for this author in PubMed Google Scholar
Chulho Cho
View author publications
You can also search for this author in PubMed Google Scholar
Dongkyoo Shin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

STARLab, Vrije Universiteit Brussel (VUB), Bldg G/10, Pleinlaan 2, 1050, Brussels, Belgium
Robert Meersman
School of Computer Science and Information Technology, RMIT University, Bld 10.10, 376-392 Swanston Street, 3001, Melbourne, VIC, Australia
Zahir Tari
Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo S/N, 28660, Boadilla del Monte, Madrid, Spain
Pilar Herrero

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jeong, J., Shin, D., Cho, C., Shin, D. (2006). Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops. OTM 2006. Lecture Notes in Computer Science, vol 4277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11915034_96

Download citation

DOI: https://doi.org/10.1007/11915034_96
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48269-7
Online ISBN: 978-3-540-48272-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Clustering XML documents by patterns

XM-tree: data driven computational model by using metric extended nodes with non-overlapping in high-dimensional metric spaces

An Efficient Schema Matching Approach Using Previous Mapping Result Set

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Structural Similarity Mining in Semi-structured Microarray Data for Efficient Storage Construction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Clustering XML documents by patterns

XM-tree: data driven computational model by using metric extended nodes with non-overlapping in high-dimensional metric spaces

An Efficient Schema Matching Approach Using Previous Mapping Result Set

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation