Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information

Suwannaroj, Sujimarn; Niranjan, Mahesan

doi:10.1007/s11265-007-0148-4

Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information

Published: 06 February 2008

Volume 50, pages 331–340, (2008)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Sujimarn Suwannaroj¹ &
Mahesan Niranjan¹

104 Accesses
2 Citations
Explore all metrics

Abstract

We present an approach to extracting information from textual documents of biological knowledge and demonstrate how cellular gene pathways may be inferred. Natural language processing techniques are used to represent title and abstract fields of publications to derive a gene similarity vectors which are subject to cluster analysis. Gene interactions are derived by parsing sentences in the abstracts to infer causal relationships. We show how high throughput transcriptome data may then be used to enhance the construction of gene pathways from information derived from text. Subnetworks constructed by integrating information automatically derived from literature with gene expression data is validated by comparing biological processes defined in the Gene Ontology 2(GO) database. We find that precision increases in \(58\%\) of the clusters when enhanced in this manner while a decrease in precision is observed in a relatively small number of clusters. These results are compared to similar attempts at the same problem and appear to be better in terms of precision of network construction. We also show an example of a subnetwork found by this analysis that overlaps a known gene pathway in KEGG and MIPS databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using the hierarchy of biological ontologies to identify mechanisms in flat networks

Article 18 July 2017

Finding gene regulatory network candidates using the gene expression knowledge base

Article Open access 10 December 2014

GeNeCK: a web server for gene network construction and visualization

Article Open access 07 January 2019

Notes

References

Benthem, J. F., & Meulen, A. G. (1997). Handbook of logic and language. Elsevier.
Corney, D. P. A., Buxton, B. F., Langdon, W. B., & Jones, D. T. (2004). BioRAT: Extracting biological information from full-length papers. Bioinformatics, 20(17), 3206–3213.
Article Google Scholar
Brown, M. S. P., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., et al. (2000). Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences of the USA, 97(1), 262–267.
Article Google Scholar
Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the USA, 95, 14863–14868.
Article Google Scholar
Grossman, D., & Frieder, O. (1999). Introduction to modern information retrieval. London: Library Association Publishing.
Google Scholar
Iliopoulos, I., Enright, A., & Ouzounis, C. (2001). Textquest: Document clustering of medline abstracts for concept discovery in molecular biology. Pac. Symp. Biocomput., 199, 384–395.
Google Scholar
Jenssen, T., Laegreid, A., Komorowski, J., & Hovig, R. (2001). A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28, 21–28.
Article Google Scholar
Kanehisa, M., Goto, S., Kawashima, S., & Nakaya, A. (2002). The KEGG databases at GenomeNet. Nucleic Acids Research, 30(1), 42–46.
Article Google Scholar
Karopka, T., Scheel, T., Bansemer, S., & Glass, A. (2004). Automatic construction of gene relation networks using text mining and gene expression data. Medical Informatics and the Internet in Medicine, 29(2), 169-183.
Article Google Scholar
Mering, C. V., Zdobnov, E. M., Tsoka, S., Ciccarelli, F. D., Pereira-Leal, J. B., Ouzounis, C. A., et al. (2003). Genome evolution reveals biochemical networks and functional modules. Proceedings of the National Academy of Sciences of the USA, 100(26), 15428–15433.
Article Google Scholar
Pavlidis, P., & Grundy, W. N. (2000). Combining Microarray Expression Data and Phylogenetic Profiles to Learn Gene Functional Categories Using Support Vector Machines. Technical report, Columbia University Department of Computer Science.
Raychaudhuri, S., Schutze, H., & Altman, R. B. (2003). Inclusion of textual documentation in the analysis of multidimensional data sets: Application to gene expression data. Machine Learning, 52, 119–145.
Article MATH Google Scholar
Salton, G., & Buckley, C. (1988). Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.
Article Google Scholar
Schlitt, T., Palin, K., Rung, J., Dietmann, S., Lappe, M., Ukkonen, E., et al. (2003). From gene networks to gene function. Genome Research, 13, 2568–2576.
Article Google Scholar
Schultz, J. M., & Liberman, M. (1999). Topic detection and tracking using idf-weighted cosine coefficient. Proceedings of the DARPA Broadcast News Workshop, pp. 189–192.
Schwikowski, B., Uetz, P., & Fields, S. (2000). A network of protein–protein interactions in yeast. Nature Biotechnology, 18, 1257–1261.
Article Google Scholar
Sekimizu, T., Park, H., & Tsujii, J. (1998). Identifying the interaction between genes and gene products based on frequently seen verbs in medline abstracts. Genome Informatics, 9, 62–71.
Google Scholar
Stein, L. (2003). Integrating biological databases. Nature, 4, 337–345.
Google Scholar
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., et. al. (1999). Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences of the USA, 96, 2907–2912.
Article Google Scholar

Download references

Acknowledgement

We are grateful to the Language Technology Group, the University of Edinburgh (LT CHUNK) and AT&T Labs-Research (Graphviz), SGD (GoTermFinder) for making software available in the public domain. SS was funded by The Royal Thai Government.

Author information

Authors and Affiliations

The Department of Computer Science, Regent Court, 211 Portobello Street, Sheffield, S1 4DP, UK
Sujimarn Suwannaroj & Mahesan Niranjan

Authors

Sujimarn Suwannaroj
View author publications
You can also search for this author in PubMed Google Scholar
Mahesan Niranjan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sujimarn Suwannaroj.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suwannaroj, S., Niranjan, M. Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information. J Sign Process Syst Sign Image 50, 331–340 (2008). https://doi.org/10.1007/s11265-007-0148-4

Download citation

Received: 26 September 2007
Revised: 26 September 2007
Accepted: 27 September 2007
Published: 06 February 2008
Issue Date: March 2008
DOI: https://doi.org/10.1007/s11265-007-0148-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information

Abstract

Access this article

Similar content being viewed by others

Using the hierarchy of biological ontologies to identify mechanisms in flat networks

Finding gene regulatory network candidates using the gene expression knowledge base

GeNeCK: a web server for gene network construction and visualization

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information

Abstract

Access this article

Similar content being viewed by others

Using the hierarchy of biological ontologies to identify mechanisms in flat networks

Finding gene regulatory network candidates using the gene expression knowledge base

GeNeCK: a web server for gene network construction and visualization

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation