Abstract
Understanding the roles and interplays of histone marks and transcription factors in the regulation of gene expression is of great interest in the development of non-invasive and personalized therapies. Computational studies at genome-wide scale represent a powerful explorative framework, allowing to draw general conclusions. However, a genome-wide approach only identifies generic regulative motifs, and possible multi-functional or co-regulative interactions may remain concealed. In this work, we hypothesize the presence of a number of distinct subpopulations of transcriptional regulative patterns within the set of protein coding genes that explain the statistical redundancy observed at a genome-wide level. We propose the application of a K-Plane Regression algorithm to partition the set of protein coding genes into clusters with specific shared regulative mechanisms. Our approach is completely data-driven and computes clusters of genes significantly better fitted by specific linear models, in contrast to single regressions. These clusters are characterized by distinct and sharper histonic input patterns, and different mean expression values.
This work was supported by the Advanced ERC Grant “Data-Driven Genomic Computing (GeCo)” project (2016–2021), funded by the European Research Council.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Maston, G., Evans, S., Green, M.R.: Transcriptional regulatory elements in the human genome. Annu. Rev. Genomics Hum. Genet. 7, 29–59 (2006)
Vaquerizas, J., Kummerfeld, S., Teichmann, S., Luscombe, N.: A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 252–263 (2009)
Bannister, A.J., Kouzarides, T.: Regulation of chromatin by histone modifications. Cell Res. 21(3), 381–395 (2011)
Levy, S.E., Myers, R.M.: Advancements in next-generation sequencing. Annu. Rev. Genomics Hum. Genet. 17, 95–115 (2016)
ENCODE Project Consortium: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012)
Kundaje, A., et al.: Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015)
Cheng, C., et al.: A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets. Genome Biol. 12, R15 (2011)
Budden, D., Hurley, D., Cursons, J., Markham, J., Davis, M., Crampin, E.: Predicting expression: the complementary power of histone modification and transcription factor binding data. Epigenetics Chromatin 7, 36 (2014)
do Rego, T.G., Roider, H.G., de Carvalho, F.A.T., Costa, I.G.: Inferring epigenetic and transcriptional regulation during blood cell development with a mixture of sparse linear models. Bioinformatics 28(18), 2297–2303 (2012)
Breiman, L.: Hinging hyperplanes for regression, classification, and function approximation. IEEE Trans. Inf. Theory 39, 999–1013 (1993)
Amaldi, E., Mattavelli, M.: The MIN PFS problem and piecewise linear model estimation. Discrete Appl. Math. 118, 115–143 (2002)
Manwani, N., Sastry, P.: K-plane regression. Inf. Sci. 292, 39–56 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Frasca, F., Matteucci, M., Morelli, M.J., Masseroli, M. (2020). Exposing and Characterizing Subpopulations of Distinctly Regulated Genes by K-Plane Regression. In: Raposo, M., Ribeiro, P., Sério, S., Staiano, A., Ciaramella, A. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2018. Lecture Notes in Computer Science(), vol 11925. Springer, Cham. https://doi.org/10.1007/978-3-030-34585-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-34585-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34584-6
Online ISBN: 978-3-030-34585-3
eBook Packages: Computer ScienceComputer Science (R0)