Finding Teams in Graphs and Its Application to Spatial Gene Cluster Discovery
Gene clusters are sets of genes in a genome with associated functionality. Often, they exhibit close proximity to each other on the chromosome which can be beneficial for their common regulation. A popular strategy for finding gene clusters is to exploit the close proximity by identifying sets of genes that are consistently close to each other on their respective chromosomal sequences across several related species.
Yet, even more than gene proximity on linear DNA sequences, the spatial conformation of chromosomes may provide a pivotal indicator for common regulation and/or associated function of sets of genes.
We present the first gene cluster model capable of handling spatial data. Our model extends a popular computational model for gene cluster prediction, called \(\delta \) -teams, from sequences to general graphs. In doing so, \(\delta \)-teams are single-linkage clusters of a set of shared vertices between two or more undirected weighted graphs such that the largest link in the cluster does not exceed a given threshold \(\delta \) in any input graph.
We apply our model to human and mouse data to find spatial gene clusters, i.e., gene sets with functional associations that exhibit close neighborhood in the spatial conformation of the chromosome across species.
KeywordsSpatial gene cluster Gene teams Single-linkage clustering Graph teams Hi-C data
We are very grateful to Krister Swenson for kindly providing the Hi-C data used in this study and for his many valuable suggestions. We wish to thank Pedro Feijão for many fruitful discussions in the beginning of this project. This work was partially supported by DFG GRK 1906/1.
- 1.Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)CrossRefGoogle Scholar
- 10.Jacob, F., Perrin, D., Sanchez, C., Monod, J.: Operon: a group of genes with the expression coordinated by an operator. C. R. Hebd. Seances Acad. Sci. 250, 1727–1729 (1960)Google Scholar
- 13.Ryba, T., Hiratani, I., Lu, J., Itoh, M., Kulik, M., Zhang, J., Schulz, T.C., Robins, A.J., Dalton, S., Gilbert, D.M.: Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types. Genome Res. 20(6), 761–770 (2010)CrossRefGoogle Scholar
- 22.Winter, S., Jahn, K., Wehner, S., Kuchenbecker, L., Marz, M., Stoye, J., Böcker, S.: Finding approximate gene clusters with Gecko 3. Nucleic Acids Res. 44(20), 9600–9610 (2016)Google Scholar
- 23.Yates, A., Akanni, W., Amode, M.R., Barrell, D., Billis, K., Carvalho-Silva, D., Cummins, C., Clapham, P., Fitzgerald, S., Gil, L., Girn, C.G., Gordon, L., Hourlier, T., Hunt, S.E., Janacek, S.H., Johnson, N., Juettemann, T., Keenan, S., Lavidas, I., Martin, F.J., Maurel, T., McLaren, W., Murphy, D.N., Nag, R., Nuhn, M., Parker, A., Patricio, M., Pignatelli, M., Rahtz, M., Riat, H.S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S.P., Zadissa, A., Birney, E., Harrow, J., Muffato, M., Perry, E., Ruffier, M., Spudich, G., Trevanion, S.J., Cunningham, F., Aken, B.L., Zerbino, D.R., Flicek, P.: Ensembl 2016. Nucleic Acids Res. 44(D1), D710 (2016)CrossRefGoogle Scholar