Abstract
Recently, conjoined genes (CGs) have emerged as important genetic factors necessary for understanding the human genome. However, their formation mechanism and precise structures have remained mysterious. Based on a detailed structural analysis of 57 human CG transcript variants (CGTVs, discovered in this study) and all (833) known CGs in the human genome, we discovered that the poly(A) signal site from the upstream parent gene region is completely removed via the skipping or truncation of the final exon; consequently, CG transcription is terminated at the poly(A) signal site of the downstream parent gene. This result led us to propose a novel mechanism of CG formation: the complete removal of the poly(A) signal site from the upstream parent gene is a prerequisite for the CG transcriptional machinery to continue transcribing uninterrupted into the intergenic region and downstream parent gene. The removal of the poly(A) signal sequence from the upstream gene region appears to be caused by a deletion or truncation mutation in the human genome rather than post-transcriptional trans-splicing events. With respect to the characteristics of CG sequence structures, we found that intergenic regions are hot spots for novel exon creation during CGTV formation and that exons farther from the intergenic regions are more highly conserved in the CGTVs. Interestingly, many novel exons newly created within the intergenic and intragenic regions originated from transposable element sequences. Additionally, the CGTVs showed tumor tissue-biased expression. In conclusion, our study provides novel insights into the CG formation mechanism and expands the present concepts of the genetic structural landscape, gene regulation, and gene formation mechanisms in the human genome.
Similar content being viewed by others
References
Akiva P, Toporik A, Edelheit S, Peretz Y, Diber A, Shemesh R, Novik A, Sorek R (2006) Transcription-mediated gene fusion in the human genome. Genome Res 16(1):30–36
Beroud C, Tuffery-Giraud S, Matsuo M, Hamroun D, Humbertclaude V, Monnier N, Moizard MP, Voelckel MA, Calemard LM, Boisseau P, Blayau M, Philippe C, Cossee M, Pages M, Rivier F, Danos O, Garcia L, Claustres M (2007) Multiexon skipping leading to an artificial DMD protein lacking amino acids from exons 45 through 55 could rescue up to 63% of patients with Duchenne muscular dystrophy. Hum Mutat 28(2):196–202
Gingeras TR (2009) Implications of chimaeric non-co-linear transcripts. Nature 461(7261):206–211
Kim DS, Kim TH, Huh JW, Kim IC, Kim SW, Park HS, Kim HS (2006a) LINE FUSION GENES: a database of LINE expression in human genes. BMC Genomics 7:139
Kim N, Kim P, Nam S, Shin S, Lee S (2006b) ChimerDB—a knowledgebase for fusion sequences. Nucleic Acids Res 34(Database issue):D21–D24
Kim RN, Kim DW, Choi SH, Chae SH, Nam SH, Kim A, Kang A, Park KH, Lee YS, Hirai M, Suzuki Y, Sugano S, Hashimoto K, Kim DS, Park HS (2011) Major chimpanzee-specific structural changes in sperm development-associated genes. Funct Integr Genomics 11:507–517
Kumar-Sinha C, Tomlins SA, Chinnaiyan AM (2008) Recurrent gene fusions in prostate cancer. Nat Rev Cancer 8(7):497–511
Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, Jing X, Sam L, Barrette T, Palanisamy N, Chinnaiyan AM (2009) Transcriptome sequencing to detect gene fusions in cancer. Nature 458(7234):97–101
Mitelman F, Johansson B, Mertens F (2004) Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. Nat Genet 36(4):331–334
Nacu S, Yuan W, Kan Z, Bhatt D, Rivers CS, Stinson J, Peters BA, Modrusan Z, Jung K, Seshagiri S, Wu TD (2011) Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples. BMC Med Genomics 4:11
Navin N, Kendall J, Troge J, Andrews P, Rodgers L, McIndoo J, Cook K, Stepansky A, Levy D, Esposito D, Muthuswamy L, Krasnitz A, McCombie WR, Hicks J, Wigler M (2011) Tumour evolution inferred by single-cell sequencing. Nature 472(7341):90–94
Nilsen TW, Graveley BR (2010) Expansion of the eukaryotic proteome by alternative splicing. Nature 463(7280):457–463
Parra G, Reymond A, Dabbouseh N, Dermitzakis ET, Castelo R, Thomson TM, Antonarakis SE, Guigo R (2006) Tandem chimerism as a means to increase protein complexity in the human genome. Genome Res 16(1):37–44
Prakash T, Sharma VK, Adati N, Ozawa R, Kumar N, Nishida Y, Fujikake T, Takeda T, Taylor TD (2010) Expression of conjoined genes: another mechanism for gene regulation in eukaryotes. PLoS One 5(10):e13284
Sampson MJ, Ross L, Decker WK, Craigen WJ (1998) A novel isoform of the mitochondrial outer membrane protein VDAC3 via alternative splicing of a 3-base exon. Functional characteristics and subcellular localization. J Biol Chem 273(46):30482–30486
Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, Fujiwara S, Watanabe H, Kurashina K, Hatanaka H, Bando M, Ohno S, Ishikawa Y, Aburatani H, Niki T, Sohara Y, Sugiyama Y, Mano H (2007) Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 448(7153):561–566
Sorek R (2007) The birth of new exons: mechanisms and evolutionary consequences. RNA 13(10):1603–1608
Vilkki S, Tsao JL, Loukola A, Poyhonen M, Vierimaa O, Herva R, Aaltonen LA, Shibata D (2001) Extensive somatic microsatellite mutations in normal human tissue. Cancer Res 61(11):4541–4544
Acknowledgements
We thank the members of the Genome Resource Center (GRC) in the Korea Research Institute of Bioscience and Biotechnology (KRIBB) for their active assistance in conducting this research project. This research was supported by grant 2009-0084206 from the Ministry of Education, Science and Technology (MEST) and grant KGM5411011 from KRIBB.
Competing financial interests
The authors declare no competing financial interests.
Author information
Authors and Affiliations
Corresponding author
Additional information
Ryong Nam Kim, Aeri Kim, and Sang-Haeng Choi contributed equally to this work.
Rights and permissions
About this article
Cite this article
Kim, R.N., Kim, A., Choi, SH. et al. Novel mechanism of conjoined gene formation in the human genome. Funct Integr Genomics 12, 45–61 (2012). https://doi.org/10.1007/s10142-011-0260-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10142-011-0260-1