Unmapped sequencing reads identify additional candidate genes linked to magnetoreception in rainbow trout
A recent study identified candidate genes linked to magnetoreception in rainbow trout (Oncorhynchus mykiss) by sequencing transcriptomes from the brains of fish exposed to a magnetic pulse. However, the discovery of these candidate genes was limited to sequences that aligned to the reference genome. The unaligned, or unmapped, sequences may yet contain valuable information resulting from regions missing, misassembled, or divergent from the reference. Using the available sequencing data from the trout brain transcriptomes, we assembled >27 million unmapped sequences (5.8% of total sequences) into 45,142 contigs and identified 12 differentially expressed contigs as a result of exposure to a pulsed magnetic field. These contigs encoded a putative superoxide dismutase – a protein necessary to prevent oxidative damage – and collagen alpha-1 type II – a structural protein important for the development and integrity of the retina. These genes were consistent with the previous study suggesting an effect of the magnetic pulse on oxidative consequences of free iron and on non-visual encephalic photoreceptors. Our results demonstrate the utility of assembling unmapped sequencing reads in studies of gene expression and identify additional candidate genes associated with a magnetic sense in trout.
KeywordsGene expression Transcriptomics RNA-seq Oncorhynchus mykiss
We would like to thank the Duke Shared Cluster Resource for providing the computational resources necessary for the project. We also thank E. Caves, L. Schweikert, and J. Notar for comments on earlier drafts of this manuscript. This work was supported by the Duke University Scholars Program [to M.B.A.] and the Air Force Office of Scientific Research [FA9550-14-1-0208 to S.J. and R.R.F].
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- Andrews S (2010) FastQC: a quality control tool for high throughput sequence data. Babraham Institute. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 12 Aug 2017
- Arniella MB, Fitak R, Johnsen S (2017) Data from: unmapped sequencing reads identify additional candidate genes linked to magnetoreception in trout. Dryad Data Repository. https://doi.org/10.5061/dryad.4g048
- Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate - a practical and powerful approach to multiple testing. J R Stat Soc Series B Stat Methodol 57:289–300Google Scholar
- Berthelot C, Brunet F, Chalopin D, Juanchich A, Bernard M, Noël B, Bento P, da Silva C, Labadie K, Alberti A, Aury JM, Louis A, Dehais P, Bardou P, Montfort J, Klopp C, Cabau C, Gaspin C, Thorgaard GH, Boussaha M, Quillet E, Guyomard R, Galiana D, Bobe J, Volff JN, Genêt C, Wincker P, Jaillon O, Crollius HR, Guiguen Y (2014) The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun 5:3657. https://doi.org/10.1038/ncomms4657 CrossRefPubMedCentralPubMedGoogle Scholar
- Carrara M, Cavallo F, Arigoni M, Calogero RA (2012) Digging in the RNA-seq garbage: evaluating the characteristics of unmapped RNA-seq reads in normal tissues. In: Sixth international conference on complex, intelligent, and software intensive systems pp 588–591. https://doi.org/10.1109/CISIS.2012.107
- Edelman NB, Fritz T, Nimpf S, Pichler P, Lauwers M, Hickman RW, Papadaki-Anastasopoulou A, Ushakova L, Heuser T, Resch GP, Saunders M, Shaw JA, Keays DA (2015) No evidence for intracellular magnetite in putative vertebrate magnetoreceptors identified by magnetic screening. Proc Natl Acad Sci U S A 112(1):262–267. https://doi.org/10.1073/pnas.1407915112 CrossRefPubMedGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):644–652. https://doi.org/10.1038/nbt.1883 CrossRefPubMedCentralPubMedGoogle Scholar
- Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, MacManes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc 8(8):1494–1512. https://doi.org/10.1038/nprot.2013.084 CrossRefPubMedGoogle Scholar
- Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–1240. https://doi.org/10.1093/bioinformatics/btu031 CrossRefPubMedCentralPubMedGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352 CrossRefPubMedCentralPubMedGoogle Scholar
- Reinert K, Langmead B, Weese D, Evers DJ (2015) Alignment of next-generation sequencing reads. Annu Rev Genomics Hum Genet 16(1):133–151. https://doi.org/10.1146/annurev-genom-090413-025358 CrossRefPubMedGoogle Scholar
- Rondeau EB, Minkley DR, Leong JS, Messmer AM, Jantzen JR, von Schalburg KR, Lemon C, Bird NH, Koop BF (2014) The genome and linkage map of the northern pike (Esox lucius): conserved synteny revealed between the salmonid sister group and the Neoteleostei. PLoS One 9(7):e102089. https://doi.org/10.1371/journal.pone.0102089 CrossRefPubMedCentralPubMedGoogle Scholar
- Usman T, Hadlich F, Demasius W, Weikard R, Kuhn C (2017) Unmapped reads from cattle RNAseq data: a source for missing and misassembled sequences in the reference assemblies and for detection of pathogens in the host. Genomics 109(1):36–42. https://doi.org/10.1016/j.ygeno.2016.11.009 CrossRefPubMedGoogle Scholar
- Walbaum JJ (1792) Petri Artedi sueci genera piscium in quibus systema totum ichthyologiae proponitur cum classibus, ordinibus, generum characteribus, specierum differentiis, observationibus plurimis: redactis speciebus 242 ad genera 52: Ichthyologiae pars 3. Impensis Ant. Ferdin. Röse, Grypeswaldiae.Google Scholar
- Whitacre LK, Tizioto PC, Kim JW, Sonstegard TS, Schroeder SG, Alexander LJ, Medrano JF, Schnabel RD, Taylor JF, Decker JE (2015) What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual. BMC Genomics 16(1):1114. https://doi.org/10.1186/s12864-015-2313-7 CrossRefPubMedCentralPubMedGoogle Scholar