Abstract
Freshwater ecosystems are among the most endangered ecosystems worldwide. While numerous taxa are on the verge of extinction as a result of global changes and direct or indirect anthropogenic activity, genomic and transcriptomic resources represent a key tool for comprehending species' adaptability and serve as the foundation for conservation initiatives. The Loire grayling, Thymallus ligericus, is a freshwater European salmonid endemic to the upper Loire River basin. The species is comprised of fragmented populations that are dispersed over a small area and it has been identified as a vulnerable species. Here, we provide a multi-tissue de novo transcriptome assembly of T. ligericus. The completeness and integrity of the transcriptome were assessed before and after redundancy removal with lineage-specific libraries from Eukaryota, Metazoa, Vertebrata, and Actinopterygii. Relative gene expression was assessed for each of the analyzed tissues, using the de novo assembled transcriptome and a genome-based analysis using the available T. thymallus genome as a reference. The final assembly, with a contig N50 of 1221 and Benchmarking Universal Single-Copy Orthologs (BUSCO) scores above 94%, is made accessible along with structural and functional annotations and relative gene expression of the five tissues (NCBI SRA and FigShare databases). This is the first transcriptomic resource for this species, which provides a foundation for future research on this and other salmonid species that are increasingly exposed to environmental stressors.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Freshwater fishes are among the most threatened vertebrates. In Europe for example, they are the second most threatened group of organisms following molluscs (Costa et al. 2021). Of the 434 fish species reported in the IUCN red list, an estimated 41.2% are threatened, salmonids being the second most affected family (Costa et al. 2021). The most common threats affecting freshwater fauna in Europe include habitat fragmentation, pollution, the presence of invasive species, dams and hydropower plants and climate change (Mueller et al. 2020). Thymallinae (graylings), a monogeneric subfamily in the diverse Salmonidae family, is represented solely by the genus Thymallus, commonly named grayling. According to the latest revision, 15 taxa are currently listed in the genus (Weiss et al. 2021). The highest diversity occurs in the Asian continent, where 12 of the 15 known species are found (Weiss et al. 2021). One of these, Thymallus arcticus (Pallas 1776), also inhabits some parts of North America, more specifically Alaska, Montana and Canada (Weiss et al. 2021). Three species are recognized in Europe: the widely distributed European grayling Thymallus thymallus (Linnaeus, 1758), the Adriatic grayling Thymallus aeliani Valenciennes, 1848 and the Loire grayling Thymallus ligericus Persat et al., 2019. Until a few years ago, the latter two were considered different lineages of T. thymallus. Thymallus ligericus is the most recently described Thymallus species (Persat et al. 2019) and a revision of the genus categorized it as vulnerable, based on its distribution range and the documented habitat fragmentation (Persat et al. 2016; Weiss et al. 2021). This grayling inhabits the upper reaches of the Loire River (France) and some of its main tributaries, such as the Vienne, the Allier, the Sioule and the Lignon Rivers (Persat et al. 2016; Weiss et al. 2021). Populations of T. ligericus have maintained their reproductive isolation despite decades of failed restocking attempts as demonstrated by the T. ligericus population in the Vienne River, which has been confronted with hatchery-reared T. thymallus over 50 years, yet no trace of these stocked individuals can be found (Persat et al. 2016).
The impact of environmental changes on species functions such as physiology, behaviour, and gene expression, is relatively unexplored in Thymallinae. Studies exist for the widely distributed T. thymallus and T. arcticus, as well as for other salmonid species and can offer insights into how environmental stressors may affect T. ligericus and other Thymallinae. Indeed it is demonstrated that events such as rising temperature, discharge fluctuations and human pollution affect physiological functions in T. thymallus and T. arcticus, as well as Salmo trutta Linnaeus, 1758 and Salmo salar Linnaeus, 1758 e.g. by inducing male bias, disrupting reproduction, growth, recruitment and feeding and swimming behaviours (Newcombe and Jensen 1996; Deegan et al. 1999; Luecke and MacKinnon 2008; Wedekind et al. 2013; Richard et al. 2015; Warren et al. 2015; Bašić et al. 2018; Opinion et al. 2020; Hayes et al. 2021; Auer et al. 2022). Thymallinae are strictly freshwater salmonids and their migratory behaviour within the same watercourse during the spawning season (West et al. 1992; Hughes and Reynolds 1994; Meyer 2001; Heim et al. 2015; Zuev et al. 2021) make them particularly sensitive to stream disturbances. The Loire is one of Europe’s longest rivers. Its first largest tributary, the Allier River, includes several biodiversity hotspots for plants and birds (Moatar et al. 2022) and hosts populations of T. ligericus. Some major urbanization of the Loire basin has arisen in the Allier Valley, impacting the environment with anthropogenic activities such as the construction of dams (Moatar et al. 2022). In 1994 the implementation of the “Loire Grandeur Nature Plan” gradually began to restore and improve the management of the Loire and Allier River environments, which had suffered from unstopped anthropogenic activities for decades (Moatar et al. 2022). Despite this, the effects of climate change have been visible in this area and the Allier River, an important regional resource in terms of domestic and irrigation water supply, has recently experienced extreme dry summers with insufficient winter recharge (Labbe et al. 2023). These extreme conditions have inevitable consequences on the ecosystem’s biota and developing mechanisms to cope with such environmental fluctuations is critical for organisms’ survival. Understanding the limits of species resilience is vital to ensure the preservation and correct management of ecosystems. Genomic and transcriptomic resources, coupled with studies of ecology and physiology, are key tools that can help uncover compensatory mechanisms such as gene expression regulation in extreme environmental conditions. Increasingly lower sequencing costs coupled with advances in high-throughput sequencing techniques allow the generation of genomic and transcriptomic resources for non-model species (Connon et al. 2018; Gomes-dos-Santos et al. 2022). Among Thymallinae, transcriptomic resources have only been generated for T. thymallus (Pasquier et al. 2016; Varadharajan et al. 2018), which is also the only species of the genus with an available whole genome assembly (Varadharajan et al. 2018; Savilammi et al. 2019). Given the critical conservation status of T. ligericus, the lack of transcriptomic resources for this species may hinder future efforts to prevent its decline. In this work, we generate a high-quality multi-tissue transcriptome of the vulnerable species Thymallus ligericus from the Alagnon River (Allier basin). By pooling information from five different tissues, we aim to ensure the robustness of the resulting transcriptome assembly. Additionally, we provide relative gene expression profiles of the five tissues. As no transcriptomic resources are yet available for this taxon, this work provides a valuable framework for future studies that will assess the species' ability to respond to stressors such as climate change and anthropogenic activity, to which T. ligericus is constantly exposed.
Methods
Animal sampling
At the end of the summer season (27/08/2022), one adult male of T. ligericus was caught by angling from the Alagnon River (45° 14′ 34.4 N-3° 10′ 23″ E; Fig. 1) in the Massiac district (Loire, France). Afterwards, 1cm3 of tissue was sampled from the brain, gills, gonad, liver, and muscle. Tissues were then stored in RNA-later (Invitrogen, Thermo Fisher Scientific) at 4 °C for 24 h and later at − 80 °C in preparation for RNA extraction. The samples were collected with the permission of the Direction Départementale des Territoires-Préfète de la Loire and genetic analyses were carried out according to the Nagoya Protocol (reference permit number: TREL2206915S/602).
RNA extraction, sequencing and library construction
Total RNA was extracted from the five tissues using the NZY Total RNA isolation kit (NZYtech) following the manufacturer´s instructions. RNA concentration was measured with DeNovix DS-11 FS and its integrity was assessed by electrophoresis in 1% agarose gel. The five samples were then sent in dry ice to Macrogen facilities (Seoul, South Korea) where strand-specific libraries with 250–300 bp insert sizes were built. Samples were sequenced using 150 bp paired-end reads on a Novaseq 6000 platform.
Raw data quality control and assembly
The quality of the raw reads was assessed for each sample with FastQC v0.11.8 (Andrews 2010). The software Trimmomatic v0.38 (Bolger et al. 2014) was then used to quality-filter the reads and trim the Illumina adapters with the following parameters: LEADING-5 TRAILING-5 SLIDINGWINDOW-5:20 MINLEN-36. Afterwards, random sequencing errors were fixed with the program Rcorrector v1.0.3 Song and Florea 2015).
With no T. ligericus genome currently available, we opted to assemble the transcriptome with a de novo approach. Assembly was carried out with Trinity v2.13.2 (Grabherr et al. 2011; Haas et al. 2013) with default parameters. Afterwards, transcript redundancy was removed with Corset v1.0.9 (Davidson and Oshlack 2014). To remove possible contamination, the assembled transcripts were then matched against the NCBI nucleotide database NCBI-nt; (Download; 24/08/2021) and Univec database (Download; 02/04/2019) in Blast-n v2.11.0 (Camacho et al. 2009). Transcripts with less than 100 bp, an e-value of 1e-5, an identity score < 90% and no match to the class Actinopterygii (Taxonomy ID:7898), as well as any transcript that matched against the Univec database were removed from the assembly. The completeness of the transcriptome was assessed both before and after redundancy/contamination removal with BUSCO v3.0.2 (Simão et al. 2015) with lineage-specific libraries of Eukaryota, Metazoa, Vertebrata and Actinopterygii. Transrate v1.0.3 (Smith-Unna et al. 2016) was then used to assess transcriptome integrity before and after redundancy/contamination removal.
ORF prediction and transcriptome annotation
Open Reading Frames (ORFs) prediction was carried out with Transdecoder v5.3.0 (https://transdecoder.github.io/). The software Blast-p v2.12.0 (Camacho et al. 2009) and hmmscan of hmmer2 package v2.4 (Finn et al. 2011) were used to search for homology and proteins by matching the transcripts against UniProtKB/Swiss-Prot48 (Bateman et al. 2017) and PFAM (Punta et al. 2012) databases. Both TransDecoder output in.gff format and the de novo assembly were processed with Gtf/Gff Analysis Toolkit (AGAT) v0.8.0 (Dainat et al. 2020) to prepare the file.gff3 for functional annotation. Afterwards, AGAT tool was used to standardize file formatting of protein and transcript fasta files. Functional annotation was carried out with InterProScan v5.44.80 (Quevillon et al. 2005) and Blast-n/p/x search. The proteins were mapped against NCBI-RefSeq—Reference Sequence Database (Download; 10/03/2022), NCBI-nr—non-redundant database (Download; 15/12/2021) and InterPro database (Download; 30/03/2019) using Blast-p/x tools implemented in DIAMOND v2.0.13 (Buchfink et al. 2014). The transcripts were searched in nt and nr databases of NCBI using Blast-n and Blast-x tools, respectively, from NCBI and DIAMOND. Finally, the blast outputs and InterProScan outputs were concatenated and integrated into the annotated gff3 file using AGAT. Gene names were assigned following the best blast hit and ranked as follows: 1 for Blast-p Hit from RefSeq database; 2 for Blast-p Hit from NCBI-nr database; 3 for Blast-x Hit from NCBI-nr database; 4 for Blast-n Hit from NCBI-nt database.
Relative gene expression
Relative gene expression was assessed in each of the five tissues using two distinct methods: the first approach involved utilizing the de novo assembled transcriptome. This analysis was conducted with Trinity scripts, aligning the trimmed and corrected reads against the transcriptome using Bowtie2 v2.3.0 (Langmead and Salzberg 2012) and quantifying gene expression with RSEM v1.2.31 (Li and Dewey 2014). For the second approach, a reference genome-based pipeline was applied, using the available T. thymallus genome as a reference. The reads were mapped to the T. thymallus genome using HISAT2 v2.2.1 (Kim et al. 2015), and the resulting SAM files were converted to BAM format and sorted using SAMtools v1.9 (Li et al. 2009). StringTie v2.1.2 Pertea et al. 2015) was used to generate read counts for each mapped dataset.
Results and discussion
RNA sequencing
Raw reads obtained for each tissue were deposited at the NCBI Sequence Read Archive database under the accession numbers SRR26130661-SRR26130665 and the BioProject PRJNA1019285. The non-redundant transcriptome assembly is provided on NCBI databases under the accession number GKPV00000000. Both redundant and non-redundant transcriptome assemblies (*_trinity_filtered.fasta.gz and *_transcriptome.fasta.gz) are also provided on Figshare platform (https://doi.org/10.6084/m9.figshare.24174285), together with transcripts (*_genes_final.fasta.gz), mRNA (*_mrna_final.fasta.gz), ORF prediction (*_cds_final.fasta.gz), proteins ORF prediction (*_proteins_final.fasta.gz), annotation files (*_annotation.gff3.gz) and relative gene expression results.
Considering the conservation status of T. ligericus, a vulnerable species with small and fragmented populations, the specimen’s capture and sacrifice must be justified. Therefore, we chose to sequence only one individual of the species. However, the robustness of the results is improved by sequencing and analyzing five different tissues, increasing the number of represented genes. In this case, the use of one individual did not represent a limitation for the generation of a robust transcriptome. Although the use of one sample precludes statistical analyses requiring higher representativeness (e.g., differential gene expression, population genetics), it represents a framework to develop less invasive methods for producing transcriptome resources. This approach has been previously used for other taxa such as bivalves, chondrichthyans and teleosts (Carruthers et al. 2018; Lan et al. 2018; Machado et al. 2018, 2020; Gomes-dos-Santos et al. 2022), providing extremely valuable genomic resources.
Raw datasets and pre-assembly processing quality control
The sequencing produced 18,708,599 paired-end raw reads for the brain tissue, 22,621,561 for the gill, 23,200,311 for the gonad, 23,250,910 for the liver and 23,118,988 for the muscle (Table 1).
After read trimming and correction (carried out respectively with Trimmomatic and Rcorrector) the quality check performed in fastQC returned high Phred score values (> 20) for both forward and reverse sequences (Fig. 2). The number of reads retained after trimming and used in the assembly are shown in Table 1.
Transcriptome assembly, annotation and relative gene expression
The newly assembled redundant transcriptome included 1,147,286 sequences and an N50 length of 922 (Table 2). The Benchmarking Universal Single-Copy Orthologs (BUSCO) allowed evaluation of the gene completeness by matching the transcriptome with Eukaryota, Metazoa, Vertebrata and Actinopterygii orthologues database (respectively 303, 978, 2,586 and 4,584). BUSCO scores show a high level of gene completeness with more than 90% (complete + duplicated) genes for the four inferred groups (Euk, Met, Vet, Act) (Table 2).
After the assembly, redundancy removal was carried out with Corset. The program removed about 75% from the initial amount of transcripts increasing the N50 from 922 to 1,221 and causing a general improvement in the quality of the assembly (Table 2). This is also reflected in the BUSCO scores. Although the percentages of matched genes are slightly lower, the percentages of single-copy genes increased significantly in the four classes (Euk, Met, Vet, Act) (Table 2), highlighting the importance and efficiency of redundancy removal. Afterwards, the non-redundant transcriptome was decontaminated with Blast-n searches against the NCBI-nt and Univec databases. Of the 68,242 predicted proteins 54,176 were functionally annotated (Table 3). All the annotation files can be consulted in FigShare (https://doi.org/10.6084/m9.figshare.24174285). The present work provides a high-quality transcriptome assembly with an N50 over 1000 bp and BUSCO scores that are congruent with those of recently published high-quality assemblies (e.g., Lan et al. 2018; Machado et al. 2018; Moreno-Santillán et al. 2019; Li et al. 2019; Shen et al. 2020; Gomes-dos-Santos et al. 2022. In our study, BUSCO scores improved remarkably after redundancy removal with the number of duplicated genes decreasing up to 50%. The de novo transcriptome assembly was obtained following a robust pipeline previously applied successfully by our team in fish as well as in other organisms such as freshwater bivalves (Machado et al. 2020; Gomes-dos-Santos et al. 2022).
Additionally, the complete tables of relative gene expression obtained with de novo and referenced pipeline for each of the five tissues are also available in FigShare (https://doi.org/10.6084/m9.figshare.24174285). Gene expression quantification was performed both using Fragments Per Kilobase of Transcript Per Million (FKPM) and Transcripts Per Million (TPM) metrics. The gene IDs followed the gff annotation file. The count of genes with TPM and FKPM values > 1, for each tissue, as identified with the two methods, are displayed in Table 4. The de novo approach yielded a significantly higher number of transcripts with TPM and FPKM values > 1 than the approach using genome reference pipeline. This outcome is expected due to the differences underlying the two approaches. The referenced pipeline is based on complete genes, while the de novo pipeline is based on transcripts. Although the first approach is deemed as a more conservative and reliable approach, important information may be lost if the reference genome used belongs to a distinct species, as is the case herein, i.e., the T. thymallus genome. In this study, the de novo pipeline that used the T. ligericus transcriptome here produced as a reference limited the loss of species-specific information. On the other hand, while working with transcripts through the de novo pipeline it is inevitable to include, for example, different unigene entries for two non-contiguous reads that may belong to the same gene. Considering the different advantages that each of the approaches brings to the analyses, both results are provided.
Conclusions
This is the first transcriptomic resource of T. ligericus, which is an important step forward considering that transcriptome assemblies are only available for a distinct Thymallus species, i.e., the European T. thymallus. Despite increasing research attention on the genomics of salmonids, whole genome assemblies are only available for a few species (e.g. Berthelot et al. 2014; Christensen et al. 2018a,b; Varadharajan et al. 2018). This is largely due to both the large size (ca. 3 Gb) and complexity of salmonid genomes, which experienced an ancestral event of whole genome duplication about 80 MY. Previous studies have shown that a quarter of the salmonid genome has undergone multiple lineage-specific re-diploidisation events and that part of it is still undergoing re-diploidisation (e.g. Robertson et al. 2017; Gundappa et al. 2022). These characteristics represent well-known hurdles for whole genome assemblies (see Carruthers et al. 2018). Under this scenario, transcriptomic resources may represent the most suitable and cost-effective solution to generate informative data at the whole genome level. By targeting solely transcribed regions, transcriptomes significantly reduce sequencing costs and facilitate the computational demand and accuracy of the assembly process (e.g., Wang et al. 2019). Moreover, transcriptomes provide fast, informative and valuable resources for meeting the demand for genomic tools to study threatened species (e.g., Connon et al. 2018). This is particularly important for understudied species whose conservation status is already at risk, such as many species of the Thymallus genus (Weiss et al. 2021). The importance of transcriptome resources as a tool to enhance the ever-developing field of conservation physiology has been previously demonstrated (e.g., Connon et al. 2018) and several works have shown how transcriptome data can be used to refine species management plans (e.g., Connon et al. 2012; Narum and Campbell 2015; Wellband and Heath 2017).
In conclusion, the results reported here constitute a significant advancement in the study of the vulnerable freshwater salmonid T. ligericus, a key resource for this distinctive genus, providing a valuable resource for future studies investigating the vulnerability of this group to present and forthcoming threats.
Data availability
All software and respective versions applied for the present work are stated in the Methods section, together with the associated parameters. When parameters are not specified, default settings are used. Raw reads are available at the NCBI Sequence Read Archive database (SRR26130661-SRR26130665; BioProject PRJNA1019285. The non-redundant transcriptome assembly is provided on NCBI databases under the accession number GKPV00000000. Redundant and non-redundant transcriptome assemblies (*_trinity_filtered.fasta.gz and *_transcriptome.fasta.gz), transcripts (*_genes_final.fasta.gz), mRNA (*_mrna_final.fasta.gz), ORF prediction (*_cds_final.fasta.gz), proteins ORF prediction (*_proteins_final.fasta.gz), annotation files (*_annotation.gff3.gz) and relative gene expression results are also provided on Figshare platform (https://doi.org/10.6084/m9.figshare.24174285).
References
Andrews S (2010) FastQC: a quality control tool for high throughput sequence data.
Auer S, Hayes DS, Führer S et al (2022) Effects of cold and warm thermopeaking on drift and stranding of juvenile European grayling (Thymallus thymallus). River Res Appl. https://doi.org/10.1002/rra.4077
Bašić T, Britton JR, Cove RJ et al (2018) Roles of discharge and temperature in recruitment of a cold-water fish, the European grayling Thymallus thymallus, near its southern range limit. Ecol Freshw Fish 27:940–951. https://doi.org/10.1111/eff.12405
Bateman A, Martin MJ, O’Donovan C et al (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45:D158–D169. https://doi.org/10.1093/nar/gkw1099
Berthelot C, Brunet F, Chalopin D et al (2014) The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun. https://doi.org/10.1038/ncomms4657
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Buchfink B, Xie C, Huson DH (2014) Fast and sensitive protein alignment using DIAMOND. Nat Methods 12:59–60. https://doi.org/10.1038/nmeth.3176
Camacho C, Coulouris G, Avagyan V et al (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:1–9. https://doi.org/10.1186/1471-2105-10-421
Carruthers M, Yurchenko AA, Augley JJ et al (2018) De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species. BMC Genomics 19:1–17. https://doi.org/10.1186/s12864-017-4379-x
Christensen KA, Leong JS, Sakhrani D et al (2018a) Chinook salmon (Oncorhynchus tshawytscha) genome and transcriptome. PLoS ONE 13:1–15. https://doi.org/10.1371/journal.pone.0195461
Christensen KA, Rondeau EB, Minkley DR et al (2018b) The arctic charr (Salvelinus alpinus) genome and transcriptome assembly. PLoS ONE 13:1–30. https://doi.org/10.1371/journal.pone.0204076
Connon RE, D’Abronzo LS, Hostetter NJ et al (2012) Transcription profiling in environmental diagnostics: Health assessments in Columbia River basin steelhead (Oncorhynchus mykiss). Environ Sci Technol 46:6081–6087. https://doi.org/10.1021/es3005128
Connon RE, Jeffries KM, Komoroske LM et al (2018) The utility of transcriptomics in fish conservation. J Exp Biol. https://doi.org/10.1242/jeb.148833
Costa MJ, Duarte G, Segurado P, Branco P (2021) Major threats to European freshwater fish species. Sci Total Environ 797:149105. https://doi.org/10.1016/j.scitotenv.2021.149105
Dainat J, Hereñú D, Pucholt P (2020) AGAT: another Gff analysis toolkit to handle annotations in any GTF/GFF format. Zenodo. https://doi.org/10.5281/zenodo.4205393
Davidson NM, Oshlack A (2014) Corset: Enabling differential gene expression analysis for de novo assembled transcriptomes. Genome Biol 15:1–14. https://doi.org/10.1186/s13059-014-0410-6
Deegan LA, Golden HE, Harvey CJ, Peterson BJ (1999) Influence of environmental variability on the growth of age-0 and adult arctic grayling. Trans Am Fish Soc 128:1163–1175. https://doi.org/10.1577/1548-8659(1999)128%3c1163:ioevot%3e2.0.co;2
Finn RD, Clements J, Eddy SR (2011) HMMER web server: Interactive sequence similarity searching. Nucleic Acids Res 39:29–37. https://doi.org/10.1093/nar/gkr367
Gomes-dos-Santos A, Machado AM, Castro LFC et al (2022) The gill transcriptome of threatened European freshwater mussels. Sci Data 91(9):1–10. https://doi.org/10.1038/s41597-022-01613-x
Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652. https://doi.org/10.1038/nbt.1883
Gundappa MK, To TH, Grønvold L et al (2022) Genome-wide reconstruction of rediploidization following autopolyploidization across one hundred million years of salmonid evolution. Mol Biol Evol. https://doi.org/10.1093/molbev/msab310
Haas BJ, Papanicolaou A, Yassour M et al (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512. https://doi.org/10.1038/nprot.2013.084
Hayes DS, Lautsch E, Unfer G et al (2021) Response of European grayling, Thymallus thymallus, to multiple stressors in hydropeaking rivers. J Environ Manag. https://doi.org/10.1016/j.jenvman.2021.1127375
Heim KC, Wipfli MS, Whitman MS et al (2015) Seasonal cues of Arctic grayling movement in a small Arctic stream: the importance of surface water connectivity. Environ Biol Fishes 99:49–65. https://doi.org/10.1007/s10641-015-0453-x
Hughes NF, Reynolds JB (1994) Why do arctic grayling (Thymallus arcticus) get bigger as you go upstream? Can J Fish Aquat Sci. https://doi.org/10.1139/f94-216
Kim D, Langmead B, Salzberg SL (2015) HISAT: A fast spliced aligner with low memory requirements. Nat Methods 12:357–360. https://doi.org/10.1038/nmeth.3317
Labbe J, Devidal J, Albaric J (2023) combined impacts of climate change and water withdrawals on the water balance at the watershed scale — The case of the Allier alluvial Hydrosystem (France). Sustainability. https://doi.org/10.3390/su15043275
Lan Y, Sun J, Xu T et al (2018) De novo transcriptome assembly and positive selection analysis of an individual deep-sea fish. BMC Genomics 19:1–9. https://doi.org/10.1186/s12864-018-4720-z
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. https://doi.org/10.1038/nmeth.1923
Li B, Dewey CN (2014) RSEM: Accurate transcript quantification from RNA-seq data with or without a reference genome. Bioinforma Impact Accurate Quantif Proteomic Genet Anal Res. https://doi.org/10.1201/b16589
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Li B, Sun S, Zhu J et al (2019) Transcriptome profiling and histology changes in juvenile blunt snout bream (Megalobrama amblycephala) liver tissue in response to acute thermal stress. Genomics 111:242–250. https://doi.org/10.1016/j.ygeno.2018.11.011
Luecke C, MacKinnon P (2008) Landscape effects on growth of age-0 arctic grayling in tundra streams. Trans Am Fish Soc 137:236–243. https://doi.org/10.1577/t05-039.1
Machado AM, Almeida T, Mucientes G et al (2018) De novo assembly of the kidney and spleen transcriptomes of the cosmopolitan blue shark, Prionace glauca. Mar Genomics 37:50–53. https://doi.org/10.1016/j.margen.2017.11.009
Machado AM, Muñoz-Merida A, Fonseca E et al (2020) Liver transcriptome resources of four commercially exploited teleost species. Sci Data 7:1–9. https://doi.org/10.1038/s41597-020-0565-9
Meyer L (2001) Spawning migration of grayling Thymallus thymallus (L., 1758) in a Northern German lowland river. Arch fur Hydrobiol 152:99–117. https://doi.org/10.1127/archiv-hydrobiol/152/2001/99
Moatar F, Descy J, Rodrigues S et al (2022) The Loire River basin. Elsevier, Amsterdam, pp 245–272
Moreno-Santillán DD, Machain-Williams C, Hernández-Montes G, Ortega J (2019) De Novo transcriptome assembly and functional annotation in five species of bats. Sci Rep 9:1–12. https://doi.org/10.1038/s41598-019-42560-9
Mueller M, Bierschenk AM, Bierschenk BM et al (2020) Effects of multiple stressors on the distribution of fish communities in 203 headwater streams of Rhine. Elbe and Danube Sci Total Environ 703:134523. https://doi.org/10.1016/j.scitotenv.2019.134523
Narum SR, Campbell NR (2015) Transcriptomic response to heat stress among ecologically divergent populations of redband trout. BMC Genomics 16:103. https://doi.org/10.1186/s12864-015-1246-5
Newcombe CP, Jensen JOT (1996) Channel suspended sediment and fisheries: a synthesis for quantitative assessment of risk and impact. North Am J Fish Manag 16(4):693–727
Opinion AGR, De Boeck G, Rodgers EM (2020) Synergism between elevated temperature and nitrate: Impact on aerobic capacity of European grayling, Thymallus thymallus in warm, eutrophic waters. Aquat Toxicol 226:105563. https://doi.org/10.1016/j.aquatox.2020.105563
Pasquier J, Cabau C, Nguyen T et al (2016) Gene evolution and gene expression after whole genome duplication in fish: the Phylofish database. BMC Genomics 17:1–10. https://doi.org/10.1186/s12864-016-2709-z
Persat H, Mattersdorfer K, Charlat S et al (2016) Genetic integrity of the European grayling (Thymallus thymallus) populations within the Vienne River drainage basin after five decades of stockings. Cybium 40:7–20
Persat H, Weiss S, Froufe E et al (2019) A third European species of grayling (Actinopterygii, Salmonidae), endemic to the Loire River basin (France), Thymallus ligericus n. sp. Cybium 43:233–238. https://doi.org/10.26028/cybium/2019-433-0045
Pertea M, Pertea GM, Antonescu CM et al (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33:290–295. https://doi.org/10.1038/nbt.3122
Punta M, Coggill PC, Eberhardt RY et al (2012) The Pfam protein families database. Nucleic Acids Res 40:290–301. https://doi.org/10.1093/nar/gkr1065
Quevillon E, Silventoinen V, Pillai S et al (2005) InterProScan: protein domains identifier. Nucleic Acids Res 33:116–120. https://doi.org/10.1093/nar/gki442
Richard A, Cattanéo F, Rubin JF (2015) Biotic and abiotic regulation of a low-density stream-dwelling brown trout (Salmo trutta L.) population: Effects on juvenile survival and growth. Ecol Freshw Fish 24:1–14. https://doi.org/10.1111/eff.12116
Robertson FM, Gundappa MK, Grammes F et al (2017) Lineage-specific rediploidization is a mechanism to explain time-lags between genome duplication and evolutionary diversification. Genome Biol 18:1–14. https://doi.org/10.1186/s13059-017-1241-z
Sävilammi T, Primmer CR, Varadharajan S et al (2019) The chromosome-level genome assembly of european grayling reveals aspects of a unique genome evolution process within salmonids. G3 Genes. Genomes, Genet 9:1283–1294. https://doi.org/10.1534/g3.118.200919
Shen F, Long Y, Li F et al (2020) De novo transcriptome assembly and sex-biased gene expression in the gonads of Amur catfish (Silurus asotus). Genomics 112:2603–2614. https://doi.org/10.1016/j.ygeno.2020.01.026
Simão FA, Waterhouse RM, Ioannidis P et al (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212. https://doi.org/10.1093/bioinformatics/btv351
Smith-Unna R, Boursnell C, Patro R et al (2016) TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res 26:1134–1144. https://doi.org/10.1101/gr.196469.115
Song L, Florea L (2015) Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience 4:1–8. https://doi.org/10.1186/s13742-015-0089-y
Varadharajan S, Sandve SR, Gillard GB et al (2018) The grayling genome reveals selection on gene expression regulation after whole-genome duplication. Genome Biol Evol. https://doi.org/10.1093/gbe/evy201
Wang B, Kumar V, Olson A, Ware D (2019) Reviving the transcriptome studies: an insight into the emergence of single-molecule transcriptome sequencing. Front Genet 10:1–11. https://doi.org/10.3389/fgene.2019.00384
Warren M, Dunbar MJ, Smith C (2015) River flow as a determinant of salmonid distribution and abundance: a review. Environ Biol Fishes 98:1695–1717. https://doi.org/10.1007/s10641-015-0376-6
Wedekind C, Evanno G, Székely T et al (2013) Persistent unequal sex ratio in a population of grayling (Salmonidae) and possible role of temperature increase. Conserv Biol 27:229–234. https://doi.org/10.1111/j.1523-1739.2012.01909.x
Weiss SJ, Gonçalves DV, Secci-Petretto G et al (2021) Global systematic diversity, range distributions, conservation and taxonomic assessments of graylings (Teleostei: Salmonidae; Thymallus spp.). Org Divers Evol 21:25–42. https://doi.org/10.1007/s13127-020-00468-7
Wellband KW, Heath DD (2017) Plasticity in gene transcription explains the differential performance of two invasive fish species. Evol Appl 10:563–576. https://doi.org/10.1111/eva.12463
West RL, Smith MW, Barber WE et al (1992) Autumn migration and overwintering of arctic grayling in coastal streams of the arctic national wildlife Refuge, Alaska. Trans Am Fish Soc 121:709–715. https://doi.org/10.1577/1548-8659(1992)121%3c0709:amaooa%3e2.3.co;2
Zuev IV, Andrushchenko PY, Chuprov SM, Zotina TA (2021) Structural features of scales of Baikal grayling Thymallus baicalensis under conditions of an altered hydrological regime. Inl Water Biol 14:60–66. https://doi.org/10.1134/S1995082920060176
Acknowledgements
This work was supported by the Foundation for Science and Technology to GS (SFRH/BD/139069/2018 and COVID/BD/152956/2022), EF (CEECINST/00027/2021/CP2789/CT0003), LFCC (https://doi.org/10.54499/CEECINST/00133/2018/CP1510/CT0004) and by FCT Strategic Funding UIDB/04423/2020 and UIDP/04423/2020. The authors would like to acknowledge the amateur angler Hervé Brun who caught the analyzed specimen for us in compliance with local regulations. The authors would also like to thank the editors and reviewers who have provided very helpful corrections, thus improving the overall quality of this work.
Funding
Open access funding provided by FCT|FCCN (b-on).
Author information
Authors and Affiliations
Contributions
E. F., L. F. C. C. and S. W. conceived this work. G. S. and H. P. provided all authorizations for sampling and carried out sample collection. I. V. carried out RNA extractions. A. M. M. provided most of the scripts used for data analyses. G. S. and A. G. performed all bioinformatics analyses. G. S. and E. F. provided the first draft and all the authors read, revised, and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Secci-Petretto, G., Weiss, S., Gomes-dos-Santos, A. et al. A multi-tissue de novo transcriptome assembly and relative gene expression of the vulnerable freshwater salmonid Thymallus ligericus. Genetica (2024). https://doi.org/10.1007/s10709-024-00210-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10709-024-00210-7