Introduction

As a powerful genomic tool, genetic linkage maps have been widely utilized for mapping of quantitative trait loci (QTL), positional cloning of candidate genes, anchoring whole-genome scaffolds sequence, and comparative genomics and evolution studies1,2,3. A genetic linkage map is constructed with polymorphic DNA markers, which are genotyped in a family and grouped in the linear order along chromosomes based on the meiotic recombination frequency. The type and number of DNA markers is an important determinant of the resolution and density of a genetic linkage map. The innovation of DNA marker technology always brings forth dramatic enhancement of linkage map resolution. Especially the advent of high-throughput next-generation sequencing (NGS) technology has revolutionized the way of DNA markers discovery. Since the discovery of DNA marker in the 1980s, various types of DNA markers have been developed in many ways. Generally, molecular markers can be classified into three major types: (1) hybridization-based markers such as restriction fragment length polymorphisms (RFLPs)4; (2) PCR-based markers like random amplification of polymorphic DNA (RAPD)5, amplified fragment length polymorphism (AFLP)6, simple sequence repeats (SSRs); (3) sequence-based markers: Single nucleotide polymorphism (SNP)7, which is a DNA sequence variation caused by only one nucleotide. Although SNPs are being less polymorphic than SSR markers due to their biallelic nature, SNPs are the most abundant and uniformly distributed in the genomes. Compared to low-throughput markers based on size discrimination or hybridization, SNP is amenable to high-throughput technology, such as NGS technologies, which makes it possible to rapidly identify a large number of SNPs in the genome. However, the whole genome resequencing using NGS technologies for SNP discovery and genotyping is only suitable for a few model species that have a simple genome or have reference genomic sequence. It is not applicable for the majority of species with complex genomes, e.g. highly repetitive genomes, and no prior genomic information. To overcome these obstacles, several genome complexity reduction techniques have been developed over the years, including complexity reduction of polymorphic sequences (CRoPS)8, restriction site-associated DNA sequencing (RAD-seq)9, genotyping-by-sequencing (GBS)10, sequence-based genotyping (SBG)11. Through complexity reduction, a large portion of repetitive sequences was filtered out and, thus, these methods can be applied in SNPs discovery in a genome-wide fashion and genotyping of large genomes without the need of a reference genome. Recently, a modification of RAD sequencing method, termed specific-locus amplified fragment sequencing (SLAF-seq), has been reported by Sun et al.12. Unlike the RAD-seq method, the SLAF-seq involves size selection of restriction site-associated fragment for excluding random interruption of the DNA, and the selected SLAFs fragment is measured by pair-end sequencing on double barcode genotyping systems. Therefore, this approach is more efficient and cost-effective in SNP screening and genotyping than RAD-seq12, and has been increasingly used to develop high-density genetic maps in a variety of plants and animals1,2,13,14, especially in species without reference genome information.

In aquaculture species, many high density linkage maps for fishes15,16 and shellfishes3,17 have been developed using SNP and/or SSR markers, and the quantitative trait loci (QTL) for important traits including sex18,19,20 and growth17,18,21 traits have been identified. A large number of gene-associated SNPs derived from ESTs and RNA-seq dataset have also been discovered22,23,24,25,26 which could be great benefit for the breeding program and whole genome association studies. As for shrimps and crabs such as Penaeus monodon27, Fenneropenaeus chinensis28, Litopenaeus vannamei29 and Portunus trituberculatus30, previous linkage maps were constructed mainly relying on dominant markers AFLP or RAPD. Only several SNP and/or SSR-based high dense linkage maps have been recently reported31,32,33,34. The Chinese mitten crab is the most economically important cultivated crab species in China. There are three major natural populations that distribute in the basin of the Liaohe, Yangtze and Oujiang rivers in China. Southern population from Yangtze river basin was thought to have better growth performance and has become widespread cultivation. Through years of traditional phenotype-based selective breeding, several improved varieties of E. sinensis have been obtained based on selection of northern and southern populations35,36. The selected fast-growing populations of the crab promoted the crab farming industrial development to some extent. However, the current culture industry of the mitten crab still faces many problems, such as sexual precocity causing a great loss to farmers37. Precocious crabs reach maturity one year earlier at a small size. On the other hand, little is known about the genetic basis for most of the traits related to commercial production. Most of previous genetic studies on the mitten crab were performed focusing on population genetics38,39. In recent year, transcriptomic analysis using RNA-Seq technology was conducted on the immunity, molt, metamorphosis and reproduction38,40,41,42. A high-density linkage map of the mitten crab including 10,358 markers was developed on a northern population with 2b-RAD method32, but the markers identity is obscure because their sequences of the nucleotide have neither been described nor deposited in the data bank. The crab whole genome sequencing work has just been finished in our lab. We previously constructed a first generation SSR-based linkage map of the mitten crab using an F1 full-sib family from an intercross between Liaohe (northern) and Yangtze (southern) river populations in China31. However, since the generated sex-specific maps consisted of only 457 and 466 SSR markers, and included many triplet and doublet, which made it impossible to develop an integrated sex-average map. Obviously, generation of an integrated genetic map including more markers will provide a valuable framework for genome sequence assembly towards elucidation of the crab genome, and accelerate the breeding programs. Here, we used the same full-sib family as the mapping population and randomly selected 147 F1 offsprings for SNP discovery and genotyping based on SLAF-seq. Finally, we constructed a high-density SNP and SSR integrated genetic map of the mitten crab including 18,309 molecular markers. A growth-related genomic region was localized by QTL mapping and genome-wide QTL-association analysis.

Results and Discussion

Mapping family

The selection of mapping parents for establishing a mapping family is a key step to construct a high-density map. Backcross, F2, and recombinant inbred lines are the most commonly used populations for linkage mapping. For most shrimps and crabs, however, their life span is only about one or two years, which makes it unfeasible to develop a backcross population. Pseudo F1 populations are usually used as a mapping family2,28,30,43,44. In the pseudo-testcross procedure, two highly heterozygous parents with significant genetic difference are selected for hybridizing to produce a set of F1 progeny. Previous studies showed that the mitten crabs from northern and southern populations display trait divergence, such as body size45, and experiments from cultured crabs have shown that this differentiation is genetically based46,47. Yangtze crab and Liaohe crab had different gene pools and were embodied by different allelic frequencies in their genomes39. With this in mind, we established a mapping family from an intercross between Yangtze crab and Liaohe crab. Thousands of progeny were produced from a pair of parents, making it possible to obtain accurate estimates of marker positions in the map.

Discovery of SLAF markers and genotyping

The mitten crab E. sinensis genome has a large number of chromosome (2n = 146)48. Sufficient marker is an essential prerequisite for construction of a dense genetic linkage map. To this end, we established SLAF libraries for high-throughput sequencing, which generated a total of 123,599,603 pair-end reads (Table 1). Among them, the high-quality bases (Q score > 30) ratio was 85.8% and guanine-cytosine (GC) content was 44.3%. After sequence alignment and clustering (for details see Materials and Methods), and discarding the low-depth and repeat-suspicious SLAFs, a total of 235,619 high-quality SLAFs were defined, of which 194,887 were detected in the female parent, and 200,452 were detected in the male parent. The reads numbers for SLAFs were 5,771,500 and 6,360,679 with an average coverage of 29.6-fold and 31.7-fold for each SLAF in the female and male parents (Table 1), respectively. In the 147 progeny of the mapping population, the average number of SLAFs was 107,758 with a coverage of 3.15-fold in each progeny. Among the 235,619 high-quality SLAFs that were defined, 127,677 were polymorphic, 105,981 were non-polymorphism and other 1961 were repetitive SLAFs. The polymorphism rate of these high-quality SLAFs was 54.2% similar with SSRs polymorphic rate (51.9%) detected previously in the same population31, implicating some genetic differentiation between germplasm resources of southern and northern populations. After removing the SLAFs with no parent information, 114,527 of these polymorphic SLAFs were retrieved and classified into eight segregation patterns (Fig. 1). As shown in Fig. 1, over 21,613 of markers were homozygous in two parents with genotype aa or bb, which belong to unsegregated patterns in the progeny. After filtered out these unsegregated markers and low quality SLAF markers with average sequence depths less than 10-fold in parents and integrities less than 90% in progeny, 20,803 markers conformed to the F1 population segregation codes, including ab × cd, ef × eg, hk × hk, lm × ll, and nn × np (Table 2). At a MLOD threshold of 5.0, 17,680 of these 20,803 markers were defined as effective markers and used for subsequent genetic linkage mapping. Average sequencing depths of these 17,680 markers were 59.34-fold, 49.63-fold and 4.26-fold in the female parent, the male parent and each progeny, respectively (Table 3). All the effective SLAF marker sequences were presented in Supplementary Table S1. In comparison with 10,358 markers previously identified within a northern population32, the resultant effective primer numbers in our present study increased largely when genotyped in the intercross family between northern and southern populations. The abundance of polymorphic markers implied the high heterozygosity and complexity of the crab genome.

Table 1 Statistic of SLAF sequencing data and high-quality marker depths.
Figure 1
figure 1

Number of SLAF markers for eight segregation patterns.

Table 2 Statistic of the segregation patterns for SLAF markers.
Table 3 Summary of valid SLAF markers depths.

Construction of genetic linkage maps

To develop a SNP- and SSR-based integrated map, the newly developed effective SLAF markers were combined with 629 SSR markers from the first-generation linkage map for linkage analysis31. By means of the two way pseudo-testcross strategy, sex-specific linkage maps were first constructed for each parent at a LOD threshold of 5.0, resulting in 73 linkage groups consistent with the haploid chromosome number of E. sinensis. All of these markers were mapped onto the genetic maps. The female map contained 12,332 markers (520 SSR and 11,812 SLAF) and spanned 15467.37 cM with an average interval of 1.25 cM, while the male map contained 12,699 markers (508 SSR and 12,191 SLAF) and spanned 13817.44 cM with an average interval of 1.09 cM (Table 4). The sex-specific linkage maps were further integrated into a sex-averaged map (Fig. 2; Supplementary Figure S1). The sex-averaged map consisted of 18,309 markers (629 SSR and 17,680 SLAF) and spanned 14,821.92 cM with an average interval of 0.81 cM (Table 4). By using two calculation methods for genome size estimation49,50, the genome map length was estimated to be 14069.79 cM for male, 15733.29 cM for female, and 15025.28 cM for the sex-averaged. Genome coverage was determined based on the ratios of the observed and estimated map lengths. Thus, the genome coverage of the male, female, and sex-average linkage map was 98.30%, 98.30% and 98.64%, respectively. Considering that the present map lengths are much longer than the map created by Cui et al.32 using only SNP markers32, we tried to use only SLAF markers for reconstruction of the map (Supplementary Figure S2). The total length of the reconstructed map shorten approximate 5000 cM, indicating that the extra length resulted from the integration of SSR markers rather than genotyping error. The increased length is most likely due to the uneven distribution of SSR markers in LGs. As shown in Fig. 2, most of SSR markers located in the terminus regions of LGs such as LG1-6, which significantly extended the LGs lengths. In comparison with 10,358 markers in the previous map32, a total of 18,308 markers were grouped in the present map. It is obvious that the extra length of the map is also due to huge increase in the number of mapped markers. Additionally, unlike 2b-RAD method, the SLAF-seq approach sequenced only regions near the enzyme sites. The uneven distribution of enzyme sites caused the uneven distribution of SNP markers along the linkage map, which could also result in longer map. To avoid artificial inflation of map lengths, we set strict criteria for the screening and genotyping of SNP markers as mentioned above. The high coverage and integrity for each resultant SNP locus greatly reduced genotyping errors and ensured the accuracy of linkage analysis.

Table 4 Summary of 73 linkage groups for the mitten crab E. sinensis.
Figure 2
figure 2

Distributions of SLAF (black bar) and SSR (red bar) markers in the linkage groups.

Segregation distortion markers on the map

Segregation distortion is a ubiquitous phenomenon that is defined as a deviation of the observed genotypic frequency from representative Mendelian segregation ratios. This deviation was thought to be caused by biological factors such as gametic and zygotic selection, and/or environmental factor51,52,53. Biological segregation distortion is always associated with a cluster of skewed markers within a chromosomal region, termed as segregation distortion region (SDR)51,52. In the present study, of 18,309 mapped markers, 2,879 markers exhibited significant segregation distortion from Mendelian expectations (P < 0.05) on the sex-averaged map (Supplementary Table S2), and were widely distributed on every linkage group. A total of 209 SDRs were found on 66 LGs. The average frequency of segregation distortion markers (15.72%) was similar to that being observed previously in the same mapping population when genotyped using SSR markers only31, implicating that the segregation distortion could not be caused by experimental artifacts. The distribution of distorted markers varied greatly between and within LGs on the map. The number of segregation distortion markers in linkage group ranges from 4 to 197. The frequency of segregation distortion markers on LG14, LG21 and LG23 was much higher than other LGs at 63.74%, 65.89% and 62.07%, respectively. The degree of linkage between adjacent markers of each LGs was presented as ‘Gap < 5 cM’ in Supplementary Table S2, which ranged from 90.55% to 100% with an average of 96.13%. Although the molecular mechanism of segregation distortion remain uncovered, accumulating data showed that segregation distortion markers did not have a large effect on QTL mapping, instead, distorted markers for linkage map construction could increase the genome coverage of the genetic map, and help to improve the detection of linked QTLs53,54.

Differences in recombination rates between sexes

Different recombination rates between sexes have been reported in many species in which meiosis suppression occurs in one sex55,56,57. Generally, the heterogametic sex (XY or ZW) typically has a less recombination rate. As shown in Table 4, the female and male ratio for the recombination rates of the sex-specific maps was 1.12:1 in general. Further, the sex recombination ratio for shared markers was also calculated as previous studies58. Across all of the LGs, a total of 6726 informative markers was shared between the female and male maps (Supplementary Table S3). Based on the shared markers, the total length of the linkage map was 15367.58 and 13641.96 cM in the female and males, respectively. Thus, the female and male ratio of the recombination rate of shared markers was 1.13:1 (Supplementary Table S3), indicating that the sex recombination ratio of shared markers was similar with those of all the markers in the sex-specific maps (Table 4). This result is consistent with those of the first SSR-based linkage map in the mitten crab31 and in other decapod species30. Though the total lengths of female and male maps are similar, significantly different recombination ratios between sex-specific maps were observed in some LGs. The female and male ratios for the recombination rates of shared markers for each linkage group ranged from 0.50:1 in LG12 to 2.41:1 in LG41. The highest recombination rate ratios (greater than 2.0) were observed in LG41 2.41:1, 2.04:1 in LG48, while the lowest (less than 0.6) were LG12 0.50:1, and 0.54:1 in LG34 (Supplementary Table S3).

QTL mapping for phenotypic sex

The genetic mechanism of sex determination in crabs is controversial. Early karyotypic analysis revealed a XY-XX sex determination system in several crabs59,60. However, recent linkage analysis in a family indicated that sex-linked markers were heterozygous and segregated only in the female parent, suggesting a ZW-ZZ system in the mitten crab31. Unfortunately, these sex-linked markers have not been converted into sex-specific markers to support the putative system. In this study, 146 markers linked with phenotypic sex were assigned to LG48 (Fig. 3d), on which significant differences in recombination rates between the two sexes was observed as mentioned above. A less recombination ratio was found in the male crabs than the female crabs, implicating the male might be the heterogametic sex (XY) since the heterogametic sex usually has a less recombination rate. Surprisingly, of the 146 sex-linked markers, no skewed markers were detected towards either female or male parent judged on the sex ratio of the segregation markers (Supplementary Table S4). Thus, our data cannot support a XY-XX or a ZW-ZZ system. Further, we performed genome-wide analysis using these sex-linked markers, no known sex-related gene was found in the scaffold anchored with LG48. The potential sex determination genes including sxl, tra and dmrt were detected in the scaffolds anchored to other LGs (data not shown). Similar result was also reported previously in the searching of the crab transcriptoms32. In addition, we ever made a parallel screen the mitten crab and prawn genomes for sex-linked markers using the method of amplified fragment length polymorphism (AFLP). Two reliable female-specific AFLP fragments were successfully detected in the prawn61, while no sex-specific marker was identified in the mitten crab even if much more AFLP-primer combinations were used in the screening62. Accordingly, the sex determination mechanism in the crab could be more complicated than the prawn. To draw a robust conclusion on the sex determination system of the crab, more families and populations should be employed in sex-linkage analysis and the sex-specific marker needs to be isolated for validating the sex chromosomes.

Figure 3
figure 3

QTL mapping for the growth traits including body length (a), width (b) and weight (c), and sex (d) in E. sinensis. The blue and red curves indicate logarithm of odds (LOD) scores and percentage of explained phenotypic variance (Expl) of SLAF markers against their genetic position on LG53.

QTL mapping and association analyses for major growth traits

QTL mapping analysis was performed for the growth traits including body length, width and weight (Supplementary Table S5). Interestingly, all the growth-related QTL at genome-wide level were detected in the same linkage group LG53 (Fig. 3, Table 5), indicating that LG53 may represent a major chromosome controlling the growth of E. sinensis. Three closest markers, Marker253119, Marker01950 and Marker209500, were localized to a 169.43–170.60 cM region significantly associated with each of the growth traits (length, width and weight), explaining 78.5–95.5% of the phenotypic variance (Table 5). The high phenotypic variance explained by these loci implicates that these loci are major QTL for the crab growth performance. Generally, the phenotypic variance explained (Expl) by QTL is strongly affected by mapping population size and individual gene effects. If two QTLs are linked in the repulsion phase on one chromosome, the Expl value could be significantly biased and increase to even more than 100%63,64. Thus, the high contribution of each growth-related QTL in the crab is mostly because the QTLs are tightly linked in repulsion. Given that the mapping population used in QTL analysis was derived from an intercross between the northern and southern populations, this result suggests that the major growth-related QTL linked in repulsion might play important roles in the genetic control of potential heterosis in growth performance of the crab as revealed in hybrid crop65, which is worth to further validate.

Table 5 Summary of markers significantly associated with growth-related QTL in LG53.

By means of genome-wide QTL-association analysis, the markers in the linkage map were further mapped onto the genomic scaffolds in the mitten crab draft genome (unpublished data). A total of 13,198 markers were uniquely aligned to the scaffolds after removal of multiple alignments, indicating the consistency between the SNP markers, linkage map and scaffolds from the genome. As shown in Table 5, Marker253119 and Marker209500 were anchored onto Scaffold225377 and Scaffold291249, respectively. A number of growth-related genes, ligand of numb protein X 2 (LNX2), p21-activated kinase 2 (PAK2), FMRFamide receptor and octopamine receptor are located in the vicinity of the markers. LNX2 functions as an E3-ubiqutin ligase and its silencing affects the Notch and Wnt signaling pathways66, which play a key role in cellular proliferation and differentiation. Intriguingly, the cattle LNX2 was also identified in the genomic region associated with residual body weight gain67, suggesting that LNX2 may be a conserved regulator for individual growth in both invertebrate and vertebrate. PAK2, a serine/threonine kinase, acts as an effector of GTPases Rac1 and Cdc42 and is required for actin cytoskeletal remodeling, cell cycle progression, apoptosis or proliferation68,69. Full-length PAK2 stimulates cell survival and cell growth. And FMRFamide receptor is a receptor for the FMRFamide peptides, which has regulatory roles at skeletal neuromuscular junctions in insects, and is involved in muscular differentiation and growth. The inclusion of these positional candidate genes suggests a high efficiency and accuracy of this map in QTL mapping for growth traits. Further studies should be performed in order to determine the detail genetic architecture and regulatory mechanisms of these growth-related genes involved in cell proliferation and growth, and their specific association with growth traits in the mitten crab using more families and populations. The locus could be an ideal candidate target for marker-assisted selection in the crab breeding.

Conclusions

We constructed a second generation high-density linkage map of the mitten crab E. sinensis. The integrated map comprising 17,680 SNP and 629 SSR markers on the 73 linkage groups represents the most saturated genetic map to date for the crab and may assist in de nove assembling genomic sequence. Three growth-related QTL were identified in a genomic region on LG53, each of which is responsible for body length, body width and body weight with high explained phenotypic variance, suggesting that they are major growth QTL. The QTL markers were further mapped onto the genomic scaffolds in the mitten crab draft genome. A number of growth-related genes were found to closely map to these loci. Consequently, the high-density linkage map can serve as an efficient platform for fine mapping of QTL and positioning sequence scaffolds, and will be valuable for better understanding of the crab genomic structure and speeding up the crab breeding program.

Materials and Methods

Mapping family and sampling

A F1 mapping population is derived from a cross between a female and a male parent that were collected separately from Liaohe River (northern population) in Liaoning Province and a Yangtze River (southern population) in Jiangsu Province in China31. The F1 progeny were reared in outdoor ponds in the mitten crab farm in Chongming island, Shanghai, China. Plastic boards were setup around each pond to prevent the crabs escaping, and the water inlet and outlet of each pond had plastic nets (mesh size: 0.17 mm) to exclude indigenous fishes and other predators from the water source and drainage channel. In the first year individuals from megalopae and juvenile (“coin-sized” crab) were raised at the high density (30–100 individual/m2) and in the second year from juvenile to adult at the low density (1–2 individual/m2)70. The temperature of the pond water was ambient and fluctuated depending on the season (spring and winter: 6–21 °C; summer and autumn 24–33 °C). The pond water quality was maintained with pH 7.0–9.0, dissolved oxygen >3 mg/L, ammonia <0.4 mg/L and nitrite <0.15 mg/L. Juvenile and adult individuals were randomly sampled and stored in 100% ethanol at −80 °C. Three growth-related traits, body length, body weight and body wdith, were measured as previously described31. Genomic DNA was extracted from leg muscles following the standard phenol-chloroform protocol71. Crab assays were conducted in accordance with COPE (Committee on Publication Ethics).

SLAF library construction and sequencing

To generate a large number of high-quality SLAFs, a pre-experiment was designed to evaluate the enzymes and sizes of restriction fragments. SLAF library were constructed using the optimal pre-designed scheme with two restriction enzymes. Namely, genomic DNA from each sample was first completely digested at 37 °C with Hae III and RsaII (NEB, Ipswich, MA, USA), incubated with the Klenow Fragment (3′ → 5′exonuclease) (NEB) and dATP at 37 °C for adding a single-nucleotide A overhang to the digested fragments, and the A-tailed DNA fragment were then ligated to Duplex Tag-labeled Sequencing adapters (PAGE purified, Life Technologies) using T4 DNA ligase. PCR reaction was performed using diluted restriction ligation samples, dNTP, High-Fidelity DNA polymerase and primers: AATGATACGGCGACCACCGA and CAAGCAGAAGACGGCATACG (PAGE purified, Life Technologies). The PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, High Wycombe, UK), pooled and separated on a 2% agarose gel. Fragments of 264–464 bp (with indexes and adaptors) were excised, purified using QIAquick Gel Extraction Kit (QIAGEN) and diluted for pair-end sequencing on an Illumina Highseq™ 2500 sequencing platform (Illumina, Inc; San Diego, CA, USA) at Biomarker Technologies Corporation in Beijing. Real-time monitoring was performed for each cycle during sequencing. The ratio of high quality reads with quality scores greater than Q30 (representing a quality score of 30, indicating a 0.1% chance of an error, and thus 99.9% confidence) in the raw reads and the guanine-cytosine (GC) content were calculated for quality control.

SLAF-seq data analysis and genotyping

The SLAF-seq data were filtered and quality assessment, and then were grouped and genotyped by the method described by Sun et al. Raw reads were assigned to 149 individuals according to the barcode sequences. Polymorphic SLAFs were sorted into eight segregation patterns: ab × cd, ef × eg, ab × cc, cc × ab, hk × hk, lm × ll, nn × np, aa × bb. Given that the map population is F1, the SLAF markers assorted into the pattern aa × bb were excluded in genetic map construction. Additionally, SLAF markers with average sequence depths of less than 10-fold in parents were also filtered out.

Genetic linkage map construction

Sex-specific linkage maps was constructed for heterozygous in one parent and homozygous in the other, and marker was expected to segregate 1:1 in the F1 generation, were termed “female” or ”male”, depending on the sex of the heterozygous parent. Markers heterozygous in both parents, and expected to segregate 1:2:1 in the F1 generation, were termed “biparental” markers. The segregation type (SEG) of the female and male markers was set to lm × ll and nn × np, respectively, and biparental markers to the SEG type hk × hk. The two characters left and right of the “×” in these codes correspond to the two marker alleles of the first and second parent, respectively; each distinct marker allele is represented by a different character. The high-quality SLAF markers and 629 SSR markers in the first-generation map were selected for genetic map construction. Marker segregation ratios were calculated using the chisquare test (χ2). Markers showing segregation distortion were also integrated into the map, and the regions with more than three adjacent loci that show significant segregation distortion (P < 0.05) were defined as segregation distortion regions (SDR). Linkage grouping, marker ordering, error genotyping correction and map evaluation were performed using the HighMap method as described by Liu et al.72. Markers were divided into linkage groups (LGs), using the single-linkage clustering algorithm at logarithm of odds (LOD) threshold 4.0 and a maximum recombination fraction of 0.4.

QTL mapping and genome-wide association analyses

QTL analysis was conducted with R/qtl software using composite internal mapping (CIM) method73,74. The significance of each QTL interval was tested by a likelihood-ratio statistic (LOD). The threshold of the LOD for significance (P = 0.05) was determined using 1,000 permutations. A QTL was determined to be significant if the LOD score was higher than the significance threshold estimated by permutation. The phenotype of morphological and growth was regarded as quantitative trait. Three growth-related traits, body length, body width and body weight, were measured and evaluated as previously described31. Phenotypic sex was treated as a binary trait, with female scored as “1” and male scored as “0”. Genome-wide QTL-association analysis was conducted as follow: When a QTL was captured for significant correlation with growth-related traits, the corresponding sequencing SLAF-tags were extracted and were anchored to genomic scaffolds using Blat75. Both homolog based analysis and de novo gene prediction was performed on the genomic scaffolds using the program Augustus76.

Additional Information

How to cite this article: Qiu, G.-F. et al. A second generation SNP and SSR integrated linkage map and QTL mapping for the Chinese mitten crab Eriocheir sinensis. Sci. Rep. 7, 39826; doi: 10.1038/srep39826 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.