Introduction

Because of global warming, the major pests of soybean [Glycine max (L.) Merr.] have shifted from foliage-feeding coleopteran and lepidopteran pests to sap-sucking hemipteran pests during the last few decades1. Various stink bugs and soybean aphid species are the main hemipteran pests that damage soybean plants by sucking the juice from plant tissues2. Significant losses in soybean yield, quality, and germination have been caused by stink bug feeding3. The soybean aphid can cause up to 58% yield loss, and a $2.4 billion loss has been estimated annually in the US4.

Although synthetic pyrethroid insecticides have been widely used to manage insects, they might also harm plants. Phytotoxicity is defined as a harmful effect on various physiological processes. In various crop species including soybean, the several studies on interactions between plant physiology and pesticides have been reported5,6,7,8,9. The symptoms of phytotoxicity differ, including leaf speckling, leaf margin necrosis (browning) or chlorosis (yellowing), brown or yellow leaf spots, leaf cupping or twisting, plant stunting, and plant death10.

The insecticidal mode of action of pyrethroids, which derive etofenprox, relies on their ability to bind to voltage-gated sodium channels, disrupting insects’ nerve system11. Voltage-gated sodium channels are the most effective targets for the neurotoxic effects of pyrethroids, while voltage-gated chloride, and calcium channels work as secondary sites of action for a subset of pyrethroids11,12. Although the insecticide of etofenprox has been widely used in soybean fields to manage pests, few physiological and genetic studies of interactions between plants and pesticides have been reported. In a previous study, we firstly reported a novel trait, sensitivity to phytotoxicity of etofenprox (PE) in soybean, and it was illustrated that it was controlled by a single dominant gene through genetic analysis13. Only a limited number of genotypes, Danbaek and Kwangan, indicated leaf shrinkage due to the etofenprox application13. PE in Danbaek and Kwangan was regulated by the same gene inherited from a common ancestor, Tohoku 6913.

Because research tools have been rapidly developed, especially high-throughput genotyping systems (eg., Axiom SoyaSNP array), genomic research of soybean has been widely reported14,15,16,17,18. For example, single genes could be associated with flowering (Glyma.10g221500) and pod shattering (Glyma.16g141600), based on high throughput single nucleotide polymorphisms (SNPs) data in soybean15,16. Using genotypic data of Korean soybean collection comprising 1957 domesticated and 1079 wild accessions, domestication and evolutionary history were studied17. Through Axiom SoyaSNP array data of 2782 soybean collections, the core collection was developed for further genome-wide association study (GWAS)18.

The main goal of this study was to identify candidate genes responsible for sensitivity to PE using genomic approaches in soybean. To achieve this goal, a high saturated genetic map was constructed using a recombinant-inbred line (RIL) population, and a validation study was used with different RIL populations using TaqMan based SNP marker. Furthermore, whole-genome resequencing data from diverse soybean germplasm was used to develop SNPs in candidate genes associated with PE.

Results

The sensitivity of PE was identified as a qualitative trait regulated by a single dominant gene13. In this study, two RIL populations were used to identify candidate genes and validate the results of genetic mapping. Among the 113 RIL populations derived from Daepung (insensitive) × Danbaek (sensitive), 79 lines showed sensitivity to PE and 34 lines showed insensitivity to PE. Among the 138 RIL population derived from 5002T (insensitive) × Kwangan (sensitive), 83 and 55 lines showed sensitivity and insensitivity to PE, respectively. Both populations exhibited discrete phenotypic distribution indicating PE is a qualitative trait.

A highly saturated genetic map was constructed using Axiom 180K SoyaSNP array in the Daepung × Danbaek population (Fig. 1A, Fig. S1, Table S1). Out of 170,223 markers identified from the chip analysis, 9712 markers were used to construct 20 linkage groups, comprising 485 markers per linkage group, on average. Among the 1715 SNPs from the chip analysis, 358 SNP markers were used to construct chromosome 6 of the genetic map (Fig. S1, Table S1). The genomic region of 22.8 cM conferring 7.6 Mbp (Gm06:14,359,496..21,913,046) was associated with the color of pubescence and the T gene (Gm06:18,731,136..18,737,982) was located in this region. To construct chromosome 13 of the genetic map, 881 SNPs out of 1953, were used (Fig. S1, Table S2). The genomic region of 2.5 cM, corresponding to 29 Kbp (Gm13:17,365,410..17,336,628) was associated with the color of flower and W1 gene (Gm13:17,312,425..17,317,057) was 20 Kbp apart from this region (Fig. S1, Table S2).

Figure 1
figure 1

The genetic linkage map and its physical region of the locus conferring PE, which was identified by genetic mapping, on chromosome 16 from two RIL population. (A) The locus related to PE on the chromosome 16 from Daepung × Danbaek RILs population. (B) The genotypes distribution analysis for candidate region using RILs with extreme phenotypic data. The red box shows candidate region of PE. (C) Validation of candidate region using the twelve TaqMan base SNP markers in 5002T × Kwangan RIL population. (D) The nine candidate genes based on the reference genome of soybean Wm82.a2.v121.

Out of 1075 SNP markers, 166 SNP markers were used to construct chromosome 16 (Fig. S1, Table S1). The genomic region of 9.6 cM was associated with PE, which was 548 kbp (Gm16:33,766,944..34,315,611). Genotype distribution analysis was conducted using 20 RILs showing extremely sensitive and insensitive phenotypes to refine the candidate region. Around two markers of AX-90424663 and AX-90394393, 29 polymorphic SNP markers were eliminated during the map construction because of redundancy. Through genotype analysis, including those unmapped SNP markers, the left and right borders of the candidate region were shifted from Gm16:33,766,944..34,315,611 to Gm16:34,149,756..34,750,023 (Fig. 1B). Thus, the region between AX-90401124 and AX-90494358 (~ 600 kbp, Gm16:34,149,756..34,750,023) exhibited clear discrimination of genotype distribution between the two groups, extremely sensitive and insensitive to PE (Fig. 1B).

To validate the results, 138 5002T × Kwangan RIL population was genotyped, where 83 and 55 lines showed sensitivity and insensitivity to PE, respectively, and the sensitivity of PE was scored. Around the candidate region identified using the Daepung × Danbaek RIL population, the genomic region between two markers, PE-07 and PE-11, showed clear discrimination of haplotypes, narrowing down the candidate region to ~ 80 kbp related to PE (Fig. 1C, Fig. S2). The results indicates that this region (~ 80 kbp, Gm16:34,261,202..34,341,781) appears to play a key role in soybean sensitivity to PE (Fig. 1C, Fig. S2).

Based on the Williams 82 genome assembly (https://soybase.org/gb2/gbrowse/glyma.Wm82.gnm4/ accessed on 18 January 2021), nine gene models were located with the candidate genomic region of ~ 80 kbp region related to PE21 (Fig. 1D). Out of nine annotated genes, 25 SNPs among Danbaek, Kwangan and Daepung were identified on the eight genes (Table 1). Among 25 SNPs, 15 SNPs resulted in nonsynonymous substitutions, located on the seven genes, including Glyma.16g181700, Glyma.16g181900, Glyma.16g182100, Glyma.16g182200, Glyma.16g182300, Glyma.16g182400 and Glyma.16g182500 (Table 1).

Table 1 Nine annotated gene models and SNPs in the exon of the genes in candidate gene regions between Danbaek, Kwangan and Daepung.

To confirm the genetic variations within the seven gene models, nucleotide sequences of the genes were compared among 31 soybean genotypes including Danbaek and Kwangan. Several germplasms of insensitivity to PE had the same alleles as Danbaek and Kwangan at SNP-01, 03, 05, 06, 07, 09, 10, 11, 12, 13, 14 and 15 (Table 2). However, all genotypes with insensitivity to PE had distinct alleles from Danbaek and Kwangan at SNP- 02, 04 and 08, located between 34,277,712 and 34,315,218 bp on soybean chromosome 16 (Table 2). These three SNPs cause amino acid substitutions in three genes, and it is proposed that these three genes, Glyma.16g181900, Glyma.16g182200, and Glyma.16g182300, are promising candidate genes associated with PE in soybean.

Table 2 SNPs within genes showed amino acid changes in diverse soybean germplasm.

Discussion

Through the 180K Axiom SoyaSNP assay, the high-density genetic map was constructed for the Daepung × Danbaek RIL population. In the population, the colors of flower and pubescence, which had been reported to be controlled by single genes previously19,20, were divided into purple (70) and white (43), gray (76), and brown (37), respectively. The responsive genes, T and W1, for the colors of pubescence and flower could be identified using the constructed genetic map.

The candidate region (~ 80 kbp) conferring PE was narrowed down from (kbp) to ~ 80 kbp using the second RIL population (5002T × Kwangan) through TaqMan based SNP assay. The genomic region contains nine annotated gene models on chromosome 16 based on the Williams 82 genome assembly (https://soybase.org/gb2/gbrowse/glyma.Wm82.gnm4/ accessed on 18 January 2021). Using high-depth resequencing data on 31 soybean germplasm, three candidate genes, Glyma.16g181900, Glyma.16g182200 and Glyma.16g182300 had amino acid substitutions between genotypes with insensitivity (29 genotypes) and sensitivity (Danbaek and Kwangan) to PE. Glyma.16g182200 is the most promising candidate gene associated with sensitivity to PE, Therefore, accurately evaluated phenotypes, high-density genetic map, and genomic information from diverse germplasm would be essential for successful genetic mapping to identify candidate gene(s) of a target trait.

Arabidopsis homologous gene, AT5G02370.1, a homologous Glyma.16g181900, encodes adenosine triphoshpate binding microtubule motor family protein (https://www.arabidopsis.org/). In wild-type, trichomes have a stalk and three or four branches. In contrast, in zwichel (zwi) mutants, the trichomes have a shortened stalk and only two branches resulting in leaf senescence, similar to the morphological symptoms induced by PE22. Glyma.16g182300 encodes cleavage and polyadenylation specificity factor 100 (CPSF 100), a homologue of AT5G23880.1 in Arabidopsis. In Arabidopsis, CPSF100 mutant showed early flowering phenotypes induced by modified RNA processing of flowering control locus A enhancer of flowering24,25. Besides flowering time, the CPSF protein family is involved in various functions in biological processes such as environmental response and amino acid metabolism23,26,27. A variant of this gene might also affect the responses of plants to environmental stresses at the level of transcription, causing sensitivity to PE.

AT5G23890.1, Arabidopsis orthologue of Glyma.16g182200, encodes glycosylphosphatidylinositol (GPI)—anchored protein (GPI-AP), allowing various proteins to associate with membrane lipid bilayers and anchor on the external surface of the plasma membrane28. The GPI oligosaccharide structure is ubiquitous among eukaryotes with a common minimal backbone comprising three mannoses, one non-N-acetylated glucosamine, and an inositol phospholipid, which covalently links the carboxyl terminus (C terminus) of GPI-Aps to the lipid bilayer29. GPI-Aps include diverse families, such as cell wall structure proteins, proteases, enzymes, and lipid transfer proteins. They are involved in several functional processes, including cell wall composition, cell wall component synthesis, polar cell expansion, stress responses, hormone signaling responses, and pathogen responses in Arabidopsis28.

Voltage-gated sodium channels are important for initiating and propagating the action potential of neurons. Because of their essential roles in electrical signaling, sodium channels are primary targets of synthetic insecticides, including etofenprox. These sodium channel neurotoxins bind to the receptor on the sodium channel, changing different channel properties, including ion selectivity, ion conductance and/or channel opening and closing12. GPI-Aps have been reported to play roles in ion channels30,31,32.

TEX101, a germ cell-specific GPI-AP, belongs to the lymphocyte antigen 6 (Ly6)/urokinase-type plasminogen activator receptor (uPAR)-(LU) protein superfamily33. A recent crystal structure analysis of TEX101 provides evidence that this molecule has two LU domains, which function as a regulator of ion channels34. This structural feature holds great promise for elucidating the actual interactions between this molecule and a group of molecules associated with the ion channels35. In zebrafish, voltage-gated sodium channels are not located at the cell surface in the mutant of a subunit of GPI transamidase, which is crucial for membrane anchoring of GPI-anchored proteins. The biogenesis of GPI-anchored proteins is necessary for cell surface expression of sodium channels of neurons in zebrafish35,36.

The protein structure of the candidate gene was predicted using the AlphaFold Protein Structure Database (https://alphafold.ebi.ac.uk/) (Fig. S3)37. Comparing the predicted structures of Glyma.16g182200, the energy minimized prediction model showed the difference in length of residue between the Daepung and Danbaek types (Fig. S3B,C). Taken together, the PE symptoms in this study might have originated from the structural disruption at the active site of GPI-AP that is anchored on ion channels, which are the targets of pyrethroid insecticide, causing leaf shrinkage or senescence by cell wall disruption.

The SNP markers and high-resolution genetic map identified in the this study will facilitate MAS of PE in soybean. PE-sensitive genotype specific SNP markers were identified. Among the three candidate genes, Glyma.16g182200, encodes GPI-AP, is the most promising causal single gene regulating PE in soybean. Further functional characterization would be required through RNA expression analysis or cloning to transformation of the candidate gene. Identification of the responsive gene will improve our understanding of the basic mechanism of sensitivity to PE in soybean.

Materials and methods

Plant materials and the evaluation of three qualitative traits

Two recombinant inbred line (RIL) populations were used in the present study. The first population of RILs was developed from a cross between Daepung38 and Danbaek39. Daepung showed insensitivity to etofenprox and Danbaek showed sensitivity to etofenprox13. After artificial crossing in 2012, F1 progeny was advanced using single seed descent in the Dankook University greenhouse resulting in 119 F5:8 lines used for this study in 2019.

The second 138 RIL population was derived from a cross between 5002T40 showing insensitivity to etofenprox and Kwangan41 showing sensitivity to etofenprox, in the experimental field of Kyungpook National University. 5002T was developed for high yield with maturity group V in the southern USA in 200240 and Kwangan was developed for high-protein13. This population was used to confirm genetic mapping and inheritance of sensitivity to PE with a different genetic background. Furthermore, from these two RIL populations, SNP markers associated with sensitivity to etofenprox were developed.

The RILs were sowed using a mixture of horticultural soil and the nursery bed soil at the ratio of 3: 1 in 50 deep cell seed trays, 55 × 27 × 12 cm (Wide × Length × Height). Etofenprox 20% EC 1000 × dilution was foliar sprayed four times in two weeks at V1 stage13. Distilled water was used as a negative control. A 10 mL was applied to each plant in each application and the phenotype (sensitivity/insensitivity) was evaluated by presence/absence of leaf shrinkage a week after the final treatment. The treatments were conducted with three biological replications13.

Flower color and pubescence color were also scored in two RIL populations for detecting W1 and T genes, respectively, previously reported19,20. Three qualitative phenotypic data (sensitivity of PE, flower color, and pubescence color) were recorded by two types, reference allele or alternative allele.

DNA extraction, SNP genotyping, and genetic map construction

Genomic DNA from each line and the parents of two RIL populations was extracted from young trifoliate leaves using the cetyltrimethylammonium bromide method42 with the following modifications : an incubation time of 90 min, re-suspension of the DNA pellet in 500 μL 1 × TE, and no RNase A treatment. First, all DNA was quantified using an ND-1000 Spectrophotometer and diluted to 100 ng µL−1 for further study.

For the SNP genotyping, Axiom 180K SoyaSNP assay (Affymetrix, CA, USA) was used14 in the Deapung × Danbaek RIL population. Genomic DNA from the lines and parents was hybridized into the Affymetrix GeneTitan array system and then scanned using GeneTitan Scanner (Affymetrix, CA, USA) following the manufacturer’s protocol. SNP genotype analysis was conducted based on Axiom Genotyping Solution Data Analysis User Guide (http://www.affymetrix.com)43.

The genetic map was constructed using the QTL ICIMapping software ver. 4.144 according to the following parameters: binning by segregation distort 0.01, grouping by a 3.0 logarithm of odds threshold, nnTwoOpt (nearest neighbor with a two-opt heuristic algorithm) for ordering algorithm45, and rippling by the sum of adjacent recombination. The kosambi mapping function was used for the genetic map construction46. The genotype information of the unmapped SNP marker is used to narrow down the mapping region. Redundant markers were removed for map construction.

TaqMan SNP genotyping assay and Sequence comparison with soybean germplasm

To validate the result of genetic mapping from the Deapung × Danbaek RIL population, the probes for the TaqMan assay were designed based on the twelve SNPs originating from Axiom 180K SoyaSNP assay (SFC-dye, South Korea) (Table S2). The extracted DNA was mixed with SFC Master Mix buffer, probes, and primers for the TaqMan SNP genotyping assay. The polymerase chain reactions (PCRs) were run on ABI Step one plus real-time PCR instrument (Applied Biosystems, CA, USA). The PCR conditions were initially 2 min at 50 ℃, 10 min at 95 ℃ for denaturation reaction, and 40 cycles (95 ℃ for 15 s and 60 ℃ for 1 min). At the end of each cycle, the fluorescence intensities of VIC and FAM were measured at the end of each cycle. The fluorescence intensity results were analyzed using the StepOne software V.2.3 (Applied Biosystems, CA, USA)47.

Sequence variations showing amino acid changes within the candidate genes were detected by mapping resequencing data of Danbaek, Kwangan and Daepung against soybean reference genomic sequence21. To confirm the sequences variation, diverse soybean germplasm, totally 31 soybean germplasm, including 25 parents for the Korean soybean nested association mapping populations, were used (Table 3)48. Out of 31 germplasm, two Korean soybean cultivars, Danbaek and Kwangan, present sensitivity to PE and the other 29 germplasms were insensitive to PE.

Table 3 Information of 31 soybean germplasms used for sequence comparison.

Protein structure prediction

The protein structure of the candidate gene was obtained from the AlphaFold Protein Structure Database (https://alphafold.ebi.ac.uk/)37. The previously identified nonsynonymous substitution was generated using the mutation function in Swiss PDB viewer 4.1.0 and energy minimization was conducted to obtain predicted protein structure in Danbaek. Both the predicted protein structures of Daepung and Danbaek were visualized and compared using USCF Chimera X software49.

Ethical approval

The soybean cultivars ‘Danbaek’, ‘Kwangan’ and ‘Daepung’ were provided by National Agrobiodiversity Center in Jeonju, Korea. The soybean cultivar ‘5002T’ was provided by USDA Germplasm Resources Information Network (GRIN) database (https://www.ars-grin.gov). All the experiments carried out on plants in this study followed relevant institutional, national, and international guidelines and legislation.