Virus diseases constitute the most serious threat to soybean production in many tropical areas. The most common virus of soybean around the world is Soybean mosaic virus (SMV). SMV is a member of the genus Potyvirus (Berger et al. 2005). It has a positive sense single stranded RNA (ssRNA) genome of approximately 10 kb in length, a genome linked viral protein (VPg) covalently bound to the 5′ end, and a poly (A) tail at the 3′ end (Riechmann et al. 1992). SMV genome encodes one large polyprotein, which is cleaved to yield at least 10 mature proteins by virus-encoded proteases (Jayaram et al. 1992). SMV causes severe symptoms such as mosaic or necrosis in many soybean cultivars, and is easily transmitted by aphids in fields, thus resulting in significant reductions in soybean yield and quality. The seed borne nature of SMV possesses a serious threat to soybean cultivation.

During 2013–14, screening of seven soybean varieties (Bragg, DS2706, DS2708, DSb19, JS335, JS93-05 and SL958) under All India Coordinated Research Project (AICRP) trial at experimental farm of ICAR Research Complex for NEH Region, Meghalaya, India showed yellow mosaic and rugosity of leaves (Fig. 1a). The disease severity index (DSI) was estimated following standard rating scale (supplementary Table 1). The seven varieties showed mean DSI ranging from 1.85 to 44.45 %. The lowest mean DSI (1.85 %) was recorded for DS2706 and the highest DSI (44.45 %) was for DS2708 followed by DSb19 (43.71 %). The symptomatic leaf samples were collected from field and examined under electron microscope (EM) following the leaf dip method using 2 % aqueous uranyl acetate (UA). EM analysis revealed the presence of flexous filamentous virus particle (Fig. 1b). The EM observation indicated the possibility of Potyvirus (Berger et al. 2005) infection in soybean. Therefore, attempt was made to identify and characterize the virus species applying reverse transcription-polymerase chain reaction (RT-PCR) based method.

Fig. 1
figure 1

a Mosaic symptom in naturally infected soybean leaves along with non-symptomatic healthy soybean leaves. b Transmission electron micrograph of flexous filamentous Potyvirus particle in infected leaf tissue. c RT-PCR-detection of Potyvirus using degenerate primers, viz. CIF/CIR (M = 1 kb DNA ladder; lane 1 = template from non-symptomatic plant; lane 2,3,5,6 = template from symptomatic plant; lane 4 = negative control) and POT2/POT1 (M = 1 kb DNA ladder; lane 1,2,3 = template from symptomatic plant; lane 4 = template from non-symptomatic plant; lane 5 = negative control)

Total RNA was extracted from both symptomatic and symptomless leaf samples using RNeasy Plant Mini Kit (Qiagen, Valencia, CA) and complementary DNA (cDNA) was synthesized (RevertAid, Fermentas, India). PCR assay was carried out using two sets of Potyvirus specific degenerate primers viz., CIFor/CIRev (Ha et al. 2008) and POT1/POT2 (Colinet et al. 1998). The former set was reported to amplify a ~700 bp region of cylindrical inclusion protein (CI) domain (Ha et al. 2008) and the later set was designed to amplify a ~1,300 bp region encompassing partial nuclear inclusion protein and coat protein (NIb-CP) domain (Colinet et al. 1998) of Potyvirus open reading frame (ORF). All the symptomatic leaf samples showed virus-specific amplification of ~700 bp and ~1,300 bp in RT-PCR assay for CIFor/CIRev and POT1/POT2, respectively (Fig. 1c).

The RT-PCR amplicons of both the primers from two samples (variety: Dsb19 and DS2708) were gel purified (GeneJET, Fermentas, India) and each fragment was sequenced bi-directionally (Chromous Biotech, Bangalore, India). The partial sequences were assembled and submitted in National Centre for Biotechnology Information (NCBI) GenBank designating partial CI protein (KJ001224-KJ001225) and partial polyprotein for NIb and CP (KJ001226-KJ001227) for isolate Dsb19 and DS2708, respectively. The partial sequences of CI (674) and NIb-CP (798 bp) domains of two isolates from Meghalaya (Dsb19 and DS2708) shared 99.6-100.0 % identity both at nucleotide and amino acid level. Thus, the two isolates belonged to the same species, as the threshold of 85.0 % nucleotide sequence identity proposing to differentiate species within Potyvirus (Berger et al. 2005). The initial BLAST analysis showed that the partial CI and NIb-CP domain of both isolates shared 99.0 % nucleotide identity with previously reported SMV isolate Ar13 (KF135488) from Iran. Similarly, the corresponding proteins encoded by partial CI (224 amino acids) and NIb-CP (265 amino acids) domains showed 99.0 % amino acid identity with the polyprotein encoded by the same isolate (protein id AGP03223). Further, the new isolates were compared with published sequences of SMV and other potyviruses affecting crops of Fabaceae family viz., Bean common mosaic virus (BCMV), Bean common mosaic necrosis virus (BCMNV), Bean yellow mosaic virus (BYMV) and Cowpea aphid-borne mosaic virus (CABMV). In case of SMV, one representative isolate from each location available in NCBI database was considered. The pair-wise multiple alignments (CLUSTAL W, DNASTAR Inc.7.1, USA) showed nucleotide sequence identity of 89.0–97.8 % and 64.1–75.2 % with SMV and other potyviruses, respectively for partial CI domain. Whereas, the corresponding values at amino acid level were 97.3–99.6 % and 63.8–87.1 %, respectively. Similarly, for the partial NIb-CP domain the new isolates shared nucleotide and amino acid identities of 91.7–99.0 % and 97.0–99.2 %, respectively with SMV isolates. The new isolates shared only 59.6–67.0 % and 49.6–72.8 % nucleotide and amino acid identities with other potyviruses, respectively. In nucleotide based neighbor-joining (NJ) phylogeny for partial CI and NIb-CP domain, the new isolates (DS2708 and DSb19) grouped with reported isolates of SMV (Fig. 2). Similar clustering pattern was also observed at amino acid level (data not shown). Thus, SMV was identified as the causal agent of mosaic disease in soybean grown in mid-hill condition of Meghalaya, India. Finally, we revalidated our findings by screening the infected samples with SMV CP specific primers SMV-CPf/SMV-CPr (Wang and Ghabrial 2002). The infected samples gave specific amplicon of ~460 bp (Fig. 3a) same as previous report (Wang and Ghabrial 2002). The seeds harvested from the infected plants showed seed coat mottling symptom termed as ‘bleeding hylum’ (Fig. 3b). SMV was reported as a contributing factor for crop loss by causing seed coat mottling (Kennedy and Cooper 1967).

Fig. 2
figure 2

Phylogenetic tree based on nucleotide sequences of CI domain (a) and NIb-CP domain (b) of two newly sequenced isolates of SMV (in box) from India along with reported SMV and other potyviruses. The analyses were conducted in MEGA6 using Neighbor-Joining method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates) are shown next to the branches (shown only when >50 %). The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method and are in the units of the number of base substitutions per site. Each sequence is labelled with the GenBank accession number followed by virus name, origin and isolate name. Virus acronyms: SMV soybean mosaic virus, BCMV bean common mosaic virus, BCMNV bean common mosaic necrosis virus, BYMV bean yellow mosaic virus, CABMV cowpea aphid-borne mosaic virus

Fig. 3
figure 3

a Revalidation of SMV infection through RT-PCR using SMV CP specific primer (M = 1 kb DNA ladder; lane 1,2,3,4 = template from symptomatic plant; lane 5 = template from non-symptomatic plant; lane 5 = negative control). b Seed mottling symptom on seeds harvested from SMV infected plant

The molecular characteristic of SMV has been studied and reported from several North American and East Asian countries like USA, Canada, Korea, Japan, China (Seo et al. 2009). In India the occurrence of SMV was reported from plains of India viz., New Delhi (Nariani and Pingaley 1960), Uttar Pradesh (Singh et al. 1976) and Karnataka (Naik and Murthy 1997) from 1960 onwards mainly based on host range, transmission and physical properties. Till date, SMV has not been characterized from India at molecular level. This study first time reported partial characterization of SMV from India based on CI and NIb-CP domains. Plants grown from SMV-infected soybean seeds provide the primary inoculum source. It could be possible that SMV might have been introduced in North-east India through seeds. To the best of our knowledge, this is the first molecular evidence of SMV infection in soybean from India.