Introduction

Rice (Oryza sativa L.) is one of the most popular staple foods consumed by more than half of the world’s population including Asians. Along with many south Asian countries, rice is widely consumed in India as cooked, puffed and pounded forms. Aromatic rice pertaining to a special group of rice is regarded highly due to presence of excellent aroma and superior grain quality. The Indian subcontinent is blessed with nature’s gift of Basmati rice popular among consumers as superior, scented, long slender grain rice and fetches premium price in the national and international markets. Along with the popular Basmati rice genotypes a hand full of traditional heirloom rice genotypes also possess excellent aroma, making many of these unrecognized landraces as an attractive low-cost alternative to high priced Basmati rice. West Bengal takes pioneer position in rice production in India and has considerably large diversity in cultivated folk rice genotypes. Two such cultivars of West Bengal, ‘Tulaipanji’ and ‘Gobindabhog’, have already been GI tagged and gained considerable attention in the international market for it’s uses in multinational cuisines and were featured in global sporting events1. Sub-Himalayan Terai region of West Bengal harbors considerably wide variability in rice cultivation and is identified as a hotspot of growing non-Basmati aromatic rice2. The rich agrarian heritage of the ancient tribe residents of this region, the ‘Rajbonshis’ has predominant role in conserving the rice agrobiodiversity of this region. The abundance of monsoon and large number of rivers in this region has facilitated the formation of Buri Balason rice bowl, Teesta-Dharala rice bowl, Kranti-Golabari rice bowl, Mekhliganj rice bowl etc. where aromatic landraces of rice, locally known as ‘bhogdhan’ are cultivated since time immemorial3,4. Besides aroma these local indigenous rice landraces, collectively regarded as the folk rice cultivars, offers genetic and nutritional diversity, sustainability, reduction in the carbon footprints and imports, tolerance to many diseases and pests due to the broadening of the gene pool, and adaptability to the local soil and climatic conditions5. The introduction of High-Yielding Varieties (HYV) in rice cultivation has led to gradual disappearance of folk rice cultivars from farmers field since green revolution6,7,8 which has resulted in survival of only a handful of these genotypes9,10,11,12,13. Although the replacement of these aromatic landraces with high- yielding modern varieties ensures good yield but enforces colossal threat to the rice agrobiodiversity resulting permanent damage to the rice ecosystem of different states of India14.

Aromatic landraces have comparatively low yield potentiality than HYVs, but offer considerable variation in agronomic, phenotypic, nutritional and disease tolerance traits. Inspite of their low yield potential, rice landraces have been proven valuable for resistance to rice blast disease in past15,16. The sub-Himalayan terai region being very hot and humid in nature, favors the spread of blast disease caused by fungus Magnaporthe oryzae, the most dreaded diseases in all rice growing regions of the world17. Profuse leaf blast is very common in this region causing significant yield loss of rice genotypes. Proper screening of the heirloom genotypes for the occurrence and severity of blast disease in this region has not been conducted till date and sources of resistance have not been investigated to explore the gene specificity. Blast pathogen is believed to follow the gene for gene hypothesis17,18. More than hundred blast-resistant (R) genes and around 350 QTLs has been reported conferring resistance to blast disease15,19 among which many has been cloned and biochemically characterized. There are absolutely no studies on the availability of blast-resistant (R) genes in these landraces. Mining of these R genes in these landraces is the need of the hour to combat region specific blast strain as the co-evolution of these genetically diverse landraces and the blast pathogen from time immemorial has enabled these genotypes with tolerance to the specific strain of the pathogen.

In the present study, we have performed genetic diversity analysis of locally grown scented aromatic landraces and determined the nutritive potential of these heirloom genotypes. The genotypes were also screened for resistance/tolerance to blast disease and suitable aromatic landraces were identified which can be used for future breeding programs. Allele mining for known source of resistance in these cultivars was performed. Our results portray the prospect of these scented landraces in terms of nutritional quality and blast disease resistance for the first time.

Materials and methods

Plant materials

The indigenous cultivars were collected from different parts of sub-Himalayan region of West Bengal and neighboring states. These genotypes are maintained in Uttar Banga Krishi Viswavidyalaya rice repository after purity breeding. For morphological characterization of the genotypes and screening for blast disease these genotypes were evaluated for two consecutive years (2018 and 2019 kharif season) in the field of Uttar Banga Krishi Viswavidyalaya Instructional Farm. IR64 was used as blast resistant check and ‘Gotra bidhan’ as local blast susceptible check.

Field practices

The experimental field was divided into seventy-four plots of 5 m × 3 m size, having 1.5 ft spacing between plots and a 3 ft footway around the field. After seed viability test with 1% brine water, seed treatment was performed with tricyclazole (2 g per kg of seed) and sown in seedbeds. Twenty-five day old seedlings were transplanted in the main field with plant to plant spacing of 25 cm and line to line spacing of 50 cm. Fertilizer dose of N:P2O5:K2O5 at the rate of 70:50:40 kg/hectare (w:w) were applied. Half of ‘N’ has been applied as basal dose. Remaining half of ‘N’ was divided and applied after 30 days after transplanting (DAT) and before flowering stage. During the experiment, the daily weather reports having the parameters like rainfall (mm), maximum and minimum temperature (°C) and maximum and minimum relative humidity percentage were retrieved from the Integrated Agromet Advisory Services (Gramin Krishi Mousam Seva), UBKV, Coochbehar.

Plant phenotypic parameters

All the phenotypic parameters were evaluated in two replications each year. Plant height, tiller number per plant, panicle number per plant, panicle length and leaf angles were measured after the panicle maturation whereas hundred seed weight, filled grain per panicle, grain length, grain width and yield parameters were taken after harvest and drying of the seed material. In each case mean value was calculated for both the years as an average of at least thirty plants taken from two replicative blocks. The upper leaf and lower leaf of every effective tiller was identified for measurement of leaf angle. First leaf after the flag leaf of the tiller was considered as upper leaf whereas the 3rd or 4th leaf from an effective tiller was considered as middle leaves. In each case effective tillers of at least 15 plants were considered and measured using a protractor at panicle maturation stage of the plant and average leaf angles were calculated.

Disease scoring parameters

The disease related parameters include blast disease scoring, lesion number in leaves, lesion size, lesion type and sporulation center. Disease scoring was done in 15 days interval starting from 3rd day of the month of August each year. Scoring of the symptoms was done following the standardized disease scoring scale defined by IRRI20,21. The minimum score of ‘0’ indicates of no disease and the maximum score of ‘9’ indicates severe disease symptoms of coalesced eye shaped spots on the leaf surface leading to complete drying of the leaves (Fig. 1). The Percentage Disease Index (PDI) values were calculated every 15-day interval for 2 months after transplanting to the main field from nursery and finally Area Under Disease Progress Curve (AUDPC) values were calculated from PDI values. The leaves with distinguished disease symptoms were considered for counting lesion number for each genotype in upper, middle and lower leaves. The typical spindle shaped brown lesions were counted in the leaves after 60 days of transplanting. Average lesion number was calculated from at least 50 such leaves for each case. The lesion area was calculated by measuring the length and breadth of the lesion using a millimeter scale. The lesion types were determined visually where the initial undefined yellow chlorotic lesions were marked as one where as fully matured brown necrotic lesion defined with white or gray center and brown lining in the periphery was defined as two. Average values for lesion type were calculated from at least 50 leaves for each genotype. Sporulation centre was considered based on the presence (one) and absence (zero) of the brown to black necrotic centers in the spindle shaped scars.

Figure 1
figure 1

Pictorial representation of blast symptoms in the scale of 1 to 9 on rice leaves as evaluated for blast disease scoring.

In vitro disease progression assay

The infected leaf samples from the field were collected and the fungus was purified by single spore isolation method on Water Agar media, cultured on Potato Dextrose Agar media. The fungal DNA was isolated and sequenced. The ITS sequences were analyzed by NCBI-BLAST for confirmation. The leaf cuttings of different genotypes were infected with the confirmed culture of purified Magnaporthe for in-vitro disease progression assay. The progression of the disease was measured by appearance of symptoms in time lapse photography and subsequent microscopic studies.

Nutritional parameters

The grains of selected aromatic genotypes (Fig. 2) were harvested, cleaned, dehusked and powdered. These powdered samples were used for estimation of all biochemical parameters.

Figure 2
figure 2

Paddy grain of the 35 aromatic landraces grown across sub-Himalayan Terai region of India. The low land region of north eastern India in the outer foothills of Himalaya and the north of the Indo-Gangetic plains is defined as the sub-Himalayan Terai region.

Carbohydrate parameters

The Total Soluble Sugars (TSS) and Starch (STA) content from rice powder was estimated by following Anthrone method22. The final absorbance was measured at 630 nm and the concentrations of Starch (STA) and Total Soluble Sugars (TSS) were calculated from a standard reference plot of glucose (10–100 µg). The Reducing and Non-Reducing sugars (RES and NRS) from rice powder were estimated by following DNS method22. The absorbance was measured at 510 nm. The concentration of Reducing and Non-Reducing Sugar (RES and NRS) were calculated from a standard reference plot of glucose (20–100 µg). The Amylose content (AMY) from rice powder was estimated by following iodine-colorimetric method22. The absorbance was taken at 590 nm. The concentration of amylose (%) was calculated from a standard reference plot of pure amylose from Potato (200–1000 µg). Amylopectin content (%) was determined by subtracting the amylose content from total starch content.

Protein content

The protein content from the rice powder was estimated by Bradford method23. The absorbance was recorded at 595 nm wavelength. The concentration of protein percentage was calculated from a reference plot of Bovine Serum Albumin (20–100 µg).

Resistant starch and glycemic index

The Resistant Starch (RS) of rice powder was estimated using Megazyme kit24. The absorbance was measured at 510 nm. Amyloglucosidase was used as a standard enzyme. Resistant Starch content was calculated using the formulae RS = ΔE × F/W × 9.27, where, ΔE = Absorbance, F/W = conversion of absorbance to micrograms (100 µg). The Glycemic Index (GI) of rice powder was estimated using in vitro method following Kumar et al.24. The absorbance was measured at 510 nm. Maltose (200 mg) was used as standard carbohydrate. Average values were used to plot curves followed by computing the area under the curve (AUC). The Hydrolysis index (HI) for each rice variety was calculated by dividing AUC of sample by that of maltose and expressed in percentage. The predicted Glycemic Index was calculated using the following formula (PGI) = 39.71 + (0.549 × HI).

Antioxidant activity

The antioxidant activity of the powdered rice was estimated by DPPH method25. A stock solution of DPPH (2, 2-diphenyl-1-picrylhydrazyl) was made by dissolving 24 mg in 100 ml methanol and working solution was prepared by diluting 10 ml of stock solution with 45 ml of methanol. The absorbance was recorded at 515 nm using UV–VIS spectrophotometer, along with control (Methanol—0.5 ml + DPPH—2.5 ml). The percentage of inhibition was expressed by putting the values into the formula:

$${\text{Percentage of Inhibition }}\left( \% \right) \, = \frac{{\left( {{\text{Absorbance of the control}} - {\text{ Absorbance of the test samples}}} \right) \, \times { 1}00}}{{\text{Absorbance of control}}}$$

The percentage of DPPH scavenging potential was plotted against the concentration of samples. The concentration of the sample necessary to decrease the DPPH concentration by 50% was obtained by interpolation from linear regression curve and denoted as IC50 value (μg/ml).

Aroma

The sensory test of rice aroma was performed by using 1.7% KOH solution to the rice powder following the method described by Hien et al.26. Based on aroma the genotypes were ranked as (1) mild, (2) medium, (3) strong. In a given day only 10 samples were evaluated as handling more may cause biasness.

Genomic DNA isolation and genetic diversity analysis

The seeds of selected genotypes were geminated in water-soaked Petri plates under controlled condition. Genomic DNA was isolated from the young leaf tissue following the CTAB (cetyltrimethyl ammonium bromide) method with minor modifications. The quality of purified genomic DNA was estimated on 0.8% agarose gel electrophoresis. The DNA samples were later diluted with nuclease-free water to the concentration of 20 ng/μl and subjected to amplifications using SSR or ISSR markers27,28 and required PCR cocktail. The amplification of different alleles were scored as binary matrix where present or absent of an allele was denoted as 1and 0 respectively in an agarose gel (1.5–3%). Scoring was done on the basis of distinct, unambiguous and well resolved bands. Different parameters like heterozygosity index (H), polymorphic information content (PIC), resolving power, discriminating power etc. were calculated using iMec server (https://irscope.shinyapps.io/iMEC/)29. The UPGMA based genetic clustering was performed by NTSYS-PC version 1.8030.

Allele mining for blast genes

Molecular screening, of the genotypes was performed for the presence of major blast resistant genes. The genotypes were differentiated on the basis of presence and/or absence of the above-mentioned blast resistant genes. PCR amplification was carried out in a 20 μl reaction volume containing 2 μl template DNA, (conc.) of Master Mix containing dNTP, MgCl2, Taq buffer and Taq DNA polymerase, forward and reverse primer and water. For scoring the marker genes the amplified PCR products were separated by electrophoresis in 1.5% agarose gels stained with ethidium bromide along with DNA ladder (NEB). All PCR reactions were carried out twice for reproducibility.

Statistical analysis

Principal component analysis (PCA) and clustering

PCA have been carried out to identify the significant biochemical and disease related parameters. Based on PCA results, bi-plot analysis has been carried out to assess the impacts of different biochemical and phenotypic attributes respectively on different genotypes. The k-means non-hierarchical clustering algorithm was performed for grouping the rice genotypes based on different biochemical and phenotypic attributes. The number of clusters were determined using the gap statistic method. PCA and k-means non-hierarchical clustering techniques were carried out using R software version 3.5.1, Patched (2018-07-02 r74950) Platform: x86_64-w64-mingw32/x64 (64-bit)31. The correlation analysis between nutritional parameters and disease attributes were performed using SAS, version 9.332.

Logistic regression analysis

Logistic regression modeling was carried out using IBM Statistical Package for the Social Sciences (SPSS trial version 2033) by considering only significant phenotypic characters except AUDPC as explanatory variable and presence or absence of blast disease as response variable where presence or absence of blast disease. If X1, X2,…Xp are p explanatory variables and if Y is the binary response variable taking value 0 and 1 for absence or presence of a particular attribute respectively, then the functional form of binary logistic regression equation is

$$\pi = P\left( {Y = 1} \right) = \frac{1}{{\left( {1 + e^{ - z} } \right)}} + \in$$

where π is the probability of an event occurrence under consideration and \(z = \beta_{0} + \beta_{1} X_{1} + + \beta_{2} X_{2} + \ldots + \beta_{p} X_{p} .\) Here, \(\in\) is the error term. The parameters of the equation are generally estimated through iterative maximum likelihood estimation procedure. The goodness of fit of the model was determined by Hosmer and Lemeshow goodness of fit test.

Kendall's tau-b correlation coefficient (τb)

Kendall's tau-b correlation coefficient (τb) was calculated using IBM-SPSS (SPSS trial version 2033) which is a nonparametric measure of association between two qualitative variables. Kendall's tau-b correlation coefficient (τb) was calculated between presence of blast resistant genes and tolerance of the blast disease in different genotypes where tolerance or susceptibility of blast disease was addressed based on clustering of the genotypes and AUDPC values.

Results and discussion

Field attributes of indigenous farmer’s varieties collected from sub-Himalayan Terai region

A total of ninety-nine genotypes collected from different region of sub-Himalayan Terai region were subjected to purity breeding for at least five years and the pure lines were deposited to the Uttar Banga Krishi Viswavidyalaya (UBKV) rice repository. The performance of the genotypes for morphological and yield traits were assessed in the experimental plots. Majority of these genotypes are long duration, photosensitive, tall and have low yield potential (Table 1). The grains of these genotypes are predominantly long or short bold and very few genotypes have long to medium slender grains. Among the ninety-nine genotypes; twenty-seven were found to have mild to strong aroma among which only few genotypes exhibited very strong aroma. ‘Tulaipanji’, ‘Gobindabhog’, ‘Kalonunia’, ‘Radhunipagol’, ‘Konkanijoha’, ‘Chakhao’ etc. are traditionally known very important cultivars in northern Bengal as well as in different states of' north eastern India. Especially ‘Kalonunia’ and ‘Tulaipanji’ are two genotypes of great traditional value in Terai and Duars region and have excellent market potential. At UBKV eight photo-insensitive lines of these two cultivars have been developed keeping the fragrance intact using mutation breeding and selection programs34. Since the aroma is the most important criteria in terms of consumer preference of these genotypes, only thirty-five aromatic genotypes (twenty-seven farmers genotypes and eight UBKV developed lines from these genotypes) from total one hundred and seven genotypes (Supplementary information 1a, 1b, Table 1) were chosen for furthers studies. The detailed description of the selected genotypes along with their origin, ecology, place of collection and yield potential is given in Table 2.

Table 1 Mean performance of ninety-nine indigenous farmers’ varieties of rice (Oryza sativa L.) for yield and its attributes.
Table 2 Description of the selected traditionally important aromatic cultivars along with photo insensitive line developed from these aromatic cultivars.

Genetic diversity of indigenous aromatic cultivars using microsatellite markers

A total of forty-two microsatellite markers (thirty-two SSR and ten ISSR) were used for evaluating genetic diversity of the selected aromatic genotypes (Table 3)27,28. A total of 45 alleles were detected using SSR markers and 30 alleles were detected using purine rich ISSR markers. Except RM469 all the SSR markers were found to be polymorphic in nature. Percentage polymorphism varied between 14.29% for RM460 to 97.14% for RM108 and RM434. Polymorphism Information Content (PIC) value varied from 0.052 (RM434) to 0.375(RM288) with highest Heterozygosity index (5.00) (Table 4). RM 1, RM23, RM 38, RM 250, RM 314, RM 3134, RM 3872, RM 6250 etc. exhibited good resolving and discriminatory power among the SSR markers. All the purine rich ISSR Markers exhibited very good discriminatory power and high PIC ranging from 0.295 to 0.374 and proved to be more effective for diversity analysis. Based on the microsatellite marker analysis genetic clustering was performed using NTSYS software (Fig. 3, Supplementary information 2a, 2b). Clustering using UPGMA method grouped the total thirty-five accessions into three distinct clusters. The distance coefficient of these clusters ranged from 0.45 to 0.85. Cluster I consisted of two subgroups, subgroup I has five genotypes i.e. ‘Tulaipanji’, ‘Kalshipa’, ‘Dubarikomal’, ‘Dharmaphou’ and ‘Dudheswar’, majority of these genotypes are photo sensitive, long duration, tall, lodging susceptible having low yield potential (2–2.5 t/ha). Seven genotypes, ‘Radhatilak’, ‘Kalturey’, ‘Sadanunia’, ‘Tulsibhog’, ‘Kataribhog’, ‘A-1-1’ and ‘Rangakomal’ constitute Sub group II. Majority of these genotypes exhibits very low yield potential (1.5–2 t/ha). The second cluster constitutes of sixteen genotypes amongst them four genotypes is black rice introduced from north eastern Manipur state and the remaining genotypes, ‘Kalonunia’, ‘Gobindabhog’, ‘Radhunipagol’, ‘Konkanijoha’, ‘Rampha’ etc. are known for their excellent aroma. The selections from ‘Kalonunia’ with strong aroma are found in this cluster. The third cluster is majorly comprised of comparatively high-yielding photo insensitive lines of ‘Tulaipanji’ and ‘Kalonunia’ along with two ‘Chakhao’ cultivars. Yield potential of this cluster varied from 3.5 to 4 t/ha.

Table 3 Detailed of the SSR and ISSR markers used in the study.
Table 4 Polymorphism information of the thirty-five genotypes generated using microsatellite markers.
Figure 3
figure 3

UPGMA based genetic clustering of thirty-five aromatic landraces determined by polymorphism obtained from SSR and ISSR markers using NTSYS-PC version 1.8030 (http://www.exetersoftware.com/cat/ntsyspc/ntsyspc.html).

The evaluation of genetic diversity for characterization of these heirloom cultivars is necessary and relatively easy due to the availability of nearly saturated molecular map in case of rice27,28,35. SSRs are interspersed throughout the genome and known as mono-locular, co-dominant, highly informative and easy to analyze marker system where as ISSR markers relies on inter tandem repeats of di, tri, tetra or penta nucleotides found at microsatellite loci and gives an array of amplified products. Purine rich ISSR markers have been proved to be very useful in determining genetic relationship between diverse population36,37,38. In the present study we have performed genetic diversity analysis of locally grown scented landraces using ISSR and SSR markers and based on the results the cultivars were clustered. The results of the marker analysis emphasize that each of these cultivars are distinct from the other and the population under study is genetically diverse. It is noteworthy that the marker-based clustering grouped the genotypes according to the yield potential which proves the accuracy and usefulness of microsatellite marker system in determining purity and phylogeny of the germplasm. This natural population of aromatic cultivars with such wide genetic variability may prove to be good resources for excellent quality traits and disease tolerance. These cultivars were further analyzed for the grain quality attributes and tolerance to destructive blast disease.

Analysis of nutritional variability in selected rice genotypes

Nutritional parameters like Total Soluble Sugar (TSS), Reducing Sugar (RES), Non- Reducing Sugar (NRS), Starch (STA), Amylose (AMY), Resistant Starch (RS), Glycemic Index (GI), Protein (PRO), Antioxidant properties (ANT) and Aroma (ARO) has been measured for thirty-five selected genotypes (Table 5). The TSS content of the genotypes varied from 88.89% (‘Chakhao Sampark’) to 37.44% (‘Tulsibhog’) whereas average starch content of these genotypes ranged from 85.45% (‘Sadanunia’) to 45.86% (‘Konkanijoha’). The wide variability in carbohydrate content prompted us to measure the amylose content of these genotypes as the ratio of amylose and amylopectin content determines the stickiness and flakiness of the rice after cooking. Higher amylose content was found in KNS-2-1-1 (31.27%), ‘Tulsibhog’ (29.10%) and ‘Kabra’ (28.77%) etc. whereas low amylose was found in genotypes like ‘Chakhao’ selections, ‘Kalojeera’, ‘Konkanijoha’, ‘Gobindabhog’, ‘Kalturey’ etc. Good resistant starch (RS) content in ‘Kataribhog’ (2.25%), ‘Chakhao Sampark’ (2.22%), and KNS-2-1-1 (2.11%) were found. Although most of the genotypes showed an average GI value of more than 60%, few cultivars like ‘Dubarikomal (54.77%), ‘A-1-1’ (58.67%), ‘KNS-2B-S1’ (55.08%), ‘TPUR-B-1-IET 28104’ (55.63%) exhibited quite encouraging low glycemic index. Powdered ‘Kataribhog’ grains were found to have remarkable GI, as low as 45.72% by initial analysis which is even lower than the GI of recommended rice varieties for diabetic patients39. The protein content of the genotypes was varying from 4.11% in ‘Rampha’ and was highest in 9.47% in ‘Dharmaphou’. When the total antioxidant activity was evaluated for powdered grains of these genotypes all the genotypes with black pericarp has exhibited the highest antioxidant potential (‘Chakhao sel-2’, ‘Chakhao sel-3’, ‘Chakhao Sampark’, ‘Chakhao Poiterin’ and ‘Kalshipa’). Apart from black genotypes four indigenous cultivars ‘Sadanunia’, ‘Kabra’, ‘Kalturey’, ‘Rangakomal’ have exhibited very high antioxidant activity comparable to the black varieties. All the varieties exhibited good to very good aroma.

Table 5 Variation in different nutritional attributes of selected indigenous aromatic rice genotypes.

Based on the nutritional parameters, the genotypes were clustered statistically. Resistant Starch (RS), Glycemic Index (GI), Non-Reducing Sugar (NRS) and Total Soluble Solids (TSS) were found to be contributing to the variability of the cultivars (Fig. 4A, Supplementary information 3a). Both the cluster plot and Principal Component Analysis divided the genotypes in four distinct clusters (Fig. 4B). The first cluster with genotypes like ‘Kataribhog’, ‘Radhatilak’, ‘Radhunipagol’, ‘Dudhkalam’, ‘Kalokhasa’, ‘Dubarikomal’, ‘Dharmaphou’, ‘Konkanijoha’ etc. were found to be low in Glycemic index (GI) and high in Resistant Starch (RS). The second cluster were constituted by ‘Rampha’, ‘Tulaipanji’, ‘Kalonunia’, ‘Dudheswar’, ‘Kalshipa’, ‘Kalojeera’, ‘KNS-2′-1’, ‘KNS-3′-1’, ‘T6M-3-3’ have relatively high GI values. The third group constituted of the photo insensitive lines of ‘Tulaipanji’ like ‘TSP6-M3-4’, ‘T4M-3-5’, ‘TPUR-B-1(IET 28104)’ along the black varieties like ‘Chakhao sel-3’, ‘Chakhao sel-2’, ‘Chakhao Poiterin’ and two popular genotypes ‘Sadanunia’ and ‘Gobindabhog’. All these genotypes exhibited high starch content. PCA of the variables were performed and Resistant Starch (RS), Total Soluble Sugar (TSS) and Glycemic Index (GI) were found to be significantly contributing in the grouping of the genotypes. The Strach content (STA), Glycemic Index (GI) were found in opposite dimension to Resistant Starch (RS) in PCoA biplot (Fig. 4C). Correlation analysis suggested that the RS and the GI are negatively correlated whereas Protein (PRO), Antioxidant activity (ANT) and Aroma (ARO) were found positively correlated in these genotypes (Fig. 4D).

Figure 4
figure 4

PCoA biplot of thirty-five genotypes based on their important nutritional attributes. All the analysis was performed using R software31, version 3.5.1, Patched (2018–07-02 r74950) Platform: x86_64-w64mingw32/x64 (64-bit) (https://www.R-project.org/). (A) Contribution of various nutritional traits based on the mean values for the grouping of the rice genotypes. The Y-axis depicts the contribution of the parameters in percentage (%). (B) Cluster analysis of the rice genotypes, the X and Y axis represent the PC1 (dim1) and PC2 (dim2) and the percentage (%) of total variation defined by PC1 and PC2. (C) PCoA biplot exhibiting the grouping of the genotypes as well as the nutritional characters. The X and Y axis represent the PC1 (dim1) and PC2 (dim2) and the percentage (%) of total variation defined by PC1 and PC2. (D) Dimension wise distribution of the nutritional characters contributing towards the clustering of the genotypes. The genotypes are denoted numerically or in short as follows 1. ‘Tulaipanji’ (Tlp), 2. ‘Radhatilak’ (Rad), 3. ‘Kalshipa’ (Kls), 4. ‘Rangakomal’ (Rng), 5. ‘Sadanunia’ (Sad), 6. ‘Tulsibhog’ (Tul), 7. ‘Kalturey’ (Klt), 8. ‘Kataribhog’ (Ktb), 9. ‘A-1-1’ (A11), 10. ‘Dubarikomal’ (Dbk), 11. ‘Dharmaphou’ (Dmp), 12. ‘Dudheswar’ (Ddh), 13. ‘Dudhkalam’ (Ddk), 14. ‘Rampha’ (Ran), 15. ‘Gobindabhog’ (Gob), 16. ‘Konkanijhoha’ (Kkj), 17. ‘Kalokhasa’ (Kkh), 18. ‘Kabra’ (Kab), 19. ‘Radhunipagol’ (Rdh), 20. ‘Kalojeera’ (Klj), 21. ‘KNS-2′-1’ (K2′1), 22. ‘KNS-3′-1’ (K3′1), 23. ‘KNS-2-1-1’ (K211), 24. ‘Kalonunia’ (Kln), 25. ‘Chakhao sel-1’ (Cs1), 26. ‘Chakhao Poiterin’ (Chp), 27. ‘Chapka Chakhao’ (Chc), 28. ‘Chakhao sel-2’ (Ch2), 29. ‘Chakhao sel-3’ (Ch3), 30. ‘Chakhao Sampark’ (Chs), 31. ‘KNS-2B-S1’ (K2s1), 32. ‘T4M-3-5’ (T435), 33. ‘TSP6-M3-4’ (TS34), 34. ‘TPUR-B-1’ (TSPM-3-1) (T31), 35. ‘T6M-3-3’ (T633). The attributes are described in short form as follows. TSS total soluble sugar, RES reducing sugar, NRS non-reducing sugar, STA starch, AMY amylose, RS resistant starch, GI Glycemic Index, PRO protein, ANT antioxidant, ARO aroma.

Considering the high nutritional and ethno-medicinal values of local land races, the present study has special merits. Starch is the major contributor which determines the appearance of rice after cooking and its consistency40. The starch granules are aggregation of linear amylose chain and highly branched amylopectin fractions41. The proportion of amylose and amylopectin has profound effect on the physicochemical properties of rice like stickiness, water absorption, volume expansion, hardness and texture of rice40. The amylose content of Pusa Basmati was found to be less when compared with ‘Gobindabhog’42. Along with good cooking quality, modern health-conscious consumers prefer rice with high medicinal value. Due to the lifestyle diseases like obesity, Type II diabetes mellitus, hypertension, etc. the popularity of rice is declining in many countries as white starch is considered to be one of the detrimental source of energy. Very few studies have been conducted for the estimation of health benefitting GI and RS of the Indian scented rice. Rice generally contains less than 3% of type 5 resistant starch43. Resistant starch produces zero calories on digestion, and offers health benefit for diabetic patients. Many of the previous work has also reported strong negative correlation of RS and GI in rice cultivars even after cooking44 where as high amylose content rice cultivars have been reported to have low GI24,45. GI of Pusa Basmati 1121 has been reported as 58.41% which has been shown to reduce by steaming of the paddy or by different cooking treatments46,47. As compared to basmati; the land races like, ‘Kataribhog’, ‘A-1-1’, ‘Dubarikomal’, ‘KNS-2B-S1’, ‘TPUR-B-1 (IET 28104)’ etc. demonstrates very good health parameters with low GI and high RS. Although black rice has been known to have high antioxidant activity but the local genotypes like ‘Sadanunia’, ‘Rangakomal’, ‘Kalshipa’ etc. has shown promising antioxidant activity which is comparable with black cultivars. Non-basmati aromatic cultivars from Bangladesh have been evaluated for their phenol and flavonoid content and have been proved to have good antioxidant potential48. Cultivars like ‘Gopalbhog’, ‘Gobindabhog’, ‘Badshabhog’ etc. has been found to have good nutritional value49. Altogether the results indicate that the local cultivars of these region have very good quality traits and can be popularized as healthy rice or may be utilized in breeding programs for quality enhancement.

Screening of the selected aromatic genotypes against rice blast disease

The sub-Himalayan Terai region is hot spot for blast disease and the meteorological data indicates that mid-September to late October is the most suitable time for the outbreak of the disease (Supplementary dataset 1). The AUDPC (Area Under Disease Progress Curve) indicates that cultivars like ‘Sadanunia’, ‘Tulsibhog’, ‘Chakhao Sampark’, ‘T4M-3-5’ etc. exhibited extreme tolerance for leaf blast disease whereas ‘Gobindabhog’, ‘Konkanijoha’, ‘Kalonunia’, ‘Rampha’, ‘KNS-2B-S1’, ‘KNS-3′-1’ etc. were found to be highly susceptible to leaf blast under natural conditions (Table 6, Fig. 5A). From the PDI values it has been noticed that the disease progresses very rapidly in the month of September for all the genotypes (Fig. 5B). The distinctness of the tolerant versus susceptible cultivars can be observed during this period. In-vitro disease progression assay suggested that the susceptible cultivars develop symptoms after 24 h of inoculation whereas the tolerant cultivars did not develop symptoms even after 48 h of inoculation (Fig. 5C). Time lapse microscopy of susceptible cultivars showed spore germination and hyphae development within 24 h post inoculation in susceptible cultivars like ‘Gobindabhog’ and ‘Konkanijoha’ unlike tolerant cultivars ‘Sadanunia’, ‘T4M-3-5’ etc. (Fig. 5D). Both in field experimets and in-vitro experiments represent discreet difference in disease severity among the selected genotypes.

Table 6 Variation in plant morphological and disease related characters of selected indigenous aromatic rice genotypes.
Figure 5
figure 5

Screening of the local landraces for the occurrence of the leaf last disease. (A) Calculated AUDPC of all the genotypes from 2 year’s field trial and scoring of the plants for disease symptoms. (B) PDI of some highly susceptible and tolerant cultivars between mid of August to early October. (C) Time lapse photography of the leaves from susceptible (‘Kalonunia’) and tolerant (‘Sadanunia’) cultivars after inoculation from purified blast fungus. (D) Microscopic image of hyphal growth in susceptible (‘Kalonunia’) and tolerant (‘Sadanunia’) cultivars.

Identification of important phenotypic characters related to blast disease

Phenotypic data like plant height (PLH), tiller number (TLN), upper, lower and middle leaf angles (ULA, MLA, LLA respectively) in selected rice genotypes and their impact on the disease parameters like lesion numbers in upper, middle and lower leaves (LUL, LML, LLL respectively), lesion size (LSZ), sporulation center (SPC) and AUDPC were calculated (Table 6). Principal Component analysis showed that the disease parameters like lesion number in upper middle and lower leaf (LUL, LML and LLL), lesion type (LST), sporulation center (SPC) and AUDPC were found to be significantly contributing to the clustering of the genotypes (Fig. 6A, Supplementary information 3b). Among the morphological traits only upper leaf angle (ULA) significantly contributed to the clustering of the genotypes. Cluster analysis based on these disease parameters have resulted in four distinguished clusters for these genotypes (Fig. 6B). Cluster I and III constitutes of the genotypes like ‘Sadanunia’, ‘Chakhao Sampark’, ‘TSP6-M3-4’, ‘T4M-3-5’ etc. which showed low AUDPC value (< 600) with a smaller number of disease lesions in leaves and considered highly tolerant to the blast disease whereas genotypes in cluster III exhibited relatively higher AUDPC and is considered to be less tolerant to the disease. Cluster IV represented the cultivars which has high AUDPC (> 1000) with dense disease lesion in all leaves and is considered to be highly susceptible to the blast disease. PCA analysis has demonstrated that AUDPC and disease lesions on lower, upper and middle leaf are placed on the same dimension whereas the leaf angle parameters were found to be falling in the opposite dimension (Fig. 6C). Correlation and regression analysis suggested that upper and middle leaf angle exhibits a significant negative correlation with AUDPC (Table 7, Fig. 6D). Tiller numbers and plant height were found to have non-significant but positive correlation with occurrence of the disease. The correlation study between the nutritional and disease parameters were also performed. Aroma (ARO) and antioxidant activity (ANT) was found to be have significant but weak positive correlation with AUDPC whereas Non-Reducing Sugar (NRS) content was found to have very weak negative correlation with AUDPC (Supplementary information 4) suggesting that the nutritional traits and disease resistance are unrelated independent characters.

Figure 6
figure 6

PCoA biplot of thirty-five genotypes based on their important phenotypic and disease related attributes. All the analysis was performed using R software31, version 3.5.1, Patched (2018-07-02 r74950) Platform: x86_64-w64mingw32/x64 (64-bit) (https://www.R-project.org/). (A) Contribution of various phenotypic and disease related attributes based on the mean values for the grouping of the rice genotypes. The Y-axis depicts the contribution of the parameters in percentage (%). (B) Cluster analysis of the rice genotypes, the X and Y axis represent the PC1 (dim1) and PC2 (dim2) and the percentage (%) of total variation defined by PC1 and PC2. (C) PCoA biplot exhibiting the grouping of the genotypes as well as the phenotypic and disease related attributes. The X and Y axis represent the PC1 (dim1) and PC2 (dim2) and the percentage (%) of total variation defined by PC1 and PC2. (D) Dimension wise distribution of the phenotypic and disease related attributes contributing towards the clustering of the genotypes. The genotypes are designated either numerically or in short form as following 1. ‘Tulaipanji’ (Tlp), 2. ‘Radhatilak’ (Rad), 3. ‘Kalshipa’ (Kls), 4. ‘Rangakomal’ (Rng), 5. ‘Sadanunia’ (Sad), 6. ‘Tulsibhog’ (Tul), 7. ‘Kalturey’ (Klt), 8. ‘Kataribhog’ (Ktb), 9. ‘A-1-1’ (A11), 10. ‘Dubarikomal’ (Dbk), 11. ‘Dharmaphou’ (Dmp), 12. ‘Dudheswar’ (Ddh), 13. ‘Dudhkalam’ (Ddk), 14. ‘Rampha’ (Ran), 15. ‘Gobindabhog’ (Gob), 16. Konkanijhoha (Kkj), 17. Kalokhasa (Kkh), 18. Kabra (Kab), 19. Radhunipagol (Rdh), 20. Kalojeera (Klj), 21. Chakhao Sel.-1 (Cs1), 22. Chakhao Poiterin (Chp), 23. Chapka Chakhao (Chc), 24. Chakhao sel-2 (Ch2), 25. Chakhao sel-3 (Ch3), 26. Kalonunia (Kln), 27. KNS-2′-1 (K2′1), 28. KNS-3′-1 (K3′1), 29. KNS-2-1-1 (K211), 30. KNS-2B-S1 (K2s1), 31. Chakhao Sampark (Chs), 32. TPUR-B-1(TSPM-3–1) (T31), 33. T4M-3-5 (T435), 34. T6M-3-3 (T633), 35. TSP6-M3-4 (TS34). The phenotypic attributes are denoted as PLH plant height, TIN Tiller Number, LLA lower leaf angle, MLA middle leaf angle, ULA upper leaf angle, LLL lesion no. In Lower Leaf, LML lesion no. In Middle Leaf, LUL lesion no. In Upper Leaf, LST lesion type, LSZ lesion size, SPC sporulation centre, AUD AUDPC.

Table 7 Correlation and regression analysis of the morphological and disease related characters with AUDPC.

Assessment of pi genes with the disease

The selected genotypes were screened for the presence of twenty-three well characterized rice blast resistant pi gene (Table 8). Almost all the landraces were found to possess number of pi gene. Genotypes like ‘A-1-1’ and ‘Kataribhog’ contain all the twenty-three pi genes (Supplementary information 5). Lowest number of pi genes was found in ‘T6M-3-3’ and ‘T4M-3-5’. Pi 27t showed the maximum density in the population and was found in every genotype. Similarly, Pi5, Pizt, pib, Pikm and Pita/Pita2 was found in these landraces with high density (Fig. 7). Pik-h, pi-9, pi-1 were found with very low frequency in these genotypes. To address the association of the pi genes, present in these genotypes with occurrence of the disease Kendall’s-tau association analysis was performed using R package. All the genotypes from the cluster one, two and some genotypes from cluster three with less than 800 AUDPC was considered as tolerant and genotypes falling in cluster three and four with more than 800 AUDPC were considered as susceptible. Six pi genes pi37, pizt, pikh, pi9, pita-pita2, and pik showed strong Kendal-tau b association (< 0.001) with blast tolerance in these genotypes (Fig. 7, Supplementary information 6). Pikp, pikm and pi-33 also exhibited significant correlation with blast tolerance, whereas fourteen out of twenty-three markers were found to have no association with blast tolerance in these genotypes (Fig. 7).

Table 8 Details of Pi genes analysed in the study.
Figure 7
figure 7

Distribution of twenty-three major blast resistant genes in the local landraces. Frequency of the genes in the landraces is calculated. Significant Kendall’s-tau b association of the Pi gene with tolerance to the disease is represented by ‘*’ (**mean level of significance < 0.01, *means level of significance < 0.05). Kendall's tau-b correlation coefficient was calculated using IBM-SPSS trial version 2033 (https://www.ibm.com/analytics/spss-statistics-software).

Fitted model for prediction of susceptibility to blast disease based on morphological traits

Based on the results of Principal Component analysis, significantly contributing variables except AUDPC were chosen for development of a logistic regression model for prediction of disease susceptibility of a genotype based on its morphological data. For testing the goodness of fit of the model Hosmer and Lemeshow test was performed which was found non-significant (p value 0.694) for the model at 5% level of significance indicating that the model fits well as for any goodness of fit.

The fitted model is:

$$P\left[ {Blast = \,1} \right] = \frac{1}{{1 + \exp \left( {5.566 + 0.892*LUL - 0.759*LML - 2.845*LST - 0.144*LLL + 2.847*SPC - 0.329*ULA} \right)}}$$

The model predicts that one unit increase in upper leaf lesion (LUL) will increase the odds in favor of blast occurrence by 2.439 times (Table 9, Supplementary information 7). On the other hand, one unit decrease in upper leaf angle (ULA) will increase the odds in favor of blast occurrence by 0.720 times. Sporulation center (SPC) was found to be the most important parameter for the prediction of blast occurrence. The fitted model is highly accurate in the sense that based on significant phenotypic parameters considered for the purpose the overall correct classification rate among all the thirty-five genotypes is 75.7% whereas the correct classification rate is 86.7% within the susceptible genotypes under consideration (Supplementary information 7).

Table 9 The logistic regression model was fitted by considering only significant variables except AUDPC based on PCA results.

Rice blast caused by fungal pathogen Magnaporthe oryzae has been major constrain causing huge yield losses every year and considered as one of the most destructive disease of rice50,51,52,53. Indian subcontinent has faced seven severe epidemics due to the disease in last two decades53. Rice blast is influenced by several climatic conditions as relative humidity; temperature, light intensity etc. Based on the weather parameters several disease prediction models are available in different countries50,54, but disease prediction model based on morphological traits of the plant have been linked with blast disease is scarce. Morphological traits, like plant height, tiller number; leaf angle etc. has been reported to impact disease severity in many crops. Plant height and percentage of unfilled grain has been reported to be positively correlated to the severity of the disease55. Much earlier Mohanty et al.56 has reported a positive correlation of leaf angle, leaf pubescence, epicuticular wax, and quantity of deposition of conidia with disease incidence. The cultivars in present study have shown a range of adaptability to the blast disease. Our data suggest a significant negative correlation of leaf angle of the genotypes with occurrence of the disease. The leaves with smaller leaf angles may protect the spores from direct sunlight and favors the spore germinations50,57 as direct sunlight has very detrimental effect to the germination of blast spore58. More over reduced leaf angles result in dense canopy cover and increases canopy temperature which may lead congenial micro-environment for the fungus and may positively influence the selective outbreak of the disease in these genotypes.

Search for resistant sources against blast disease has been going on all over the world and around 120 resistant genes have been reported till date. Majority of the pi genes are known to encode nucleotide binding site (NBS)-leucine rich repeats (LRR) proteins59. The genes which were found to be associated with blast tolerance of the genotypes in the present study, like Pi 9 and pi 37 etc. are known to confer broad spectrum resistance to blast disease60,61. Pikh, pikm and pikp are known to be different allele of pik which act as a two-protein system in the plant against blast62. Pita and pizt are also known to exhibit complete resistance to blast disease. It can be presumed that the landraces have acquired this resistant gene while combating continuous pathogen pressure for a long period of time. Along with these known genes the presence of unknown genes may also be contributing to the resistance of the genotypes against the blast disease. Detail investigation of the resistant genotypes may lead to the isolation of novel genes or QTLs linked with blast disease resistance.

Conclusion

In the current study, proximate analysis of non-Basmati aromatic rice genotypes has been performed which has successfully zeroed on cultivars like, ‘Kataribhog’, ‘Sadanunia’, ‘Kalshipa’ etc. having low glycemic index, high resistant starch and high antioxidant potential respectively. On the basis of collective evidence from two years field trial and in vitro experiments blast resistant local genotypes like ‘Sadanunia’, ‘Chakhao Sampark’, ‘T4M-3-5’ etc. were identified. Allele mining for the resistant genes in these genotypes demonstrated significant association of six pi genes with resistance against blast disease. The prediction model with plant morphological characters were developed with an accuracy level of more than 85% for the occurrence of blast disease. The scope for pushing non-Basmati scented rice in the domestic as well as the global market is growing and it is high time to highlight and popularize these folk cultivars for their nutritional and disease resistance attributes. Many of these traditional cultivars may also be used as donors for traits like biotic, abiotic stress resistance and for aroma in rice improvement programs. In addition, the use of these cultivars as donor will result in large number of segregants in subsequent generation due to the wide genetic base of these cultivars. It is also important to build strategies for improvement of these genotypes in terms of yield, photosensitivity, disease resistance, cooking quality, and benefit to human health etc. using mutation breeding or biotechnological tools keeping the desirable traits like aroma intact. Combined approaches for the betterment of these heirloom rice cultivars will encourage the farmers to take on the cultivation of their own traditional genotypes over the HYVs.