1 Introduction

1.1 Nutritional and Ecological Value of Leguminous Crops

Legume is any plant species of Fabaceae (Leguminosae) family which constitutes 5% of the total 400,000 plant species known so far on earth. The term ‘legume’ has derived from Latin word Legūmen that means “beans inside pods”. The Fabaceae family constitute major “founder crops”, the ones domesticated earliest in known human history [1]. Among the “Big Eight’ founder crops’ group that known to be cultivated as early as 10,000 BC, four are legumes including; Pea (Pisum sativum), chick pea (Cicer arietinum), lentils (Lens culinaris), and bitter vetch (Vicia ervilia) [2]. Legume cultivation holds key importance in terms of worldwide grain food and forage production demands. The edible seed fruits of any leguminous plant are generally termed as pulses. The commonly cultivated grain legumes among the 20,000 worldwide distributed species are; common bean, faba bean, pea, chickpea, cowpea, pigeon pea, lentils, peanut, grass pea and horse gram [3]. According to Food and Agriculture organization (FAO) 2018 stats, common bean (dry bean) was the most cultivated grain legume worldwide with 34.5 million hectares of total cultivation area followed by chickpea (17.8 million ha) and cowpea (12.5 million ha). The estimated grain legume production exceeded 92 million tons worldwide. The major portions of grain legumes production are coming from India, China, Canada, Australia, USA, Brazil, Argentina and Russia. India is the largest producer of grain legumes contributing 1/4th of the total global production [4]. In Europe, soybeans, faba beans and field peas are presently the most cultivated legumes. Particularly, soybean production has dramatically increased in the last decade because of its high demand in livestock feed. According to European Commission (EC) reports, 943,000 ha of land was under soya cultivation in 2019 and expected to increase by 44% by 2030. However, only 43% of the total legume consumption in Europe is indigenously produced [5].

Nutritional Impact

The United Nations celebrated 2016 as “International year of Pulses” to highlight the importance of nutritious seeds produced by legumes. Grain legumes are also sometimes referred to as ‘poor man’s meat’ because of the high protein content they possess. The storage proteins in pulses make them the richest plant-based source of proteins which vary from 16–50% of the total dry weight. Along with this high protein content, the presence of adequate amounts of dietary fibers, vitamins, complex carbohydrates, sugars, minerals, and fatty acids making legume seeds as one of the healthiest foods sources [6]. Beans also contain some of the non-nutritional phytochemicals such as phytosterols, polyphenols, trypsin inhibiters, phytate, lectin and saponins. Some of these non-nutritional components of legume grains have been termed as anti-nutrients which means making nutrients less available in the body. This may result from reduced digestion, absorption, and bioavailability of nutrients. Enzyme inhibitors like trypsin/a-amylase inhibitors block the active site of these enzymes and hence no digestibility. Further, lectins are found to interfere with nutrients absorption in the small intestine by attaching specific epithelial cell receptors [7]. However, in recent research, these phytochemicals are found to play a critical role in body’s normal homeostasis through their antioxidant and anti-inflammatory activities [8]. Also, the consumption of pulses in daily diet is found to prevent cardiovascular diseases, obesity and type 2 diabetes control, as well as reduce risk of certain cancers including prostate and breast cancer [9].

Ecological Impact

The members of the Fabaceae family are well known for their agronomic and ecological role related to their ability to fixate nitrogen [1]. They possess specialized structures in their roots called ‘nodules’. Root nodules are symbiotic association structures that develop in the root hairs of leguminous plants with diazotrophic rhizobia through a complex bilateral signaling pathway initiated in nitrogen deficient soils. Rhizobia are a group of gram-negative bacteria that colonize root hairs in a host specific manner and fix gaseous atmospheric nitrogen into a usable form (i.e., ammonia) for plants. Diazotroph is any bacteria or archaea that assimilates atmospheric nitrogen into usable form. This process of harvesting gaseous nitrogen by the rhizosphere rhizobium colonies is termed biological nitrogen fixation (BNF). The symbiotic association is a two-way relation as the bacteria living in root nodules feeds upon the carbon-rich metabolites produced by the host plants. Moreover, legumes cropping has also been found to improve the soil structure (i.e., water retention), nutrients bioavailability (i.e., phosphorous mobilization) and breaking disease cycles through the activities of these rhizobial communities in root nodules. These unique features of legume crops play a critical part in maintaining soil health and adequate soil-nitrogen levels thus reducing the needs of artificial nitrogen fertilizers and pesticides. Further, this capacity of legumes has been extensively utilized in crop rotation systems for sustainable use of agricultural land, mitigating the Greenhouse gas (GHG) emissions and enhanced carbon sequestration processes [4]. The term ‘crop rotation’ refers to the process of growing different crops on the same land area in a specific sequence cycle so to maintain soil fertility, rhizosphere microbial diversity, nutrients availability and resistance to plant pests. It has been found that legumes rotation with cereal crops reduces greenhouse gas emissions by 5–7 times compared to non-legumes cultivation [5].

Challenges in Production

Despite all these advantages global legumes cultivation has declined over the last few decades. According to FAO stats of 2015, only in Europe, the legumes cultivation area was reduced to 1.8 M ha from 5.8 M ha that was recorded in 1961 [10]. This decline in production was mainly attributed to the socio-economic factors and to the limited yield of the legumes in the field. Inconsistent yields effect the profitability of the crop in comparison to other crops grown at the same season [11]. Legumes output value is highly variable per unit of area and relatively low as compared to major cereals like rice, wheat and corn [12]. Pea crop, for example, has observed variability in market outputs ranging between 25–78% in Europe [10]. This low profitability of legumes is highly associated with unstable production yields rather than market prices. The factors limiting the production yields include both; (i) abiotic stresses i.e., salinity, drought, low soil fertility, extreme temperatures, and (ii) biotic stresses i.e., bacterial/fungal/viral diseases, and other pests’ infestation.

2 Improvement in Legumes

2.1 Breeding of Legumes

The economy-driven trends of extensive cereal breeding programs, after the green revolution, is one of the reasons for the lack of focus on legumes breeding and cultivation in the last few decades. Furthermore, conventional breeding of legumes is constrained by the limited genetic diversity of available germplasms, collection facilities, and reference genome databases [13]. The natural inclination of legumes towards self-pollination is the main factor of their germplasm relatively low diversity. Breeders, in general, screen and select the favorable combination of genes/mutations in plant genotypes to provide the desired set of characteristics. This is done by the genome-wide screening of all the available wild and adapted germplasm of that species for the resistance against biotic and abiotic stresses [14]. Rapid advancement in genome sequencing and analysis technologies has helped breeders to link a variable phenotype in a population with a particular locus, termed quantitative trait loci (QTL). In the last decade, genome of around 35 legume crops have been sequenced along with complete transcriptome profile analysis [15]. The availability of this huge genomic data has built into valuable tools for marker-assisted selection (MAS) for desired phenotypes in leguminous crops by employing various genetic markers including simple sequence repeats (SSR), single nucleotide polymorphism (SNP), amplified fragment length polymorphism (AFLP), restriction fragment length polymorphism (RFLP) etc. [16].

2.2 Modern Genome Editing Tools

After the discovery of restriction endonucleases and their successful use in making a recombinant DNA molecule, the term ‘Genetic Engineering’ came to limelight. Genetic engineering (GE) essentially means the introduction of a foreign genetic material, using restriction enzymes and ligases, in a suitable host to provide the desired trait [17]. Since its development, technology has been continuously utilized in the various fields of crop improvement, having a significant global impact. The transformation of CRY1 family genes from Bacillus Thuringiensis into crops for insect pest resistance is one of the mainstream success of GE technology so far [18]. The next development in applied genetics was the discovery of homing endonuclease (HEs) or meganuclease I-SceI encoded by mobile genetic elements (MGEs) of the DNA for recombination process [19]. These meganuclease generate double stranded breaks at specific sites in the DNA and then breaks are repaired by natural DNA repair pathways such as Homology directed repair (HDR) or Non-homologous ends joining (NHEJ) repair depending upon the repair mechanism available in the given cell. This double stranded break repair is generally accompanied with insertions or deletions (Indels) of some base pairs leading to genome editing in a locus-specific manner [20]. The genome editing success by using meganucleases was limited because of low frequency of its recognition sites in the DNA [21]. To overcome this limitation synthetic genomic scissors, Zinc finger nucleases (ZFN) were developed in 2002. ZNF is a hybrid molecule containing DNA binding domains, zinc fingers (His2Cys2) and a nuclease domain of Fok1 endonuclease. These zinc fingers can be engineered to recognize specific sites in the genome and this capacity is being utilized for genome editing by engineering ZNFs for a specific DNA regions [22]. One of the limitations in using zinc finger nuclease (ZFN) technology was compromised specificity depending upon the flanking DNA sequence of the recognition sites, called context-dependent specificity. On the similar pattern, transcription activator like effector nucleases (TALENs) were developed by combing DNA-recognizing domains of transcription activator (TAL) like effectors with nuclease domain of restriction enzyme Fok1 for site specific mutagenesis [23]. Despite similarity in basic concept, the major difference in TALENs and ZFNs is the frequency and accuracy of cleavage sites. Zinc finger domains recognize 3–4 bases while TALENs has specificity to a single bases conferred by the repeat variable di residues (RVD) of TALE proteins’ DNA-binding domain making it possible to join several modules without interference in recognition sequence. Moreover, the design of TALENs is relatively simple making the DNA recognition and binding process less complex compared with ZFNs [24]. Now, coming to the most advanced genome editing tool termed as Cluster regularly interspaced palindromic sequences (CRISPR) – Crispr associated system (CAS) system. CRISPR-Cas genome editing system is tailored from the natural adaptive defense mechanism in most of the bacterial and archaeal species against consecutive infections of bacteriophages. CRISPR are simply virus-specific sequences, termed ‘spacers’ and are placed between regularly clustered repeats present throughout the bacterial and archaeal genomes called CRISPR loci [25]. CRISPR Loci are associated with a set of specialized proteins called CAS-proteins. These CAS proteins have highly specific nuclease activity mediated by RNA-DNA complementarity. The spacer sequences upon infection by bacteriophage got transcribed into crispr RNA (crRNA) and act as a guide for CAS protein to cleave the viral DNA and hence infectivity stops [26]. The adoption of this technology by artificially designing guide RNA (gRNA) and CAS protein constructs leads to an era of third generation genome editing. The RNA-DNA base pair complementarity makes CRISPR-Cas system more precise genome editing tool compared with ZFN and TALENs where protein-DNA interaction was used for targeting [24].

ZFN technology was employed in soybean to generate heritable mutations on Dicer-like 1 (DCL1) loci that lead to a defective miRNA precursor transcript processing [27], on Dicer-like 4 (DCL4a and DCL4b) loci for hairy root transformation by increasing the growth of lateral root [28]. TALEN technology have been effectively used to improve seed nutritional characteristics, increasing oleic acid content in soybean [29] and peanut [30] or reducing linoleic acid content in soybean [31] by targeting fatty acid desaturase 2-1A (FAD2-1A) 2-1B, (FAD2-1B), and FAD3A gene family. TALEN was successfully applied in soybean to produce mutants by targeting Dicer-like2 gene [32].

3 CRISPR-Mediated Genome Improvement in Legumes

CRISPR-Cas based genome editing is a relatively simple process requiring only a Cas9 endonuclease activity and a guide RNA (gRNA). The gRNA further consists of i) a crisper RNA (crRNA) that binds the target sequence and, ii) a transactivating RNA (tracer RNA) that mediates target recognition and cleavage. Some variations on the basic pattern may include the use of a different Cas-protein (i.e., Cas12a, Cas13 etc.) with different Protospacer adjacent motif (PAM) requirements. PAM is 2–5 bp sequence that flanks the target sequence to facilitates the Cas-protein bindings and is often a limitation to CRISPR-Cas system’s applicability [33]. The critical steps in plant genomic improvement through CRISPR-Cas includes; (i) optimization of gRNA and Cas9 constructs, (ii) successful transformation/use of suitable delivery vehicle, (iii) detection of resulting mutations and, (iv) regeneration of mutated callus [16]. CRISPR-based genetic improvements have been extensively explored in various crops to generate mutations and then selection of high-quality cultivars. However, the lack of optimized protocols for the successful transformation and regeneration of legumes plantlets from callus is a major barrier in their genome editing [34]. Browning of tissues are recalcitrant in-vitro rooting behavior are the key barriers in regeneration of many pulses [35]. Agrobacterium-mediated transformation using seed-tissues have achieved some of success in recent years, particularly in soybean [36] and model legumes Medicago truncatula and Lotus japonicus [37]. Other legume plants with successful reports of CRISPR-based editing includes cowpea, pea, chickpea, alfalfa and peanut as mentioned in Table 16.1.

Table 16.1 Genome editing of various legume crops for improved varieties

Use of Cas9 variants to achieve diverse targeted sites, optimized guide RNA vector constructs, multiple gRNA targeting the same gene or different genes at the same time [52], and development of regeneration protocols are key areas of research in legume genetic improvements. Further, ongoing regulatory debates are focusing on non-specific complementarity binding of gRNA and off-target cleavages which may result in unwanted effects in the host [50].