Introduction

The symbiosis resulting from members of the genus Frankia interacting with the roots of 8 dicotyledonous plant families (referred to actinorhizal plants) is found worldwide and contributes to the ability of actinorhizal pioneer plants to grow in poor and marginally fertile soils [1]. This symbiotic association has drawn interest because of its higher rate of soil nitrogen input and the ability of the plants to overcome harsh environmental conditions [2]. The molecular mechanism for the establishment of an actinorhizal nitrogen-fixing root nodule remains elusive [3]. Molecular phylogeny of the Frankia genus has consistently identified four main clusters regardless of the typing locus used [1]. These Frankia clusters also follow and support the host specificity groups proposed by Baker [4]. Cluster 1 is divided into sub-cluster 1a including F. alni and relatives that are infective on Alnus and Myricaceae and sub-cluster 1b strains that are infective on Allocasuarina , Casuarina and Myricaceae including F. casuarinae [5]. Cluster 2 contains F. coriariae [6] and uncultured microsymbionts of Coriariaceae , Datiscaceae , Dryadoideae and Ceanothus , while cluster 3, associated F. elaeagni [5], F. discariae [7] and closely related strains are infective on Colletieae , Elaeagnaceae , Gymnostoma and Myricaceae . Finally, cluster 4 groups a broad range of non-nitrogen-fixing and infective strains including F. inefficax species [8] together with “F. asymbiotica” [9] and other related strains that are unable to establish a symbiosis with actinorhizal plants. As has been established for rhizobial and arbuscular mycorrhizal symbioes, the LysM-RLKs are also involved in the perception of Frankia signal molecules by the actinorhizal plant [10, 11]. However, the bacterial signals triggering this symbiosis remain unknown. At present, more than 30 Frankia genomes from strains in pure culture have been sequenced and annotated [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30] and two Candidatus genomes were generated from nodule metagenomes [31, 32]. Analysis of the Frankia genomes failed to reveal the presence of common canonical nodABC genes [33] which also appear to be missing in several photosynthetic [34] and non-photosynthetic [35] bradyrhizobia. The only exceptions were found in the two Candidatus Frankia genomes, which contained the canonical nodABC and sulfotransferase nodH genes [32, 36]. This contradictory situation justifies additional sequencing of genomes from cultivated Frankia strains to gain insight into the depth of the pangenome pool covered. Here we report the first proof of the presence of rhizobial homologous canonical nodABCH genes within the draft genome of cultivated Frankia isolate, strain NRRL B-16219 and widespread occurrence of nodAB in field collected Ceanothus americanus microsymbionts.

Organism information

Classification and features

Strain NRRL B-16219 metabolizes short fatty acids, TCA-cycle intermediates and carbohydrates (Table 1). It is infective on members of Elaeagnaceae and Morella cerifera and produces effective root nodules [4, 37]. In coherence with its host range, strain NRRL B-16219 is phylogenetically affiliated to cluster 3, known to effectively nodulate members of Elaeagnaceae , Rhamnaceae and Myricaceae families. Phylogenetic analysis based on 16S rRNA gene sequence showed that strain NRRL B-16219 was most closely related to type strains of F. discariae DSM 46785 T (99.78%) and F. elaeagni (98.26%) (Fig. 1).

Table 1 Classification and general features of Frankia sp. strain NRRL B-16219 according to MIGS [45]
Fig. 1
figure 1

Maximum likelihood (ML) phylogenetic tree based on the 16S rRNA gene sequences (1400 nt), showing the relationships between Frankia NRRL B-16219 and Frankia species. The ML tree was inferred using the GTR + GAMMA model and rooted by midpoint-rooting; the branches are scaled in terms of the expected number of substitutions per site. The numbers above the branches are support values when larger than 60% from ML (left) and MP (right) bootstrapping

Frankia sp. strain NRRL B-16219 shows typical Frankia morphological structures; branched hyphae, vesicles, the site of nitrogenase activity, and multilocular sporangia containing non-motile spores (Fig. 2).

Fig. 2
figure 2

Scanning electron micrograph of strain NRRL B-16219 after growth for 4 weeks in liquid DPM medium at 28 °C showing hyphae (h), vesicles (v) and sporangia (s)

Fig. 3
figure 3

Organization of nod genes in Frankia NRRL B-16219 and Candidatus Frankia datiscae Dg1 genomes. Sizes, localization and orientation of the genes are displayed proportionally. These genes are not detectable in any other Frankia genome except Candidatus Frankia Dg2

Extended feature descriptions

Strain NRRL B-16219 represents one of the rare Frankia strains directly isolated from soil on plate medium without passing through plant trapping assay. The strain was isolated from the rhizosphere of Ceanothus jepsonii [37] following a complex protocol of soil treatment with phenol (0.7%), sample fractionation through ultracentrifugation in sucrose density gradient, and plating on solid DPM without nitrogen source. Strain NRRL B-16219 developed unpigmented white colonies after 4 weeks growth on DPM medium at 28 °C without shaking. The strain was phenotyped using GENIII microplates in an Omnilog device (BIOLOG Inc., Haywood, USA) as previously described [5]. It was able to metabolize acetic acid, citric acid, D-cellobiose, dextrin, D-fructose, D-mannitol, D-mannose, fructose-6-phosphate, fusidic acid, glucose-6-phosphate, D and L malic acid, p-hydroxy-phenylacetic acid, propionic acid and D-serine and to grow in presence of 1% sodium lactate and up to 1% NaCl. Growth occurred between pH 5.0–6.8. The strain showed tolerant only to rifamycin.

Genome sequencing information

Genome project history

Because it is one of the rare strains isolated directly from the soil, NRRL B-16219 strain was selected as part of an effort to gain insight into the depth of the pangenome pool and to identify symbiotic signaling molecules. The sequencing project was completed in April 2016 and the generated data was submitted as draft genome to Genbank under BioProject PRJNA318440 and the accession number MAXA00000000.1.

Growth conditions and genomic DNA preparation

The studied strain was kindly provided by David Labeda, ARS USDA bacterial collection, as NRRL B-16219 strain ID. The strain was grown at 28 °C in stationary culture in 1-l bottles containing DPM medium [5], supplemented with 0.5 mM NH4Cl as nitrogen source maintained. Biomass from 1 month-old culture was harvested by centrifugation at 9000 x g for 15 min, rinsed several times with sterile distilled water. The mycelial mats were broken by repeated passages through syringes with progressively smaller diameters (21 g to 27 g). Genomic DNA extraction was performed using Plant DNeasy kits (Qiagen, Hilden, Germany) following the recommendation of the manufacturer. Prior to genome sequencing, the quality of the isolated DNA was checked by using the prepared DNA as template for PCR and partial sequences of several housekeeping genes and the 16S rRNA gene were generated and analyzed [16].

Genome sequencing and assembly

Sequencing of the draft genome of Frankia sp. NRRL B-16219 was performed at the Hubbard Center for Genome Studies (University of New Hampshire, Durham, NH) using Illumina technology [38]. A standard Illumina shotgun library was constructed and sequenced using the Illumina HiSeq2500 platform with pair-end reads of 2 × 250 bp. The Illumina sequence data were trimmed by Trimmonatic version 0.32 [39], and assembled using Spades version 3.5 [40], and ALLPaths-LG version r52488 [41].

Genome annotation

The genome was annotated via the NCBI Prokaryotic Genome Annotation Pipeline. Additionally nod gene prediction analysis was done within the Integrated Microbial Genomes-Expert Review system developed by the Joint Genome Institute, Walnut Creek, CA, USA [42] developed by the Joint Genome Institute, Walnut Creek, CA, USA, using similarity search tools. This whole-genome shotgun sequence has been deposited at DDBJ/EMBL/GenBank under the accession number MAXA00000000.1. The version described in this paper is the first version, MAXA00000000.1. A summary of the project information is shown in Table 2.

Table 2 Project information

Genome properties

The draft genome of Frankia NRRL B-16219 consisted of 289 DNA contigs that correspond to estimated genome size of 8,032,739 bp and a GC content of 71.7%. The draft genome contained 6859 total genes, including 6211 protein-encoding genes (90.55%), 561 pseudo genes (8.17%) and 53 RNAs (0.76%) (Table 3). Classification of genes into the COG functional categories is shown in Table 4.

Table 3 Genome statistics
Table 4 Number of genes associated with the general COG functional categories

Insights from the genome sequence

Comparison of genomes from Frankia sp. NRRL B-16219 and other Frankia species

The Frankia sp. NRRL B-16219 genome was compared to all of the Frankia genomes available at NCBI genome database including seven Frankia species including F. alni , F. casuarinae , F. elaeagni , F. coriariae , F. discariae , F. inefficax , and “F. asymbiotica”, two Candidatus Frankia and other Frankia sp. strains. As shown for other closely related strains from cluster 3, strain NRRL B-16219 has one of the largest genome sizes (8,032,739 bp) with a high GC content of 71.72%. Genes shown or suggested to be involved in the actinorhizal symbiosis were detected. Nitrogenase genes were organized into one operon: nifH-D-K-E-N-X-orf1-orf2-W-Z-B-U and a non-linked nifV gene. Genes encoding the hydrogenase subunits were clustered into two operons. Genes for two different types of truncated hemoglobins, HbN and HbO, were also present.

Nodulation pathway

In rhizobia, the common canonical nodABC genes playing a key role in triggering root nodule formation in Legumes. These signals are secreted as a reply to host-plant flavonoids perceived by the compatible rhizobial strains [43]. The Nod factors perceived by host plant through the LysM-RLKs, and the resulting signal transduction cascade triggers a bacterial invasion of root cortical cells and the genesis of functional nodules. Despite the presence of these LysM-RLKs in the actinorhizal plants [11], none of the Frankia genomes from cultivated strains contained any homologous nod genes [33], but they are present in the two Candidatus Frankia genomes [32, 36]. Six nod-like genes were detected in the NRRL B-16219 draft genome (Additional file 1: Table S1) organized into two regions (Fig. 3). The first cluster contained genes encoding the nodA1, nodC, ABC-2 type transport system ATP-binding protein (nodJ), ABC-2 transporter efflux protein, DrrB family (nodI) and nodH. The second cluster contained nodA, nodB and a nodH genes. Amino acid sequence similarities between Frankia sp. strain NRRL B-16219 NodA, B, C, and H predicted proteins ranged from 86 to 93% and 57–67% with the uncultured Frankia (Dg1 and Dg2) and (α- and β-) rhizobia, respectively (Additional file 2: Table S2). Further phylogenetic analysis (Fig. 4) showed that the Frankia Nod proteins were positioned at the root of both the α- and β-rhizobial NodABC proteins as previously reported [4, 8]. They were most closely related to plant nodulating Betaproteobacteria of Burkholderia and Paraburkholderia genera. The GC content of Frankia nod genes ranged from 57.9% for nodA to 66.37% for nodB which is quite similar to that of some rhizobial species including Methylobacterium and Burkoldaria. For both Frankia and rhizobia, GC% of the nod genes was lower than that of total genome sequences.

Fig. 4
figure 4

Maximum likelihood phylogeny based on amino acids of nodA (a), nodB (b), nodC (c) and nodH (d). GC-content is provided for nod genes and for genomes (in parenthesis). Bootstrap and probability values larger than 50% are only shown

Field collected microsymbionts of Ceanothus americanus contain nod genes

Root nodules from Alnus glutinosa , Casuarina glauca and Elaeagnus angustifolia growing in Tunisia and Ceanothus americanus and Elaeagnus umbellata growing in Durham New Hampshire, USA, were collected. The nodA-nodB region from C. americanus nodules was PCR-amplified and sequenced. Following the alignments of the nodA and nodB gene sequences of Dg1 and NRRL B-16219, the primer set (forward primer nodAF 5′-AGCGCGACCCGAGCTCAGGATAATCG-3′ and reverse nodBF (5′-CGATCCCACCCGGATGGAGCTGC-3′) was designed in this study. The sequenced PCR-products were translated into amino acid sequences to permit the detection of the 23 aa sequence at the beginning of the 193 aa of the NodA, the intergenic region (160 nucleotides) and 41 aa at the end of the 230aa of the NodB. Both sequences showed 100% sequence similarities to their respective homologous region in NodA (23/193aa) and NodB (41/230aa) protein sequences for Candidatus Frankia Dg1. Regardless of their affiliation to cluster 2 or to cluster 3 (Fig. 5), all of the analyzed C. americanus microsymbionts contained the nodAB genes. In contrast, A. glutinosa, C. glauca, E. umbellata and E. angustifolia microsymbionts failed to amplify the expected PCR product. This result is in congruence with previous reports claiming that no homologous nod genes are retrievable in sequenced genomes from strains isolated from these actinorhizal plant species [33].

Fig. 5
figure 5

Neighbor-Joining phylogenetic tree based on glnA gene sequences. Bootstrap and probability values larger than 50% are only shown. Marked in bold are Frankia strains or microsymbionts with nod genes as present in their genomes or detected by PCR-sequencing analysis

Conclusions

We report here the genome sequence of a Frankia strain directly isolated from soil rhizosphere. The generated draft genome was assembled into 289 contigs corresponding to 8,032,739 bp, which falls within the size range of Frankia cluster 3 [33]. Bacterial factors triggering actinorhizal symbiosis remain enigmatic since many sequenced Frankia genomes have revealed the absence of universal nod-factors. It was hypothesized that most Frankia strains use a novel nod-independent pathway for the infection process of actinorhizal plants. In contrast, two Candidatus Frankia Dg1 and Dg2 genomes contain canonical nod genes [32, 36]. Here we provide the first proof for the presence of nod genes in the genome of a cultivated Frankia strain. In addition, a PCR-sequencing approach suggested that nod genes are only widespread in C. americanus microsymbionts. This situation is similar to legume symbionts where two nodulation pathways are described: the well-studied nod-dependent and an alternative nod-independent pathway. The majority of rhizobia use the nod-dependent pathway, while some photosynthetic [34] and non-photosynthetic [35] bradyrhizobia use the alternative nod-independent pathway. Moreover, some rhizobia use both pathways and the use of the nod-independent pathway seems to be highly dependent on host species rather than the presence or absence of nod genes in a given bradyrhizobial genome [44]. For Frankia , almost all host plants are infected through the nod-independent pathway, while the nod-dependent process may only be present in unstudied actinorhizal species such as members of the genus Ceanothus .