Introduction

The ‘ Mycoplasma mycoides cluster’ comprises five species/subspecies, Mycoplasma mycoides subsp. mycoides , Mycoplasma leachii , Mycoplasma mycoides subsp. capri , Mycoplasma capricolum subsp. capripneumoniae and Mycoplasma capricolum subsp. capricolum [1, 2]. Among them, Mycoplasma mycoides subsp. mycoides , the causative agent of contagious bovine pleuropneumonia (CBPP), is an economically very important bacterial bovine pathogen in sub-Saharan Africa. CBPP was first described in Europe already in 1773 [3], and the causative Mycoplasma was then cultivated and characterized in 1898 in Europe [4]. It has been shown that it spread from Europe to North America, Africa, Australia and Asia via livestock movements. Currently the disease is endemic and widespread in sub-Saharan Africa, ranging from western, central to eastern Africa. In Europe the last outbreaks were reported in Spain, Italy, Portugal and France in the 1980s and 1990s [5]. In comparison to other members of the ‘ Mycoplasma mycoides cluster’, with the exception of Mycoplasma capricolum subsp. capripneumoniae , Mycoplasma mycoides subsp. mycoides shows limited sequence diversity, probably due to its recent emergence about 300 years ago [5, 6].

Currently the complete genomes of only three Mycoplasma mycoides subsp. mycoides strains have been deposited in GenBank, the type strain PG1 [7], which is often used in laboratories but which is considered to be avirulent, the Australian outbreak strain Gladysdale [8] and a European outbreak strain 57/13 [9]. PG1 has been shown to differ genetically and phenotypically from field stains of Mycoplasma mycoides subsp. mycoides , showing attenuated cytotoxicity and reduced adhesion to bovine epithelial cells [5, 10, 11], most likely because of the multiple in vitro passages this strain underwent before being deposited in the strain collections. In particular strain PG1 contains 2 large 24 kb repeats while 27 field strains isolated from three different continents only contain one [11]. Strain Gladysdale was isolated from Australia around 1953 [12]. Strain 57/13 was isolated in Italy in 1992. Neither of these three strains, therefore, represent virulent African strains. The genetic diversity of Mycoplasma mycoides subsp. mycoides strains has been reported to be highest in Africa [5] where the disease is present in many countries of sub-Saharan Africa [13]. We sequenced and annotated the genomes of two virulent African strains Afadé and B237, which are frequently used as challenge strains in animal experiments [1418]. The strains have been re-isolated directly from experimentally infected animals and have not been exposed to subsequent passaging beyond filter-cloning to promote uniformity before genomic DNA was isolated for sequencing. The genomic sequence information from this work will contribute to comparative genomic analyses and therefore the characterization of the core and pan genome of the ‘ Mycoplasma mycoides cluster’ and Mycoplasma mycoides subsp. mycoides in particular. The genomic information will also be useful for downstream ‘omics’ applications, such as proteomics, transcriptomics and reverse vaccinology approaches.

Organism information

Classification and features

Mycoplasma mycoides subsp. mycoides is an obligate parasite, which resides in the respiratory tract of animals. It is a non-motile, non-sporulating bacterium. It lacks a cell wall and has a pleomorphic shape. Transmission electron microscopy images were generated for both Afadé and B237 strains (Fig. 1). Cell pellets were fixed in 150 mM HEPES, pH 7.35, containing 1.5 % formaldehyde and 1.5 % glutaraldehyde for 30 min at RT and at 4 ° over night. After dehydration in acetone and embedding in EPON, ultrathin sections of 40 nm were mounted on formvar-coated coppergrids, poststained with uranyl acetate and lead citrate [19] and observed in a Morgagni TEM (FEI). Images were taken with a side mounted Veleta CCD camera.

Fig. 1
figure 1

(quarter page, single column): Typical fried egg-shaped colony of Mycoplasma. a Afadé, b B237. Transmission electron microscopy of Afadé (c) and B237 (d). Ultrathin sections reveal cell bodies (CB) and thin protrusions (black arrowheads, top left). Multiple protrusions can originate from one cell body (top right). Multiple constrictions along protrusions lead to a necklace-like appearance in some regions (bottom left, white arrowheads). Branching along the protrusions occurs (bottom right, asterisk)

Interestingly the transmission electron microscopy revealed protrusions resembling the attachment organelle observed in Mycoplasma pneumonia [2023]. The physiological function of these protrusions and branching phenotype needs to be defined in future studies. The general features of Mycoplasma mycoides subsp. mycoides strains Afadé and B237 are presented in Table 1 and Appendix: Table 6.

Table 1 Classification and general features of Mycoplasma mycoides subsp. mycoides strains Afadé and B237

We previously confirmed that both strains Afadé and B237 are Mycoplasma mycoides subsp. mycoides using phenotypic growth characteristics, species-specific PCR and a Multi-Locus Sequence Typing (MLST) method [5, 6]. Mycoplasma mycoides subsp. mycoides strain Afadé originates from Northern Cameroon and was isolated at the Farcha laboratories in Tchad in 1965 [24]. It has since served for several experimental infections [1418]. The filter-cloned strains used for this sequence analysis were re-isolated from experimentally infected cattle [14, 17] that showed severe clinical signs and pathomorphologic lesions typical of CBPP. Mycoplasma mycoides subsp. mycoides strain B237 was originally isolated in 1997 in Thika, Kenya, by the Kenya Agricultural Research Institute (KARI).

Figure 2 shows a phylogenetic tree of the 16S rRNA sequences. 16S rRNA gene sequences from Mycoplasma mycoides subsp. mycoides strains Gladysdale, 57/13 and PG1, Mycoplasma mycoides capri strains 95010 and GM12, Mycoplasma capricolum subsp. capricolum strain ATCC27343, Mycoplasma capricolum subsp. capripneumoniae strain M1601, Mycoplasma leachii strains 99/014/6 and PG50, Mycoplasma feriruminatoris strain G5847 (Accession numbers: CP002107, CP010267, NC_005364, NC_015431, NZ_CP001668, NC_007633, CM001150, NC_017521, ANFU01000033, NC_014751, respectively) were retrieved from GenBank. All Mycoplasma genome sequences retrieved from GenBank have two copies of 16S rRNA each, with the exception of Mycoplasma feriruminatoris, where two copies are present but are not resolved in the draft genome [25].

Fig. 2
figure 2

(half page, 2 columns): Phylogenetic tree based on 16S rRNA sequences showing the relationship between Mycoplasma mycoides subsp. mycoides strains Afadé and B237 with members of the ‘Mycoplasma mycoides cluster’ and their closest relatives. The alignment length was 1,439 bp. The tree was generated with PhyML v.3.0 [48] using the HKY85 model of evolution and with 1,000 bootstrap values. Only boostrap values over 500 are shown.

Genome sequencing information

Genome project history

The sequencing and quality assurance was performed at Lausanne Genomic Technologies Facility, Center for Integrative Genomics, University of Lausanne, Switzerland. The assemblies and finishing were done at the Institute for Genome Sciences and International Livestock Research Institute. Functional annotation was produced by the Institute for Genome Sciences Analysis Engine [26] (http://www.igs.umaryland.edu/research/bioinformatics/analysis/index.php). Table 2 presents the project information and its association with MIGS version 2.0 compliance [27].

Table 2 Project information

Growth conditions and genomic DNA preparation

Both strains were grown in PPLO medium (Difco, Cat no. 255420) supplemented with 20 % heat-inactivated horse serum (Sigma, Cat. No. H1138), 0.5 % glucose, 0.03 % penicillin G, 20 mg/ml thallium acetate and 0.9 g/L yeast extract at 37 °C.

Liquid cultures of Mycoplasma were filter cloned using a 0.22 μm filter to disrupt possible cell aggregates. A serial dilution (1/10 - 1/10,000,000,000) was made immediately and 50 μl was plated on PPLO agar.

After 3–4 days of incubation at 37 °C, a single colony was picked and was used to inoculate 4 ml of PPLO medium which was aliquoted and stored at −80 °C.

Filter cloned Mycoplasma were grown overnight in 100 ml PPLO medium at 37 °C. Before entering the stationary growth phase the culture was centrifuged at 2,862 g for 1 h, and the pellet was resuspended in 2.5 ml of TNE buffer (0.01 M Tris–HCl, pH 8.0; 0.01 M NaCl; 0.01 M EDTA). Subsequently 50 μl SDS (10 %) and 50 μl Proteinase K (20 mg/ml) were added and the tubes were incubated at 37 °C for 2 h. After addition of 26 μl of 100 mM PMSF the tubes were incubated 15 min at room temperature, 25 μl of RNase A (10 mg/ml) was added, followed by incubation at 37 °C for 1 hr. Sodium acetate and Phenol Saturated Buffer was added (25 μl of NaOAc 1.5 M pH 5.2, and 2250 μl of Phenol), the solution was mixed by vortexing and centrifuged at 15,870 g for 10 min. The top phase was transferred to a new tube and mixed with Phenol:Chloroform:Isoamyl Alcohol Buffer (Phenol:Chloroform:Isoamyl Alcohol; 25:24:1) followed by another centrifugation at 15,870 g for 10 min and again the top phase was transferred to a new tube. Finally, the DNA was precipitated with isopropanol, washed with 70 % ethanol, dried and resuspended in 200 μl of 2 mM Tris, 0.2 mM EDTA.

Genome sequencing and assembly

The genome sequence of Mycoplasma mycoides subsp. mycoides strain Afadé was generated using a combination of Pacific Biosciences R.S. (PacBio) sequencing (65,280 reads/2853 bp average read length) and Illumina MiSeq sequencing (7,078,010 reads/295 average read length) down-sampled to cover 50 times the expected genome size. The sequencing errors of the long PacBio single-molecule reads were corrected with the shorter, high accuracy Illumina reads using the Celera Assembler (CA) pacbio correction module PBcR (version 7.0, [28]). The resulting corrected PacBio reads were randomly sampled to 25 genome fold and assembled using CA (version 7.0, [29]) and yielded 18 contigs with a total size of 1,278,455 bp. Eight contigs comprised the draft genome of strain Afadé.

The whole genome sequence of Mycoplasma mycoides subsp. mycoides strain B237 was obtained using PacBio sequencing (59,775 reads/2674 average read length). Pacbio reads were corrected with PBcR self-correction module. Corrected reads randomly sampled to 25 genome fold were assembled with CA and yielded 2 contigs with total size of 1,208,895 bp. One long contigs comprises the entire genome and contained the other contig (5091 bp) in a repeat region. The final genome sequences had a 24-fold coverage for Afadé and 23-fold coverage for B237.

The contigs of both assemblies were aligned against the two Mycoplasma mycoides subsp. mycoides reference genomes of Gladysdale [8] and PG1 [7] available in Genbank (CP002107, NC_005364) using mummer [30] and we noticed that all small contigs (<15,000 bp) aligned to places already covered in other bigger contigs. On closer inspection, most of these contigs aligned to a previously characterized 26 kb region [11], consisting of a tandem repeat of three 8 kb segments, interspersed with transposon elements. Due to its repetitive nature, this 26 kb region was not clearly resolved during the assembly process. In order to resolve part of it, we were able to design unique primer pairs and amplify two long-range PCRs fragments of 4,800 and 5,200 bp respectively. For each genome, both Sanger derived sequences were aligned to the assembled genomes before and after polishing with multiple iterations of the PacBio Quiver algorithm (version 0.9.0 [31]). We verified that in the regions covered by the Sanger sequences, all substitution mismatches were resolved by Quiver, however we manually fixed a few indels present in the post polishing alignment, which were not corrected by Quiver.

Genome annotation

Open reading frames (ORFs) were predicted using Prodigal 2.50 [32]. Functional annotation was produced by the Institute for Genome Sciences Analysis Engine [26].

We annotated the small contigs overlapping bigger ones described above separately and noticed that these contigs had more ambiguous characters and ORFs that were on average half the size of the corresponding ORFs in larger contigs (498 nt versus 920 nt). This was due to insertions and deletions. We therefore excluded the small contigs from the assemblies and report 1 contig for Mycoplasma mycoides subsp. mycoides strain B237 and 8 contigs for Mycoplasma mycoides subsp. mycoides strain Afadé.

We also reannotated the genomes of Mycoplasma mycoides subsp. mycoides strain PG1, Mycoplasma mycoides subsp. mycoides strain Gladysdale and Mycoplasma mycoides subsp. mycoides strain 57/13 using the same Engine, for ease of comparison.

Genome properties

The genomes of Mycoplasma mycoides subsp. mycoides strain Afadé and B237 have a total size of 1,190,241 bp and 1,203,804 bp, respectively. The GC-content of both genomes is 23.9 %. Both strains have two copies of the 12 kb and 13 kb repeat described in [11], the difference in size between the two genomes is therefore not due to a missing copy in Afadé.

A total of 1,124 ORFs as well as 30 tRNA and 2 copies of the 23S, 16S and 5S rRNA operons were predicted. The average gene length is 920 bp and 927 bp for Afadé and B237, respectively. The coding density of the genome is 86.7 %. Signal peptides were detected using pSortb v3.0 [33] and LipoP v1.0 [34]. Transmembrane helices were detected with the TMHMM server v2.0 [35, 36]. CRISPR repeats were searched with the CRISPR Finding program online. The properties and the statistics of both genomes are summarized in Tables 3, 4, 5.

Table 3 Summary of the B237 and Afadé genomes: one circular chromosome
Table 4 Nucleotide content and gene count levels of the genome
Table 5 Number of genes associated with the 25 general COG functional categories

Insights from the genome sequence

The genomes of the two African strains Mycoplasma mycoides subsp. mycoides Afadé and B237 were compared to the three previously sequenced Mycoplasma mycoides subsp. mycoides strains Gladysdale, PG1 and 57/13 using CloVR and Sybil [37, 38]. Figure 3 shows a synteny gradient of the aligned genomes. Although there are a high number of transposable elements in all genomes, no major rearrangements have been observed. These results fit well with the very recent emergence of the pathogen, estimated to be as young as 300 years, and the narrow host specificity of Mycoplasma mycoides subsp. mycoides [5].

Fig. 3
figure 3

(quarter page, two columns): Synteny gradient display for the four available Mycoplasma mycoides subsp. mycoides genomes, using PG1 as a reference. A white bar in the reference denotes a region with no gene annotation. The matching genes are colored based on the relative position in their respective genomes (yellow for the beginning and blue for the end). Genes shown in black are part of a paralogous cluster in their respective genome and therefore do not have a single native location. The GC-content in % is plotted for the reference genome

The core genome length is 1,148,950 bp. A total of 773 SNPs were identified when comparing the five core genomes. Only 72 SNPs distinguish B237 from Afadé. Two hundred and sixty six SNPs separate the Australian and European strains Gladysdale and 57/13. PG1 is the most distant from the other four genomes with 399, 483, 465 to 425 SNPs when compared to Afadé, Gladysdale, 57/13 and B237, respectively. This confirms previous reports [5].

We looked for homologs to the Cytadhesin proteins P1, P30, P40. P65, P90, HMW1 and HMW3 from Mycoplasma pneumoniae in the Afadé and B237 proteomes using blastp. No significant hits were found for any of the proteins. Other proteins might be involved in the adhesion process and will need to be identified and characterized.

Conclusions

The genomes of the two African strains as expected differ from the laboratory type strain PG1, the European outbreak strain 57/13 and the Australian outbreak strain Gladysdale. Therefore these genome sequences should be included in subsequent genome comparisons and ‘omics’ studies. The presence of protrusions and branching phenotypes in these two Mycoplasmas but the absence of protein encoding genes similar to the ones characterized in Mycoplasma pneumoniae indicates that other/novel proteins in the Mycoplasma genomes encode the development of protrusions and branching.