Objective

Spiroplasmas are wall-less, gram-positive bacteria with mobile helical cells and can infect numerous organisms including plants, insects, mites, ticks, crustaceans, and mammals. Economically important spiroplasmas in the U.S.A. are plant pathogens such as Spiroplasma citri, causal agent of citrus stubborn disease (CSD) [1], brittle root of horseradish (Armoracia rusticana) [2], and S. kunkelli, causal agent of corn stunt [3]. In California, CSD is endemic [4] and can be a serious disease of citrus [5, 6]. Its incidence ranges from spotty to abundant due to reasons such as, but not limited to, abundance of the S. citri vector Neoaliturus tenellus (Baker), also known as Circulifer tenellus, the beet leafhopper (BLH); proximity of citrus to annual crops and dichotomous hosts infected with S. citri which are hosts of the BLH; and semi-arid, hot climate/habitat [4, 7]. To determine any genomic differences that may occur between S. citri populations, the bacterium was isolated and cultured from different times and hosts, and S. citri DNA purified and subjected to PacBio sequencing. The whole genome sequence was assembled for five strains to add to the two S. citri sequences in the public database [8, 9]. The genome sequences reported here will be used for comparative genomics and to better understand the etiology, relationships and evolution among the spiroplasmas. In addition, this genomic data will be used to improve detection assays for S. citri from those previously published [4, 10, 11].

Data description

Spiroplasma citri was isolated and grown in LD8 medium [12], triple cloned, and stored at − 80 °C. S. citri strain C189 was established in 1972 from a Navel orange tree (Citrus sinensis (L.) Osb.) [1] in Riverside, California by grafting to Madam Vinous sweet orange seedlings and maintained in planta at the Citrus Clonal Protection Program, University of California, Riverside, California. S. citri strain BR-12 was obtained in 1981 from horseradish in Collinsville, Illinois [2]. S. citri strain LB319 was isolated in 2007 from a Spring Navel orange tree in Ducor, California. S. citri strain BLH-13 was isolated in 2010 from BLH collected from parsley (Petroselinum crispum) in Mettler, California. S. citri strain BLH-MB was isolated in 2011 from BLH collected from Russian thistle (Salsola tragus) in Parlier, California.

Cultures were re-established for this study and total genomic DNA was extracted by CTAB [13]. Sequencing was performed using PacBio (Menlo Park, CA, USA) RS II platform using single molecule real-time (SMRT) cell v3 with sequencing polymerase (P) and chemistry 4.0 v2©—P6C4 [9]. The library was prepared using PacBio Procedure-Preparing > 30 kb libraries using SMRTbell Express Template Preparation Kit according to manufacturer’s specifications. Adapter screening and quality filtering of raw sequencing data were performed using SMRT Analysis (PacBio) with default settings. The S. citri genomes yielded between 40,816 and 122,010 reads encompassing a range of 5.4 Mb to 1.7 Gb. The N50 value was between 17,795 and 20,974 bp.

For each of the five S. citri strains, filtered subreads established by PacBio were assembled into contigs using Canu 1.8 [14]. To check for contig circularity, ~ 500 bp segments from each end of a contig were used to BLASTn search the PacBio read data. Appropriate reads connecting both ends were used for enclosure. The chromosome and plasmid status of each contig were further confirmed by BLASTn analyses against the GenBank database. The S. citri chromosome was circularized for all five strains and ranged from 1,576,550 to 1,742,208 bp, with an average coverage of 59-fold and an average G + C content of 25.4%. Total genome size ranged from 1,611,714 to 1,832,173 from plants and 1,968,976 to 2,155,613 from the BLH. Extrachromosomal DNAs were characterized, which resulted in identification of one or two plasmids from the plant hosts; and eight or nine plasmids from the BLH. The genome sequence data has been deposited in the NCBI database under Accession numbers CP046368-CP046373 and CP047426-CP047446 (Table 1; Bioproject; Data set 1–5). Annotation of each contig was performed by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) [15] and predicted 38 RNA genes for all strains and between 1908 and 2556 coding sequences. These data extend the sequence database of S. citri and should help to improve detection assays for S. citri and provide insight on the evolution of plant pathogenic spiroplasmas.

Table 1 Overview of data files/data sets

Limitations

  • Contigs that did not clearly associate with the chromosome were designated as putative plasmids.

  • Plasmids that were not circularized were assumed to be linear.