Introduction

The majority of cyanobacteria use chl a as a sole magnesium tetrapyrrole and common phycobilisome functioning as the bulk LHC. The prochlorophytes are a unique pigment subgroup of phylum Cyanobacteria – besides chl a, they contain other chls (b; 2,4-divinyl a; 2,4-divinyl b; f; g) as antennal pigments and simultaneously do not depend on the PBP-containing photoreceptors [1]. Prochlorophytes demonstrating these outgroup features are few and encompass three marine unicellular genera ( Prochloron , Prochlorococcus , Acaryochloris) and one freshwater filamentous ( Prochlorothrix ). Unicellular Prochlorococcus spp. dominate in phytoplankton of oligotrophic regions of the world’s ocean and they are of exceptional importance from the viewpoint of global primary productivity [2]. Prochloron sp. and Acaryochloris sp. were isolated in symbiotic association with colonial ascidians [3, 4]. In contrast to other prochlorophytes distribution, P. hollandica is characterized by low abundance and patchy distribution [5]; more detailed genome analysis would explain the ecophysiological background of this microorganism.

The genus Prochlorothrix is represented by two cultivable free-living species: Prochlorothrix hollandica and Prochlorothrix scandica, as well as a number of unculturable strains, originating from environmental 16S rRNA sequences [6]. The distinction between P. hollandica and P. scandica is predominantly based on the molecular-genetic characters: DNA reassociation less than 30 % and DNA GC mol% content difference more than 5 % [5].

P. hollandica was isolated from the water bloom of Loosdrecht lake (near Amsterdam, Nertherlands) and validly published under the rules of Bacteriological Code as the type strain CCAP 1490/1T [7, 8]. The strain CCAP 1490/1 was generously supplied in 1994 by Dr. Hans C.P. Matthijs (Amsterdam University) and since then stored as CALU1027 at the Collection of Cultures of Algae and Microorganisms of St. Petersburg State University, CALU [9]. Prochlorothrix hollandica is also maintained as different strains under collection indexes CCMP34, CCMP682, NIVA-5/89, SAG10.89, and the strain PCC9006 was reported as well [10]. Another filamentous strain Prochlorothrix scandica was isolated from the phytoplankton of Lake Mälaren (Sweden), and is maintained as NIVA-8/90 and CALU1205 [11].

Among prochlorophytes at first were sequenced small genomes of unicellular Prochlorococcus sp. strains from LL- and HL-clades [2, 12, 13]. Four sequenced genomes of symbiotic Prochloron didemni P1-P4 are second in number [14]. Acaryochloris marina genomes were sequenced in the strains CCME5410 and MBIC11017 [15], but only one paper mentioned about P. hollandica PCC9006 genome sequenced by Shich et al. in the context of improving of global cyanobacterial phylogeny [16]. Here we report that genomic DNA of P. hollandica CCAP 1490/1T (CALU1027) was sequenced and obtained draft genome was annotated in order to conduct investigations in the field of comparative genomics of cyanobacteria and prochlorophytes.

Organism information

Classification and features

A representative genomic 16S rDNA sequence of strain P. hollandica CCAP 1490/1T (CALU1027) was compared with another prochlorophytes and also with cyanobacterial type strains sequences obtained from GenBank. The tree was reconstructed using neighbor-joining with the Kimura-2 parameter substitution model in MEGA 6.0 [17, 18]. The phylogenetic position of P. hollandica CALU1027 represents in Fig. 1. Representatives of the genus Prochlorothrix are morphologically similar to other filamentous non-heterocystous cyanobacteria (Subsection III, Oscillatoriales) [19]. In particular, P. hollandica CALU1027 produces long (>300 μm) straight, unbranched, non-motile trichomes (Fig. 2). Individual cells are 1.6 ± 0.1 μm wide and 11.8 ± 0.9 μm long that matches with the data reported [2, 4]. The opaque polar aggregates of gas vesicles resemble of those presented in Pseudanabaena type, but P. hollandica trichomes possess more slight intercellular constrictions (1/5 − 1/8 cell diameter). Trichomes multiply by means of occasional breakage without the resulting formation of hormogonia. Light- or electron microscopic-visible sheath and mucilaginous capsule were never observed; cell envelope demonstrates a typical Gram-negative triple-layer contour [5]. A brief survey of P. hollandica CALU1027 properties according to MIGS recommendations [20] is given in Table 1.

Fig. 1
figure 1

Phylogenetic position of P. hollandica CALU1027 within cyanobacteria. GenBank accession numbers are indicated in parentheses. The numbers above branches indicate bootstrap support from 1000 replicates

Fig. 2
figure 2

Light micrograph of P. hollandica CALU1027. Scale bar 10 μm

Table 1 Classification and general features of P. hollandica CALU1027

Genome sequencing information

Genome project history

The WGS project AJTX02 has been deposited at DDBJ/EMBL/GenBank under accession AJTX00000000 (20.02.2013) and updated, in this research, as Draft Genome Project AJTX00000000.2 (29.04.2015). The assembled contigs have been deposited in NCBI. The project information and its association with the MIGS are summarized in Table 2.

Table 2 Project information

Growth conditions and genomic DNA preparation

P. hollandica CALU1027 was grown in the BG-11 medium [2]. The strain is a moderate mesophile, well growing at 20-22 °C under continuous flux of light. For DNA isolation cells were harvested by centrifugation and treated with 2 μg/mL Proteinase K in 0.1 M Tris-HCl (pH 8.5), 1.5 M NaCl, 20 mM Na2EDTA, and 2 % cetyltrimethylammonium bromide at 55 °C for 3-4 h. DNA was purified by standard protocol of organic extraction and ethanol precipitation.

Genome sequencing and assembly

For genome sequencing, DNA was randomly fragmented using Q800R sonicator system. After size selection, 500 bp DNA fragments were used for constructing sequence libraries and thereafter sequenced with a 250 bp paired-end reads method using the Illumina MiSeq platform according to the manufacturer’s protocol, resulting in 3,679,738 read pairs. Reads were processed via the Trimmomatic 0.32 tool [21] and after filtration there were 3,665,348 read pairs. The obtained reads were used for further genome assembly with SPAdes 3.5 [22]. From the resulting assembly, the P. hollandica CALU1027 contigs was selected and scaffolded with Contiguator 2.7.4 [23], using assembly GCF_000332315.1 as a reference. The draft genome of P. hollandica CALU1027 contained about 5.5 Mbp in 286 contigs organized in 10 scaffolds; the N50 length of the contigs was 33,173 and N50 length of the scaffolds - 1,244,169 bp (Table 3).

Table 3 Genome statistics

Genome annotation

Protein-coding genes of draft genome assembly were predicted using the NCBI Prokaryotic Genome Annotation Pipeline (v.2.10) and an annotation method of best-placed reference protein set with GeneMarkS+ [24]. The annotated features were genes, CDS, rRNA, tRNA, ncRNA, and repeat regions. Functional assignments of the predicted ORFs were based on a BLASTP homology search against WGS of phylogenetically closest cyanobacteria and the NCBI non-redundant database. Functional assignment was also performed with a BLASTP homology search against the Clusters of Orthologous Groups (COG) database [25, 26]. As much as 2855 genes (66 %) were assigned as a putative function, and the remaining genes were annotated as either hypothetical proteins or proteins with unknown function.

Genome properties

The GC content of the P. hollandica CALU1027 genome was 54.56 %. Gene annotation revealed 3737 protein coding genes, 12 rRNA genes, and 44 tRNA genes. COG annotations of protein coding genes are presented in Table 4.

Table 4 Number of genes associated with general COG functional categories

Insights from the genome sequence

The assembly and analysis of P. hollandica CALU1027 genome annotation revealed a repertoire of genes necessary for the autonomous energy and substrate metabolism: 743 detected genes with relevance to 129 metabolic pathways have orthologs in P. hollandica CALU1027 and other cyanobacteria (Table 5). Comparative genomes analysis of P. hollandica CALU1027 with filamentious heterocystous cyanobacteria Anabaena variabilis ATCC29413 and unicellular prochlorophytes Prochlorococcus marinus CCMP1375 and Acaryochloris marina MBIC11017 revealed that the main differences were in the amino acids compounds, carbohydrates metabolism, membrane transport and stress response systems (data not shown).

Table 5 Selected functional capacities

Chl a/b-containing Prochlorothrix and Prochloron were long considered to have a common ancestry with chloroplasts of green algae and higher plants [27, 28]. However, P. hollandica and another prochlorophytes were shown to possess unique genes pcbA − pcbC coding chl a/b-LHC apoproteins and they are dissimilar from CAB apoprotein superfamily of chloroplast antenna [1930]. It is notable that we found some PS II proteins commonly absent in cyanobacteria but usually belonging to chloroplast in green algae and higher plants: PsbW (6.1 kDa, nuclear encoded), PsbT (5 kDa, nuclear encoded), PsbR (10 kDa) and PsbQ (16 kDa, oxygen evolving complex protein). We also found that P. hollandica contains an ortholog of hetR gene (key regulator of heterocyst differentiation) although all these filamentous non-heterocystous cyanobacteria are devoid of nitrogenase and other prerequisites for diazotrophy [31, 32].

Conclusions

The studying of P. hollandica CCAP1490/1T (CALU1207) genome is valuable for analyses of photosynthesis genes evolution and for comparative genomics of cyanobacterial adaptation.