Applied Microbiology and Biotechnology

, Volume 77, Issue 1, pp 167–173

Polyphosphate kinase genes from full-scale activated sludge plants


    • Department of Civil and Environmental EngineeringUniversity of Wisconsin–Madison
  • Suzan Yilmaz
    • Department of Civil and Environmental EngineeringUniversity of California at Berkeley
  • Shaomei He
    • Department of Civil and Environmental EngineeringUniversity of Wisconsin–Madison
  • Daniel L. Gall
    • Department of Civil and Environmental EngineeringUniversity of Wisconsin–Madison
  • David Jenkins
    • Department of Civil and Environmental EngineeringUniversity of California at Berkeley
  • Jay D. Keasling
    • Department of Chemical EngineeringUniversity of California at Berkeley
    • Department of BioengineeringUniversity of California at Berkeley
    • Physical Biosciences DivisionLawrence Berkeley National Laboratory
Environmental Biotechnology

DOI: 10.1007/s00253-007-1122-6

Cite this article as:
McMahon, K.D., Yilmaz, S., He, S. et al. Appl Microbiol Biotechnol (2007) 77: 167. doi:10.1007/s00253-007-1122-6


The performance of enhanced biological phosphorus removal (EBPR) wastewater treatment processes depends on the presence of bacteria that accumulate large quantities of polyphosphate. One such group of bacteria has been identified and named Candidatus Accumulibacter phosphatis. Accumulibacter-like bacteria are abundant in many EBPR plants, but not much is known about their community or population ecology. In this study, we used the polyphosphate kinase gene (ppk1) as a high-resolution genetic marker to study population structure in activated sludge. Ppk1 genes were amplified from samples collected from full-scale wastewater treatment plants of different configurations. Clone libraries were constructed using primers targeting highly conserved regions of ppk1, to retrieve these genes from activated sludge plants that did, and did not, perform EBPR. Comparative sequence analysis revealed that ppk1 fragments were retrieved from organisms affiliated with the Accumulibacter cluster from EBPR plants but not from a plant that did not perform EBPR. A new set of more specific primers was designed and validated to amplify a 1,100 bp ppk1 fragment from Accumulibacter-like bacteria. Our results suggest that the Accumulibacter cluster has finer-scale architecture than previously revealed by 16S ribosomal RNA-based analyses.


Enhanced biological phosphorus removalActivated sludgeRhodocyclusAccumulibacter phosphatisPolyphosphate kinase


The enhanced biological phosphorus removal (EBPR) activated sludge process is widely used to remove phosphorus from municipal wastewaters. The process is often studied in laboratory scale sequencing batch reactors (SBRs), fed with synthetic feed containing acetate or propionate as carbon sources. Researchers on four continents employing similar SBR operating conditions have repeatedly enriched bacteria affiliated with a phylogenetically coherent group within the β−Proteobacteria, (assessed using 16S ribosomal RNA [rRNA] analysis; Crocetti et al. 2000; Hesselmann et al. 1999; Liu et al. 2001; McMahonet al. 2002a). This group shares > 96.4% 16S rRNA sequence identity and appears to be a member of the Rhodocyclaceae family. Fluorescence in situ hybridization (FISH) was used to document their abundance and confirm their role in EBPR in lab-scale SBRs (Crocetti et al. 2000; Hesselmann et al. 1999; Liu et al. 2001; McMahon et al. 2002a), pilot-scale systems (Lee et al. 2003), and full-scale wastewater treatment plants (WWTPs; He et al. 2006, 2007; Kong et al. 2004; Zilles et al. 2002b). Hesselmann et al. (1999) proposed a genus and species name for one member of this group; Candidatus Accumulibacter phosphatis. Members of the candidate genus have not yet been cultured in isolation, and must be studied in lab-scale enrichment cultures or in full-scale WWTPs.

Recently, the metagenomes of two highly Accumulibacter-enriched EBPR sludges were sequenced, confirming the presence of two distinct Accumulibacter “species” in the two lab-scale SBRs (Garcia Martin et al. 2006). In this paper, we will discuss Accumulibacter-like bacteria using a hierarchical nomenclature of “Group-Cluster-Type-Clade” which is similar to the Linnaean “Phylum-Genus-Species-Strain,” acknowledging the open debate about bacterial species definitions (e.g.(Gevers et al. 2005)) and neglecting levels between Phylum and Genus. We consider the “Accumulibacter cluster” to constitute the organisms generally detected by commonly used PAOMIX 16S rRNA-targeted FISH probes (Crocetti et al. 2000; Hesselmann et al. 1999). Bacteria responsible for EBPR can take up large quantities of inorganic phosphate (Pi) and store it as polyphosphate (polyP). In model organisms, polyphosphate kinase 1 (ppk1) is the primary enzyme thought to be responsible for polyphosphate synthesis from ATP (Ahn and Kornberg 1990; Akiyama et al. 1992; Tzeng and Kornberg 1998). Ppk1 genes were previously retrieved from Accumulibacter-like bacteria cultivated in an acetate-fed SBR in Berkeley, CA, USA, using degenerate PCR primers (McMahon et al. 2002a). In that study, four ppk1 genotypes were discovered, two of which (Types I and II) appeared to be derived from Accumulibacter-like bacteria, sharing a maximum of 86% nucleotide identity. Unlike the 16S rRNA gene, the ppk1 gene appears to be a powerful genetic marker for revealing finer-scale population structure within co-occurring groups of Accumulibacter-like organisms.

In the present study, we recovered ppk1 gene fragments from four full-scale WWTPs in the San Francisco Bay Area, CA, USA (Table 1) to determine if these comparatively more diverse activated sludge communities contained Accumulibacter-like ppk1s similar to those observed in SBRs. Two of the plants used high rate anaerobic selectors, one was achieving complete biological nitrogen and phosphorus removal using an anaerobic-anoxic-aerobic configuration, and one plant did not have an anaerobic zone. The Accumulibacter-like ppk1s retrieved were used to design new PCR primers specific for the Accumulibacter cluster. We verified the primer set specificity and ability to amplify ppk1 gene fragments from five additional full-scale WWTPs located throughout the USA.
Table 1

Treatment plant characteristics

Treatment plant


Configuration at time of sampling

Sizea (mgd)

SRTb (days)

Oro Loma Sanitary District (OL)

San Leandro, CA

High rate anaerobic selector (anaerobic/aerobic)



Contra Costa County Sanitary District (CC)

Martinez, CA

High rate anaerobic selector (anaerobic/aerobic)



San Jose / Santa Clara Water Pollution Plant (SJ)

Alviso, CA

Four-stage anoxic/aerobic with step feed



City of Las Vegas Water Pollution Control Facility (LV)

Las Vegas, NV

Anaerobic/anoxic/aerobic (MUCT)c



Madison Metropolitan Sewerage District Nine Springs Plant (NS)

Madison, WI

Anaerobic/anoxic/aerobic (MUCT)



Hampton Roads Sanitary District Nansemond Plant (NAN)

Suffolk, VA

Anaerobic/anoxic/aerobic (VIP)



Hampton Roads Sanitary District Virginia Initiative Plant (VIP)

Norfolk, VA

Anaerobic/anoxic/aerobic (VIP)



Durham Advanced Wastewater Treatment Facility (DUR)

Tigard, OR

Anaerobic/anoxic/aerobic (A2O)



East Bay Municipal Utilities District (EB)

Oakland, CA

Conventional activated sludge with pure oxygen aeration



City of Oshkosh Wastewater Treatment Plant (OSK)

Oshkosh, WI

Conventional activated sludge



aAverage actual flow treated in millions of gallons per day (mgd).

bSolids retention time (SRT)

cModified University of Cape Town (MUCT)

Materials and methods

Full scale activated sludges

Activated sludge samples were taken from the mixed liquor channels of each WWTP. For WWTPs sampled in California, aliquots were immediately centrifuged (3,000 g for 10 min) and the biomass pellets were frozen on dry ice for transport to the laboratory, where they were stored at −80°C. Additional aliquots were immediately filtered through glass fiber filters (Whatman GF/B) for the soluble Pi assay. The suspended solids (total and volatile) and phosphorus content of the sludge were assayed within 2 h of sampling after transport to the laboratory at room temperature. All other WWTPs were sampled as described elsewhere (He et al. 2007). Total suspended solids, volatile suspended solids, soluble reactive phosphorus, and total phosphorus were by Standard Methods 2540B 2540E, 4500-P C, and 4500-P B.5, respectively (APHA 1995).

Lab-scale sequencing batch reactors

Two sequencing batch reactors were operated as previously described (Garcia Martin et al. 2006; McMahon et al. 2002a, b; Schuler and Jenkins 2003). Reactor “UCB” was operated at the University of California at Berkeley for 2 years and was originally inoculated with activated sludge from the Southeast Water Pollution Control Plant in San Francisco, CA. Reactor “UWM” was operated for 2 years at the University of Wisconsin–Madison and was originally inoculated with activated sludge from the Nine Springs Wastewater Treatment Plant in Madison, WI.

Genomic DNA extraction

Bulk genomic DNA was extracted from sludge samples using a series of enzymatic digestions, followed by phenol-chloroform extraction, as described previously (Purkhold et al. 2000) with minor modifications (Garcia Martin et al. 2006). Some aliquots were carried through the same extraction procedure described by Purkhold et al. (2000), while some were subjected to bead beating after the second enzyme digestion, as described (McMahon et al. 2002a).

PCR amplification of ppk1 fragments using degenerate primers

PCR conditions were optimized for each batch of genomic DNA. The following conditions were fixed for all samples. Amplification was carried out in 50 μl reactions containing 1X PCR buffer II (Applied Biosystems, Foster City, CA), 3.5 mM MgCl2, 200 μM of each dNTP, 400 nM of each forward and reverse primer (NLDE-F and TGNY-R, Table 2), 5% dimethyl sulfoxide, 200 ng bovine serum albumin, and 2.5 U AmpliTaq Gold (Applied Biosystems) on an MJ Research DNA Engine thermal cycler. A touch-down PCR program was used: an initial 12 min denaturing step at 94°C, followed by ten cycles of 94°C for 45 s, an optimized annealing temperature for 45 s (decreasing 0.5°C per cycle), and 72°C for 2 min. An additional 25 cycles were carried out with the same denaturing and extension conditions, but with 45°C annealing for 45 s, followed by a final 12 min extension at 72°C. The initial annealing temperature for each sample was varied using the gradient feature of the thermal cycler. The optimum was between 48 and 50°C, touching down to 43 and 45°C, respectively. The template concentration was titrated during optimization; the optimum amount was generally between 20 and 200 ng of extracted DNA per reaction.

ppk1 clone library construction

PCR products from several replicate reactions (usually 3–4) were purified by extraction from agarose gels using spin columns (Qiagen, Valencia, CA) and were cloned into pCR4 using the TOPO TA Cloning Kit for Sequencing (Invitrogen, Carlsbad, CA). Approximately 90 clones from each library were screened using restriction fragment length polymorphism (RFLP) analysis with MspI and Alu I (Boehringer Mannheim, Germany), in separate digests (Dojka et al. 1998). Unique representatives were chosen for sequencing.

PCR amplification of ppk1 fragments using Accumulibacter-specific primers

Amplification of Accumulibacter ppk1 fragments was carried out on 5 ng of genomic DNA from activated sludges, or on approximately 1 to 3 ng of PCR products amplified from ppk1 clones (generated using vector primers). The reaction mixtures contained 1X PCR buffer II (Applied Biosystems), 3.0 mM MgCl2, 200 μM of each dNTP, 400 nM of each forward and reverse primer (Table 2), 5% of DMSO, and 0.05 U/μl of AmpliTaq Gold(DNA polymerase (Applied Biosystems). The PCR was conducted on an iCycler (Bio-Rad, Hercules, CA), with the program consisting of an initial 10-min denaturation step at 95°C, followed by 30 cycles at 95°C for 30 s, 68°C for 1 min and 72°C for 2 min, and then a final extension at 72°C for 12 min. Products were visualized by agarose gel electrophoresis. Products amplified from the UWM reactor were cloned, screened, and sequenced as described above.
Table 2

Primers used for PCR

Oligonucleotide (abbreviation)

Sequence (5′–3′)

Target gene

NLDE-0199Fa (NLDE-F)


Most ppk1 homologsb





Accumulibacter-ppk1 cluster



aPrimers are numbered according to the full length Type I ppk1 sequence (pKDM12: accession number AF502200), with the number referring to the location of the base at the 3′ end of the primer.

bDegenerate primers were previously described (McMahon et al. 2002a).

Phylogenetic analysis

A dataset of known and putative ppk1 gene sequences was constructed by searching available public databases using the BLAST network service on the NCBI website (Altschul et al. 1990). The sequence fragments were translated, and the amino acid sequences were aligned against the dataset using the Seqlab program in the GCG software package version 10.0 (Genetics Computer Group, WI). Translated sequences were inspected for characteristic motifs to confirm their homology to known ppk1s (e.g., amino acid sequence ARFDE; Tzeng and Kornberg 1998). Alignments were masked manually to exclude positions with gaps in more than 15% of sequences, and the remaining aligned positions were exported for analysis. Bayesian inference of phylogeny was carried out with MrBayes version 3.1.1 using default priors (Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003). A mixed amino acid substitution model was used for amino acid-derived phylogenies. For nucleic acid-based phylogenies, the general time reversible model was used with rates varying according to codon positions. Trees were visualized and printed using the program PAUP* version 4.0b10 (Sinauer Associates, Sunderland, MA).

Nucleotide sequence accession numbers

The GenBank accession numbers for the nucleotide sequences determined in this study are AY963820-AY963838, DQ466618-DQ466723, DQ630733-DQ630737, DQ868996-DQ869001, and DQ883814.


Sludge characteristics

A selection of performance parameters for each sludge used for ppk1 library construction are reported in Table 3. The low Pns content of the EB sludge and the relatively high effluent soluble Pi concentration reflect the fact that the aeration basin at this plant was aerobic and that the EB sludge was not carrying out EBPR. The other three sludges had higher Pns contents and lower effluent soluble Pi concentrations and appeared to be carrying out EBPR. Performance data for the other sludges used to confirm Accumulibacter-ppk1 primer specificity can be found elsewhere (He et al. 2007).
Table 3

Physical and chemical characteristics of sludges used for ppk1 library construction

Activated sludge sourcea

Percent Pns-content [100% x mg Pns (mg suspended solids)−1]



Aerobic zone soluble Pi

per VSS

per TSS

(mg l−1)

(mg l−1)

(mg P l−1)

























aAbbreviations described in Table 1

Sludge ppk1 diversity captured in clone libraries

Previously designed degenerate ppk1-specific primers (McMahon et al. 2002a) were used to construct clone libraries from WWTP sludges EB, SJ, OL, and CC. Amplification with the degenerate primers was inefficient, and several reactions were required to obtain sufficient product for gel extraction and cloning. A summary of the number of RFLP types recovered from each library is provided in Table S1. Initial phylogenetic analyses were carried out on an aligned dataset of 171 confirmed or putative ppk1 amino acid sequences from this study and from public databases. To reduce computational requirements, possible replicate sequences were identified and removed from the dataset if they shared greater than 98% DNA sequence identity with another sequence derived from the same sample. The reduced dataset, containing 145 taxa, was used for further analyses, the results of which are presented in Figs. 1 and S1. Five distinct lineages are evident from inspection of the tree. These were arbitrarily designated Groups I–V for the purposes of discussing tree topology. Representatives of all major groups, except Group IV, were obtained from the sludge samples. Excluding replicates, we recovered 30 sequences affiliated with Group I, 16 with Group II, 11 with Group III, and 38 with Group V.
Fig. 1

Phylogenetic tree indicating major ppk1 clades, as inferred using the entire dataset, excluding replicates. Phylogenetic analyses were carried out on amino acid sequences (395 homologous positions) deduced from gene fragments in the in-house database plus the new sequences obtained in the current study. MrBayes version 3.1.1 was used for Bayesian analyses with 1.8 million generations, yielding 34,800 trees following 60,000 generations of burnin. Sequence names were omitted for clarity. A horizontal version of this tree with sequence names is presented in Fig. S1. The Accumulibacter cluster falls within Group I and is featured in more detail in Fig. 2. Previously defined sequence Types III and IV (McMahon et al. 2002a) fall within Group V which is featured in more detail in Fig. S2

Accumulibacter-like ppk1s were associated with Group I–a highly supported lineage containing ppk1s from β-Proteobacteria and γ-Proteobacteria. Notably, Group I contains the previously described “Type I” and “Type II” ppk1s retrieved from lab scale EBPR reactors (McMahon et al. 2002a). Group II was also a distinct lineage, containing ppk1s from only α-Proteobacteria (Fig. S1). Groups III and IV contained only Actinobacteria and Cyanobacteria ppk1s, respectively. Group V contained ppk1s from Escherichia coli, Vibrio cholerae, and Pseudomonas aeruginosa. The partitioning of γ-Proteobacteria between Groups I and V is noteworthy, because it means that ppk1 protein- and 16S rRNA gene-based phylogenies are incongruent. This implies that ppk1 may have been horizontally transferred.

Accumulibacter-like ppk1 genes detected in EBPR sludge

Several sequences were obtained from sludges SJ, OL, and CC that were closely affiliated with the ppk1 fragments retrieved from lab scale SBRs, in Group I. A tree of Accumulibacter-like ppk1 DNA sequences was constructed to better illustrate fine-scale topologies within this cluster (Fig. 2). Five distinct lineages emerged, suggesting the presence of at least five coherent clades within the Accumulibacter cluster (one within Type I and four within Type II). The maximum pairwise DNA identity within each clade ranged from 93.5 to 98.1%. The maximum pairwise DNA identity across the whole Accumulibacter cluster was 79.6%.
Fig. 2

Phylogram indicating inferred relatedness of ppk1s from the Accumulibacter clade, based on DNA sequences. The tree was constructed with all sequences determined to affiliate with the Accumulibacter clade (including replicates). MrBayes was used for two separate runs of 3 million generations, yielding a total of 59,902 trees following 5,000 generations of burnin per run. The consensus tree is presented here with probabilities supporting major clades shown next to the nodes. All 59,902 trees were used to calculate the probabilities. For ease of identification with their source sludge, clones named UCB-## have been renamed from their original publication (McMahon et al. 2002a) in which they were named HP1-ppk2-##

Novel groups of ppk1 found in all sludges

Several novel lineages with no cultured representatives were identified during phylogenetic analyses of the newly expanded ppk1 dataset (Figs. 1 and S1). A separate tree was constructed to better illustrate the fine-scale topology of Group V, which could be confidently divided into three subgroups (Fig. S2). Subgroups Vc and Vb included Types III and IV, respectively, which were previously obtained from a lab scale SBR (McMahon et al. 2002a). All four full-scale sludges contributed members to each of the three subgroups, suggesting that the organisms contributing these genes are ubiquitous in full-scale plants, regardless of configuration. The ecological or engineering significance of organisms possessing Group V ppk1 remains to be determined.

Accumulibacter-ppk1 specific primer design and validation

The previously described degenerate primers targeting ppk1 in most bacteria can be used to retrieve novel ppk1 fragments from uncultured organisms (McMahon et al. 2002a). However, amplification with these primers is often non-specific and highly inefficient, probably because of the high level of degeneracy. Therefore, we designed new PCR primers to specifically target the Accumulibacter cluster. The primers amplified Accumulibacter-ppk1 fragments from positive controls (Fig. S3a) and sludge genomic DNA known to contain Accumulibacter-ppk1 (Fig. S3b). They are also highly specific: no amplification products were detected when ppk1 fragments not affiliated with the Accumulibacter cluster or non-EBPR sludge DNA were used as template (Fig. S3a and b, respectively).


Accumulibacter-like bacteria are thought to be responsible for phosphorus accumulation in most volatile fatty acid-fed laboratory scale EBPR systems (Crocetti et al. 2000; Hesselmann et al. 1999; Levantesi et al. 2002; Liu et al. 2001; Oehmen et al. 2005; Pijuan et al. 2004) as well as many full-scale WWTPs (Beer et al. 2006; He et al. 2006, 2007; Kong et al. 2004; Zilles et al. 2002a, b). In this paper, we describe a first effort to characterize the population structure of Accumulibacter-like bacteria using a genetic locus that provides more phylogenetic resolution than the commonly used 16S rRNA gene. The four full-scale WWTPs were chosen to represent a variety of activated sludge configurations.

Several new sequences retrieved from EBPR systems formed a distinct cluster in phylogenetic trees with other previously sequenced Accumulibacter-like ppk1s. We named the phylogenetically coherent lineage that contained this cluster “Group I” to provide a nomenclature that allows discussion of the Accumulibacter clades within a more broadly defined group. The new Accumulibacter-like sequences retrieved from the full-scale sludges clearly affiliated with the previously named Type I ppk1 (McMahon et al. 2002a). Clone SJ2-65 was the most closely affiliated with Type II, sharing 99% DNA sequence identity with the UCB clones. Notably, a clone from the SBR at the University of Wisconsin–Madison (UWMH-E5) also shared 99% DNA sequence identity with both SJ2-65 and the Type II UCB clones, suggesting that the same strain of Accumulibacter was present in the Madison SBR. This SBR also contained Type I ppk1s. The fact that no Type I ppk1s were recovered from the full scale plants is intriguing as it implies that Type I ppk1s are comparatively rare at full-scale. A quantitative survey of a larger selection of WWTPs (including those used to inoculate the SBRs) is required to confirm this observation.

The retrieval of sequences associated with several distinct clades of Accumulibacter-like ppk1 fragments is consistent with the hypothesis that several phylogenetically distinct groups of these organisms exist in WWTPs. Those clustering with the Type I and II SBR sequences are likely from Accumulibacter-like organisms. We propose that sequences branching outside the designated Accumulibacter cluster, but within Group 1, are derived from organisms qualifying as members of a different genus. It is not yet possible to correlate evolutionary distance based on 16S rRNA with ppk1 DNA or amino acid sequences. Thus, we cannot determine whether these sequences originated from the Accumulibacter-related “Dechloromonas subgroup” defined by Zilles et al. (2002b). However, because 16S rRNA sequences belonging to this subgroup were obtained by Zilles et al. (2002b) from a conventional activated sludge plant not performing EBPR, our hypothesis is consistent with the fact that several EB clones (also from a conventional non-EBPR plant) are part of a group that is distinctly separate from the Accumulibacter-ppk1 clade.

We emphasize that it is not possible to draw quantitative conclusions about ppk1 diversity, distribution, or abundance based on the number of unique RFLP types and sequences obtained using PCR-based clone libraries because many sources of bias, including different efficiencies of DNA extraction, PCR amplification, and cloning contribute to non-quantitative results (Wintzingerode et al. 1997). Degenerate primers are probably even more likely to produce amplification bias, as some sequences will have more perfect matches than others.

The newly designed and validated Accumulibacter-specific ppk1 primers will be useful for the harvesting of additional Accumulibacter-ppk1 sequences from full scale WWTPs, for further studies of Accumulibacter cluster diversity and structure and as a diagnostic tool to determine whether Accumulibacter-like organisms are present. The ppk1 locus is much more divergent than the 16S rRNA locus and can distinguish between the different Accumulibacter sub-clusters, making it a promising candidate for a reliable genetic marker for Accumulibacter-mediated EBPR. Remaining to be elucidated are the relationships between the fine-scale population structure within the Accumulibacter cluster and how it might affect EBPR process performance.


The authors wish to thank Philip Hugenholtz, Hector Garcia Martin, Victor Kunin, Jason Flowers, and Daniel Noguera for helpful discussions. This research was supported by National Science Foundation Grants BES-9912472 and BES-0332181 to DJ and JDK, and by BES-0332136 to KDM.

Supplementary material

253_2007_1122_MOESM1_ESM.doc (784 kb)
ESMPolyphosphate kinase genes from full-scale activated sludge plants (DOC 802 kb)

Copyright information

© Springer-Verlag 2007