Background

In the natural environment crude oil, a complex mixture of light and heavy hydrocarbons and inorganic compounds, is degraded by members of the Bacteria and Archaea, as well as by certain plants and fungi. Significant work has been done to identify the taxonomic groups and pathways involved in the bioremediation of crude oil, motivated in part by the need to improve predictions of in situ degradation rates of oil and of targeted hydrocarbon compounds. The 2010 Macondo Well blowout in the Gulf of Mexico highlighted some of the gaps in our understanding of how crude oil is degraded in situ in the pelagic marine environment, as both the hydrocarbons emerging from the damaged well and the in situ marine microbial community reacted in ways that were unpredicted [13]. In that case, components of the crude oil advecting in a deep plume below 1000 m in the Gulf were consumed in situ by the marine microbial community, reducing ecological disturbance at the sea surface.

The extent of the microbial crude oil catabolism at the relatively cold temperature (approximately 5°C) of the deep plume was considered surprising [2]. It has long been recognized, however, that bacteria can respond quickly to crude oil in near-freezing seawater [4]. Even sea ice microbial communities, living at temperatures below the freezing point of seawater, can respond to inputs of diesel fuel and crude oil [57]. Low temperature crude oil degradation has also been observed in polar and alpine soil [810], and by several Bacterial strains in culture [1113]. Despite these and other advances in understanding the potential for low temperature bioremediation, the presence of crude oil degradation genes in the available psychrophile genomes has not been investigated, though recent work has suggested that these genes might be broadly distributed across the Bacteria and Archaea [14]. By identifying such genes and evaluating differences between gene products and homologues from mesophiles, we hoped to identify structural differences that may enable crude oil catabolism at low temperatures. In addition to improving our ability to predict in situ bioremediation in cold environments, this knowledge paves the way for the rational design or modification of enzymes for improved function at in situ temperature in polar and sub-polar environments. These considerations are important for small scale, reduced energy, environmental clean-up strategies involving bioreactors and other technologies. Rational protein manipulation has already resulted in enzymes of potential value for environmental cleanup and industrial processes [15, 16]; however, this work has been limited to possible terrestrial, not marine, applications at standard conditions for temperature and pressure.

By mass a considerable fraction of crude oil is n-alkanes (alkanes): straight chain, saturated hydrocarbons with no cyclic functional groups. The shortest and most volatile alkanes are the natural gas components methane, ethane, butane, and propane, all of which are important substrates for a variety of Bacteria and Archaea. Even approaching the freezing point of water these small alkanes remain in the gas phase and are therefore highly bioavailable. Bioavailability decreases with the increasing number of carbons in an alkane molecule, reaching a minimum with large, extremely hydrophobic waxes [4]. At mesophilic growth temperatures alkanes larger than C16 are solid, necessitating the use of emulsifiers to improve bioavailability [17]. To degrade alkanes of different lengths Bacteria and Archaea have evolved a diverse array of enzymes, collectively termed alkane hydroxylases. All alkane hydroxylases function by oxidizing the terminal or subterminal carbon, converting the alkane into an alcohol [18]. This conversion “activates” the alkane for processing by downstream enzymes, starting with alcohol dehydrogenase.

The diversity of alkane hydroxylases, described in recent reviews [1720], is briefly summarized here. Operating on the lowest molecular weight alkanes (approximately C1-C4) are the soluble methane monooxygenase (SMMO), particulate methane monooxygenase (PMMO), and propane/butane monooxygenase (P/BMO) enzymes. Acting on mid-weight alkanes (roughly C5-C16) are a group of alkane hydroxylases belonging to the cytochrome p450 family of enzymes and the membrane-bound non-heme AlkB enzymes. Less is known about the degradation of long chain alkanes, but two enzymes, AlmA and LadA, have been identified that utilize alkanes large than C20 [21, 22].

To explore the diversity of alkane hydroxylases in the genomes of psychrophilic Bacteria we conducted a de novo annotation of nineteen psychrophile genomes, searching for homologues of known alkane hydroxylase genes. To evaluate what properties of these proteins might enable catalytic function at low temperature we compared protein parameters between putative alkane hydroxylases from psychrophiles and mesophiles averaged across the whole protein, within secondary structure elements, and, for protein flexibility, within specific residues along the length of the protein.

Methods

Identifying alkane hydroxylases

Proteins representative of alkane hydroxylases were identified in the Universal Protein Resource (Uniprot) database [23] by protein name search for ‘alkane hydroxylase’ , ‘methane monooxygenase’ , ‘propane monoxygenase’ , ‘butane monooxygenase’ , ‘LadA’ , and ‘AlmA’. Proteins belonging to uncultured organisms or identified as fragments were excluded from further analysis, while duplicated names or sequences were reduced to a single copy. An exception was made to allow fragments for AlmA, as all AlmA proteins in the database were described as fragments yet were of similar length. Conserved domains were identified in the representative alkane hydroxylases by hmmscan in HMMER v3.0 [24] against the PFAM-A database of protein families [25] with an e-value cutoff of 10−5. Hmmscan uses profile-hidden Markov models, a representation of amino acid probability by position, to match query sequences against a database.

Nineteen psychrophile strains (maximum growth temperature < 20°C) and nineteen closely related mesophile strains were selected and their genomes downloaded from Genbank (Additional file 1: Figure S1). Psychrophiles were identified according to temperature annotations in the GOLD [26] and HIMA [27] databases, and to reviews by Casanueva et al. [28] and Siddiqui et al. [29]. Care was taken to include all plasmids and chromosomes with the genome of each strain. Open reading frames (ORFs), defined as any region longer than 150 bp without a stop codon, were translated and searched for conserved protein domains against the PFAM-A database [25] using hmmscan in HMMER v3.0 [24] and an E-value cutoff of 10−5. Coding sequences (CDS, ORFs containing a pfam domain) with a hit to a pfam present in alkane hydroxylases were extracted for further analysis.

Complete records for diagnostic pfams were downloaded as fasta files from the Protein Family website (PFAM; http://pfam.xfam.org/). For PFAM datasets larger than 5,000 sequences, 5,000 sequences were randomly selected for analysis. For each pfam, the PFAM dataset was combined with the proteins of that family from the Uniprot, psychrophile, and mesophile datasets. These combined protein sets were aligned using three iterative alignments in Clustal Omega v1.2 [30], a program that allows for high-quality alignments of large numbers of protein sequences. The alignments were then filtered using an in-house script (filter_seqs_selective.py) which trims the alignment to the last start and earliest end position of the proteins from the Uniprot dataset. Proteins from the psychrophile, mesophile, or PFAM datasets that did not meet a minimum length guideline after filtering were eliminated from further analysis. After filtering, the sequences were aligned one more time, and a distance matrix of each pfam was created using the --full and --use-kimura flags in Clustal Omega v1.2.

To describe sequence similarity within pfams we used nonmetric multidimensional scaling (NMDS) of Kimura-corrected genetic distance [31] in the R package Vegan [32]. This method was selected over phylogenetic trees based on the ease with which points in a region of interest on a 2D NMDS plot can be selected programmatically, compared to selecting branches on a phylogenetic tree. Although NMDS plots have been used to describe protein homology previously [33], this method is not in wide use. To validate the NDMS approach to describing sequence similarity we compared the Euclidean distance between NMDS points in the first and second dimension, maximum likelihood distance from a phylogenetic tree based on the same alignment, and bit scores from a reciprocal blastp search. We used the combined protein dataset for the FA_desaturase pfam for this analysis and generated a tree of the filtered alignment using FastTree OpenMP v2 [34] with the JTT+CAT model. Summed branch lengths between all branch tips were extracted from the tree with an in-house script (dist_from_tree.py) using the Phylo package in Biopython [35]. To describe the relationship between phylogenetic tree distance, bit score, and NMDS distance, linear models were fit to a randomly selected subset of the data (n = 10,000) in log-linear space for NMDS and phylogenetic distance and log-log space for NMDS and bit score distance. Goodness of fit was further evaluated by exploring the distribution of the residuals.

For NMDS analysis we determined the ideal number of dimensions to be three for fewer than 3,000 sequences, four for between 3,000 and 6,000 sequences, and five for more than 6,000 sequences. Sequences that placed far from the majority of points in a 2D plot of the NMDS analysis, and thus prevented the identification of distinct clusters for the majority of points, were culled from the original alignment and a new distance matrix was constructed before re-running the NMDS analysis. Clusters on the final 2D NMDS plots that contained proteins from the Uniprot, psychrophile, and mesophile datasets were selected for further analysis.

Analysis of protein parameters

The flexibility, grand average of hydropathy (GRAVY), isoelectric point, and aromaticity parameters of proteins were calculated with the ProtParam module in BioPython [35]. Aliphatic index was calculated using the method of Ikai [36]. To determine the parameters by secondary structure, the α-helix, β-strand, and coil region for each protein was determined by the stand-alone version of psipred [37] and the runpsipredplus script. The best database for secondary structure prediction was evaluated by comparing predictions using the NCBI nr database, uniref90, and Pfam-A for one candidate alkane hydroxylase against predictions obtained from an intensive 3-D structural prediction model using Phyre2 [38]. Both databases achieved a prediction accuracy of 71.8%, just below the prediction of psipred as implemented by the Phyre2 server (72.7%). We used Pfam-A for further predictions due to the smaller size of that database. Protein parameters were recalculated using a 9 residue window (selected for consistency with the window used in flexibility calculation), and the per-position parameter was taken as the mean of the window centered on that position. Per-position values for each parameter were then extracted for comparison according to secondary structure. Differences in parameters between psychrophile and mesophile proteins within clusters and secondary structure elements were evaluated using the Wilcox test. Differences in parameters between psychrophile and mesophile proteins within taxonomic pairs were evaluated with a pairwise comparison. Because multiple parameters were investigated, all p-values derived from the Wilcox Test were corrected for multiple comparisons using the Holm-Bonferroni method.

To evaluate differences in flexibility, widely considered important for cold activity, between putative alkane hydroxylases from psychrophiles and mesophiles on a by-position basis, we aligned the flexibility parameters for all proteins in each cluster according to a multiple sequence alignment generated in Clustal Omega v2 [30] using in-house scripts (align_params.py, align_params.r). For each position in the alignment the mean flexibility and standard deviation were calculated for psychrophile and mesophile proteins. Positions in the alignment where the difference in means (psychrophile proteins – mesophile proteins) between the two groups exceeded the sum of the standard deviations were flagged as sites of significant deviation. To place these findings in the context of protein tertiary structure, 3D models were constructed of a representative psychrophile protein in each cluster using the intensive modeling option in Phyre2 [38]. Residues with significant differences in flexibility were color-highlighted in the models using Discovery Studio Visualizer (Accelyrs).

All in-house scripts can be obtained from https://github.com/bowmanjeffs/cold_ah.

Results

The Uniprot searches collectively returned 939 alkane hydroxylase proteins after culling. These proteins belonged to 16 pfams of which seven were determined to have a regulatory or electron carrier binding function, or to be the result of an erroneous classification (MmoB_DmpM, ADH_zinc, LXG, Nol1_Nop2_Fmu, DUF900, NAD_binding_1, FAD_binding_6). Four of the remaining nine pfams were represented in the psychrophile genomes (Table 1). Among these four was FA_desaturase, used to show the correlations between Euclidean distance in 2D NMDS space and phylogenetic tree distance (R2 = 0.4232) and bit score (R2 = 0.5029) and thus the validity of using NMDS plots (Figure 1). NMDS plots of all four pfams contained clusters with both psychrophilic and Uniprot proteins, indicating close sequence similarity (Figure 2, Table 2). The Pyr_redox_3 and Bac_luciferase pfams each had only one cluster, corresponding to AlmA and LadA respectively. FA_desaturase had two clusters; cluster 0 corresponds to the AlkB group of membrane bound alkane hydroxylases, while cluster 1 is defined by only a single Uniprot protein annotated as alkane-1 monooxygenase. The p450 family also had two clusters; cluster 0 corresponds to the Bacterial p450 alkane hydroxylase, while cluster 1 corresponds to the Eukaryotic p450 alkane hydroxylase. A total of 26 putative alkane hydroxylases were identified in the psychrophile genomes and 41 in the mesophile genomes (Additional file 2: Table S1).

Table 1 Occurrence of conserved protein family (pfam) domains linked to alkane hydroxylases (AH) in each dataset
Figure 1
figure 1

Euclidean distance in 2D NMDS space as a function of bit score (top) and phylogenetic distance (bottom). Euclidean distance was compared to bit score and phylogenetic distance to evaluate the fidelity of these parameters. Euclidean distance in the FA_desaturase pfam is strongly correlated with bit score (R2 = 0.4232 n = 2,439,512), obtained from reciprocal blast, and with phylogenetic distance (R2 = 0.5029, n = 2,439,512), as summed branch lengths from a maximum-likelihood tree. Orange lines are linear models fit to the complete data sets; only 10,000 randomly selected data points are plotted.

Figure 2
figure 2

NMDS plots of genetic distance within four protein families (pfams). The distance between two points on the plot is proportional to their sequence similarity, thus neighboring points have similar functions. Clusters of points identified as candidate alkane hydroxylases, due to the presence of known alkane hydroxylases from Uniprot, are outlined with gray boxes.

Table 2 Number of candidate alkane hydroxylases observed in each of the psychrophile and mesophile genomes examined

The Pyr_redox_3, Bacterial_luciferase, FA_desaturase, and p450 families contained sufficient psychrophile and mesophile proteins to allow a comparison of protein parameters within these two families. Considering parameters averaged across the protein, no pfam had a statistically significant difference in any parameter between the psychrophile and mesophile populations. We use the term “trending” to describe possible relationships between parameters and temperature class when p << 0.05 by the Wilcox Test but differences did not meet the significance threshold after applying the Holm-Bonferroni method to correct for multiple comparisons [39]. Trending differences were observed in three of the four pfams. Flexibility and tryptophan content trended lower in psychrophile FA_desaturase, while GRAVY and lysine content trended higher. For p450, threonine content trended higher in psychrophiles. For Bac_luciferase, alanine, isoleucine, and lysine trended lower in psychrophiles, while cysteine, methionine, arginine, and tyrosine trended higher.

Comparing between taxon pairs (as given in Table 2) revealed more differences for parameters averaged across the whole protein (Table 3). For p450, four psychrophile-mesophile taxon pairs were available for analysis: Octadecabacter antarcticus 307 and Rhodobacter sphaeroides ATCC 17025, Glaciecola psychrophila 170 and Glaciecola agarylitica 4H37YE5, Octadecabacter arcticus 238 and Ketogulonicigenium vulgare Y25, and Terriglobus roseus DSM18391 and Terriglobus saanensis SP1PR4. Because multiple genes were present in some of these genomes, a total of nine pairwise comparisons were possible. In all nine comparisons the psychrophile protein had a lower isoelectric point and arginine content than the mesophile protein, while valine was elevated in psychrophiles in all comparisons; asparagine and threonine were elevated in psychrophile proteins in all but one comparison (Table 3).

Table 3 Pairwise parameters for candidate alkane hydroxylases in two conserved pfam domains, p450 and Pry_redox_3

There were only two psychrophile-mesophile pairs available for the analysis of putative alkane hydroxylases from the Pyr_redox_3 family: G. psychrophila 170 and G. agarylitica 4H37YE5 and Psychrobacter cryohalolentis K5 and Acinetobacter baumonii AYE. Due to the large number of putative hydroxylases belonging to this pfam in G. psychrophila 170 and A. baumonii AYE, however, 11 comparisons were possible (Table 3). Cysteine and valine were elevated in the psychrophile proteins for all but one comparison; glutamic acid was reduced in the psychrophile proteins for all comparisons. For FA_desaturase only one taxon pair was available for analysis: O. antarcticus 307 and R. sphaeroides ATCC 17025, with four possible comparisons. Given the limited number of comparisons pairwise FA_desaturase parameters were not explored further.

Considering protein parameters by the secondary structure elements α-helix, β-sheet, or coil also revealed no statistically significant differences in protein physical parameters. The strongest trends were observed for psychrophile FA_desaturases: lowered flexibility in the coil and α-helix regions and reduced acidic residues and lysine in the α-helices. Considering taxon pairs for p450 (Table 3), isoleucine was generally reduced in psychrophile α-helices and β-sheets but elevated in coils. Asparagine and valine were always higher in the coil region for psychrophiles. Flexibility, isoelectric point, alanine, glycine, and proline were all generally elevated in β-sheets. Flexibility and aspartic acid were elevated in α-helices, while arginine was reduced. For psychrophile Pyr_redox_3 (Table 3), aspartic acid was elevated in the coil while glycine was reduced. In psychrophile β-sheets, GRAVY and cysteine were both elevated. Glutamic acid and asparagine were reduced in psychrophile α-helices while alanine was elevated.Local differences in flexibility between psychrophile and mesophile proteins within each cluster were apparent after alignment of all the proteins within each cluster (Figure 3), although there was considerable variation in the number of significant differences between clusters and in the direction of the differences (whether greater or lesser flexibility in the psychrophile data set). The Pyr_redox_3 cluster 0 had only five sites with significant differences (see vertical orange and green lines in Figure 3), while the other clusters, Bac_luciferase cluster 0, FA_desaturase cluster 1, FA_desaturase cluster 0, p450 cluster 1, and p450 cluster 1, had many more such sites: 127, 65, 52, 20 and 16, respectively. Summing the difference in mean values (blue line, Figure 3) returned a positive value for p450 cluster 1 (higher degree of flexibility in the psychrophile proteins analyzed) and a negative value for the other clusters (higher degree of flexibility in the mesophile proteins). Restricting this analysis to only those sites with a significant difference in means (sites with an orange or green line, Figure 3) produced a similar result: p450 cluster 0 and p450 cluster 1 yielded a positive sum while the remaining clusters gave a negative sum.

Figure 3
figure 3

Alignment of the flexibility parameter between putative alkane hydroxylases in psychrophiles and mesophiles. Blue line indicates the difference in mean flexibility for the psychrophile and mesophile proteins, black line indicates the sum of the standard deviations for these two groups. Positive values for the mean (blue line) indicate positions in the alignment where the flexibility was greater for the psychrophile proteins; negative values, where flexibility was reduced. Gaps in the data reflect gaps in the alignment that prevented the calculation of the mean or standard deviation (SD). Center residues for windows with a significant increase in flexibility for psychrophiles and mesophiles are indicated by orange and green vertical dashed lines, respectively.

The Phyre2 protein fold prediction server produced high confidence (90% or more residues modeled at 90% or greater confidence) for the Pyr_redox_3, Bac_luciferase, and p450 pfams. FA_desaturase could not be modeled with high confidence. By correlating the residues in modeled proteins (Figure 4) with positions with a significant difference in mean flexibility, we identified sites that may reflect mutations that enhance protein activity at low temperature (Figure 4). PDB files of the modeled proteins are provided as Additional file 3.

Figure 4
figure 4

Predicted 3D structures for representatives of the four clusters from psychrophiles with high confidence predictions. Proteins are colored from C-terminal (red) to N-terminal (blue). Positions indicated by the orange and green vertical lines in Figure 3 are highlighted as yellow (increased flexibility in psychrophiles) or green (reduced flexibility in psychrophiles).

Discussion

Alkanes are ubiquitous in marine and soil environments, occurring as by-products of cell metabolism and death; they also enter the marine environment from natural hydrocarbon seeps and anthropogenic sources. As a result alkane degradation is likely widespread among heterotrophic bacteria [14]. Several studies have demonstrated that some psychrotolerant Bacteria (growth at 0°C but maximum growth temperature > 20°C), which overlap ecologically and geographically with psychrophilic bacteria, can catabolize alkanes [12, 13, 40, 41], while crude oil degradation has been observed in a variety of cold environments [2, 5, 6, 810]. Despite the ubiquity of low molecular weight alkanes, the AMO, AmoC, MeMO_Hyd_G, Monooxygenase_B, and Phenol_Hydrox pfams were not detected in the psychrophile genomes examined. Because these pfams are known to be restricted to relatively few taxonomic groups, their absence in the analyzed psychrophiles, though indicative of an inability to catabolize low molecular weight alkanes in this group, may not be surprising. The nineteen psychrophile genomes available to explore in this analysis, however, represent only a very small sampling of psychrophile functional diversity. Psychrophiles are known, for example, to undergo C1 metabolism [41], yet none of these strains has been targeted for genome sequencing. As more psychrophile genomes are sequenced and published, they can be explored for additional alkane catabolism pathways predicted from environmental evidence but missing in the current set of analyzed genomes. Given that the diversity of enzymes involved in alkane degradation is also not fully explored, the current genomes may well contain alkane hydroxylases lacking sufficient sequence homology to known alkane hydroxylases. This possibility was highlighted by a recent genomic study of a cold-adapted Colwellia strain obtained from the deep hydrocarbon plume in the Gulf of Mexico [42]. No genes for known alkane hydroxylase were identified in this strain despite abundant ancillary data linking Colwellia to short chain alkane degradation following the Macondo Well blowout [42].

Because alkane bioavailability is positively correlated with temperature and negatively correlated with chain length, the preferential degradation of short chain alkanes is expected in cold environments. Surprisingly, we found several candidate alkane hydroxylases homologous to LadA and AlmA, enzymes associated with the degradation of long-chain alkanes. These putative long-chain alkane hydroxylases are ecologically diverse, occurring in the genomes of sea ice Bacteria (Octadecabacter arcticus, O. antarcticus, and Glaciecola psychrophila) and tundra soil Bacteria (Terroglobus saanensis). Confirming the ability of these strains to degrade long-chain alkanes or similar substrates will be a priority in future work. Because O. arcticus, O. antarcticus, and G. psychrophila are all associated with sea ice (Table 2), which in springtime hosts a high density of ice algae, their hypothesized ability to degrade long-chain alkanes may result from a preference for ice-algal lipids. The bioavailability of lipids and long-chain alkanes can be enhanced at low temperature by naturally occurring surfactants, most notably microbially produced exopolymers (EPS) [43]. Sea ice is rich in EPS [44] which may enable the catabolism of these compounds even at low temperature.

A considerable body of literature is dedicated to determining what protein modifications enable enzymatic function at low temperature. At low temperatures water molecules interact more tightly with the protein surface, reducing the overall flexibility of the protein. To counter the effect of low temperature on enzyme function, cold-active proteins make use of a variety of amino acid substitutions. The sum impact of these different substitutions, including their interactions and feedbacks, is difficult to predict. Compounding this difficulty is the co-occurrence of low temperature and low water activity, as found in virtually all ice matrices (e.g., permafrost, glacial ice, and sea ice). Optimization to low water activity and low temperature may be more difficult than optimization to low temperature alone.

Although all analyzed pfams showed some differences between psychrophiles and mesophiles for the measured parameters, no coherent overall optimization strategy was evident. The clearest trends appeared in our pairwise comparisons, which were limited to Pyr_redox_3 and p450. High flexibility and low isoelectric point appeared to be important for cold adaptation in p450, where asparagine, threonine, and valine were all enriched and arginine reduced. For Pyr_redox_3, cysteine and valine were enriched and glutamic acid was reduced. These modifications differ somewhat from those described in previous work on amino acid substitutions among cold-adapted proteins. Working with a more limited number of psychrophile genomes, and without distinguishing between protein families, Metpally and Reddy [45] did not report a role for valine or cysteine in protein cold adaptation. Our findings suggest that amino acid substitution patterns may require a more nuanced view, differing between proteins as a result of protein structure, strain taxonomy, and ecology.

One challenge to evaluating protein temperature optimization is the localization of some parameters. Although changes to isoelectric point and hydropathy are likely to be globally distributed in a protein, at least among secondary structure elements or among sites of a given solvent accessibility, optimized flexibility may come about through the modification of only specific residues [46, 47]. Regions of consistently increased flexibility were present in alignments from all six putative alkane hydroxylase clusters, though the generalized location of increased flexibility varied between cluster representatives (Figure 4). P450 cluster 0 had several regions of increased flexibility at probable hinge points on bends, loops, and in the coil region. Interestingly, three of these were centered on methionine residues (Met8, Met269, and Met295 in the representative p450 cluster 0 protein from Glaciecola psychrophila). Methionine is known to play a role in low temperature optimization of other heme-binding proteins by providing alternate heme-binding sites in the event of partial denaturation [47]. In the G. psychrophila p450, however, most of these sites were located toward the exterior of the protein and are unlikely to interact with heme. P450 cluster 1 had no evidence of increased flexibility in loops or coils, but did have regions of increased flexibility in the core of the protein. Bac_luciferase cluster 0 had large differences in local flexibility between the psychrophile and mesophile proteins. Regions of increased flexibility included bends likely to function as hinge points and residues near the protein active site.

Conclusions

Although the total number of putative alkane hydroxylases in the analyzed psychrophiles was smaller than in a taxonomically related group of mesophiles, the metabolic potential for alkane degradation in the psychrophiles is clear. These findings are consistent with environmental observations of crude oil degradation in sea ice, permafrost, and most recently the cold deep ocean. As in other cold-active enzymes, the putative alkane hydroxylases show clear and, within clusters, consistent differences in amino acid composition and protein parameters from mesophilic homologues. These proteins are good candidates for rate studies, such as enzyme assays using fluorescently labeled substrates, and rational manipulations, such as targeted mutation for enhancement of the substrate range or optimal physicochemical conditions.