Cloning and structural basis of fluorescent protein color variants from identical species of sea anemone, Diadumene lineata

Diadumene lineata is a colorful sea anemone with orange stripe tissue of the body column and plain tentacles with red lines. We subjected Diadumene lineata to expression cloning and obtained genes encoding orange (OFP: DiLiFP561) and red fluorescent proteins (RFPs: DiLiFP570 and DiLiFP571). These proteins formed obligatory tetramers. All three proteins showed bright fluorescence with the brightness of 58.3 mM−1·cm−1 (DiLiFP561), 43.9 mM−1·cm−1 (DiLiFP570), and 31.2 mM−1·cm−1 (DiLiFP571), which were equivalent to that of commonly used red fluorescent proteins. Amplitude-weighted average fluorescence lifetimes of DiLiFP561, DiLiFP570 and DiLiFP571 were determined as 3.7, 3.6 and 3.0 ns. We determined a crystal structure of DiLiFP570 at 1.63 Å resolution. The crystal structure of DiLiFP570 revealed that the chromophore has an extended π-conjugated structure similar to that of DsRed. Most of the amino acid residues surrounding the chromophore were common between DiLiFP570 and DiLiFP561, except M159 of DiLiFP570 (Lysine in DiLiFP561), which is located close to the chromophore hydroxyl group. Interestingly, a similar K-to-M substitution has been reported in a red-shifted variant of DsRed (mRFP1). It is a striking observation that the naturally evolved color-change variants are consistent with the mutation induced via protein engineering processes. The newly cloned proteins are promising as orange and red fluorescent markers for imaging with long fluorescence lifetime.


Introduction
Fluorescent proteins are widely used in the field of biology for exploring various phenomena in cells and in organisms [1][2][3][4][5][6][7]. Among them, red florescent proteins are preferable for deep tissue imaging due to less light scattering at longer wavelengths [8,9]. Various red fluorescent proteins have been discovered mainly from Cnidarians [10]. Diadumene lineata (syn. Haliplanella lineata, Diadumene luciae) is a small sea anemone originally inhabiting the coast of Japan [11]. The sea anemone is called 'orange-striped green sea anemone' since it has conspicuous orange stripes in the column [12]. Careful observation shows that it has red stripes on its tentacles. These stripes and tentacles emit bright fluorescence under blue light. This observation motivated us to acquire both orange and red fluorescent proteins from the same animal species. We performed expression cloning with Diadumene lineata cDNA employing a frame-insensitive expression vector, pRSET-TriEX [13], and obtained a gene encoding an orange fluorescent protein. We further acquired two genes encoding novel red fluorescent proteins by PCR cloning using the sequence information of the orange fluorescent protein. Here we report photophysical properties and oligomeric states of the new fluorescent proteins, as well as the crystal structure of one of the red fluorescent proteins.

Expression cloning
Total RNA was extracted from a single orange stripe tissue of the body column of Diadumene lineata with TRIzol reagent according to the manufacturer's protocol [13]. The extracted total RNA samples were converted to cDNA using SuperScript Double-Stranded cDNA Synthesis Kit (11917-010, Invitrogen, Carlsbad, CA, USA). The double-stranded cDNA was phosphorated and inserted at the SmaI sight of pRSET-TriEX by blunt-end ligation to construct a plasmid for expression cloning. Transformation of Escherichia coli (JM109(DE3)) was performed using this plasmid. A colony that emitted fluorescence was selected and a plasmid encoding the fluorescent protein was prepared and sequenced. The entire coding sequence (CDS) was amplified by PCR and cloned to the BamHI/EcoRI sites of pRSET B for protein expression, spectral analysis, and oligomeric state analysis.

PCR cloning
Total RNA was prepared from a single tentacle. cDNA was prepared from the total RNA with the same protocol as described above [13]. The cDNA was used as PCR template using the same primers as for amplifying the CDS region of the orange fluorescent protein. The pRSET vector containing the cDNA was used for the transformation of E. coli competent cells. Two red colonies, slightly differently colored, were selected and plasmids were prepared for sequencing and protein expression.

Photophysical properties
Photophysical characterization (absorption, emission, lifetime measurements) of DiLiFPs was performed in phosphate buffer (10 mM, pH 7.0) supplemented with 150 mM NaCl at 21 °C. Absorption spectra were recorded with a Cary-4000 spectrometer (Agilent Technologies). Fluorescence spectra were recorded with Fluorolog 3 spectrofluorometer (Horiba Jobin Yvon) with 2 nm excitation/emission bandpass and corrected for the instrumental response characteristics. The theoretical extinction coefficient of DiLiFPs at 280 nm was calculated from the amino acid sequence (W, Y and C) [14] and used to determine the total protein concentration. In this calculation, we assumed that there is no intramolecular cysteine S-S bonding. The concentrations of matured chromophore were determined by the alkaline denaturation method [15][16][17]. Based on this matured chromophore concentration, the chromophore extinction coefficient was calculated. The fraction of properly folded fluorescent protein is calculated from the matured chromophore concentration over the total protein concentration. The fluorescence quantum yield was determined by a comparative method using rhodamine 101 in ethanol (Φ f = 0.915) as a standard [18].
Fluorescence lifetime was measured using a setup reported previously [19], a single photon counting experiment, the FT200 Picoquant spectrometer coupled with 200 fs excitation laser. Excitation pulse (530 nm, 200 fs, 4 MHz) was provided by a femtosecond Ti:sapphire laser (Coherent Chameleon Ultra II) coupled to an intracavity frequency-doubled OPO (APE) and a pulse picker (4 MHz). Briefly, here the emission from protein solution (absorbance < 0.1 at 530 nm, Helma spectroscopic cell 1 × 1 cm) was collected at 90° through a polarizer set at the magic angle and a monochromator (bandpass 4 nm). The single-photon events were collected by a cooled microchannel plate photomultiplier tube R3809U (Hamamatsu) and recorded by a PicoHarp 300 TCSPC system (PicoQuant, 4 ps bin time). The instrumental response function was recorded at the laser excitation wavelength using colloidal silica (Ludox), and its full width at half maximum was ~ 40 ps. All decays were collected until reaching 10 4 counts. The decays were recorded at three different emission wavelengths covering the steady-state emission spectra (DiLiFP561: 575-600-625 nm/DiLiFP570 & DiLiFP571: 600-625-650 nm) and globally analyzed by a sum of exponentials convolved by the measured IRF using the FluoFit software (Picoquant). The quality of the fit was judged by the weighted residuals and their autocorrelation function.
Photostabilities of DiLiFPs were analyzed by determining photobleaching quantum yield based on the protocol reported by De Keersmaecker et al. [20]. Details are available in the supporting materials.
Maturation of DiLiFPs was measured by monitoring the increase of fluorescence intensity in a protein solution rapidly prepared (<1 h) on ice from bacteria in which the protein expression was induced for 1 h. Details are available in the supporting information.

Expression of fluorescent proteins for spectral and oligomeric states analyses
Fluorescent proteins were prepared as reported previously [13]. Briefly, E. coli competent cells (JM109(DE3)) were transformed with pRSET-DiLiFP561, pRSET-DiLiFP570 and pRSET-DiLiFP571 used for protein expression. The expressed proteins were extracted from E. coli cells and purified by immobilized metal affinity chromatography using Ni-NTA agarose resin (QIAGEN) according to the manufacturer's protocol. Desalting columns (PD-10, GE Healthcare Life Sciences) were used for removing imidazole and changing buffer to 10 mM phosphate-buffered saline (pH 7.4).

Pseudo-native polyacrylamide gel electrophoresis
Pseudo-native polyacrylamide gel electrophoresis (PN-PAGE) analysis [21] was performed on 15% polyacrylamide gel (15% acrylamide, 0.09% bis-acrylamide, 0.375 M Tris HCl, pH 8.8, and 0.1% SDS) with 5% polyacrylamide stacking gel (5% acrylamide, 0.13% bisacrylamide, 0.125 M Tris-HCl, pH 6.8, and 0.1% SDS). Samples (7.5 µg of fluorescent protein into each well) were loaded on the gel in a buffer containing 0.0625 M Tris HCl, pH 6.8, 1% SDS, 10% glycerol without boiling. A protein unstained marker (XL-Ladder Broad, APRO Life Science Institute, Inc., Naruto, Japan) was used as the molecular weight standards. The transmission and the fluorescence images of the gel were taken with a digital camera (PowerShot G11, Canon). After taking the fluorescence image, the gel was stained with a Coomassie brilliant blue stain (CBB-G250, nacalai tesque, Kyoto, Japan) to detect all protein bands . The gel was illuminated with a light table  for the transmission image or Transilluminator (TFML-26, UVP) equipped with Visi-Blue Plate (with Orange Cover) for the fluorescence image [13].

DiLiFP570 expression for crystallization
For structure determination, we subcloned the DiLiFP570 gene into a pET28b vector, whose N-terminal hexahistidine tag can be removed by thrombin digestion. The DiLiFP570 gene was PCR-amplified. In order to use the NdeI site on the vector for subcloning, intrinsic NdeI sites in the fluorescent proteins were removed by introducing a silent mutation: the codon coding 61st histidine 'CAT' was changed to 'CAC'. The DiLiFP570 gene was cloned into the pET28b vector. DiLiFP570 (pET28b) was expressed in BL21(DE3) using LB medium at 28 °C. After overnight expression, cells were harvested and disrupted by 15 min sonication. Cell suspension after sonication was centrifuged and the supernatant was applied onto a Ni-NTA column. An N-terminus hexahistidine tag of the purified DiLiFP570 sample was cleaved by thrombin and was removed by a subsequent Ni-NTA column chromatography step.

Crystallization of DiLiFP570
Crystallization of DiLiFP570 was done by the hanging-drop vapor diffusion method. The crystallization condition was 3.0 M ammonium sulphate, 0.1 M imidazole at pH 6.0 at 20 °C with a protein concentration of 2.5 mg/mL. X-ray diffraction data were collected at a beamline BL-5A of KEK photon factory, Tsukuba, Japan. The collected data were processed by XDS [22]. The initial phase was determined by molecular replacement using MOLREP [23]. The closest homologous protein structure was a GFP-like nonfluorescent chromoprotein (PDBID: 3VK1) with a sequence homology of 67%, which was used as a search model. Structural refinement was performed by PhenixRefine [24] and Refmac5 [25]. A structural model was built using Coot [26]. The constructed structural model was deposited to protein data bank through PDBj (pdbj.org) with ID 7X2B. Pictures of the protein structure were created with Pymol (www. pymol. org).

Cloning of fluorescent protein genes
We collected a sea anemone, Diadumene lineata, from the Japan Sea coast. It showed orange stripes in the column and red stripes along the tentacles under blue light (Fig. 1a). The expression cloning of fluorescent proteins was performed from the body that has an orange stripe and found a bright fluorescent colony on the plate (Fig. 1b). From that colony, a gene encoding orange fluorescent protein was obtained. We named this fluorescent protein 'DiLiFP561', based on the absorption maximum in the visible range shown in the following section. BLAST [27] search revealed that two very similar fluorescent protein sequences (plum-1 and plum-2) have been reported recently by Sarper et al. [28]. The amino acid sequence of DiLiFP561 is 99.09% and 97.73% identical with plum-2 and plum-1, respectively.
By using the N-terminal and C-terminal DNA sequence of DiLiFP561, we performed PCR cloning on the tentacle and found red fluorescent colonies on the plate (Fig. 1c). We obtained genes encoding red fluorescent proteins from two bright clones and named 'DiLiFP570' and 'DiLiFP571'. When compared to DiLiFP561, sequence alignments for DiLiFP570 and DiLiFP571 showed 15 and 12 amino acid substitutions out of 220, respectively, which correspond to identity percentages of 93.18% and 94.55% (Fig. 2). We registered the sequence information of DiLiFP561, DiLiFP570, and DiLiFP571 to DDBJ with the accession number of LC682408, LC682409, and LC682410, respectively.

Oligomeric states of DiLiFPs
Oligomeric states of the proteins were analyzed by pseudo-native SDS PAGE (Fig. 3) [13,21]. The theoretical molecular weight of DiLiFP monomers calculated from their sequence was ~ 25 kDa. All DiLiFPs migrated almost the same distances as tetrameric fluorescent protein (Kaede [7]) with molecular weight ~ 100 kDa (Fig. 3b). A small fraction of DiLiFPs was seen at the same distance as monomeric fluorescent protein (eGFP) (Fig. 3a), however, it was non-fluorescent. From this result, we concluded that DiLiFPs formed obligate tetramers.

Photophysical characterization of purified fluorescent proteins
Absorption and fluorescence spectra of the purified DiLiFPs were measured at pH7.0 (Table 1, Fig. 4). The absorption bands at 280 nm were assigned to amino acid residues. The visible absorption maxima originated from the chromophores of DiLiFP561, DiLiFP570, and DiLiFP571 were located at 561, 570, and 571 nm, respectively. A shoulder is also observed at higher energy revealing a vibronic structure.
The extinction coefficients at these maxima were 1.08 × 10 5 , 9.54 × 10 4 , and 8.44 × 10 4 M −1 ·cm −1 , respectively, by using the matured chromophore concentration [15][16][17]. The fractions of properly folded DiLiFP561, DiLiFP570, and DiL-iFP571 were 55.6%, 50.6%, and 47.8%, respectively. The emission spectrum of DiLiFP561 was the mirror image of the absorption spectra with a vibrational structure (Fig. 4). The fluorescence spectrum has its maximum at 578 nm, and the fluorescence quantum yield was calculated as 0.54. In contrast, emission spectra for DiLiFP570 and DiLiFP571 were characterized by a broad band with only one maximum and relatively large Stokes shift. The maximum wavelength of the fluorescence spectrum of DiLiFP570 and DiLiFP571 were 600 and 605 nm, respectively, and fluorescence quantum yields were 0.46 and 0.37. The broadband emission of the red fluorescent proteins could be a sign of higher flexibility, due to a larger rearrangement in the excited state concomitant with the existence of a non-radiative pathway.
Absorption and fluorescence spectra of DiLiFP561 were similar to those of plum in Ref. [28]. This result was expected, since the amino acid sequence of DiLiFP561 was 99.09% and 97.73% identical with plum-2 and plum-1, respectively. In contrast, the absorption and fluorescence spectra of DiLiFP570 and DiLiFP571 were red-shifted compared to those of plum-2 and plum-1.
The fluorescence decays were measured at three different wavelengths: the emission maximum, 25 and 50 nm longer than the maximum. Two-phase exponential regression analyses were performed for all three proteins. Fast pre-exponential component (0.3 ns) was also involved for DiLiFP570 and DiLiFP571. The pre-exponential component appeared as a decay component at 600 nm, and a rise component at 625 and 650 nm. The amplitude-weighted average fluorescence lifetimes of DiLiFP561, DiLiFP570 and DiLiFP571 were calculated to 3.7, 3.6 and 3.0 ns, respectively, at their emission maximum wavelength (Table 1, Fig. 5). No significant difference was observed on the average fluorescence lifetime detected at the longer wavelength (Table S2, Supporting information). So far, only five orange and red fluorescent proteins have been reported with longer lifetimes than DiLiFP561 and DiLiFP570: mKO (4.1 ns), KO (4.2 ns) [29], mCherry-XL (3.9 ns) [30], mCyRFP1 (3.7 ns) [31] and mScarlet (3.9 ns) [32]. The shorter lifetime of DiLiFP571 than others was in line with the value of fluorescence quantum yield. DiLiFP570 had 95% contribution from the longest component, whereas the decay curve of DiLiFP571 had about 75% contribution from this component. This can be linked to the existence of more than two states in the ground state [33] or that formed only after excitation. The short 0.3 ns component observed for DiLiFP570, and DiL-iFP571 is changing sign (decay/rise) and could be assigned to excited state proton transfer or Förster resonance energy transfer between a neutral and anionic species, as observed for red Kaede species [34,35].  To evaluate the pH dependency of DiLiFPs, emission spectra at various pH were acquired (Fig. 6). A blue shift of the emission spectra at acidic pH was observed, only for DiLiFP570 and DiLiFP571. Fluorescence of DiLiFPs became dimmer at lower pH (Fig. 6D). The apparent pKa values, which were estimated from pH values that showed a half maximum of the fluorescence intensity, were 4.3, 4.5 and 4.6, for DiLiFP561, DiLiFP570, DiLiFP571, respectively.
The brightness of fluorescent protein is defined as the product of the molecular absorption coefficient and the fluorescence quantum yield. The brightness at pH 7 of DiLiFP561 and DiLiFP570 was 58.3 and 43.9 mM −1 ·cm −1 , respectively. The brightness of DiLiFP571 was 31.2 mM −1 ·cm −1 , which was lower than other DiLiFPs. Among the long-lifetime orange and red fluorescent proteins, so far reported only mScarlet shows higher brightness (70.0 mM −1 ·cm −1 ) than DiLiFP561 and DiLiFP570.
Final evaluation we performed for imaging application was the maturation. DiLiFPs for this experiment were purified rapidly (less than 1 h) on ice from bacteria in which the expression of the proteins was induced for 1 h. The fluorescence intensities of DiLiFP570 and DiLiFP571 were saturated in less than 24 h and the level was kept for 96 h (Fig.  S1, Supporting information). Their fluorescence intensities at 3 h were more than half of the plateau level. The maturation of DiLiFP561 showed a similar trend, although the fluorescence intensity at 24 h was ~ 80% of that at 96 h (Fig.  S1d, Supporting information). Therefore, the time required for half maturation of all three DiLiFPs were shorter than 3 h. From these results, we concluded that the maturation of DiLiFPs is fast enough for an imaging application.

Crystal structure of DiLiFP570
The crystal of DiLiFP570 diffracted to a resolution of 1.63 Å. The crystal belonged to I222 space group, and one protein molecule was contained in the asymmetric unit ( Table 2).
The DsRed-like tetramer can be generated by the symmetry operation (Fig. 7a). The protein molecule contains an artifact sequence after thrombin cleavage at its N-terminus (Gly-Ser-His) and these residues were disordered in the structure. Except for these disordered terminal residues, electron density map for the rest of the entire protein molecule was clear. Met62-Tyr63-Gly64 forms the DsRed-like chromophore in the structure with the extended two double bonds, which was confirmed by the positioning of N and Cα atoms of Met62 on the same plane of chromophore aromatics (Fig. 7b).
The photophysical properties of fluorescent proteins are influenced by the stabilization of β-strand and hydrogen bond [38,39]. To reveal the molecular mechanism of the fluorescent color difference between DiLiFP561 and DiLiFP570, we mapped the residues which differ between the two FPs ( Fig. 7c; non-conserved residues are shown with yellow stick representation). We found that most of these residues are located at relatively distant positions from the chromophore and are exposed to the solvent. Among them, only Met159 was located in the vicinity of the chromophore (Fig. 7c). For DiLiFP561, residue 159 is Lys, a positively charged residue, while there is no charge in DiLiFP570 at this position. The residue corresponding to Lys159 is Lys163 in DsRed [40]. The primary amine of the Lys163 side chain is considered as a critical group for maintaining the negative charge of the chromophore phenol oxygen via a salt bridge at all times. Similar Lys/Met substitution was observed in a monomeric DsRed variant, mRFP1 generated by the directed evolution method [41]. Red shifts of the emission and excitation peak are also reported in mRFP1; the emission and excitation peak wavelengths of mRFP1 are 24 and 26 nm longer than those of DsRed, while DiLiFP570/ DiLiFP571 show 9/10 nm red shift in their excitation and 22/27 nm in their emission peak wavelengths compared with DiLiFP561. The redshifts of fluorescence for DiLiFP570/DiLiFP571 are also consistent with the difference between DsRed and mRFP1. We think that the absence of the positive charge and associated interactions in DiLiFP570/DiLiFP571 could be reasons for the redshifts of fluorescence. The positive charge of Lys159 localizes electron density towards hydroxyl oxygen in the chromophore. We consider that the effect of this localization is eliminated with methionine substitution at position 159, and the delocalization stabilizes the π-conjugation resulting in a red shift of fluorescence.

Conclusion
Considering that all DiLiFPs shown in this study originate from natural organisms, it is interesting that the naturally obtained color-change mutation is consistent with the mutation which was found in the protein engineering processes during the DsRed modifications. To our knowledge, this type of observation is the first in the fluorescence protein research field. Hence, cloning of fluorescent proteins from uninvestigated species will still provide interesting insights for developing fluorescent proteins. DiLiFP561 and DiLiFP570 were orange and red fluorescent proteins with long fluorescence lifetime and relatively pH stable. The newly cloned orange and red fluorescent proteins are promising as fluorescent markers for imaging that requires a long lifetime and/or under mild acidic conditions.
Acknowledgements We thank the beamline stuff of KEK-PF, Tsukuba, Japan, for the X-ray diffraction data collection. This work was supported in part by YU-COE program of Yamagata University, the Category 1 research grant from KU Leuven (C14/16/053) and PRESTO of JST (JPMJPR10QA). DL acknowledges KU Leuven Facultaire Luik Onderzoeksfonds (FLOF) for financial support. Chevreul Institute (FR 2638), Ministère de l'Enseignement Supérieur, de la Recherche et de l'Innovation, Hauts-de-France Region and FEDER are acknowledged for supporting and funding partially this work. We thank the reviewers for their insightful and helpful comments. Data availability The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Declarations
Conflict of interest On behalf of all authors, the corresponding author states that there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.