Biological context

Several types of unrelated proteins appear to exhibit intrinsic surfactant activity as their primary function, which, for the moment, appears to be the case for latherin. Latherin is one of the most abundant proteins in the sweat of horses, it is also found in horse saliva, and is a known allergen to some humans (McDonald et al. 2009). Latherin’s function is believed to be to wet the hydrophobic hairs in order to enhance the rate of translocation of sweat water to the surface of the pelt for evaporative cooling (McDonald et al. 2009). The hydrophobins, a family of surface active proteins produced by filamentous fungi, and RSN-2, a surfactant protein present in the foam nests of certain species of frogs, have been investigated at the protein structure level (Linder 2009; Cooper et al. 2005; Fleming et al. 2009). Latherin exhibits no amino acid sequence similarities to either of these proteins, but is instead a member of the palate, lung, and nasal epithelium carcinoma-associated family of proteins (PLUNCs) found in mammals (McDonald et al. 2009). Latherin and PLUNCs are, in turn, related to the larger, two-domain bactericidal/permeability-increasing protein (BPI), cholesteryl ester-transfer protein (CETP) and lipopolysaccharide-binding protein (LBP). Although the functions of individual PLUNCs have not been confirmed, they are postulated to have some role within the innate immune response (Bingle and Craven 2002). The structures of BPI and CETP are available, but no structure for any member of the PLUNC family has thus far been reported.

Latherin’s amino acid sequence is unusually rich in aliphatic residues, in particular leucine, which contributes almost 25 % of the residues present, compared to the SwissProt average for all proteins within that database of 9.67 % (McDonald et al. 2009). This abundance of leucines is also a feature of one of the PLUNCs from humans that, like latherin, exhibits strong surfactant activity (Gakhar et al. 2010). Latherin, therefore, not only presents an opportunity to investigate the relationship between structure and function of a unique surfactant protein of mammals, but potentially also to understand the structure and function of the PLUNCs as a whole, for which there is currently little or no structural and direct functional information.

Methods and experiments

A synthetic latherin (sLath) gene based upon the previously described, cDNA encoding latherin (GenBank AF491288; UniProt/Swiss-Prot P82615), excluding the presumptive secretory leader/signal peptide, optimised for expression in Escherichia coli, was purchased from GeneArt. The sLath gene was then directionally inserted into the NcoI, BamHI sites of the pET32a expression vector (Novagen) allowing for the production of recombinant latherin extended by two extra amino acids, AM (single letter amino acid code) at the N-terminus of the wild-type sequence. The ‘sLath/pET32’ plasmid was transformed into Tuner (DE3) cells (Novagen). Expression was carried out in Luria–Bertani broth for non-labeled samples or M9 minimal media (Sambrook et al. 1989) containing the relevant isotope(s) for the production of single (15N only) or double (15N, 13C) labeled samples. The protein was purified to near homogeneity as estimated from SDS-PAGE electrophoresis, as described previously (McDonald et al. 2009).

For the purpose of NMR, protein was concentrated to approximately 600 μM in 50 mM NaCl, 20 mM sodium phosphate, 1 mM sodium azide, pH 7.5. D2O was added to a final concentration of 5 % (v/v). All experiments were performed at 310 °K using a Bruker AVANCE 600 MHz spectrometer equipped with 5 mm triple-resonance probes and pulsed-field gradients. The WATERGATE tailored selective excitation sequence was typically used for water suppression (Piotto et al. 1992). Proton chemical shifts were referenced relative to the H2O offset frequency and heteronuclear chemical shifts calculated from the proton reference according to the method of Wishart et al. (1995). NMR spectra were processed using AZARA (Wayne Boucher, Department of Biochemistry, University of Cambridge, http://www.bio.cam.ac.uk/azara) and assigned using CCPNmr analysis (Vranken et al. 2005). Maximum entropy reconstruction (Laue et al. 1986) was used to enhance resolution of the indirect dimensions of three-dimensional experiments.

Sequence-specific resonance assignment of the latherin backbone was accomplished with the aid of 2D 15N-HSQC (see Fig. 1), 3D HNCACB, 3D CBCA(CO)NH (Muhandiram and Kay 1994), 3D HNCO (Kay et al. 1994), 3D HNCACO, 3D HBHA(CBCA)NH (Wang et al. 1994) and HBHA(CBCACO)NH spectra. The majority of aliphatic sidechain carbon and proton resonances were located by navigating from the backbone data using 2D 13C-HSQC, 3D (H)C(CO)NH-TOCSY, 3D and 3D H(C)(CO)NH-TOCSY spectra (Grzesiek and Bax 1992). The high number of overlapping leucine sidechain resonances were assigned using 3D methyl-selective experiments (Uhrin et al. 2000) modified for the removal of CH2 resonances from the methyl proton-carbon planes (see Fig. 2). Remaining aliphatic resonances were identified using 3D 13C-edited [1H, 1H]-NOESY spectra. A proportion of aromatic sidechain 13C/1H signals (histidine Hδ1, tryptophan Hδ1, tyrosine Hδ,ε and phenylalanine Hδ,ε) were assigned using 2D HBCBCGCDHD and 2D HBCBCGCDCEHE spectra (Yamazaki et al. 1993) and the remainder were identified from the 13C-edited [1H, 1H]-NOESY spectrum.

Fig. 1
figure 1

The 15N HSQC spectrum of latherin at 310 K. The residue specific assignments are indicated and the crosspeaks assigned to sidechain NH2 groups are linked by horizontal dashed lines

Fig. 2
figure 2

The methyl region of the 13C, 1H correlation spectra of latherin illustrating the increased resolution of the leucine methyl crosspeaks possible with a the me-HCCH-TOCSY experiment as compared to b the conventional 13C-HSQC

Extent of assignments and data deposition

All latherin polypeptide backbone resonances were assigned, with the exception of the N-terminal residues A(−2), M(−1), A(0); two isolated residues S59, K82; and a number of residues located on a dynamic loop region (G145, N146, S149, L150, N153, A154). A total of 93.51 % of backbone residues were identified, while assignment of non-labile amino acid sidechain protons is 94.23 % complete. The majority of the missing assignments are those of the residues within the 145–154 residue dynamic loop region. Despite their high relative abundance within the protein, the experiments nevertheless allowed full assignment of all leucine residues. A few resonances displayed chemical shifts outwith the known distribution of shifts. Sidechain protons in residues (85D, 113R and 180 N) all displayed the effects of ring current shift due to their close proximity to aromatic residues. 135L Cγ has an atypical chemical shift of 31.09 ppm. This residue is buried within the hydrophobic core of the protein surrounded by other aliphatic residues, and analysis of its stereochemical properties in the calculated structure indicated φ, Ψ, χ1 and χ2 bond angles in favourable regions. The atypical chemical shift in 135L therefore remains to be explained.

The 1H, 13C and 15N chemical shift assignments have been deposited with the BioMagResBank database (http://www.bmrb.wisc.edu), accession number 19067.