Introduction

The secretoglobins (SCGBs) comprise a family of secreted proteins found in mammals and marsupials. The first discovered SCGB was found in rabbits and was first called blastokinin,[1] then later uteroglobin [2] and is now designated SCGB1A1 (in some early literature, the SCGB family is referred to as the 'uteroglobin' family). Eventually, the term 'secretoglobin' was coined to refer to the characteristics that all family members have in common. The 'secreto-' portion of the name indicates that these proteins are secreted. A second reason was proposed for the suffix 'globin'; their functions had largely remained a secret (Lehrer, R., personal communication). This suffix was given because secretoglobins form dimers consisting of two four-α helix-bundle monomers, creating a hydrophobic binding pocket, reminiscent of the globin-fold, which is an eight-α-helix bundle with a pocket for a molecule such as a heme group [3].

Secretoglobins are found at high levels in many secretions, including uterine, prostatic, pulmonary, lacrimal and salivary glands, with any specific secre-toglobin often being expressed in more than one tissue. For example, mRNA expression of every SCGB family member (except SCGB1D2) has been demonstrated in human airways [4]. In general, the physiological and pathophysiological functions of most individual SCGBs remain to be defined. Nevertheless, roles currently ascribed to SCGBs include lung maintenance and repair, immune modulation and, at least in rodents, mate selection. Some SCGB family members, such as mammaglobin, have been successfully used as epithelial cancer biomarkers.

SCGBs are small (~10 kDa in humans) proteins that dimerise before secretion. Dimers are resistant to proteases, heat and pH [5, 6]. The crystal structures of several SCGBs have been resolved, including those of rabbit and rat uteroglobin (Protein Data Bank identifiers [PDB ID]:1UTG, 2UTG, 1UTR), rat Clara-cell specific protein (CCSP) (PDB ID:1CCD) and feline CH2 (Feld-1) (PDB ID:1PUO, 1ZKR, 2EJN) [7]. These proteins contain four α-helical structures and assemble into homo- or hetero-dimers orientated in anti-parallel fashion, held together by covalent disulphide bonds (via one to three conserved Cys residues) and non-covalent interactions [8].

The uteroglobin (UGB) dimer forms an internal hydrophobic cavity, located at the interface between the two subunits; this is the location of binding of hydrophobic ligands, including steroid hormones, some polychlorinated biphenyl metabolites, retinoids and various eicosanoid mediators of inflammation [9, 10]. UGB's subunits consist of four α-helices which do not form a canonical four-helix bundle motif but, rather, a boomerang-shaped structure. The subunits are connected in an anti-parallel fashion to form a dimer in which helices 3 and 4 are involved in the dimer interface. In the structure of SCGB1A1, six residues (Phe6, Leu13, Tyr21, Phe28, Met41 and Ile63) in each subunit have been identified as being particularly important to this aspect of UGB structure [11]. All of these, except Phe28, are accessible to the ligand, which probably functions in maintaining the dimer interface. The other five are involved in ligand binding. The aromatic residues Phe6 and Tyr21 are critical to this binding and cannot be replaced by aliphatic amino acids. Conversely, Leu13 is accessible to solvent in the hydrophobic pocket and is commonly substituted by aromatic amino acids. This suggests that Leu13 may be involved in determining ligand specificity.

Sources of secretoglobin genes and proteins

Protein sequences for human SCGBs were accessed from Uniprot [12] through the HUGO Gene Nomenclature Committee website (http://www.genenames.org). Sequences for mouse SCGBs were retrieved from the National Center for Biotechnology Information (NCBI) gene database (http://www.ncbi.nlm.nih.gov/gene), and from the 'supplementary data' of Laukaitis et al. [13]. Sequences were aligned with T-COFFEE using the most accurate mode, which combines multiple sources of sequence homology and structural information, where available [14].

Human gene family members

As is commonly the case for a newly discovered family of proteins, SCGBs were originally named based on the location in which they were most highly expressed; this led to the same SCGB often being 'rediscovered' and named multiple times. In 2000, a standard nomenclature was established, when all proteins in the family were named SCGBs and assigned family and subfamily names [3]. The nomenclature system was based on that used for the cytochrome P450 [15, 16] and nuclear hormone receptor [17] superfamilies, and was guided by the phylogenetic relationships of known SCGB family members, assembled by Ni and colleagues [18]. This provided a convenient and systematic naming system for an entire superfamily. In this report, the most common names used for each protein are listed, along with their standardised names. The human genome contains 11 SCGB genes and five pseudogenes (Figure 1).

Figure 1
figure 1

Phylogenetic tree of mouse (m) and human (h) SCGBs. For simplicity, and to avoid clutter, of the mouse androgen-binding protein (ABP) group, only SCGB1B27 (ABPA) and SCGB2B27 (ABPGB) are included. SCGB protein sequences were aligned using TCOFFEE [14] and analysed using nearest-neighbour-joining methods, as well as using 10,000 bootstrap replicates in the Phylip package [19]. Nodes with ≥50 per cent bootstrap confidence levels have been labelled.

SCGB1A1subfamily

UGB, also known as blastokinin and CCSP (SCGB1A1), was initially discovered in the rabbit uterus and is the founding family member [2]. For this reason, more information about its biology is available than for many of the other SCGBs. These proteins differ from other SCGBs in that they are homodimers -- that is, they are composed of two identical monomers and their subunits lack the middle Cys residue found in other SCGBs. In humans, high SCGB1A1 levels are found in peripheral airway surface fluid, where it is one of the most abundant proteins; it is also expressed in uterine endometrium and the prostate [20]. In the airways, SCGB1A1 is expressed in several cell types, especially Clara cells, and appears to play a role in immunomodulation through regulation of cell infiltration and in tissue repair after injury [20].

SCGB1A1 may also exert anti-tumorigenic activity. For example, ablation of the mouse Scgb1a1 gene in some strains is usually lethal and survivors develop tumours [21]. Conversely, recombinant SCGB1A1 inhibits proliferation and invasion of some cancer cell lines [20]. Studies of the Scgb1a1(-/-) knockout mouse suggest that SCGB1A1 may provide protection from oxidative stress and exert anti-inflammatory actions, in addition to providing resistance to pollutant-induced injury [22]. Interestingly, SCGB1A1 is initially downregulated to allow the body to respond to an infection [23].

SCGB1Bsubfamily

The human genome contains six genes that cluster phylogenetically with genes encoding mouse androgen-binding proteins (Scgb1b and Scgb2b). These genes were described based on genomic analysis, and have been given SCGB4A designations [24]. Based on phylogenetic clustering of their protein sequences, however, we propose that these genes be changed to SCGB1B and SCGB2B designations, to reflect their similarity to the mouse proteins. The SCGB1B subfamily includes SCGB1B1P (formerly ABPA1P), SCGB1B2P (formerly SCGB4A1P) and SCGB1B3 (formerly SCGB4A4). SCGB1B1P and SCGB1B2P are predicted to have become pseudogenes, whereas SCGB1B3 has no obvious inactivating mutations. Interestingly, however, SCGB1B2P is the only SCGB1B member having evidence for expression in expressed sequence tag (EST) databases.

SCGB1C1

SCGB1C1 has been shown to be localised to Bowman's glands in the olfactory mucosa. Here, it is thought to act as an odorant-binding protein, with ligands appearing to be small, hydrophobic molecules [25].

SCGB1Dsubfamily

SCGB1D1 and SCGB1D2 are also known as lipophilin A and lipophilin B, respectively. The lipophilins form heterodimers with SCGB2A proteins, which further associate to form tetramers. They have been identified in the prostatic fluid of rats and in the lacrimal gland fluid of humans and rabbits;[26] little is known about their function.

SCGB1D4 is widely distributed throughout the body; however, expression is particularly strong in the lymph node, tonsil, cultured lymphoblasts and ovary. It is inducible by interferon-γ in lymphoblast cells. SCGB1D4 appears to exhibit immunological functions, including regulation of chemotactic migration and invasion [27]. There is one pseudogene in this subfamily identified as SCGB1D1P1. We propose that it be renamed SCGB1D5P, keeping it in line with the other members of this subfamily.

SCGB2Asubfamily

SCGB2A1 is also known as lipophilin C. SCGB2A2 is also known as mammaglobin and is expressed in a highly tissue-specific manner in breast epithelium, where it forms heterodimers with SCGB1D2 [28].

SCGB2Bsubfamily

The SCGB2B gene subfamily includes SCGB2B1P (formerly ABPBG1P), SCGB2B2 (formerly SCGB4A2, SCGBL) and SCGB2B3P (formerly SCGB4A3P). Of the SCGB1B and SCGB2B subfamilies, only SCGBL is listed in the HGNC database, but we propose a name change to include it in the SCGB superfamily-naming system. There is only a single reference to SCGB2B2 in the literature [24] but there is evidence for its expression in EST databases.

SCGB3Asubfamily

SCGB3A1 and SCGB3A2, identified in 2002, have high structural homology [29] with SCGB1A1. Their expression appears to be localised principally to epithelial organs, such as the lung, mammary gland, trachea, prostate and salivary gland [30]. In the bronchial epithelium, expression is decreased after injury [29]. It has been proposed [29] that SCGB3A1 might have similar and overlapping expression and function with SCGB1A1.

SCGB3A1 is a candidate tumour-suppressor gene and a target gene for endothelial PAS domain protein 1 (EPAS1--formerly HIF2α) [31]. SCGB3A1 expression is diminished in many human cancers (including lung, prostate, pancreatic and nasopharyngeal); hypermethylation of the SCGB3A1 promoter has also been reported for many malignancies [31]. SCGB3A2 has been shown to be induced by T-helper cell 1 (Th1) cytokines but suppressed by proinflammatory and Th2 cytokines [4]. Any given cytokine can evoke different responses in different SCGBs [4]. Intranasal administration of recombinant SCGB3A2 suppresses allergen-induced lung inflammation, further highlighting similarities between SCGB3A2 and SCGB1A1 [32].

Mouse gene family members

Scgb1a1

This gene encodes mouse UGB and is orthologous to human SCGB1A1.

Scgb1c1

This gene encodes a protein that is the mouse equivalent of human SCGB1C1.

Scgb1b and Scgb2b: The androgen-binding protein (ABP) family

Sixty-four of the 68 mouse Scgb genes belong to a family that has been called the ABP family [13]. These proteins are heterodimers consisting of two distinct types of subunits, SCGB1B (previously called ABPA-like), and SCGB2B (previously ABPBG-like). These were originally isolated from mouse saliva and described based on their ability to bind androgens [33]. ABPs have since been shown to be expressed in glands of the face and neck, as well as in the prostate and ovary [34]. The role of ABPs in communication is supported by the expression of many Abpa (Scgb1b) and Abpbg (Scgb2b) mRNAs in the brain (olfactory lobe), sensory organs (olfactory epithelium, vomeronasal organ), glands of the head and neck (parotid, sublingual, submaxillary and lacrimal) and sexual tissues (prostate and ovary and preputial and clitoral glands) [13].

Scgb3agenes

Scgb3a1 and Scgb3a2 encode predicted proteins that align well with human SCGB3A1 and SCGB3A2 protein structures and are most likely orthologous to them.

Evolution

SCGB members have amino acid sequences that are highly divergent within the superfamily, complicating the identification of group members. To test whether all entries found were related to known SCGBs, a jackHMMER profile was created (an iterated sequence profile search, seeded with human SCGB1A1), which confirmed group membership for all human and mouse SCGBs [35] with an expected value of less than 0.001. Homologene,[36] a software program that analyses groups of homologous proteins across multiple species, currently recognises 21 SCGB clades (Figure 2). The SCGB genes encode proteins that all have a similar structure [18]. Despite high amino acid sequence divergence, many structural features (such as helical bundles and the ability to dimerise) are retained [11, 18]. This is consistent with a highly flexible and rapidly evolving gene superfamily and is likely to have aided in the evolution of the diverse functions of the superfamily.

Figure 2
figure 2

Phylogenetic tree of mouse androgen-binding proteins. Protein sequences were aligned using TCOFFEE [14] and analysed using nearest-neighbour-joining methods, as well as using 10,000 bootstrap replicates in the Phylip package [19].

When SCGBs were named in 2000, six human SCGBs were described and divided into five groups, based on proposed evolutionary relationships [18]. Currently, there are 11 described human SCGB genes. Figure 1 shows mouse and human proteins on a phylogenetic tree for this family. In the case of the ABP proteins, we have used SCGB1B27 (ABPA27) and SCGB2B27 (ABPBG27) to represent the mouse SCGB1B and SCGB2B groups, respectively. Table 1 lists chromosomal locations of human SCGBs, and only those mouse genes that share orthology. Four human SCGBs have direct mouse orthologues; the ABP subfamily includes three human SCGB1Bs versus 30 mouse Scgb1bs and three human SCGB2Bs versus 34 mouse Scgb2bs. The human genome contains the SCGB1D and SCGB2A subfamilies, both of which are absent in the mouse.

Table 1 Comparison of human secretoglobin genes (SCGBs) with only those mouse Scgbs that share orthology

The ABP (Scgb1b/Scgb2b) family contains genes for two different types of subunit, ABPA (SCGB1B) and ABPBG (SCGB2B),[37] located adjacent to each other on mouse chromosome 7 (Table 2). This 'recent, phylogenetically independent proliferation of close paralogs, or lineage specific gene family expansion' is an example of an 'evolutionary bloom' [37]. Another example of this has been most notably studied in the large and diverse cytochrome P450 family [38]. It has been suggested that these evolutionary blooms might represent simply a stochastic process [37].

Table 2 The mouse androgen-binding protein (ABP) family, complete with the newly proposed Scgb nomenclature

The genes that encode any Scgb1b/Scgb2b pair tend to be next to each other on the chromosome and orientated in a 'head-to-head' (3' -5'|5' -3') fashion. These structures have been called 'modules' [39]. It appears that there was a single Scbg1b-Scgb2b module which has expanded dramatically in some species (64 genes in mouse, 43 in rabbit). In other species it has resulted largely in pseudogenes, such as those of the primate lineage, or been lost altogether in species such as the shrew and elephant [13]. Interestingly, in humans there are three such modules. Although at least two modules have become pseudogenes, it remains possible that the SCGB1B2-SCGB2B2 module might be active, based on EST data. The mouse shows the most extensive expansion, which began in the ancestor of the genus Mus[13] after divergence from rat, less than 17 million years ago, and apparently has involved two different modes of duplication [39].

Association of SCGBs with disease

SCGBs have been linked to multiple disease states, either as participants or as biomarkers. SCGB1A1 may serve as an early biomarker for lung injury, owing to the regenerative role of cells that secrete SCGB1A1 [20, 40]. In addition, SCGB1A1 may act as a tumour suppressor [41] and has been shown to be upor downregulated in various human lung cancers [42]. Proteomic analysis of lacrimal gland fluid has revealed that patients with dry eyes have a decrease in SCGB1D1, SCGB1D2 and SCGB2A1 expression; the condition 'dry eyes' may be caused by post-translational modifications [43]. In addition, SCGB1D2 has been reported to be upregulated in breast cancer, making it a potential marker for this type of malignancy [44]. In this context, panels of autoantibodies to tumour-associated antigens in breast cancer include SCGB1D2, which, when combined with others, may have diagnostic potential. SCGB1D2 is also downregulated in pituitary adenomas [45].

SCGB2A1 has been shown to be a prognostic marker in epithelial ovarian cancer [46, 47] and endo-metrial cancer [48]. Because SCGB2A2 expression is highly specific to breast epithelial tissue, it has been proposed as a marker for detecting breast cancer metastases to sentinel lymph nodes and distant tissues [4951]. SCGB2A1 overexpression has also been evaluated as a marker for breast cancer, with mixed conclusions [28, 5255].

SCGB3A1 has been shown to be differentially expressed in smokers with lung cancer [56]. Its decreased expression has been correlated with increased tumour burden in non-small-cell lung cancer [31]. A SCGB3A2 polymorphism has been associated with increased asthma risk in a Japanese population [57, 58]. In chronic rhinosinusitis, SCGB3A2 levels in sino-nasal tissue are inversely correlated with the total number of infiltrating inflammatory cells, as well as scores of symptom severity [4].

Conclusions

The SCGBs represent an intriguing family of biologically active proteins. The relatively recent revelations of anti-inflammatory and immunomodulatory functions, together with their potential as cancer biomarkers, underscore their physiological and pathophysiological importance. However, a great deal more needs to be elucidated regarding the actions of individual SCGBs. Further studies directed at characterising the individual SCGBs are necessary, the results of which are likely to yield valuable targets for therapeutic intervention.

One of the most intriguing characteristics of the mammalian 'Abp' genes, the Scgb1b/Scgb2b subset of the SCGB gene family, is their evolutionarily independent expansions (so-called 'evolutionary blooms') in a number of mammalian lineages. Discovery of the reason for these blooms may lead to a better understanding of how these SCGBs function in different mammals.