Introduction

Lipocalins (LCNs) are members of a family that includes a diverse group of low-molecular-weight (18–40 kDa) proteins. The larger members of this family undergo cleavage to form the ultimate LCN protein. Comprising usually 150–180 amino-acid residues, these proteins belong to the calycin superfamily and are widely dispersed throughout all kingdoms of life [1]. LCNs are evolutionarily conserved and share an eight-stranded antiparallel β-sheet structure; this forms a “barrel” which is the internal ligand-binding site that interacts with and transports small hydrophobic molecules—such as steroid hormones, odorants (e.g., pheromones), retinoids, and lipids [2, 3].

There are three main structurally conserved regions (SCR1, SCR2, SCR3) that are shared in the lipocalin fold; these represent a moiety composed of three loops that are close to each other in the three-dimensional structure of the β-strands that make up the barrel [4,5,6,7]. Based on the SRCs, two separate groups have been proposed: the kernel LCNs and the outlier LCNs [3]. The kernel LCNs represent a core set of proteins sharing the three characteristic motifs, while the outlier LCNs, which are more divergent family members, typically share only one or two motifs [4]. Based on this categorization—retinoic acid-binding protein-4 (RBP4), α1-microglobulin (A1M), apolipoprotein D (APOD), complement C8 gamma chain (C8G), prostaglandin D2 synthase (PTGDS), and the major urinary proteins (MUPs)—have all been classified as kernel lipocalins, while odorant-binding proteins (OBP2A, OBP2B) and von Ebner’s gland protein (LCN1) are included in the outlier category [4, 7].

Depending on the structure of the individual LCN, the binding-site pocket can accommodate molecules of various sizes and shapes—thus contributing to the diversity of functions within this protein family [5]. Lipocalin crystal structures confirm the highly conserved eight continuously-hydrogen-bonded antiparallel β-strand domains creating the barrel.

The fatty acid-binding protein (FABP) gene family is considered a related, but distinct, subfamily of the calycin superfamily [8] and will not be discussed further here. Another subset of the lipocalins worthy of mention is the immunocalin subfamily. These include α1-acid glycoprotein, α1-microglobulin/bikunin precursor, and glycodelin, each of which exert significant immunomodulatory effects in cell culture [9, 10]; interestingly, all three are encoded by genes in the human Chr 9q32-34 region—together with at least four other lipocalins (neutrophil gelatinase-associated lipocalin, complement factor γ-subunit, tear prealbumin, and prostaglandin D synthase), which also might exert anti-inflammatory and/or antimicrobial activity [11].

Lipocalin family in humans

Among bacteria, plants, fungi, and animals, more than 1000 LCN genes have been identified to date. Nineteen LCN genes, encoding functional LCN proteins, exist in the human genome (Table 1). Figure 1 dendrogram shows the evolutionary relatedness of these human LCN proteins.

Table 1 List of all human LCN and mouse Lcn genes—with official gene symbols, full protein name, aliases, chromosomal locations, isoforms, National Center for Biotechnology Information (NCBI) RefSeq mRNA accession numbers, NCBI RefSeq protein accession numbers, and total number of amino acids (# of AAs) [information retrieved and confirmed from https://www.genenames.org/needs a close bracket]
Fig. 1
figure 1

Dendrogram of lipocalins (LCNs) in the human genome. Although the names listed are the official human gene symbols [https://www.genenames.org/], this dendrogram is based on the alignment of proteins (listed in Table 1), using multiple sequence alignment by CLUSTALW (http://www.genome.jp/tools/clustalw/)

In humans, LCNs are located in blood plasma and other body fluids such as tears and genital secretions, in which they serve as carriers for a variety of small molecules [5]. LCNs also can play important roles in disease such as diabetic retinopathy [12], and, as a result, they are extensively used clinically as biochemical markers. For example, A1M (α1-microglobulin/bikunin precursor, encoded by the AMBP gene) is a biomarker of proteinuria and indicator of declining renal function [13].

Lipocalin-1 (LCN1; human tear pre-albumin, or von Ebner’s gland protein) is one of four major proteins in human tears, acting as a lipid sponge on the ocular surface [14, 15]. LCN1 is produced by lacrimal glands and secreted into tear fluid. Decreased LCN1 levels are associated with Sjögren’s disease, LASIK-induced dry-eye disease [16], and diabetic retinopathy [12].

Lipocalin-2 (LCN2; also known as neutrophil gelatinase-associated lipocalin) mediates various inflammatory processes by suppressing macrophage interleukin-10 (IL10) production [17, 18]. Several studies have shown that LCN2 gene expression in adipose tissue is elevated in insulin-resistant states [19, 20]. LCN2 is also involved in kidney development and used as a biomarker for acute and chronic renal injury [21].

Odorant-binding proteins 2A and 2B (encoded by the OBP2A and OBP2B genes) are members of the LCN family. OBP2A is highly expressed in the oral sphere (e.g., nasal mucus, salivary, and lacrimal glands), whereas OBP2B is expressed in endocrine organs (e.g., mammary gland and prostate) [22]. Functioning as soluble-carrier proteins, OBP2A and OBP2B can bind reversibly to odorants [23].

The AMBP gene encodes α1-microglobulin/bikunin precursor protein; α1-MG (A1M) is the lipocalin—derived from proteolytic cleavage of AMBP [24]. A1M is secreted into plasma, where it can exist free, or bound, to immunoglobulin-A or albumin. Although the molecular weight of A1M is 27.0 kDa, it is freely filtered through the glomerulus and reabsorbed by proximal tubular cells [25]; for this reason, A1M is a biomarker of proteinuria, i.e., increased levels in urine indicate a defect in proximal tubules. A1M is considered to be a major factor for progressive impairment of renal function, as well as for early diagnosis of acute allograft rejection [13, 24, 26]. Recent studies have shown A1M to be expressed in rat retinal explants and to have oxygen radical-scavenging and reductase properties; these findings suggest that A1M might protect against oxidative stress and possibly be involved in the response to retinal detachment [27, 28].

Other members of the LCN family include apolipoproteins D (APOD) and M (APOM)—which interestingly exhibit structural similarities to LCNs rather than to other apolipoproteins. APOD is an atypical apolipoprotein, because it is highly expressed in mammalian tissues such as liver, kidney, and central nervous system. APOD is a component of HDL cholesterol. Recent studies have shown that abnormal APOD expression is associated with altered lipid metabolism; three distinct missense mutations (Phe36Val, Tyr108Cys, and Thr158Lys) in African populations link APOD with metabolic syndrome [29]. A recent study showed that APOM, which resides in the plasma HDL fraction, acts as a chaperone for sphingosine-1-phosphate (S1P) and facilitates interaction between S1P and plasma HDL, thereby exhibiting a vasculoprotective effect [30].

The protein encoded by the complement C8 gamma chain gene (C8G) is one of the three subunits present in complement component 8 (C8). It is an oligomeric protein composed of three non-identical sub-units (α, 64-kDa; β, 64-kDa; γ, 22-kDa); the gamma chain is the only one that belongs to the lipocalin family [31]. C8 is part of the membrane-attack complex (MAC) that participates in irreversible association of the complement proteins C5b, C6, C7, and C9 to form a cytolytic complex that inserts into, and directly lyses, microbes [32]. Activation of complement triggers the assembly of MAC, which is then deployed to kill a wide range of Gram-negative bacteria [33]. Two functionally distinct C8-deficiency states have been identified: the first reflects a lack of the alpha and gamma chains and has been reported in Afro-Caribbean, Hispanic, and Japanese populations; the second results from lack of the beta chain and is found mainly in Caucasians [34, 35]. Deficiency of C8 complement is a very rare primary immunodeficiency associated with invasive and recurrent infections by Neisseria meningitidis [32, 36, 37].

Orosomucoids (ORM1 and ORM2), α1-acid glycoproteins (trivial name AGPs), belong to the subfamily of immunocalins. ORM1 is an acute phase protein secreted by hepatocytes in response to inflammation, with its expression being regulated by pro-inflammatory cytokines such as IL1 and IL6, the chemokine IL8, and glucocorticoids [38]. ORM1 and ORM2 are polymorphic proteins—commonly referred to as ORM/AGP with four variants in humans: AGP F1; AGP F2; AGP S, encoded by the ORM1 gene; and AGP A, encoded by the ORM2 gene [39]. AGPs are important members of the lipocalin family, because their capacity to bind to basic drugs can affect plasma free drug concentrations, playing a key role in a drug’s volume of distribution, metabolism, and therapeutic effect [40]. The ORM1 and ORM2 proteins have been recently identified as predictive urinary biomarkers for rheumatoid arthritis [41]. In addition, they are predictive markers for systemic lupus [42] and chronic inflammation [43].

The progestagen-associated endometrial protein (PAEP) is a secreted immunosuppressive glycoprotein (28 kDa), also termed glycodelin, i.e., one of the immunocalins. Studies have shown that PAEP downregulation can lead to abortion during the first trimester—due to increased activation of the immune system [44, 45]. In addition, PAEP has been found expressed in many tumors (e.g., gynecological malignancies, lung cancer, and melanoma) [46,47,48].

The protein encoded by the prostaglandin D2 synthase (PTGDS) gene is a glutathione-independent prostaglandin synthase (PTGDS). PTGDS is involved in the arachidonic acid cascade, converting prostaglandin H2 to prostaglandin D2 (PGD2), and is preferentially expressed in brain [49]. Increased PTGDS expression has been shown in patients having attention deficit hyperactivity disorder, compared with patients having bipolar disorder [50]. Another study suggests that dysregulated PTGDS mRNA expression is associated with rapid-cycling bipolar depression [49]. Enhanced PTGDS expression has also been associated with various malignancies [51,52,53,54,55].

Plasma retinol-binding protein 4 (RBP4) is a 21-kDa transporter of all-trans-retinol and belongs to the lipocalin family [56, 57]. RBP4 circulates in plasma as a moderately tight 1:1 M complex with vitamin A. RBP4 is secreted mainly by hepatocytes and also by adipose tissue [58]. In humans, increased circulating RBP4 levels have been correlated with obesity [59], insulin resistance, and type-2 diabetes [60, 61]. Insulin resistance has been long considered to play a key role in the development of non-alcoholic fatty liver disease (NAFLD) [62]—which is associated with altered RBP4 levels. Information in the literature on this association, however, is controversial. Several studies have reported significantly increased RBP4 levels in patients with NAFLD [63,64,65,66], whereas other studies have shown no difference on RBP4 levels between control and NAFLD groups [67, 68].

There is limited information in the literature regarding human LCN6, LCN8, LCN9, LCN10, LCN12, or LCN15.

Lipocalin family in mice

Lipocalins have been extensively studied in the mouse. Forty-five proteins belong to this family in mice (Table 1), which also includes major urinary proteins (MUPs) as members of this family (Fig. 2). All of the LCN genes are expressed in both humans and mice (Table 1), with the only exception of LCN1, which is found only in human—whereas Lcn3, Lcn4, Lcn5, Lcn11, Lcn16, Lcn17, and all the functional Mup genes are found in mouse but not human.

Fig. 2
figure 2

Dendrogram of mouse Lcn and Mup proteins. Although the names listed are the official mouse gene symbols [http://www.informatics.jax.org/], this dendrogram is based on the alignment of proteins (listed in Tables 1 and 2), using multiple sequence alignment by CLUSTALW (http://www.genome.jp/tools/clustalw/)

MUPs are intriguing small proteins (19–21 kDa) found in mouse urine; in rats, MUPs are known as α2u-globulins [69,70,71,72]. (For the purpose of this review, MUPs will refer to major urinary proteins in both mouse and rat.) While the presence of proteinuria is considered in humans to be a pathological renal condition, this is not the case for mice or rats [70]. Under physiological conditions, rodents excrete substantial levels of protein in urine, with MUPs accounting for > 90% of total protein content [70, 73, 74], playing a key role in chemo-signaling between animals to coordinate social behavior [75]. MUPs represent highly homologous proteoforms that control the release of volatile pheromones for urinary scent marks by transporting them into the vomeronasal organ (VNO) [76, 77]. MUPs bind pheromones within the hydrophobic calyx of the protein structure where hydrophobic binding sites exist for small lipophilic ligands. The affinity of each MUP for specific ligands varies, according to its subtype [70, 78], and depends on the amino-acid sequence in the binding domain [79, 80]. MUP affinity is most affected by polymorphisms that influence amino acids on the luminal surface of the ligand-binding domain (pocket)—rather than on the protein surface where most sequence differences are observed [73]. In addition, MUPs may act as direct stimulants of pheromone receptors [81].

MUPs are primarily synthesized in post-pubescent mouse liver in response to various hormones such as testosterone, growth hormone, thyroxine, insulin, and glucocorticoids [82, 83]. MUP synthesis is sex-dependent—resulting in (three- to fourfold) higher protein concentrations in post-pubescent males than female mice [84].

MUP expression is stimulated by androgens and leads to higher expression levels in adult males than in females, as well as immature males [70, 85]. Because their expression is stimulated by androgens, MUP synthesis is gender-dependent with higher (three- to fourfold) protein levels occurring in adult males than in females or immature males [70, 85]. For example, in C57/BL6 mice, MUPs represent 3.5–4% of total protein synthesized in male liver, but only 0.6–0.9% in female liver [86]. Mup mRNA is also expressed in a number of secretory tissues—such as nasal tissue, mammary, salivary, submaxillary, and lacrimal glands [87, 88]—as well as skeletal muscle, kidney, brain, spleen, heart, epididymal adipocytes, and brown adipose tissue [89,90,91]. MUP synthesis is initiated in response to different hormonal signals during various developmental stages; for example, liver synthesis of MUPs begins at onset of puberty and on through adulthood [92], whereas MUP synthesis in lacrimal gland starts 1 to 2 weeks before onset of puberty and continues into adulthood [83]. In addition, the specific Mup mRNA subtype produced varies from tissue to tissue [93] (Table 2).

Table 2 Mouse tissues known to express Mup mRNA [93]

The MUP gene cluster in mouse and human genomes

Interestingly, the mouse Mup gene cluster (22 protein-coding genes; Table 3) can be divided into two subgroups. The first group (Mup3, Mup4, Mup5, Mup6, Mup20, and Mup21) is slightly older (Fig. 2) and contains a more divergent class of genes. The second group comprises the remaining 16 Mup genes, which share almost 99% sequence identity [75, 94]. The predicted gene, previously designated Gm21320 (“gene model 21320”), has now been renamed Mup22, cf. [http://www.informatics.jax.org/].

Table 3 List of all mouse Mup genes [http://www.informatics.jax.org/], with official gene symbols, aliases, chromosomal locations, isoforms, National Center for Biotechnology Information (NCBI) RefSeq mRNA accession numbers, NCBI RefSeq protein accession numbers, and total number of amino acids (# of AAs) [information retrieved and confirmed from https://www.ncbi.nlm.nih.gov/genome.]

As members of the LCN family, MUPs exhibit conservation in the common three-dimensional structure of the protein family, i.e., a central area pocket formed by eight hydrophobic β-strand domains that form a barrel (Fig. 3) [81, 95]. This structure enables the MUPs to serve as carrier proteins for small lipophilic molecules such as pheromones and other chemical signals [78, 81]. All 22 mouse Mup protein-coding genes are located in a cluster (the Mup locus) on Chr 4 (Fig. 4 a, b) [96]. There are also 29 Mup-ps pseudogenes in the Chr 4 Mup cluster (intriguingly, the one remaining pseudogene, Mup-ps22, is located on Chr 11).

Fig. 3
figure 3

Structure of prototypical mouse urinary protein. The crystal structure consists of eight β-strands, forming a calyx-shaped barrel (red); this encloses an internal ligand-binding site. There are also an α-helix (green) and four 310-helices (blue); the hydrophobic pocket is located inside the barrel. AB, BC, CD, DE, EF, FG, GH, HI, AND βI denote the amino-acid segments between the β-strands (This diagram taken from Ref. [95])

Fig. 4
figure 4

Chromosomal location of mouse Mup genes and pseudogenes. a The Mup cluster region, located at 60,498,012 Mb to 60,501,960 Mb (red vertical rectangle). Taken from the Ensembl genome browser. b The Chr 4 region (in greater detail)—showing ten of the 22 Mup genes (Gm21320 is Mup22) in the Mup cluster and 12 of the 29 Mup-ps pseudogenes

An “evolutionary bloom” is defined when one sees a recent, phylogenetically independent proliferation of close paralogs or lineage-specific gene family expansion [97]. Examples of this phenomenon have been extensively studied in the large and diverse cytochrome P450 superfamily [97]. For example, the koala’s ability to detoxify eucalyptus leaves appears to be due to an evolutionary bloom within a cytochrome P450 gene group; the koala’s CYP2C subfamily was found to comprise 31 putative protein-coding functional enzymes, compared to 15 Cyp2c genes in mouse and just four CYP2C genes in human [98]. Another example is the mouse Scgb gene superfamily—which includes a number of encoded androgen-binding proteins involved in mate selection [99]; this is fascinating, because the Mup cluster (described herein) also encodes proteins involved in mate selection. It has been suggested that these evolutionary blooms might represent simply a stochastic process [97]. However, it is more likely these blooms are the result of environmental pressures needed for the organism to survive (i.e., find food, avoid predators, and reproduce) at a particular moment in evolutionary time.

Mup gene polymorphisms in rat and mouse have shown significant differences. Yet, such differences have not been seen in genomes of other mammalian species for which whole-genome sequences have been explored [75]. Although the amino acid sequence of MUP homologs between rat and mouse is ~ 65%, there is a characteristic six amino-acid consensus sequence (Glu-Glu-Ala-Ser-Ser-Thr) that remains highly conserved between these two species [100]. In general, species differences in MUP proteins appear to be mainly due to glycosylated MUP amino-acid residues that occur in rats, but not mice [100]. Most other mammalian species (e.g., dog, baboon, gorilla, and chimpanzee) have only one functional protein-coding MUP gene, except for horse that has three functional MUP genes [75].

The human MUP-related gene is a pseudogene (MUPP, located at Chr 9q32). Using the UCSC genome browser [https://genome.ucsc.edu/], one can visualize that human Chr 9q32 is syntenic to mouse Chr 4 at 60,498,012 Mb to 60,501,960 Mb, where the Mup cluster of 22 Mup genes is located; in fact, the human ZFP37 and mouse Zfp37 gene flank the “MUP region” in both human and mouse, respectively. The human MUPP locus exhibits a high degree of sequence similarity to mouse Mup functional genes but contains coding-sequence disruptions that prevent the gene product from being formed [101]. The human MUPP shows a G > A transition (relative to the chimpanzee MUP sequence) that disrupts a splice-donor site [75]; this is interesting because this G > A mutation has not been observed in mammals other than humans [101]. The human MUPP pseudogene sequence is most similar to the mouse Mup-ps4 pseudogene [75].

One of the main functions of MUP proteins is to promote aggressive behavior through binding to vomeronasal pheromone receptors (V2Rs) in the accessory olfactory neural pathway. Even though there is a co-expansion of MUPs and V2Rs in mouse, rat, and opossum—all human V2R receptors have become inactive, possibly leading to the pseudogenization of the single human MUP gene [102, 103]. In other words, the absence of the specific V2R removed the selection pressure for a functional MUP ligand.

Parallel expansions of Mup clusters

The last common ancestor of rat and mouse had either a single, or a small number of, Mup genes [75]. By determining the extent of Mup gene expansions across non-rodent lineages, Logan and colleagues were able to identify orthologs of the Slc46a2 and Zfp37 genes (and the contiguous genomic sequence spanning the interval between these two genes) in nine additional placental mammals [75]. Whereas C57BL/6J mice have a cluster of 22 distinct Mup genes on Chr 4 and rats have nine distinct Mup genes, mammalian species such as dog, pig, baboon, chimpanzee, bush baby, and orangutan—each has a single Mup gene (with no evidence of additional pseudogenes). By contrast, the human genome has only the one pseudogene.

A neighbor-joining dendrogram of human LCN and mouse MUP proteins is illustrated in Fig. 5; subfamilies can be distinguished based on evolutionary divergence. Note that all mouse MUPs are clustered into a subgroup near the top of the dendrogram, whereas the human LCNs are split into several different branches—due to the high degree of divergence of LCN proteins. The mouse Mup cluster divergence is most closely associated with human LCN9 and PAEP (Fig. 5). Note that the evolutionarily oldest human LCN genes include ORM1, ORM2, APOM, APOD, RBP4, and LCN8.

Fig. 5
figure 5

Dendrogram of human LCNs and mouse MUPs, combined. Although the names listed are the official human gene symbols and mouse Mup gene symbols, this dendrogram was based on the alignment of proteins (listed in Tables 1 and 3) using multiple sequence alignment by CLUSTALW (http://www.genome.jp/tools/clustalw/). Note that the human LCN9 gene is evolutionarily closest to the mouse Mup cluster in this dendrogram

Functions of MUPs in mice

MUPs and chemical communication

Due to their influence on pheromones, MUPs appear to be involved in regulating transmission of social signals—such as identity, territorial marking, and mate choice [104,105,106]. Most pheromones are small volatile molecules that influence aggression, mating, feeding, and territorial behavior within the same species [103, 107]. Mice use pheromones as cues to regulate social behaviors. Neurons that detect pheromones reside in at least two separate organs within the nasal cavity: the vomeronasal organ (VNO) and the main olfactory epithelium (MOE). Each pheromone molecule is thought to activate a dedicated subset of these sensory neurons—similar to the manner in which odorants are received by dedicated subsets of mammalian olfactory receptors. However, the identity of the responding neurons that regulate specific social behaviors remains largely unknown.

Pheromones have a short half-life, which can be prolonged by binding to the characteristic barrel pocket, in the MUP protein. In addition, gradual release of a pheromone from a MUP protein allows the half-life of these airborne odor signals to be extended, e.g., to be used as mammalian scent marks [108, 109].

MUPs are also linked to reproductive success in males [110] and to social behavior, by adjusting the animal’s odor profile in response to different stimuli. This function underscores how social environment plays an important role in MUP production in both male and female mice. For example, MUP synthesis is upregulated in a male mouse housed with a female, but downregulated when a male mouse is housed with other males only [111]. Perhaps related to this, MUPs can be predictive of the onset of aggressive and dispersal behavior among male mice [103].

In addition to serving as pheromone carriers, MUPs can function as pheromones themselves. They facilitate chemical information exchange to convey specific information (e.g., gender, social and reproductive status) between animals [70]. Recent research has revealed that MUPs also act as kairomones, causing a fear reaction in response to predators [112]. For example, the rat kairomone that triggers defensive behavior in mice is encoded by the Mup13 gene [112].

Members of the MUP family are known to be involved in intraspecies interplay, especially male-male aggression in mice [103]. Female mice are attracted to urine-borne male pheromones. MUP20, for example, has been shown to be rewarding and attractive to female mice. MUPs excreted by male mice can also influence reproductive behavior and promote female attraction [103, 113,114,115]. The molecular mechanism promoting spontaneous ovulation involves direct stimulation of VNO nerves by four residues on the NH2-terminus of MUP proteins [116]. MUPs may also be involved in mediating individual recognition and inbreeding avoidance [117, 118].

MUPs and metabolism

MUPs also appear to be involved in energy metabolism—actions reminiscent of the lipocalins that have been implicated clinically in lipid disorders and metabolic syndrome caused by obesity and type-2 diabetes [119, 120]. For example, mouse MUP1 regulates systemic glucose metabolism by modulating the hepatic gluconeogenic and/or lipogenic programs [121,122,123]. Caloric restriction dramatically reduces MUP1 expression in mouse liver [124, 125] and appears to decrease MUP4 and MUP5 expression, as well [124, 126].

Decreased hepatic MUP1 levels have been linked to obesity and type-2 diabetes in mice with either genetic (leptin receptor-deficient db/db) or dietary fat-induced obesity [121, 127]. Similar decreases in MUP1 are found in extrahepatic organs—such as adipose tissue and the hypothalamus—after caloric restriction [128, 129]. Furthermore, MUP1 was also found to lower blood glucose levels by inhibiting expression of phosphoenolpyruvate carboxykinase and glucose-6-phosphatase, two rate-limiting enzymes for gluconeogenesis [127]. These studies suggest that mouse MUP1, and possibly other MUP family members, are playing key roles in energy metabolism and potentially contributing to the development of metabolic diseases such as type-2 diabetes.

Human LCN-like genes and their mouse orthologs

We have listed the human LCN-related genes in Table 1 and mouse Lcn-like genes plus the Mup cluster genes in Table 3. What are the percent similarities of proteins—if one compares human LCN-like genes with their mouse orthologs? Among the 19 LCN-like genes in the human genome (Table 4), 17 have mouse orthologs, whereas human LCN1 and PAEP do not. Within the LCN cluster, LCN6 and LCN8 exhibit the highest percent similarity: 74 and 70%, respectively. Among all 19 LCN-like genes, RBP4 and APOM reveal the highest percent similarity (86% and 81%, respectively); LCN15 and OBP2A display the lowest percent similarity (39%) to their mouse orthologs.

Table 4 List of the 19 LCN-like genes in the human genome and their similarity to their mouse orthologs, expressed as percent identity (%)

Human and mouse ancestors are estimated to have diverged from one another ~ 80 million years ago. Table 4 confirms the relatively rapid rate of evolutionary divergence by the LCN-like genes, which is consistent with their function of requiring evolutionarily quick adaptation to changing environments; this is similar to, e.g., beta-defensin (DEFB) genes, which number almost four dozen in the human genome and encode broad-spectrum antimicrobial cationic peptides [130]. Out of 19 LCN-like genes, the appearance of two novel human genes (LCN1 and PAEP) during the past ~ 80 million years is further evidence of an enhanced evolutionary rate for this gene superfamily.

Rapid rates of evolutionary divergence stand in sharp contrast to, e.g., highly conserved transcription factors, whose human-mouse orthologs are generally > 95% similar in protein sequence. In fact, assaying for complementation of lethal growth defects in yeast, almost half (47%) of the yeast genes could be successfully humanized [131], and the yeast-human divergence occurred well over one billion years ago.

Conclusions

Lipocalins (LCNs) are members of a family of evolutionarily conserved small proteins that possess a binding pocket. The LCN proteins (18–40 kDa) are encoded by 19 human LCN-related genes and 45 mouse Lcn-related genes. LCN proteins are expressed in numerous tissues and play important roles in physiological processes by transporting molecules in plasma and other body fluids. In humans, LCNs are extensively used clinically as biochemical markers in various diseases—such as diabetic renal disease, systemic lupus erythematosus, and chronic inflammation.

In mice, major urinary proteins (MUPs) are also members of the lipocalin family. The Mup cluster of 22 functional protein-coding Mup genes (plus 29 of 30 Mup-ps pseudogenes) is confined to mouse Chr 4 and represents an “evolutionary bloom,” because only one or a few MUP genes are functional in other mammals. In fact, no functional MUP gene exists in the human genome—although a human MUPP pseudogene located at Chr 9q32 is syntenic to the Mup cluster on Chr 4.

The MUP protein structure contains a conserved “barrel” formed by the eight β-chains having the characteristic central hydrophobic pocket binding-site. Mouse MUP proteins are expressed mainly in the liver, secreted into the bloodstream, and excreted by the kidney. MUPs are involved in the communication of information in urine-derived scent marks and can also serve as pheromones themselves. Circulating MUPs may also contribute to regulation of nutrient metabolism—possibly by suppressing hepatic gluconeogenic and lipid metabolism. However, it still remains unclear how MUPs, especially mouse MUP1, regulate energy metabolism and the gluconeogenic pathway. Further studies will be needed to shed light on these mechanisms.