Background

The avian eggshell is a bioceramic formed of calcium carbonate and an organic matrix pervading and enveloping the calcite crystals. Eggshell formation is the last step of egg production, a process most comprehensively studied in the chicken due to its commercial importance [1, 2]. Egg production starts in the ovary by massive yolk accumulation for six to ten days. Most yolk components are produced in the liver, secreted into the blood stream and transported to the ovary where they are taken up by receptor-mediated transport at the oolemma, the plasma membrane of the egg cell [3]. After ovulation the egg, the bulk of which is yolk covered by a proteinaceous inner vitelline membrane, enters the oviduct to start an approximately 22 h-long journey driven by peristaltic movements of the oviduct wall. In the first section of the oviduct, the infundibulum, the egg is covered by the outer vitelline membrane. Egg white (albumen) production takes place in the next section of the oviduct, the magnum. Eggshell formation starts in the white isthmus by assembly of the eggshell membranes from soluble components secreted by cells lining the oviduct. First small calcite accretions form on specialized, regularly spaced nucleation sites on the external shell in the red isthmus [4]. Bulk mineralization takes place in the eggshell gland (uterus), where the egg stays for 16-17 h. The final step in eggshell assembly is the deposition of the cuticle, which also covers the openings of eggshell pores traversing the calcified layer. Eggshell calcification is thought to be controlled by matrix proteins secreted by epithelial cells lining isthmus and uterus [57]. However, our previous proteomic analyses of the soluble proteome of the chicken calcified eggshell layer [8, 9] indicated that the matrix does not only contain proteins produced by the shell gland epithelium but also proteins produced and secreted in other sections of the oviduct. Apparently leftovers of the assembly processes along the egg production line migrate with the egg to end up in the uterus fluid where they are eventually incorporated into the growing calcitic layer. Several such proteins, especially the major egg white proteins ovalbumin [10], lysozyme [11] and ovotransferrin [12] were shown by immunohistochemical methods to reside in the intra-crystalline matrix itself and not to be just surface contaminants. This location was also confirmed for osteopontin, a protein previously identified in bone, but also secreted in the eggshell gland triggered by the entry of the egg and is therefore an example of a protein occurring in different biomineral systems of one organism [13]. However, although egg white proteins may weakly influence calcium carbonate crystallization in vitro [11, 12, 14, 15], their role, if any, in eggshell mineralization remains unclear. In addition to egg proteins the matrix proteome contained basement membrane components, endoplasmatic reticulum residents, Golgi complex proteins, and other intracellular compartments. These may have reached the oviduct fluid as by-products of secretion or may have been released by damaged cells of the oviduct epithelium. In total, the soluble proteome of the mineralized layer, the thickest eggshell compartment, comprised more than 520 proteins [8, 9]. Further proteomic studies of insoluble proteins of the calcified layer and the cuticle [16] and solubilized cuticle [17] contributed several interesting new proteins to the overall chicken eggshell proteome. This unexpected complexity of the eggshell proteome raised the problem of how to discern between matrix proteins functioning in matrix assembly and calcification and a background of non-functional proteins. Because birds are not as easily accessible to genetic manipulation as unicellular or some invertebrate species, one has to resort to more practicable methods.

The concept that functional eggshell proteins should be produced in the eggshell gland has stimulated gene expression studies with microarrays comparing differential expression between eggshell gland tissue of juvenile and sexually mature hens [18] and between eggshell gland tissue and tissues of other oviduct sections [19]. Of these studies, the second one may be more relevant to the identification of eggshell proteins, because it focused on the eggshell gland during deposition of the eggshell. Altogether 605 transcripts were highly expressed in uterine tissue as compared to other oviduct sections and 437 corresponded to known proteins present in protein sequence databases. Of these, 52 corresponded to matrix constituents identified by proteomics or protein biochemical methods previously. However, many of these transcripts or corresponding proteins are not likely to play a role in eggshell production, because the list also contains such proteins as tubulin, actin, glyceraldehyde-3-P-dehydrogenase or ezrin. A disadvantage of such transcriptomic studies is that they will not recognize major eggshell proteins such as ovotransferrin, ovalbumin or serum albumin, which are not produced in the eggshell gland but may nevertheless influence eggshell mineralization and eggshell properties in some way [11, 12, 14, 15]. Another possible way to identify functionally important eggshell proteins, and the one we pursue here, is the comparison of eggshell matrices of different avian species.

Mass spectrometry (MS)-based in-depth and high-throughput studies depend on the availability of comprehensive sequence databases created by genome sequencing projects. The first available avian genome sequence was that of chicken [20] followed by zebra finch [21] and turkey [22] genome sequences recently. For the present study we chose the turkey eggshell because turkey belongs to the same family as chicken (Phasianidae) and has a very similar genome [22, 23]. The present proteomic analysis of the turkey calcified eggshell layer acid-soluble and acid-insoluble proteins showed many similarities, but also unexpected differences, between chicken and turkey eggshell proteomes.

Materials and methods

Preparation of proteins and peptides

Turkey eggshells (strains Converter and Big 6) were obtained from Putenzucht Miko GmbH, A-4871 Zipf, Austria. These were fertilized, but not incubated eggs. The empty shells of five eggs of each strain were washed under tap water and then cleaned in 5% EDTA for 60min at room temperature to facilitate mechanical removal of membranes, cuticle and attached contaminants. The membranes were peeled off and the cuticles were removed by rubbing under flowing de-ionized water. The pieces of calcified eggshell were washed with water, dried and demineralized in 10% acetic acid (1 g of shell/20 ml) overnight in the cold room. The suspension was centrifuged for 1 h at 4°C and 12000gav to separate acid-soluble from acid-insoluble matrix. The pellets were washed three times by re-suspension in 10vol 10% acetic acid, centrifugation at 4°C and 12000gav for 30min, and lyophilized. Supernatants were successively dialyzed against 3 × 10vol 10% acetic acid and 2 × 10vol 5% acetic acid at 4-6°C (Spectra/Por 6 dialysis membrane, molecular weight cut-off 2000; Spectrum Europe, Breda, The Netherlands) and lyophilized.

Proteins were separated by SDS-PAGE using pre-cast 4-12% Novex Bis-Tris gels in MES buffer, using reagents and protocols supplied by the manufacturer (Invitrogen, Carlsbad, CA), except that 1% β-mercaptoethanol was used as reducing agent in the sample buffer. The sample was suspended in 20 μl sample buffer/100 μg of matrix, heated to 70°C for 10 min, and centrifuged to remove sample buffer-insoluble material. Three lanes were loaded with 80 μg of organic matrix in each of three separate experiments per strain and fraction. Gels were stained with colloidal Coomassie (Invitrogen) after electrophoresis, and cut into suitable slices for in-gel reduction, carbamidomethylation and digestion with trypsin as described [24]. The molecular weight marker was Novex Sharp pre-stained (Invitrogen). The eluted peptides were cleaned with C18 Stage Tips [25] before mass spectrometric analysis.

LC-MS and data analysis

Peptide mixtures were analyzed by on-line nanoflow liquid chromatography using the EASY-nLC 1000 system (Proxeon Biosystems, Odense, Denmark, now part of Thermo Fisher Scientific) with 20 cm capillary columns of an internal diameter of 75 μm filled with 1.8 μm Reprosil-Pur C18-AQ resin (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). Peptides were eluted with a linear gradient from 5-30% buffer B (80% acetonitrile in 0.1% formic acid) for 100 min, 30-60% B for 12 min and 80-95% B for 8 min at a flow rate of 250 nl/min. The eluate was electro-sprayed into an Orbitrap Elite (Thermo Fisher Scientific, Bremen, Germany) using a Proxeon nanoelectrospray ion source. The Orbitrap Elite was operated in a HCD top 10 mode essentially as described [26, 27]. The resolution was 120,000 for full scans and 15,000 for fragments (both specified at m/z 400). Ion target values were 1e6 and 5e4ms, respectively. Exclusion time was 90 sec. Raw files were processed using the Andromeda search engine-based version 1.3.9.3 of MaxQuant (http://www.maxquant.org/) with enabled second peptide, iBAQ and match between runs (match time window 0.5min; alignment time window 20 min) options [2830]. For protein identification the ENSEMBL turkey protein database from release 66, 2012, (http://www.ensembl.org/info/data/ftp/index.html) was downloaded and combined with the reversed sequences and sequences of common contaminants, such as human keratins. Carbamidomethylation was set as fixed modification. Variable modifications were oxidation (M), N-acetyl (protein) and pyro-Glu/Gln (N-term). The initial mass tolerance for full scans was 7 ppm and 20 ppm for MS/MS. Two missed cleavages were allowed and the minimal length required for a peptide was seven amino acids. The peptide and protein false discovery rates (FDR) were set to 0.01. The maximal posterior error probability (PEP), which is the individual probability of each peptide to be a false hit considering identification score and peptide length, was set to 0.01. Two sequence-unique peptides (sum of sequence-unique peptides in acid-soluble and acid-insoluble fractions of six technical replicates forming one biological replicate) occurring at least three times in total in two different technical replicates were required for high-confidence protein identifications. Identifications with only two sequence-unique peptides were routinely validated using the MaxQuant Expert System software [31] considering the assignment of major peaks, occurrence of uninterrupted y- or b-ion series of at least four consecutive amino acids, preferred cleavages N-terminal to proline bonds, the possible presence of a2/b2 ion pairs and immonium ions, and mass accuracy. The iBAQ (intensity-based absolute quantification) [32] option of MaxQuant was used to calculate, based on the sum of peak intensities, the approximate share of each protein in the total proteome.

Sequence database searches were performed with FASTA (http://www.ebi.ac.uk/Tools/sss/fasta/) [33] against current releases of Uniprot Knowledgebase (UniProtKB) and International Protein Index (IPI). Other bioinformatics tools used were clustalW2 for sequence alignments (http://www.ebi.ac.uk/Tools/msa/clustalw2/), InterProScan (http://www.ebi.ac.uk/Tools/pfa/iprscan/) [34] for domain predictions, SignalP 4.1 (http://www.cbs.dtu.dk/services/SignalP/) [35] for signal sequence prediction, and the Venn diagram plotter (http://omics.pnl.gov/software/VennDiagramPlotter.php) for preparing Venn diagrams.

Results and discussion

For this study we used two biological replicates each consisting of the pooled washed shell calcified layers of five eggs of either strain Converter (biological replicate A) or strain Big6 (biological replicate B). Each of these biological replicates was analyzed in six technical replicates, three for the acid-soluble fraction and three for the acid-insoluble fraction. Matrix yields were approximately 8mg of acid-soluble matrix and 16mg of acid-insoluble matrix per g of dry shell calcified layer, together constituting approximately 2.5% of the total shell weight. No differences were detected in PAGE patterns of samples from different strains (not shown). Acid-soluble eggshell matrix and acid-insoluble matrix were analyzed separately. For each technical replicate three identical PAGE lanes were cut into 18 slices for in-gel digestion (Figure 1). MS analysis of the eluted peptides produced a total of 216 raw files. Technical replicate groups were analyzed with MaxQuant, first separately, then after combining results of acid-soluble and acid-insoluble fractions of each strain. After grouping obvious fragments of identical proteins and almost identical proteins together, pool A (strain Converter) yielded 555 protein groups and pool B (strain Big 6) 647 protein groups, with an overlap of 505 protein groups (Figure 2A). Almost all of the protein groups identified in only one pool were of low or very low abundance, each protein representing between <0.0001 to <0.01% of the total proteome as judged by their iBAQ values. Furthermore, most of these proteins were also identified in the respective other pool but below statistical acceptance thresholds. Therefore these differences likely did not represent differences between strains, but rather experimental variation. The combined proteomes yielded 697 protein groups (Additional file 1: Table S1) that may represent a slightly lower number of proteins because not all fragments of identical proteins distributed over different database entries may have been identified unequivocally. This was more than the sum of different proteins identified in the acid-soluble shell proteome [8, 9], the insoluble shell proteome [16] and the cuticle [17] of the chicken eggshell (Figure 2B), a finding that may also reflect technical progress. The overlap between total chicken and turkey eggshell proteomes was 52% and increased to 85% if only turkey proteins of >0.01% of the total proteome (172 proteins) were considered, and to 94% with proteins of greater than 0.1% abundance (47 proteins). This indicates that much of the differences occurred among the low and very low-abundance proteins. The numerical difference between combined chicken eggshell proteins and turkey eggshell proteins was 151 proteins (protein groups). This increase may be attributable to a large part to developments in mass spectrometry instrumentation (FT-ICR [8] versus the much faster Orbitrap Elite, this study). However, the greater number of technical replicates and the analysis of both, acid-soluble and acid-insoluble fraction, may also have contributed. The complete lists of identified protein groups including those not accepted for Additional file 1: Table S1 are contained in Additional file 2: Table S2 ProteinGroups pool A, and Additional file 3: Table S3 ProteinGroups pool B. The identified peptides are listed in Additional file 4: Table S4 Peptides pool A and Additional file 5: Table S5 Peptides pool B. These files also contain additional accession numbers for groups with more than one protein, numbers of total, razor and sequence-unique peptides, their distribution over gel sections, iBAQ intensities, peptide sequences, and other relevant data not only for accepted identifications but also for identifications with only one peptide or two peptides in only one technical replicate.

Figure 1
figure 1

SDS-PAGE of turkey eggshell matrix proteins. Each lane was loaded with 80μg of matrix. S, acid-soluble matrix; I, acid-insoluble matrix. Molecular markers are shown to the left. Slices for in-gel digestion are indicated to the right.

Figure 2
figure 2

Comparison of turkey biological replicate proteomes (A) chicken and turkey eggshell proteomes (B). The number of identified chicken eggshell proteins were compiled from [8, 9, 16, 17] and cover acid-soluble proteins of the calcified layer [8, 9], insoluble proteins of the calcified shell [16], and cuticle proteins [17].

Of the 697 protein groups 122 were identified only in acid-soluble fractions and 20 only in acid-insoluble fractions (Additional file 1: Table S1). Most of these were identified in shells of only one turkey strain at low abundance (<0.001%) and may not be due to true solubility preferences but experimental variation. However, 38 proteins were identified in one fraction of both biological replicates and may therefore reflect real differences in distribution among solubility fractions (Additional file 1: Table S1). This may especially be true for proteins identified with more than three sequence-unique peptides or an abundance of >0.001%, such as the antimicrobial peptide NK-lysine (H9H1A3_MELGA), the possibly extracellular ribonuclease G1NAU2_MELGA, the protein similar to Cys-rich secretory protein 3 contained in entry G1NN67_MELGA, or galectin-3 (G1NLL4_MELGA).

Similar to the chicken eggshell proteome, the turkey eggshell proteome contained many proteins occurring in other egg compartments, especially the egg white. These are likely leftovers of egg assembly that migrate together with the egg from the site of their secretion into the oviduct lumen to the eggshell gland [8]. Other proteins may have reached the oviduct fluid as by-products of secretion, shedding of extracellular domains of membrane proteins, or may have been released by damaged epithelial cells. This mixture of proteins is then supplemented with proteins specifically secreted from uterus epithelia and may be inserted into the growing eggshell during mineralization.

When looking for functional eggshell proteins, an obvious choice is to inspect the major proteins, although minor proteins can of course also affect matrix assembly and mineralization, especially if they have catalytic properties. In previous studies we discerned major proteins from minor ones using the exponentially modified protein abundance index (emPAI), a quantification method relying on spectral counts and relating the number of identified unique parent ions to the number of theoretically possible peptides [36]. Although in principle well suited for the purpose of identifying major proteins, the results are not particularly intuitive and the division into abundance groups as practiced before is somewhat arbitrary [8]. In the present study we therefore used the more refined iBAQ procedure [32] as implemented in recent MaxQuant versions. This method is based on peak intensities of identified peptides and these can be normalized to the sum of all intensities yielding the percentage of each component in relation to the total proteome. This showed that 47 major proteins, or protein groups, with an individual percentage of >0.1 constituted approximately 95% of the total identified turkey eggshell proteome. Of these 47 proteins we previously identified 44 (94%) in the chicken eggshell proteome and 24 were among the 50 most abundant of the chicken matrix proteins (Table 1). Furthermore, the messages of 11 of them were previously found to be up-regulated in epithelia of egg-containing uterus in different transcriptomic studies (Table 1). However, there were also some unexpected differences as detailed below.

Table 1 The most abundant proteins of turkey eggshell matrix

Ovocleidins and ovocalyxins (“eggshell-specific” proteins)

By far the most abundant protein in the identified turkey eggshell proteome was ovocleidin-116 (OC-116) (Table 1). As frequently observed for major proteins, OC-116 was identified with hundreds of peptides all over the PAGE gradient (Additional file 2: Table S2 and Additional file 3: Table S3). However, we found the highest concentration in slice 5 (Figure 1) corresponding to a Mr of 80 to 110 kDa. The presence of OC-116 fragments, observed in chicken matrix preparations [37], could also have contributed to the PAGE pattern and the resulting peptide distribution. Some representative spectra identifying and distinguishing the turkey protein are shown in Figure 3. OC-116 was first detected in chicken eggshell as the core protein of a proteoglycan with a molecular mass of approximately 120kDa [38] and was subsequently characterized by molecular cloning and sequencing [37, 39]. A substantial fraction of it occurs in the eggshell matrix without attached glycosaminoglycan chains, but in an N-glycosylated form [40]. In the chicken eggshell matrix we previously showed that OC-116 was one of the most abundant proteins of the acid-soluble proteome [8] and phosphoproteome [9]. OC-116 was first described as an eggshell-specific protein [39] but was subsequently also identified in egg yolk [41], the vitelline membrane [42] and egg white [43]. Interestingly, however, it was also identified in chicken bone [44] and is expressed in chicken osteoblasts and osteocytes during bone development and mineralization [45], indicating some similarities between bone and eggshell mineralization. The supposed mammalian homolog of OC-116, matrix extracellular phosphoglycoprotein (MEPE) [45], was shown to regulate bone formation, for instance by inhibiting growth plate cartilage mineralization [46]. In comparison to chicken OC-116 the turkey protein sequence (ENSMGAP00000007641/G1N6E1_MELGA) lacked approximately 250 amino acids (aa) at the C-terminus. Sequence identity in the remaining overlap of approximately 500aa was 80%. Consequently only four of the 40 sequence–unique peptides identified matched to chicken and turkey sequences. Database searching after we had added the chicken OC-116 sequence to the turkey sequence database produced a single peptide, 670QVEQVRHADRLR682, matching to the C-terminus (Figure 4). Compared to other OC-116 peptides this peptide was rarely identified and occurred only in very few technical replicates. The distribution of OC-116 peptides over the PAGE slices would rather indicate a turkey OC-116 of a length similar to that of the chicken protein.

Figure 3
figure 3

Representative spectra of turkey ovocleidin-116 (ENSMGAP00000007641/G1N6E1) peptides. Top, doubly charged peptide (aa46-59) with PEP 2.4-15 and a mass error of −1.0 ppm. Bottom, quintuply charged peptide (aa234-273) with PEP 8.5-215 and a mass error of −0.2 ppm. MaxQuant Expert System annotations not contained in simple fragment annotation were added for some major peaks (in black). These were the [M + H] ion and the histidine immonium ion in the upper spectrum and a series of internal fragments and the phenylalanine immonium ion in the lower spectrum. Full expert annotation is not shown for sake of clarity.

Figure 4
figure 4

Spectrum of a turkey OC-116 peptide indicating the presence of a C-terminus similar to chicken OC-116. This peptide was not found in turkey OC-117 but matches the chicken OC-116 C-terminus (aa671-682), which is missing in the turkey protein. This quadruply charged peptide was identified with a posterior error probability (PEP) of 0.0005 and a mass error of −0.144 ppm. It is the only evidence for the presence in the turkey protein of a C-terminus similar to that of chicken OC-116. The N-terminal glutamine was cyclized to pyroglutamate during peptide fractionation under acidic conditions, a frequently observed modification of peptides with N-terminal glutamine.

Ovocleidin-17 (OC-17) is a major chicken eggshell protein [8] completely unrelated to OC-116 [47, 48]. Its amino acid sequence was established by Edman sequence analysis of the purified protein [48]. The presence of a C-type lectin-like domain lacking the characteristics of a typical true C-type lectin and a tendency to aggregate during isolation suggested a function as structural matrix protein [48]. However, OC-17 was also reported to affect in vitro calcium carbonate crystallization [49] and to possess antimicrobial activity [50]. Neither the genome-derived protein sequence database of turkey nor that of chicken contained a sequence of significant similarity to OC-17, indicating that it may be encoded in the 5-10% of the genome sequences still missing for both species [23]. Addition of the chicken OC-17 protein sequence to the turkey protein sequence database yielded two OC-17 peptides, 47SAAELRLLAELLNASR62 and 75VWIGLHR81 (Figure 5). Furthermore, the existence of a protein similar to OC-17 in turkey eggshell matrix would be in agreement with the results of a comparative Western blotting study using anti-chicken OC-17 antiserum [51]. The OC-17 peptides peaked in gel fractions 15 and 16 (Additional files 2: Table S2 and Additional file 3: Table S3), corresponding to the size of OC-17. However, major proteins of similar size, such as avidin and cystatin, also showed a peak of their peptide distribution in these same fractions and very likely correspond to the two major bands observed in these sections (Figure 1). Nevertheless, the sum of evidence may indicate that OC-17 is present in turkey eggshell matrix, although at an unknown percentage of the total proteome.

Figure 5
figure 5

Spectra indicating the presence of OC-17 in the turkey eggshell proteome. This figure shows selected spectra of the two ovocleidin-17 peptides. Top, this triply charged peptide was identified with a mass error of 0.7 ppm and a PEP of 1.9-102. Bottom, doubly charged peptide with a mass error of 0.3 ppm and a PEP of 0.01. Four MaxQuant Expert System annotations for major peaks are shown in black in addition to the simple fragment annotation. These were two internal fragments, the immonium ion of tryptophane, and an a2 ion with loss of the tryptophane residue from the peptide backbone. Full expert annotation was omitted for sake of clarity.

Ovocalyxin-36 (OCX-36) is a major protein of the turkey eggshell matrix proteome (Table 1) and the chicken eggshell matrix proteome [8]. In chicken it is secreted in oviduct sections where eggshell production takes place and its gene expression is strongly up-regulated during mineralization [19, 52]. Its sequence contains a bactericidal permeability-increasing domain and OCX-36 was therefore suggested to play a role in egg defense against microbial contamination [52].

Two other ovocalyxins, OCX-32 and OCX-21 were not found in the turkey eggshell matrix proteome. OCX-32 [53] apparently did not have a counterpart in the published turkey genome sequence. Addition of the chicken protein sequence to the turkey sequence database used with MaxQuant did not produce any evidence for its presence in the turkey eggshell matrix. OCX-21 is a name given to the sequence contained in entry IPI00574331 of the chicken sequence database on various occasions [17, 19]) and is identical to gastrokine-2 (E1C2G7_CHICK), a secretory protein of mammalian gastric surface mucous cells [54]. FASTA database searches showed that the turkey protein sequence database did contain a homolog of the protein with 92.5% sequence identity to the chicken protein in accession ENSMGAP00000000035/G1MPS6 _MELGA. Therefore, the absence of any identified peptide of this major chicken eggshell matrix protein [8] may indicate its absence from the turkey eggshell matrix.

Major proteins not previously identified in eggshells

Among the major turkey eggshell matrix proteins (Table 1) were two proteins that were not identified in other eggshell matrices or egg fractions before, periostin and trefoil family peptide 2 (TFF-2). Periostin accounted for almost 11% of the total matrix proteome and was not identified in the chicken matrix although a periostin sequence was present in the IPI_CHICK sequence database used for this study [8]. As expected from its name, periostin was detected in the periosteum that covers the outer surface of bone, but it also occurs in other collagen-rich mammalian connective tissues [55]. The role of periostin in bone mineralization is not clear at present, although it seems to be important for bone growth and repair, and there is no indication of its possible function in eggshell formation at present. Its presence in shell matrix now provides another strong link of eggshell mineralization to bone metabolism.

TFF-2 belongs to a family of small proteins expressed predominantly in the mucosa of the gastrointestinal tract [56, 57] and it seems to play a role in mucosal protection and repair. Its location and function in the mammalian gastrointestinal tract apparently partially overlaps with that of gastrokines such as OCX-21/gastrokine-2. Indeed interactions between gastrokines and TFFs have been reported [54]. TFF-2 binds to mucins, especially to mucin 5AC [58], which was another major component of the major turkey eggshell (Table 1). In vitro binding of TFF-2 to mucin induces the formation of highly viscous complexes [59]. Thus, the function of these two matrix components may be to protect the shell gland mucosa of egg-laying hens.

Also not previously identified in eggshell matrix was the egg white protein meleagrin [60]. Its chicken homolog gallin was shown to have antimicrobial activity [61]. The expression of gallin was highest in the magnum section of the oviduct and about 140 times less in the eggshell gland. This indicates that this small protein reaches the eggshell gland essentially as a left-over of egg white assembly by co-migration through the oviduct together with the unfinished egg [61].

Other major proteins possibly involved in mineralization or matrix assembly of chicken and turkey

Osteopontin is a major non-collagen protein in bone but also occurs in many other tissues and body fluids. It has an inhibitory effect on various normal and pathological mineralization processes [62]. In the chicken oviduct it is secreted exclusively in the eggshell gland and accumulates in the shell matrix [13]. Osteopontin of different species is highly phosphorylated and its inhibitory effect was shown to depend on phosphorylation. Similar to chicken osteopontin [8, 9] the abundance of turkey osteopontin was most probably greatly underestimated because the phosphorylated peptides were not identified in this general survey. The kinase phosphorylating extracellular matrix proteins such as OC-116 and osteopontin was recently identified as FAM20C protein [63, 64], a Golgi lumen resident that was also found in chicken and turkey eggshell matrix (Table 1). Another obvious candidate for a protein with a role in mineralization is G1N6Y5_MELGA, which contains a predicted α-carbonic anhydrase domain and could therefore be involved in carbonate production for the growing eggshell calcite layer or the control of CO2 concentration in the uterus fluid. A protein with a probable, yet unknown, function in eggshell assembly is glypican-4 (Table 1). Similar to the expression of osteopontin [65] its expression is massively up-regulated in response to the mechanical strain exerted onto the eggshell gland walls upon entry of the egg and ceases shortly before completion of the eggshell [66]. Other proteins may have a more general function in eggshell matrix production, such as the extracellular chaperone clusterin, which may be important in maintaining the proper folding of matrix proteins during matrix assembly [67]. SPARC/BM40/osteonectin, which is also abundant in other vertebrate mineralized tissues such as bone and teeth, may participate in regulation of matrix assembly [68].

Major egg white proteins in the eggshell matrix

Major egg white proteins known from chicken that were identified among the major turkey eggshell matrix proteins were ovalbumin, avidin, ovotransferrin, cystatin and lysozyme C (Table 1). Many others were identified at lower abundance (Additional file 1: Table S1). Two of the major proteins, lysozyme C and ovotransferrin were shown to be true components of the chicken eggshell matrix previous to proteomic analyses using immunohistochemical methods [11, 12]. Several of these proteins have antimicrobial activity and their main function during egg production may be in the defense of egg and oviduct against microbial contamination either by direct attack of bacterial cell walls (lysozyme C), iron sequestration (ovotransferrin), biotin sequestration (avidin), or protease inhibition (cystatin) [1, 2]. Lysozyme C, ovotransferrin, and ovalbumin have a very weak effect on calcium carbonate crystal morphology in vitro [11, 12, 14, 15] but their possible role in in vivo crystallization, if any, remains unclear at present. An attractive idea in this respect is that such proteins showing only very weak or no interaction with calcite may nevertheless influence eggshell mineralization by maintaining a proper environment in the eggshell gland with respect to pH, CO2 concentration and soluble calcium availability in the uterus fluid [14].

Conclusions

Analysis of the turkey eggshell matrix proteome revealed some unexpected differences as compared to the chicken eggshell matrix, although both species belong to the same family (Phasianidae). The turkey eggshell contains a new major eggshell component, the bone protein periostin. There were also differences among a group of so-called eggshell-specific proteins produced in the chicken eggshell gland epithelial cells and thought to be very important for eggshell production. Two of these, OC-116 and OCX-36, were also among the major turkey eggshell matrix proteins. Another one, OCX-21/gastrokine-2 was missing in the proteome, although a very similar protein sequence was contained in the genome-derived turkey protein sequence database. The sequences of OC-17 and OCX-32 were not contained in the turkey sequence database but may be encoded in the 5-10% of the genome not yet sequenced [22, 23]. Addition of the chicken sequences of these two proteins to the database enabled identification of two OC-17 peptides suggesting the presence of this protein in the turkey eggshell matrix. However, this approach was not successful in the case of OCX-32, indicating that it either was not in the matrix or that the sequences are too different for such a cross-species identification. More eggshell proteomes are needed to identify a possible common set of shell mineralization-controlling avian proteins along with more transcriptome studies, immunochemical approaches to eggshell protein localization, and functional tests with isolated proteins.