Background

By definition, a biomarker is a characteristic that can be measured and evaluated as an indicator of normal biological processes, pathological processes or pharmacologic responses to therapeutic interventions. Several applications for biomarkers have been described, including the use of biomarkers to diagnose the presence or absence of a disease [1, 2], predict or evaluate the efficacy of new and existing drug therapies [3, 4], or to serve as surrogates for measuring clinical outcomes [5]. To be considered a “good” biomarker, an indicator of disease must exhibit accuracy, sensitivity, and specificity. More simply stated, the expression of a biomarker must be specific to a disease and the biomarker should remain unchanged during unrelated disorders. Likewise, reliable and reproducible quantification of the biomarker must be demonstrated [6, 7].

The development of soft ionization techniques in mass spectrometry (MS) including matrix-assisted laser desorption/ionization (MALDI) and electro-spray ionization (ESI), enabled the use of MS to characterize biopolymers such as peptides and proteins, and greatly enhanced proteomic research [8]. By definition, the term proteomics refers to the use of a scientific approach to elucidate all proteins within a cell or tissue at a given time and physiological condition [9]. Proteomics typically involves the use of analytical methodologies such as two-dimensional gel electrophoresis (2D-GE) or liquid chromatography (LC) to separate proteins or peptides, and MS to isolate, identify, and characterize proteins and their associated post-translational modifications (PTMs). Mass spectrometric-based (MS-based) proteomic methodologies boast several advantages over earlier protein detection and characterization methods including the capacity to detect a greater number of proteins in a given sample, as well as accurate protein identification and quantification without a reliance on antibodies. Proteomics likewise offers an advantage over genomic analyses in biomarkers studies because a weak correlation often exists between mRNA levels and actual protein concentration [1013].

Prior to the dominance of MS-based analytical approaches in proteomic biomarker studies, most initial protein analyses involved the use of antibody-based strategies, namely Western blots and enzyme-linked immunosorbent assays (ELISAs). Though extremely accurate, the most crucial element of an antibody-based detection strategy is a highly specific antibody-antigen interaction. Though a substantial amount of information has been generated on the expression of cytokines and soluble mediators of inflammation expressed during intra-mammary infections in dairy cows using ELISAs specifically [14], the use of approaches that rely on the availability of species-specific antibodies limits the identification and characterization of novel protein candidates in biomarker discovery analyses. Additionally, very few antibodies are commercially available for ruminant species as compared to more traditional laboratory animal species such as mice and rabbits, which further restricts the number of bovine proteins that can be analyzed by Western blot or ELISA. Finally, it has been well established that many proteins are modified post-translationally during disease, but because there is no practical protocol for the development of an immunoassay targeting a modified site if neither the site nor the modification is known, the use of antibody-based detection strategies can also limit the detection of potential disease specific PTMs.

A tremendous amount of research has been dedicated to elucidating the mechanisms and mediators involved in the bovine host response to intra-mammary infection with Gram-positive and Gram-negative pathogens [14]; due largely to staggering economic losses caused by the disease, the limited number of efficacious treatment options, and the lack of accurate biomarkers to evaluate the efficacy of new animal drugs proposed as primary and adjunctive mastitis therapies. As well, the prominent use of antimicrobials to prevent and combat mastitis infections in dairy cattle has garnered significant attention due to fears regarding the potential impact of antibiotic use in agriculture and the emergence of resistant strains of bacteria. Furthermore, while antibiotics are an effective treatment regimen for most cases of contagious mastitis caused by Gram-positive pathogens, several studies have demonstrated that despite their use, antibiotics have little or no efficacy in treating clinical or subclinical cases of mastitis caused by environmental or Gram-negative bacteria [15, 16].

The outer membrane of Gram-negative bacteria is characterized by the presence of lipopolysaccharide (LPS), a compound known to stimulate a rapid inflammatory response in the bovine mammary gland [17, 18]. Mastitis infections caused by Gram-negative pathogens are problematic to treat, mainly because Gram-negative bacteria are protected from most antibiotics, detergents and chemicals by their outer cell-wall. Additionally, antibiotics have no effect on the deleterious effects of the endotoxin released by the bacteria. The only treatment that has shown promise as an adjunctive therapy for the profound intra-mammary inflammation associated with coliform mastitis is the use of non-steroidal anti-inflammatory drugs or NSAIDs [1921]. However, due to the lack of valid criteria to evaluate efficacy, only the nonselective NSAID Banamine® has been approved for use in lactating dairy cattle. Subsequently, the need to identify biomarkers to evaluate the efficacy of new and existing drug therapies and to facilitate new veterinary drug approvals has provided a stimulus for investigations into changes in the bovine milk proteome during mastitis.

The biological complexity of bovine milk, including the numerous reported PTMs of milk proteins, the presence of multiple variants of the dominant casein proteins, and the extreme dynamic range of the proteins that comprise bovine milk, has prevented the extensive characterization of low abundance proteins present in the bovine milk proteome [22]. Despite the common proteomic bottlenecks related to sample complexity, attempts have been made to study the dynamics of differential protein expression during bovine mastitis [2328]. Consequently, over the past 10 years, significant advances have been made in identifying low abundance proteins in various bovine milk fractions collected both before and during clinical mastitis infections [2328].

The proteomic analyses of mastitic bovine milk performed thus far have utilized different strategies including 2-dimensional gel electrophoresis (2D-GE) followed by MALDI-time-of-flight (TOF)/MS, and liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS). Proteomics has been used to evaluate modulation of milk proteins during mastitis in milk samples collected from cows with naturally-occurring mastitis infections [23, 24, 29], as well as in milk samples collected before and at time points following experimental induction of coliform mastitis by intra-mammary infusion with Escherichia coli or LPS [2528]. Additionally, proteomics has been used to investigate proteolysis in bovine milk following infusion with lipoteichoic acid isolated from Staphylococcus aureus [30], and comparisons have been drawn between host defense proteins detected in both human and bovine milk fractions [31]. Various quantification strategies have likewise been used to assess modulation in the bovine milk proteome during mastitis including densitometry [23], spectral counting [26, 27], and incorporation of stable isotopes [28]. In all, roughly 80 proteins related to the host response to intra-mammary infections have been robustly identified in bovine milk as a result of proteomic investigations conducted in the past 10 years (Table 1).

Table 1 Proteins identified in bovine milk fractions using proteomic strategies

To a lesser extent, proteomic strategies have also been applied to the analysis of bovine mammary tissue, but the reported analyses have focused on profiling enzymes involved in milk synthesis and the production of milk lipids, and not on differential protein expression during mastitis [32, 33]. Other analyses, however, have focused on the use of proteomics to identify virulence factors, antigenic proteins, cell wall components, and proteins unique to select bacterial strains isolated from cases of bovine mastitis, and have contributed more directly to current knowledge of pathogen responses during clinical intra-mammary infections [3437]. Specifically, proteomic analyses of veterinary pathogens, including etiological agents of mastitis, have identified potential targets for vaccine development, and elucidated potential mechanisms employed by invading bacteria to survive in the host environment [3437].

Though still hindered by the dynamic and heterogeneous cellular composition of the matrix, the use of proteomic methodologies to obtain a more complete and unbiased characterization of host and pathogen responses during clinical mastitis could lead to the identification of a biomarker, or pattern of biomarkers, indicative of the disease. Likewise, the characterization of antigens specific to divergent strains of mastitis-causing bacteria and pathogen responses to the host environment could provide the necessary targets for the development of new preventatives. Should the difficulties inherent to the characterization of a complex proteome be overcome, and the criteria for accuracy, sensitivity, and specificity met, the establishment of biomarkers of mastitis would prove useful in evaluating the efficacy of existing or new drugs to treat secondary inflammation caused by Gram-negative pathogens, or for the discovery of potential new drug targets for the treatment of all intra-mammary infections.

Proteomic Strategies for Biomarker Discovery

The focus of proteomic-based biomarker discovery analyses is typically the identification and characterization of proteins present in a given biological tissue or fluid, the assessment of differential protein expression between different samples, or the detection and evaluation of the PTMs of target proteins. Accordingly, MS has emerged as the dominant approach in protein biomarker discovery analyses. Protein identification through the use of MS can be divided into two main categories, referred to as top-down and bottom-up. The primary distinguishing features between the two proteomic approaches is that a top-down approach involves the direct ionization and fragmentation of intact proteins using MS, whereas bottom-up proteomics entails the proteolytic digestion of protein mixtures followed by the chromatographic separation of peptides and the ionization, fragmentation, and mass analysis of peptides by MS for protein identification [38, 39]. Another aspect that differentiates top-down proteomics from the bottom-up approach is the retention of PTM and cleavage information of identified proteins when using a top-down approach versus the loss of such information when dealing with tryptic peptides in a bottom-up experiment [39]. Top down methods are likewise based on the principle that the sequence of any given protein in a sample is available in its entirety for analysis, as opposed to identification based on the detection of a tryptic peptide that must be sorted out from a complex mixture of peptides [38]. Though gaining in popularity, top-down proteomics has not yet been applied to the evaluation of host or pathogen responses during bovine mastitis, perhaps due to the need for MS instruments with extremely high mass resolution, accuracy, and the capability of fragmenting large ions for top-down protein characterization [39]. Due to relative ease of the methodologies, availability of several supportive software options, the accessibility of numerous suitable instrument systems, and the establishment of viable quantification strategies, bottom-up proteomics has dominated biomarker discovery endeavors related to bovine mastitis. Over the past two decades specifically, 2D-GE followed by MALDI-TOF/MS, and LC-MS/MS have become the most widely used bottom-up proteomic approaches for protein identification in bovine milk.

Two-Dimensional Gel Electrophoresis (2D-GE) and MALDI-TOF MS

Protein profiling by 2D-GE is characterized by a first dimension separation of proteins by isoelectric point, followed by a second dimension sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS- PAGE) separation by molecular size. Advances in 2D-GE technology including gel strips with immobilized pH gradients (IPG) for isoelectric focusing have dramatically increased the resolving power and reproducibility of 2D-GE. Additionally, the development of radioactive and fluorescent labeling has improved the ability to visualize proteins in a 2D gel, as well as the detection of low-abundance and post-translationally modified proteins [40].

In a MALDI-TOF/MS experiment, the protein or peptide(s) of interest is mixed with a suitable energy-absorbing matrix and allowed to co-crystallize by air-drying on a stainless steel plate. In addition to the ability to co-crystalize with the analyte and absorb the wavelength of the laser employed, compounds used as MALDI matrices must also be vacuum stable, soluble in solvents that are compatible with the sample being analyzed, and must enable and promote the ionization of the analyte of interest [41]. Matrices used in MALDI-TOF/MS are typically highly substituted aromatic molecules with carboxylic acid moieties capable of absorbing high levels of UV light at a specific wavelength, such as sinapinic acid, which is commonly used for the analysis of intact proteins, or alpha-cyano-4-hydroxycinnamic acid, which is often used for the MALDI analysis of peptides [41].

The generation of ions in MALDI-TOF/MS is initiated by short pulse irradiation with a laser, typically nitrogen or neodymium-doped yttrium aluminum garnet, and occurs when the matrix becomes electronically excited following absorption of photons from the UV laser. Proteolytically digested peptides in the sample accept a proton from the matrix and become singly charged positive ions as they are converted into the gas phase and ejected from the matrix. The ions are then directed into the TOF analyzer where they are separated by size and generate a mass spectrum. Separation of ions in a MALDI-TOF experiment occurs due to the velocity differences that exist between ions once the pulse of ions exit the ionization source and are dispersed in time down the flight tube. TOF mass analysis is based on the principle that after acceleration to a constant kinetic energy, ions travel at velocities that are inversely related to the square root of their mass-to-charge (m/z) values [42]. Thus, lighter ions “fly” down the TOF tube and reach the detector at faster speeds than heavier ions [42]. The subsequent identification of proteins from a MALDI mass spectrum is accomplished by the comparison of the set of peptide masses generated from a specific protein, or the peptide mass fingerprint, to a protein database containing theoretically calculated mass fingerprints of all known proteins.

Fragmentation reactions, or post-source decay (PSD), can often occur during travel down the flight tube of a TOF MS instrument, possibly due to collisions between ions and neutral matrix molecules, or between ions and residual gas molecules [43]. PSD fragment ions can be monitored using TOF instruments equipped with a reflectron, which is an electrostatic mirror consisting of a series of electrical lenses that possess progressively higher repelling potentials [42]. A reflectron not only improves the mass resolution of the MALDI MS spectrum, but likewise allows for the separation of PSD fragment ions and the generation of a PSD spectrum, which can be searched against a protein database to obtain peptide sequence information. Using a MALDI-TOF instrument, however, requires the generation of several PSD spectra from different mass regions in order to generate fragmentation information adequate for peptide identification. Subsequently, the sequencing of peptides using MALDI-TOF MS is most often performed using instruments equipped with dual TOF analyzers (MALDI-TOF/TOF). In contrast to single TOF MS instruments, MALDI-TOF/TOF instruments allow for the generation of complete fragment ion spectra in a single acquisition as well as improved precursor ion selection, and are thus preferable for accurate peptide sequencing using MALDI [44].

Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS)

Despite the recent advancements in reproducibility and protein quantification, 2D-GE as a means of protein separation still suffers from issues including diminished capacity for the isolation of low abundance proteins, hydrophobic proteins, or proteins with extreme isolelectric points. Consequently, LC has emerged as the preferred method for in-solution separation of proteins and peptides prior to mass analysis using MS. Using the LC-MS/MS approach, proteins are proteolytically digested into peptides using a protease such as trypsin, which cleaves at every arginine and lysine residue, and separated online by 1- dimensional (1-D) or 2-dimensional (2-D) LC prior to introduction into the mass spectrometer for mass analysis. Separation of peptides is accomplished by either strong cation-exchange chromatography, reverse-phase (RP) chromatography, or a combination of both separation strategies, followed by ionization and mass analysis of the peptides and peptide fragments by ESI-MS/MS [45]. In traditional 2-D LC, peptides are separated in the first dimension by charge using ion exchange chromatography, and are then further separated in the second dimension by hydrophobicity using RP-LC. Alternatively, peptides can be separated by 2D-LC using RP-LC in both dimensions. In 1-D LC-MS/MS experiments, peptide mixtures are typically separated only by hydrophobicity by passage over a column packed with non-polar stationary phase. Using either approach, the number of proteins identified using LC-MS/MS is directly dependent on the efficiency of peptide separation prior to introduction into the mass spectrometer [46]. Poor chromatographic resolution increases the frequency of the co-elution of peptides off the LC column prior to introduction into the mass spectrometer, which can result in the production of a tandem mass spectrum of a peptide mixture that often fails to yield a match when searched against a protein database. Additionally, the potential for selection of a peptide from a low abundance protein for further fragmentation by a process called collision-induced dissociation (CID) will decrease with poor LC peak resolution, and peptides from dominant proteins will be preferentially selected for analysis.

In LC-MS/MS experiments, ESI is the dominant method of ionization. Ionization occurs in ESI after the peptide solution is dispersed as a fine spray of charged droplets after passage through a heated metal capillary tube to which voltage is applied. The charged droplets get desolvated by a dry inert gas, and multiply charged ions are produced. Nano-spray ionization (NSI) functions in a manner similar to ESI, but flow rates from the LC instrument into the ionization source of the mass spectrometer are much lower with nano-spray than those used for ESI. Because flow rates are reduced from microliters per minute to nanoliters per minute when using nanospray, droplet formation can occur without the additional of heat or a sheath gas, smaller charged droplets are formed, and ionization efficiency is greatly improved [47].

Ions resulting from either ESI or NSI are directed into the vacuum chamber of the MS instrument, and are resolved according to their m/z ratio. In tandem mass spectrometry (MS/MS), the masses of precursor ions are determined in the first MS scan, and an MS spectrum is generated. From each MS scan, a pre-determined number of ions can be selected for further fragmentation by CID. Fragmentation by CID involves the introduction of an inert gas such as argon (Ar) or helium (He) into the collision cell of the mass spectrometer which, through impact with the selected precursor ions, results in further fragmentation of the ions. The second stage of tandem MS is used to analyze the masses of the fragment, or product, ions produced by CID, and results in the production of a tandem or MS/MS mass spectrum. Peak lists generated from the fragment ion masses in tandem mass spectra are then distilled and searched against a protein database to determine the amino acid sequence of the peptides in the complex mixture. The assignment of the sequenced peptides to a given protein is the means by which protein identification is ultimately accomplished [46].

Common Proteomic Bottlenecks

Comparative proteomic analyses are designed to elucidate changes in the relative abundance of proteins among different biological states, most commonly healthy versus diseased. Detection of the same peptides from a given protein is not always possible in comparative studies, however, because PTMs of peptides as a result of disease is expected. Characterization of PTMs is crucial for biomarker discovery, because much of the regulation of the biological activity of proteins is mediated by the modification of peptide amino acid residues, including the phosphorylation of serine and threonine, and the glycosylation of asparagine, arginine, or tyrosine. Unfortunately characterization of PTMs has been hindered in past experiments due to the fact that modifications are labile and are often lost in a CID experiment. Electron-transfer dissociation (ETD), which is a superior fragmentation strategy for the analysis of PTMs, however, was recently introduced, and shows promise as a strategy for the characterization of protein modification during disease [48]. The ETD technique uses electrons to promote fragmentation along the peptide backbone, which produces a series of c- and z- ions, instead of CID fragmentation, which most often produces a series of b- and y- ions. Protonation during ionization occurs most often at the N-terminal amino group; however, the charge can be localized to any of the nitrogen atoms that comprise the amide bond [42]. As a result, all three of the peptide backbone bonds can be cleaved, and either the N- or C- terminus fragments may retain the charge (Fig. 1). When the charge is retained on the N-terminus, the ions are denoted as either a n , b n , or c n, whereas C-terminus ions are designated as x n , y n , or z n. While side chain information (R group) is lost during the formation of b- and y- ions, the fragmentation of the peptide backbone using ETD and the generation of c- and z- ions allows for amino acid side chains and modifications such as glycosylation and phosphorylation to remain intact, making it possible not only to deduce the amino acid sequence of a peptide, but also to detect any modified residues [48].

Figure 1
figure 1

Diagram of N- and C- terminus fragment ions generated from all three (3) possible cleavages of peptide backbone bonds

Other limitations that obstruct comparative proteomic experiments include the types of proteins present in the sample to be analyzed, protein abundance or dynamic range, and the chosen proteomic strategy. A select number of highly abundant proteins can sometimes represent a large percentage of the total protein concentration in a given sample, which can lead to a higher probability of selection of peptides from the abundant proteins for MS/MS analysis, and hindered detection of low abundant proteins. For example, serum albumin can account for more than half the total protein concentration of plasma (30-50 g/L), whereas minor proteins such as interleukins are present in only picogram concentrations [49]. However, identification of low abundance proteins is critical to biomarker discovery, as low-copy-number gene products have a high probability of being potential drug targets and biological markers of disease [50].

Preparation strategies are typically employed to deplete high abundance proteins prior to LC-MS/MS in an effort to reduce sample complexity and enhance detection of low abundance proteins. Some traditional depletion strategies include the precipitation of abundant proteins, the use of size exclusion filters or 1D-SDS PAGE to fractionate a complex mixture of intact proteins by mass, and targeted removal of abundant proteins using immuno-depletion techniques. There are several commercially available depletion kits optimized for use with serum and plasma that selectively deplete a number of the most abundant blood proteins including serum albumin, immunoglobulins, complement, and acute phase proteins. Additionally, the use of a bead-bound library of peptides with random rearrangements of six amino acids and the capacity for binding a large number of different proteins is gaining popularity as a means to equalize the abundance of proteins in a complex biological sample [51].

Specific depletion techniques used in prior proteomic analysis of bovine milk have included acid precipitation of the casein proteins [23, 28, 29], and the use of bead-bound peptide libraries [27]. When considering comparative proteomic analyses of normal and mastitis milk, however, the divergence that exists in the number of hydrophobic and hydrophilic proteins in a given sample at each physiological state, the varying sizes and charge states of proteins present in each matrix, PTMs, and the cellular distribution of proteins in one sample versus another makes the application of a universal sample preparation strategy extremely unfeasible. Additionally, though depletion strategies are commercially available, they are often optimized for only one biological matrix, and are not amenable to all types of samples, most notably bovine milk, and can lead to the non-selective depletion of proteins of potential interest [27]. Likewise, profound changes in the dynamic range, modification, and protein composition of a given sample due to the induction of a disease such as mastitis can further confound the use of sample preparation strategies [27]. When no effective sample preparation strategy is applicable, several proteomic approaches are often combined to increase protein detection and subsequent proteome coverage. For example, LC techniques are more efficient for detection of hydrophobic, low molecular mass, and basic proteins than 2D gel based methods, whereas 2D-GE enables the detection and visualization of PTMs [50]. Despite limitations and inherent drawbacks, proteomic strategies offer a wider range of capabilities than classic approaches to protein characterization such as ELISA and, with further advances, could factor prominently in the establishment of biomarkers of mastitis.

Quantification

Advances in MS have provided a means for the accurate and sensitive identification of differentially expressed proteins in complex biological samples, but another important criterion for the establishment of disease biomarkers is reliable quantification. Relative and absolute quantification of modulation in protein abundance using proteomic strategies is a topic that has garnered significant attention in recent years [5255]. Proteomic-based quantification methods can be assigned to one of two broad categories: the incorporation of stable isotope labels into proteins or peptides prior to MS analysis, or the alternative, which is to use a label-free quantitative method [55].

The basis of most labeled quantification methods is the theory that a labeled peptide will behave in the same fashion chemically as its unlabeled counterpart, and therefore the two peptides will have identical chromatographic and MS properties [55]. The addition of a label, however, does impart a mass difference between the two peptides, and thus relative abundance can be inferred by comparing the respective signal intensities of the labeled and unlabeled peptides in the same MS run [55]. Labels can be incorporated in several ways, with the most popular means being either metabolically, chemically, or enzymatically [53, 55]. Examples of labeling strategies include: metabolic labeling commonly known as stable isotope labeling by amino acids in cell culture or SILAC [56]; proteolytic labeling with 18O [57]; or isotope incorporation by several means including chemical derivatization also known as isotope coded affinity tags or ICAT [10], isobaric tags for relative and absolute quantitation (iTRAQ) [58], or global internal standard technology [59].

The derivatization of the primary amine groups of proteins or peptides with isobaric tags, or iTRAQ, has become a very popular LC-MS/MS quantification strategy in recent years. Quantification using iTRAQ is performed by the fragmentation of the attached isobaric tag, which generates a low molecular mass reporter ion [58]. The popularity of iTRAQ in proteomic screens, including the analysis of mastitis pathogens [36], the bovine milk fat globular membrane [60], and bovine milk following in vivo LPS challenge [28], is due primarily to the capacity to simultaneously analyze four or more of differential protein pools.

However, though very accurate, labeling strategies can be cost-limiting, unsuitable for some types of biological matrices, and cannot be performed retrospectively. In terms of comparative proteomic analysis of normal versus mastitic milk aimed at biomarker discovery, labeling strategies are not always feasible for protein quantification. Due to dramatic changes in protein composition during clinical mastitis, accurate comparisons of the abundance of a peptide that is present at one physiological state but not the other is problematic [27]. As a result, label-free strategies, which are based on the correlation between the abundance of a protein or peptide in a sample and the MS signal [55], have increased in popularity in recent years. Ion intensity, determined by extracted ion chromatograms (XIC), is one of the most accurate and widely used methods of label-free quantification. Using XIC, the number and intensity of selected precursor ions at a particular m/z are summed, and the peaks areas used as a measure of relative abundance [61]. Another popular label-free approach is spectral counting, which is defined as the number of MS/MS spectra that contribute to the identification of a given protein [62, 63]. Spectral counting is based on the assumption that the abundant proteins, when proteolytically digested, will yield numerous copies of the same peptide, and that peptides from abundant proteins have higher probabilities of triggering multiple MS/MS events than peptides from lower abundance proteins. Similar to spectral counts, the number of unique peptides assigned to each protein in a sample has likewise been used as a measure of relative protein abundance [26, 62]. An inherent drawback of label-free methods, though, is the assumption that the linearity of response will be the same for each protein, which often does not hold true because the chromatographic behavior of peptides tends to vary [55]. Additionally, because the amino acid composition of every peptide differs, the ionization potential of each peptide is unique, and may not always be correlated to abundance. Thus, many spectra must be acquired and data normalized when using label-free methods [61]. Nevertheless, label-free quantification does not require any extra sample processing and can be performed retrospectively; two attributes which make non-labeled quantification methods the continued focus of development in biomarker discovery research [55].

Modulation in the Bovine Milk Proteome During Mastitis

Proteomic strategies were first applied to the study of the bovine milk proteome in the late 1980s, and were predominantly focused on analysis and characterization of the most abundant milk proteins. The initial proteomic ventures utilized either 2D-GE or LC alone, and were performed only on bovine whole milk [6466]. Similarly, the earliest attempt to identify bovine milk proteins via LC coupled with MS was limited to the detection of only the major milk proteins β-lactoglobulin, α-lactalbumin, α-casein, β-casein, and κ-casein, and lacked peptide sequencing data [67]. More recently, however, proteomics has been used to characterize and, in some studies quantify, changes in the bovine milk proteome that represent the host response to infection [2329].

Initial Comparative Proteomic Analyses of Bovine Milk

The first proteomic endeavor aimed at the identification of novel markers for bovine mastitis was a comparison of differentially expressed proteins in normal and mastitic bovine milk from seven cows with naturally-occurring clinical infections using 2D-GE followed by MALDI-TOF-MS PSD [29]. Milk sample preparation prior to proteomic analysis was relatively simple, and consisted of removal of the milk fat by centrifugation at 4°C, and acid precipitation of the caseins. The results of the proteomic analyses indicated the isolated presence of lipocalin-type prostaglandin D synthase in 2D gels generated from milk samples collected from cows with clinical mastitis [29]. Other proteins detected in the 2D-GE profiles of the milk samples analyzed included serum albumin and κ- casein [29]. Though the results were extremely limited and only a single differentially expressed protein in mastitic bovine milk was reported, the study nonetheless marked the first instance of published data on a marker of inflammation identified in bovine milk during mastitis using proteomic strategies.

A later attempt to identify differentially expressed proteins in whey from healthy cows versus cows with clinical mastitis also employed 2D-GE followed by enzymatic digestion of isolated proteins and MALDI-TOF-MS [22]. Mastitis was defined strictly by identification of clinical signs and milk sample preparation prior to analysis included fat removal and precipitation of the casein proteins by the addition of salt, followed by dialysis to remove traces of the ammonium sulphate [23]. Despite attempts to remove the casein fraction from the milk prior to proteomic analysis, however, αS1-casein, β-casein, and κ-casein were still found in rather high abundance in normal milk. Also identified were the proteins serum albumin, transferrin, microsomal triglyceride protein, β-lactoglobulin, and α-lactalbumin [23]. Marked increases in serum albumin and transferrin, concurrent with apparent decreases in the caseins, β-lactoglobulin, and α-lactalbumin were apparent in the 2D-GE profiles generated from whey samples from cows with mastitis, and supported the well established theory that serum protein concentration increases in milk during clinical mastitis infections as a result of the breakdown of the blood-milk barrier [23]. While no additional biomarker information was gained and protein identification was restricted to major milk proteins, the results were the first to demonstrate temporal changes in bovine milk protein expression during mastitis without a reliance on antibody-based strategies.

Detection of Proteins Related to the Host Response in Bovine Milk

The first extensive report of protein modulation in separate sub-cellular fractions of bovine milk collected during different physiological states included direct LC-MS/MS, as well as 2D-GE, excision of protein spots, and identification by either MALDI-TOF-MS or LC-MS/MS [24]. The milk fractions analyzed included milk from a cow in peak lactation, colostrum from a fresh cow, and milk from a cow with naturally-occurring clinical mastitis. Though the sample size was limited, the results of the analyses were the most comprehensive to date in terms of the number of low abundance proteins indentified, and the number of host response proteins that were isolated from mastitic milk [24]. An additional novel aspect to the proteomic analyses conducted by Smolenski and colleagues was the fact that no sample clean-up or attempts to deplete high abundance proteins were carried out on the samples prior to analysis. Despite the presence of caseins in the fractions analyzed, the study marked the first reported identification of proteins such as apolipoprotein A-1, cathelicidin-1, heat shock protein 70kD protein, peptidoglycan recognition receptor protein (PGRP), calgranulin B and C, and serum amyloid A (SAA) in milk fractions collected from a cow with naturally-occurring mastitis [24].

The first reported comparison of protein expression patterns in milk from cows before and after experimental induction of E. coli mastitis utilized 2D-GE followed by peptide sequencing using MALDI-TOF PSD to characterize differentially expressed proteins [25]. Unlike prior 2D-GE-MALDI analyses of mastitic bovine milk, no removal of high-abundance proteins was performed prior to analysis. The analyses involved milk samples collected from 8 cows before, and at 18 h following intra-mammary inoculation with E. coli, and only proteins present in whey fractions of all 8 cows were sequenced to avoid reporting a protein that represented a potentially unique response to infection. Despite the lack of sample clean-up, the low abundance proteins transthyretin, lactadherin, β-2-microglobulin precursor, α-1-acid-glycoprotein (A1AG), and complement C3 precursor were identified in whey samples from healthy cows. Whey samples at 18 h post infection were characterized by an abundance of serum albumin, in spots of varying mass and isoelectric point, as well as increased transthyretin and complement C3 precursor levels. Also detected at 18 h post inoculation were the antimicrobial peptides (AMPs) cathelicidin −1, −2, −3, and −4, and the proteins β-fibrinogen, α-2-HS-glycoprotein, S100-A12, and α-1-antiproteinase [25]. The most notable results of the analyses, however, were the detection of the cathelicidin cationic AMPs, and the identification of the acute phase protein (APP) A1AG in both normal and mastitic whey samples, as discoveries of the AMPs and A1AG were not reported in previous comparative 2D-GE proteomic analyses of bovine milk [23, 24, 29]. As well, prior reports of APP expression in milk during bovine mastitis were all antibody-based detections, and only documented the identification of the APPs SAA, haptoglobin (HPT), and lipopolysaccharide binding protein [6870].

Temporal Expression and Quantification of Bovine Milk Protein Modulation during Clinical Mastitis

The first proteomic analyses of a more extensive longitudinal series of bovine milk samples collected from 8 mid-lactation cows before and over the course of experimental infection with E. coli, utilized an ultra pressure LC instrument coupled to a quadrupole TOF-MS [26]. Similar to earlier analyses [24], only a select number of proteins related to the host response were identified in whey from milk following E. coli challenge. LC-MS/MS conducted on whey from milk samples collected just prior to infusion with E. coli and at various time points following infection resulted in the identification of the high to medium abundance proteins αS1-, αS2- β-, and κ-casein, and the whey proteins serum albumin, β-lactoglobulin and α-lactalbumin. Additionally, a select number of lower abundance markers of inflammation including lactoferrin, transferrin, apolipoprotein A-I, fibrinogen, Glycam-1, PGRP, and cathelicidin-1 were also identified. Despite limited protein identifications, normalized peptide counts for each protein identified were used to evaluate temporal changes in milk proteins following infection [26]. Additionally, to assess the accuracy of LC-MS/MS-based label-free quantification strategies, modulation in the abundance of serum albumin, lactoferrin, and transferrin in milk during disease evaluated using proteomic-based methods were compared with protein abundance measured using ELISAs. The outcome of the comparison of label-free, proteomic-based quantification with abundance profiles generated by ELISAs revealed that label-free LC-MS/MS methods yielded results that were both comparable to antibody-based detection, and a viable means of tracking changes in relative protein abundance in milk during disease [26]. Despite the identification of primarily abundant milk proteins, the results likewise indicated that, with further methodology refinement, LC-MS/MS could be used to evaluate temporal changes in proteins related to host response for which no antibody existed [26].

More recent comparative proteomic analyses of whey from bovine milk before, and at time points after experimental challenge with E. coli [27] or LPS [28], have resulted in the more robust detection of bovine milk proteins, including the identification of a greater number of proteins related to the host response to infection. The separate comparative analyses both involved evaluation of samples collected before and following experimental induction of clinical mastitis, both reported quantification of changes in the relative abundance of milk proteins related to host response, and both detected equivalent numbers and types of proteins (Table 1). The distinguishing factor between the two comparative proteomic analyses was the application of the iTRAQ labeling strategy for quantification of protein modulation [28], versus the use of spectral counts, a label-free proteomic approach, to assess changes in the relative abundance of minor milk proteins [27]. Other divergent aspects of the in-vivo challenge studies included the number of animals enrolled in each study, as well as sample preparation strategies. Methodologies employed in the proteomic analysis of host response to LPS-mediated intra-mammary inflammation differed slightly from previous E. coli challenge studies [2527] in that only 3 cows were enrolled in the study, and acid precipitation of caseins was performed prior to LC-MS/MS analyses.

Similar to the prior reports on the use of label-free quantification strategies [26], the accuracy of LC-MS/MS label-free quantification performed using samples collected in the E. coli challenge study was compared to data generated using antibody-based strategies [27]. Unlike prior reports, however, total spectral counts were used as a measure of relative abundance and, due to increased sensitivity of the instrument system, the temporal expression of the low abundance acute phase proteins HPT and SAA were evaluated as opposed to major milk proteins [27]. Comparisons of ELISA data and total spectral counts for the two APPs revealed trends in temporal expression with very similar overall patterns, which reaffirmed the utility of using LC-MS/MS data to model expression patterns of proteins related to the host response in bovine milk during clinical mastitis. Conversely, quantification of host responses to LPS-mediated inflammation were calculated using fold-change in equivalent iTRAQ reporter ion intensities for peptides identified in pre-challenge milk samples and time points following challenge [28].

Though the time frame of the reported modulation in protein expression differed due to inoculation with E. coli versus purified LPS, which is known to stimulate a more rapid inflammatory response than E. coli in the bovine mammary gland [18], and the methods of determining changes in protein abundance were not the same, the identified proteins with altered expression patterns were similar for the two separate analyses. Proteins related to the host response to infection detected in whey from bovine milk at time points following infection were predominantly vascular derived, acute phase, antimicrobial, complement, or related to immune response, and fell into categories that could be broadly classified as secondary effects of cytokine induction [27, 28]. Cytokine expression during the bovine innate immune response to invading pathogens has been well characterized, and the induction of vascular leak and APP synthesis, as well the stimulation of neutrophils as a result of cytokine production have been previously established [14]. Following intra-mammary inoculation with E. coli, several vascular-derived proteins including all three chains of the blood coagulation protein fibrinogen, apolipoprotein A-1, and serotransferrin were detected in bovine milk as early as 12 h after challenge, with peak abundance detected at 24 h following induction of mastitis [27]. The same proteins were detected following challenge with LPS, and were 2-fold or higher in expression level 7 h after infusion [28]. Similarly, peak expression of complement C3 determined using spectral counts, was detected in bovine milk 18 h after induction of coliform mastitis [27], while 2-fold changes in reporter ion intensity was apparent for complement C3 peptides in bovine milk 7 h after LPS challenge [28]. The reported increases of serum proteins in milk following E. coli or LPS challenge are most likely results of vascular leakage induced by tumor necrosis factor-alpha (TNF-α) or interleukin-1 beta (IL-1β) expression, and corresponded well with prior reports of peak cytokine expression during E. coli and LPS induced mastitis [14, 18]. In all, 13 of the same proteins were found to have biologically relevant changes in relative abundance during clinical mastitis in both comparative proteomic analyses, including complement factors, the AMPs cathelicidin-1 and PGRP, apolipoprotein A-1, the APPs serotransferrin, HPT, and SAA, and the somewhat poorly characterized proteins kininogen, inter-alpha-trypsin inhibitor heavy chain-4 (ITIH4), and clusterin [27, 28].

The most interesting aspect of both quantitative proteomic analyses of bovine milk following experimental induction of mastitis was the modulation of apolipoproteins and the somewhat novel candidate biomarkers ITIH4, kininogen, and clusterin. Quantification of changes in abundance of the potentially novel candidates was accomplished using both iTRAQ [28] and spectral counts [27]. The temporal expression patterns of the candidate proteins, modeled using LC-MS/MS spectral count data (Fig. 2), likewise appeared to be in accord with prior reports of inflammatory mediator expression during coliform mastitis [14]. Though apolipoproteins are known to be involved in lipid transport and are major components of high density lipoproteins, other roles for the apolipoproteins during disease and inflammation have been proposed [71]. The specific role of the apolipoproteins during inflammation related to coliform mastitis has not yet been determined, but implications are that the apolipoproteins could inhibit neutrophil activation, as well as the release of inflammatory cytokines [72, 73]. By similarity, bovine ITIH4 is assumed to be a serine protease inhibitor involved in the acute phase response. Reports of ITIH4 expression in cattle, however, have been limited to isolation of the APP from the serum of heifers with experimentally induced summer mastitis [74]. Prior to recent comparative proteomic analyses of bovine milk following in vivo challenge, ITIH4 was never identified in bovine milk, and its exact role during coliform mastitis remains unknown. Kininogen, on the other hand, belongs to the family of plasma kallikreins, and is known to play key roles in complement activation [75], and the release of bradykinin [76]. Bradykinin levels in milk from cows with experimentally induced coliform mastitis have yet to be investigated, but the kinin peptides are presumed to be potent mediators of vasodilation, pain, and udder edema during clinical mastitis [77]. Similar to ITIH4, the function of clusterin during coliform mastitis also remains unclear. Prior reports have indicated that clusterin could possess anti-inflammatory properties [78, 79], but the only inference in regards to the role of clusterin in the bovine mammary gland is that clusterin could be associated with mammary gland involution, the clearance of cellular debris, or apoptotic cell death [80]. Though more focused follow-up analyses are required, the results of the comparative proteomic analysis of bovine milk during experimentally-induced clinical mastitis have provided information that could prove useful in the design and execution of future studies, and have shed light on potential candidates for the establishment of inflammatory biomarkers in bovine milk.

Figure 2
figure 2

The temporal expression of the potentially novel biomarkers of mastitis apolipoprotein A-1, clusterin, kininogen-2, and ITIH4 (mean spectral counts ± standard error) detected in whey from mastitic milk (27)

Comparison of Host Defense Proteomes of Human and Bovine Milk

Associations between proteins related to the host defense present in human and bovine milk have recently been reported [31]. The total numbers of proteins identified following proteomic analyses of both human and bovine milk differed by only one protein, and more than half of the reported proteins were detected in samples from both species [31]. Using LC-MS/MS, analyses of whey from milk and the milk fat globular membrane from both human and bovine milk resulted in the identification of 44 proteins related to the host response in human milk fractions, and 51 host defense proteins in the equivalent bovine samples. Thirty-three (33) defense-related proteins were identified in milk fractions from both species, and included several APPs, complements factors, members of the cathelicidin family of cationic AMPs, clusterin, immunoglobins, osteopontin, several mucins, lactoferrin, calgranulins, as well as protease inhibitors [31]. Though similar proteins were detected in the human and bovine milk fractions, the abundance and distribution of the proteins differed between the two species. Specifically, the authors noted the higher prevalence of antibacterial proteins in the bovine milk fractions, compared to a marked increase in the detection and abundance of immunoglobins in human milk fractions [31]. Possible explanations for the differential prominence of AMPs, including lactoperoxidase in bovine milk, and immunoglobins in human milk, were higher thiocyanate levels in the ruminant diet and differences in immune system development in human babies and bovine calves, respectively [31].

Comparative Proteomic Analyses of Mastitis Pathogens

Compared to proteomic analyses of bovine milk and the genomic analyses of common food-borne and mastitis pathogens, only limited data exists regarding changes in the proteomes of common mastitis pathogens isolated from mastitic bovine milk. The limited number of proteomic analyses conducted on prevalent mastitis pathogens is most likely explained by the prior emphasis on genomic analyses, as well as the fact that comparative proteomics is still an emerging technology in food animal research and veterinary medicine. Alternatively, the dominance of comparative bovine milk analyses could be explained by the relative ease of sample collection and the ability to obtain much larger sample volumes.

To date, reports of the proteomic analysis of pathogens specifically isolated from cases of clinical mastitis have been limited to an investigation into the cell wall components of S. aureus isolated from bovine mastitis [34], proteomic analysis of changes in E. coli when grown in milk and laboratory media [36], serological proteome analysis of immunogenic proteins from a strain of S. aureus isolated from cows with sub-clinical mastitis [35], and the proteomic characterization of different bovine mastitis S. aureus isolates [37].

A focus on virulence factors and cell wall components of S. aureus have predominated proteomic analyses of mastitis pathogens due primary to the fact that S. aureus is the most common Gram-positive etiological agent of contagious bovine mastitis. Additionally, the surface-associated secretory products, leukotoxins, and enterotoxins expressed by S. aureus represent potential targets for mastitis vaccine development. Surface components of mastitis pathogens are of particular interest due to the role these proteins play in the adhesion of invading bacteria to bovine mammary tissue, and the resulting potential for resistance to phagocytosis by host milk cells [81]. In an initial proteomic characterization of S. aureus isolated from bovine mastitis, the majority of the proteins identified by 2D-GE followed by MALDI-TOF MS were classified as either cell wall or membrane-associated proteins [34]. A specific discovery, however, included the detection of DnaK in bovine mastitis S. aureus isolates, a major surface-exposed antigen that could be involved in the recognition of epithelial cell receptors [34].

Serological proteome analysis of bovine mastitis S. aureus isolates utilized 2D-GE followed by MALDI-TOF MS, and resulted in the identification of the three highly immunogenic proteins DNA translocase FtsK, ribosomal protein S1, and a Tell-like protein [35]. Detection of DNA translocase FtsK was noteworthy, as the protein is required for DNA replication, recombination, and transfer within and between cells, and could potentially be involved in peptidoglycan synthesis [82, 83]. Likewise, the antigenic properties of ribosomal protein S1 have been reported [82], and Tell-like proteins were previously classified as both virulence and drug-resistance factors [84].

The most recent proteomic analysis of bovine mastitis S. aureus isolates also employed 2D-GE followed by MALDI-TOF MS for protein detection and identification. However, the study focused on the comparative analyses of 17 different S. aureus strains isolated from cows with clinical and sub-clinical mastitis [37]. Results of the proteomic comparison of divergent bovine mastitis S. aureus strains were the identification of 12 proteins that were conserved across the majority of the strains including alkyltransferase-like protein, zinc metalloproteinase aureolysin, glycerophosphodiester phosphodiesterase, lipoteichoic acid synthase, pyruvate dehydrogenase, and stringent starvation proteins A and B [37]. Conversely, there were 15 proteins that exhibited variable expression patterns across the isolates analyzed including the serine proteases spore photoproduct lyase B, C, and F, and the superantigens toxic shock syndrome toxin-1, staphylococcal enterotoxin C, formyl peptide receptor-like-1 inhibitory protein, hyaluronate lyase precursor A1 and A2, and penicillin-binding protein 2 [37]. Data generated by the comparative proteomic analyses of the different S. aureus strains isolated from cows with clinical mastitis not only supported prior theories that superantigens possess immunomodulatory effects and play an important role in mastitis pathogenesis, but will most certainly factor into future endeavors to characterize host-specificity of S. aureus isolates as well as targets for vaccine development [37].

Though there are numerous reports of the proteomic analyses of E. coli isolates, one study in particular focused on changes in the E. coli proteome when a strain isolated from a case of bovine mastitis was grown in fresh milk versus laboratory media [36]. The results of the comparative proteomic analyses of E. coli grown in different types of inhibitory media was the detection of several proteins that could represent specific mechanisms by which E. coli evades host immune detection and survives in the bovine milk environment. Findings that indicated mechanisms by which the bacteria adapted in order to survive in bovine milk included the up-regulation of several proteins in E. coli grown in fresh bovine milk including β- galactosidase, an enzyme involved in lactose metabolism, up-regulation of siderophores involved in iron-chelation, as well as increased expression of LuxS, an enzyme that is critical for the synthesis of bacterial hormone-like compounds involved in inter-bacterial communications [36]. Additionally, all of the identified flagellar proteins were down-regulated, including flagellin, which has been identified as a ligand for toll-like receptor 5. Down-regulation of flagellin could represent a way the bacteria escape detection by the host immune system [36].

Summary and Conclusion

The identification of all proteins that comprise the bovine milk proteome, the discovery and establishment of biomarkers indicative of clinical mastitis, and a more complete understanding of pathogen responses in the host environment during clinical mastitis infections have been hindered by several analytical challenges. Caveats to the use of proteomic strategies for the discovery of biomarkers of host and pathogen responses during mastitis include the lack of a universal sample preparation strategy capable of overcoming the complexity of the biological matrices both before and during disease, the intense dynamic range of proteins present in the bovine milk proteome, protein identification reports based on the use of different sample preparation strategies and instrument systems, each with associated strengths and weaknesses, and the intrinsic variability apparent across biological replicates during in-vivo challenge models. Notwithstanding inherent drawbacks, the data generated during recent proteomic analyses of modulation in host and pathogen responses during clinical mastitis has expanded current knowledge of the biological mechanisms involved in bovine mastitis, and revealed several possible candidate biomarkers of the disease.

Though the future development and validation of a single biomarker specific to bovine mastitis may not be feasible, establishment of protein expression profiles indicative of the host response during clinical mastitis infections could prove valuable for early disease detection, as well as for distinguishing between local and systemic responses to infections caused by divergent bacterial pathogens. Protein biomarker profiles could also serve as a means to evaluate response to treatments, and could aid in the review and approval of new veterinary therapeutics. Because mastitis is associated with profound inflammation and pain that severely compromise the health and wellbeing of infected animals, the establishment of a protein biomarker or biomarker profile could also have a substantial impact on animal welfare when used for early detection. Likewise, should the use of proteomic strategies enable the direct detection and quantification of a protein biomarker in bovine milk, the biomarker could be used for on-line surveillance when incorporated into automated milking systems. Though further evaluations are needed and additional refinement to sample preparation and analysis strategies is required, the results of the proteomic analysis conducted to date on bovine milk and etiological agents of the disease have provided information that could prove useful in the design and execution of future studies of inflammatory biomarkers in bovine milk, and have identified targets for therapeutic and vaccine development that merit more focused hypotheses-driven analyses.