Introduction

Soybeans (Glycine max L. Merr.) are an excellent source of high-quality protein, polyunsaturated fats, vitamins, minerals, and other nutrients for both human food and animal feed (Krishnan 2000; Wang et al. 2003). Soy proteins are added to a wide range of food products, such as infant formulas, meal replacement drinks, and sports bars and beverages. As well as increasing the nutritional value of the food, the US Food and Drug Administration and the Joint Health Claims Initiative in the UK, among other regulatory bodies, have established that adding soy proteins to food reduces cholesterol levels, thereby lowering the risk of cardiovascular disease.

Food allergies are an increasing health and wellness concern in most countries of the world, with North America being no exception. Currently, it is estimated that 2–6% of the North American population suffers from food allergies, and soybean is one of the eight foods that requires labeling in many countries (EU, Japan, Canada, and USA). Due to its widespread use in the food and beverage industry, soybean is often a “hidden ingredient,” which adds to its significance as a food allergen (Zarkadas et al. 1999; Hefle 2001; L’Hocine and Boye 2007).

Soybean seed contains approximately 36–38% protein composed of two major storage proteins, glycinin, and β-conglycinin, which account for ~40% and ~25% of the total protein, respectively (Fehr et al. 2003; Nielsen et al. 1989). Because glycinin and β-conglycinin greatly impact on the nutritional value and quality of soybean products, these two storage proteins and the genes encoding them have been studied extensively.

Immunoglobulin E (IgE) Western blotting analysis has been widely used for the identification of food allergens (Batista et al. 2007; Xiang et al. 2008; Herian et al. 1990). However, Western blotting alone is often insufficient to accurately identify allergens due to multiple proteins occupying a single band or gel spot. In this study, we show how the use of high-resolution 2D gel electrophoresis combined with knockout soy lines and tandem mass spectrometry (MS) allows for allergen identification with much higher confidence levels. Liquid chromatography (LC)/MS/MS has evolved into a highly powerful tool for accurate mass measurement and produces molecular weight (M w) information as well as sequence specific fragmentation data for each peptide. These data are then used to search a protein database for the identification of proteins (Perkins et al. 1999). Recently generated soybean proteome databases using high-resolution 2D gel separation coupled with rapidly expanding genome sequence data makes for an attractive method to conclusively identify IgE reactive proteins (Hajduch et al. 2005).

The three main objectives of this study were to use high-resolution gel electrophoresis and Western blotting to separate and locate the soybean proteins capable of binding IgE from soybean-allergic and sensitive patient serum, to use MS to confirm previously reported allergens and identify potential novel allergens, and to rank and attempt to draw conclusions on the prevalence and significance of these allergens among North Americans with soybean food allergies.

Materials and Methods

Materials

Immobilized pH gradient (IPG) Ready Strips™, pH 4–7 and 5–8, 7 and 17 cm, Precision Plus Protein™ Standards, BioSafe™ Coomassie, and Immun-Blot™ PVDF membrane (0.2 µm) were obtained from Bio-Rad Laboratories, Mississauga, ON. Immobiline DryStrips, pH 6–11, 7 and 17 cm, carrier ampholytes 3.5–10, IPG buffer 6–11, and ECL Plus horseradish peroxidase (HRP) substrate were from GE Healthcare, Baie d'Urfé, QC. Low-melting agarose SeaPlaque™ was obtained from BioWhittaker Molecular Applications, Rockland, ME. Acrylamide/bisacrylamide solution, urea, thiourea, glycine, glycerol, sodium dodecyl sulfate (SDS), 3-[(3-cholamidopropyl) dimethylammonio]-1-propanesulfonate (CHAPS), and Tris (base) were obtained from BioShop Canada Inc., Burlington, ON. N,N,NN′-tetramethylenediamine, 1,4-dithiothreitol (DTT), Brilliant (Coomassie) Blue R-250, iodoacetamide, trichloroacetic acid, and β-mercaptoethanol were obtained from Sigma-Aldrich Canada, Ltd., Oakville, ON. Blocking reagent was purchased from Roche Diagnostics, Laval, Qc. Anti-human IgE HRP was purchased from Southern Biotech, Birmingham, AL. All other chemicals and reagents were of the highest purity commercially available.

Selection of Plant Materials

The soybean lines used in this investigation consisted mostly of Harovinton but also included null genotypes, with different glycinin and β-conglycinin subunit compositions developed by Dr. V. Poysa at the Greenhouse and Processing Crops Research Centre, Harrow, ON (Poysa et al. 2006; Zarkadas et al. 2007). Harovinton is a northern-adapted cultivar used as the quality standard for Canadian tofu-type soybeans.

Soy-Allergic and Soy-Sensitive Human Sera

A total of 32 human sera were acquired from two sources: nine Canadian from Drs. Susan Hefle and Steve Taylor (University of Nebraska) and 23 American from PlasmaLab International, Everett, WA (www.plasmalab.com). Clinical symptoms of allergy to soybean and ImmunoCAP scores for four legume allergens are listed in Table 1. All of the Canadian patients were allergic to soy; however, only 14 of the American patients were allergic to soy, and nine were sensitive as represented by their ImmunoCAP score. Patients with clinical symptoms of allergy were designated allergic; while those lacking symptoms but with high ImmunoCAP scores were designated sensitive.

Table 1 Clinical characteristics of the 32 soy allergic/sensitive patients

Soybean Protein Preparation for 1D Polyacrylamide Gel Electrophoresis

Soybean seeds were ground into a fine powder by using a standard coffee grinder or a mixer mill MM301 (Retsch, Newtown, PA www.retsch-us.com/us/). The ground seed (50 mg) was extracted directly into 1 mL Tris-buffered saline with EDTA (20 mM Tris, pH 7.6; 150 mM NaCl, and 1 mM EDTA) for 2–3 h at RT on a nutator shaker. The slurry was centrifuged at 18,000×g for 10 min, and the supernatant was kept at −20 °C until used. Total protein content was estimated by Bradford microassay (Bradford 1976) and typically ranged from 10–15 mg/mL.

Soybean Protein Preparation for 2D Gel Electrophoresis

Ground soybean (50 mg) was either resuspended directly in 1 mL of rehydration buffer [8 M deionized urea, 2 M deionized thiourea, 2% (w/v) CHAPS, 50 mM DTT, 0.2% (v/v) carrier ampholytes, and trace of bromophenol blue], then centrifuged at 18,000×g for 10 min, after which the supernatant was kept at −20 °C until used or the ground soybean was extracted as previously described (Zarkadas et al. 2007) with the following modifications. All centrifugations were performed at 35,000×g, and the pellet was dried on ice in a fume hood for 20 min. Total protein content was estimated by a modified Bradford microassay, in which rehydration buffer was added to the standard curve samples and HCl (2.6 mM) was added to all samples.

1D Gel Electrophoresis

Soybean protein samples for gel application were prepared as previously described (Zarkadas et al. 2007) with the following modifications: After heating at 98 °C, the samples were centrifuged 10 s at ~1,000 g prior to application. Electrophoresis was performed on vertical slab gels (Mini-Protean II or Protein II xi, Bio-Rad) containing 12.5% or 15% acrylamide with a 4% stacking gel according to Laemmli (1970) at constant voltage 100–150 V (~1.5 h) for the Mini-Protean II and at 50–100 V for the Protean II xi (~20 h) until the tracking dye migrated to the bottom of the gel. Gels were either Coomassie-stained or transferred to polyvinylidene fluoride (PVDF) membrane for Western blotting. Coomassie-stained gels were scanned and dried as previously described (Zarkadas et al. 2007).

Isoelectric Focusing using Immobiline pH Gradient Strips

Isoelectric focusing was performed on 7 or 17 cm, pH 4–7 or 5–8, IPG ReadyStrips linear strips or on 7 or 17 cm, pH 6–11, Immobiline DryStrips as previously described (Zarkadas et al. 2007). Strips were hydrated with 75–100 µg protein (7 cm) in 125 μl of rehydration buffer or with 0.5–1.0 mg protein (17 cm) in 300 µl of rehydration buffer according to the manufacturer’s instructions. The 7-cm strips were focused at 20 °C at 250 V for 20 min, 4,000 V for 2.5 h followed by 4,000 V for 10,000 Vh for a total of ~14,000 Vh, whereas the 17-cm strips were focused at 20 °C at 250 V for 20 min, 10,000 V for 2.5 h followed by 10,000 V for 40,000 Vh for a total of ~50,000 Vh.

2D Gel Electrophoresis

2D electrophoresis was performed as previously described for the 7-cm strips (Zarkadas et al. 2007). For the 17-cm strips, 4 ml of equilibration buffers I and II were used, and the strips were embedded on top of 12.5% or 15% acrylamide gels for the Protean II XL (20 cm, Bio-Rad). Gels were Coomassie-stained and scanned as previously described or stained with Bio-Safe Coomassie as per the manufacturer’s instructions for MS.

IgE Western Blotting

Proteins separated on 1D or 2D gels were transferred to PVDF membranes at 100 V for 1 h in Towbin buffer supplemented with 20% methanol in a Mini-Trans-Blot Cell (Bio-Rad), dried 1 h at RT, then blocked O/N at 4 °C in 1% blocking reagent in Tris-buffered saline Tween-20 (TBST; 20 mM Tris, pH 7.6, 150 mM NaCl, 0.05% Tween-20) in glass dishes. 1D membranes were cut into strips and hybridized in 10-ml polypropylene tubes with 1/25–1/1,400 of patient serum in TBST supplemented with 0.5% blocking reagent O/N at 4 °C on a nutator with gentle rotation, whereas 2D membranes were hybridized directly in the glass dishes. Membranes were vigorously washed in TBST 5 × 10 min. Secondary antibody, anti-human IgE HRP was added at 1/2,000–1/2,500 in TBST supplemented with 0.5% blocking reagent for 1 h at RT, then washed as before. Blots were detected with ECL Plus substrate and scanned with a Molecular Dynamics Storm Imager 840 (GE Healthcare). PVDF membranes were Commassie-stained post-detection and scanned as described previously.

Mass Spectrometry (LTQ)

Linear trap quadrupole (LTQ) MS was performed at the Proteomics Platform of the Eastern Quebec Genomics Centre as follows.

Protein In-Gel Digestion

Bands of interest were placed in 96-well plates and then washed with water. Tryptic digestion was performed on a MassPrep liquid handling robot (Waters, Milford, USA) according to the manufacturer’s specifications and to the protocol of Shevchenko et al. (1996), with modifications suggested by Havlis et al. (2003). Briefly, proteins were reduced with 10 mM DTT and alkylated with 55 mM iodoacetamide. Trypsin digestion was performed using 105 mM of modified porcine trypsin (Sequencing grade, Promega, Madison, WI) at 58 °C for 1 h. Digestion products were extracted using 1% formic acid, 2% acetonitrile followed by 1% formic acid and 50% acetonitrile. The recovered extracts were pooled, vacuum centrifuge dried, and then resuspended into 8 µl of 0.1% formic acid, and 4 µl were analyzed by MS.

Mass Spectrometry

Peptide samples were separated by online reversed-phase (RP) nanoscale capillary LC (nanoLC) and analyzed by electrospray MS. The experiments were performed with a Thermo Surveyor MS pump connected to a LTQ linear ion trap mass spectrometer (ThermoFisher, San Jose, CA, USA) equipped with a nanoelectrospray ion source (ThermoFisher, San Jose, CA, USA). Peptide separation took place on a PicoFrit column BioBasic C18, 10 cm × 0.075 mm internal diameter, (New Objective, Woburn, MA) with a linear gradient from 2% to 50% solvent B (acetonitrile, 0.1% formic acid) in 30 min at 200 nL/min (obtained by flow-splitting). Mass spectra were acquired using a data-dependent acquisition mode using Xcalibur software version 2.0. Each full scan mass spectrum (400–2,000 m/z) was followed by collision-induced dissociation of the seven most intense ions. The dynamic exclusion (30 s exclusion duration) function was enabled, and the relative collisional fragmentation energy was set to 35%.

Database Searching

All MS/MS samples were analyzed using Mascot (Matrix Science, London, UK; version 2.2.0, http://www.matrixscience.com; Perkins et al. 1999). Mascot was set up to search the Uniref100 database (Schneider et al. 2005) (release 13.2) assuming the digestion enzyme trypsin. Mascot was searched with a fragment ion mass tolerance of 0.50 Da and a parent ion tolerance of 2.0 Da. Iodoacetamide derivative of cysteine was specified as a fixed modification, and oxidation of methionine was specified as a variable modification. Two missed cleavage reactions were allowed.

Criteria for Protein Identification

Scaffold (version Scaffold-2_00_06, Proteome Software Inc., Portland, OR) was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 95.0% probability as specified by the Peptide Prophet algorithm (Keller et al. 2002). Protein identifications were accepted if they could be established at greater than 95.0% probability and contained at least two identified peptides. Protein probabilities were assigned by the Protein Prophet algorithm (Nesvizhskii et al. 2003). Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.

Mass Spectrometry (Q-TOF 2)

Quadrupole time-of-flight (Q-TOF) 2 MS was performed at the NRC Proteomics Facility, Institute for Biological Sciences as follows.

Protein In-Gel Digestion

Bands of interest were washed with 30% acetonitrile/100 mM ammonium bicarbonate then covered with an 8 µg/ml solution of modified porcine trypsin (sequencing grade, Promega, Madison, WI) at 37 °C overnight. Digestion products were directly analyzed by MS without extraction.

Mass Spectrometry

Peptide mixtures were placed in a 96-well plate and separated by LC (CapLC; Waters, Mississauga, ON) equipped with a Dionex Acclaim PepMap100 C18, 5 µm, 100 A, 300 µm × 5 mm trap and a C18, 75 µm × 5 cm PicoFrit column (New Objectives) at a flow rate of approximately 500 nl/min. The gradient used was 5–45% acetonitrile/0.1% formic acid in 35 min followed by 45–85% acetonitrile/0.1% formic acid in 3.5 min (column wash). The ions were then analyzed with a Q-TOF 2 hybrid mass spectrometer (Waters-Micromass, Mississauga, ON) at one scan/second. The mass spectrometer was set to operate in automatic MS/MS acquisition mode, and spectra were acquired on doubly, triply, and quadruply charged ions with the Masslynx software (Waters-Micromass, Mississauga, ON). The scan range in TOF-MS mode was m/z 400-1600 (m/z 50-1600 in MS/MS mode), and MS/MS was performed on the three most abundant multiple-charged peaks for each TOF-MS scan.

Results

Patient Characteristics

Twenty-three adult patients with clinical manifestations of soy allergy and nine adult patients exhibiting soy sensitivity (as expressed by their ImmunoCAP score) were obtained from two sources for this study. Case histories of these individuals are summarized in Table 1. The cohort consisted of 16 men and 16 women with a mean age of 34.2 years (18–64). This North American population represents a wide spectrum of diagnosed soybean food allergies and sensitivities. For allergic patients, ImmunoCAP scores provide an estimate of the severity of reactions against soybean, and these can be as high as >100 kU/L (CS2), whereas for sensitive patients, the highest value in this population was 10–14 kU/L (RM). As shown in Table 1, allergic symptoms and sensitivities to soybean and other legumes varied greatly and were generally not well correlated with ImmunoCAP scores. Patients with a very high ImmunoCAP score were not necessarily anaphylactic; however, most patients with anaphylactic symptoms to soy and peanuts tended to have a very high ImmunoCAP score.

1D screening with Patient Sera

Thirty-two soy-allergic/sensitive patient sera were individually used in Western blot analysis to screen the tofu cultivar Harovinton soybean proteins separated by 1D gel electrophoresis on 12.5% polyacrylamide gels (Figs. 1a, b). The serum dilution of each patient was adjusted (titers varying from 1/25 to 1/1,400) to allow visualization of only the most reactive bands. In some cases, even a very low serum dilution (high serum titer) did not reveal any intense bands (HS 1/25, 08 1/50, TO 1/200). This result correlated with a low ImmunoCAP score (HS = 0.6, 08 = ND, TO = 4.0 kU/L, Table 1). In other cases, a high serum dilution revealed up to ten bands, some of which were very intense (CS2 1/1400, DB 1/800 and DP 1/800). The latter two patients are also anaphylactic to soy. These three subjects share a common IgE reaction against both the α subunit of β-conglycinin (#3) and the A2 subunit of glycinin (#10) (see Table 3). Holzhauser et al. (2009) have recently shown that these two allergens, Gly m 5 (β-conglycinin) and Gly m 6 (glycinin), were potentially indicative for severe allergic reactions to soy in a European adult cohort. Most of the individual’s serum in our study recognized between four and six soybean proteins at varying intensities. However, some sera had an intense reaction to only one specific protein (LN, PK and RC). Generally, the ImmunoCAP score was not well correlated to any one particular soybean protein, to the number of bands, or to the intensity of the reaction.

Fig. 1
figure 1

Western blots of Harovinton soy proteins (15 µg) separated on 12.5% polyacrylamide gels and probed with either American (a) or Canadian (b) soy-allergic or sensitive patient sera and detected with anti-human IgE-HRP (1/2,000). Molecular weight ranges are marked on the left side, patient codes (see Table 1) are marked at the top, and serum dilutions used are noted on the bottom in each panel. a Coomassie-stained soy protein gel is illustrated on the extreme right. Red numbered bands were picked and sent for tandem MS analysis and identification (see Table 2, red). Red numbers also correspond to the allergens described in Table 3

In addition to the 32 clinically diagnosed patient sera screened in this study, we used the non-allergic, non-sensitive sera of four individuals who do not have any known food allergies or sensitivities. A very faint background reaction was observed among each of these controls against both the glycinin A3 subunit (#9) and the SBg7S low kilodalton subunit (#15). The Western blots in this study were detected with anti-human IgE as described in “Materials and Methods,” and no bands were observed when either anti-human IgG, IgA, or IgM were used as a secondary antibody (data not shown), suggesting that reactions to soybean proteins is exclusively IgE-mediated.

Four of the Canadian patient sera in this study (LM, CC, RC, and CS) were previously used in a Western blot analysis against soybean protein (Herian et al. 1990, 1992). Bands labeled with red numbers on Figs. 1a and b were excised from large 1D and/or 2D gels and subsequently identified by tandem MS (see Table 2, red). By doing so, we determined that the low M w protein (20 KDa) band, which Herian et al. (1990, 1992) detected, is in fact a late embryonic abundant (LEA) group III protein (#16).

Table 2 Identification of allergens by tandem MS

Identification of Soy Allergens with Knockout Lines

Soy proteins extracted from Harovinton and various storage protein knockout lines were also separated on a 12.5% 1D gel and probed with patient serum to identify allergens. Figure 2 illustrates an example of this method of analysis in which serum from one patient reacted to the glycinin subunit A3 on a Western blot. On this blot, the glycinin A3 subunit was missing from the soybean lines represented in lanes 1, 2, 3, 6, 7, and 9. It was present in the soybean lines in lanes 4, 5, 8, and 10, which showed conclusive bands at the correct molecular mass of A3. By using protein extracts from seeds lacking various combinations of glycinin subunits (A2, A3, A4, and A5), it has been demonstrated that LM patient serum reacted to only one member of this closely related family of proteins. This highlights that the use of knockout lines for allergen screening is a very powerful tool. Although many knockout lines of the major storage proteins glycinin and β-conglycinin have been created, at the present time, there are limited knockout or null lines available for other soybean proteins (examples include Kunitz trypsin inhibitor (KTI), lipoxygenase, lectin, and P34).

Fig. 2
figure 2

a Coomassie-stained PVDF membrane after transfer of various glycinin and β-conglycinin soybean knockout lines separated by 12.5% polyacrylamide gel. b Western blot of membrane in a probed with LM serum (1/400) and detected with anti-human IgE-HRP (1/2,000). Lane 1 Precision Plus markers, lanes 2–10 15 µg of soy protein extract. 2 α′A3 null, 3 α′A3 null, 4 α′A4A5 null, 5 α′A2 null, 6 A1A2A3A4A5 null, 7 α′A1A2A3 null, 8 A1A2A4A5 null, 9 α′A1A2A3A4A5 null, 10 Harovinton

2D Screening of Soybean Allergens with Patient Sera

Figure 3 illustrates four examples of 2D gel screening with patient sera followed by tandem MS to identify allergens (Table 2). This analysis allowed the identification of two previously unreported soybean allergens, a class I low M w heat shock protein (HSP) and LEA group III protein in Fig. 3b and d, respectively. It also demonstrated that the identification of three known allergens, glycinin A3 and both the α and β subunits of β-conglycinin, could be confirmed in Fig. 3f and g, respectively.

Fig. 3
figure 3

Coomassie-stained 2D gels of Harovinton soybean proteins (75 µg) and IgE Western blots. a 15%, 5–8 2Dgel; b membrane transfer from a probed with DP serum (1/800); c 15%, 6–11 2D gel; d membrane transfer from c probed with RC serum (1/100); e 12.5%, 4–7 2D gel; f membrane transfer from e probed with LM serum 1/300; g membrane transfer from e probed with CS serum (1/100). Anti-human IgE-HRP was used at 1/2,000 in all blots except for d where it was used at 1/2,500. Molecular weight and pH ranges are indicated on the left and at the top of each gel/blot, respectively. Identified allergens are indicated with an arrow

Although the novel LEA allergen was detected on both 1D (Figs. 1a, b) and 2D (Fig. 3d) gels by patient serum screening, HSP was only detected on a 2D gel (Fig. 3b) by one patient serum (nine patients screened). Because of its low relative abundance, as shown on a Coomassie-stained 2D gel (Fig. 3a), HSP may be masked by other more abundant proteins at that location (~16 kDa) on a 1D gel, which could explain why it has never been detected on a 1D blot with patient serum.

Soy Protein Identification by Tandem MS

All MS data obtained in Table 2 originated from large 1D or 2D gels, which allowed for better resolution of separated proteins. As expected, the Mascot score increased with the size of the protein identified as more peptides could be identified.

There was a good match between the predicted (Table 2) and the observed (Fig. 3) M w and pI of these proteins, which suggests that the correct protein was identified by MS.

Frequency of Patient Reactions Against Soy Proteins

The entire cohort of North Americans was used to calculate the frequency of reactions to individual soybean proteins (Table 3). The results are represented as the frequency among the 32 patients, which demonstrate positive reactions to specific or unidentified proteins in soybean seed as shown on Fig. 1a and b. The most prevalent proteins detected by IgE were glycinin subunit A3, followed by an unidentified 30 kDa protein, Gly mBd 28K, SBg7S low kilodalton subunit, unidentified 140 kDa protein, SBg7S, P34/lectin, glycinin subunit A2, LEA group III protein, and the sucrose-binding protein homologue S-64. It is not surprising to see an elevated frequency for A3 and SBg7S low kilodalton subunit, as these allergens were also weakly detected in non-soy-allergic patients. Both unidentified proteins and S-64 reacted frequently, yet rather faintly. Therefore, Gly mBd 28K, P34/lectin, glycinin A2, and LEA were the allergens that reacted with the highest frequency and intensity in our study. P34 and lectin were grouped together because it was not possible to distinguish between the two allergens on 1D gels. Only lectin was detected by tandem MS analysis (Table 2), probably due to its higher abundance compared to P34.

Table 3 Frequency of patient reactions on 1D gels and candidate soy allergen

These findings seem to cast doubt on the previous reports that the soybean P34/Gly mBd 30K allergen is the most immunodominant allergen in the population (Ogawa et al. 1991; Joseph et al. 2006) since several other proteins are as significant, if not more significant, in this cohort. Some early reports of soy allergen identification by Western blot have used sera from children (Ogawa et al. 1991); however, our study has demonstrated that, although this protein was frequently detected by adult IgE, it was not responsible for intense reactions (see band 11 on Fig. 1).

Discussion

We have demonstrated that high-resolution 1D and 2D gels to separate soybean seed proteins prior to immunodetection with human serum IgE, the use of soybean knockout lines, and tandem MS analysis used in conjunction are powerful methods for allergen identification. By comparing non-allergic/sensitive sera with a 32 soy-allergic/sensitive patient cohort, significant soybean allergens were detected (Fig. 1). We demonstrated that some soybean proteins were highly recognized by serum IgE (in more than 50% of patients; see Fig. 1 and Table 3), and these included the G5 glycinin A3 subunit, a 30-kDa unidentified protein, Gly mBd 28K, Basic 7S globulin precursor (SBg7S) and its low M w subunit, a 140-KDa unidentified protein, P34/lectin, G2 glycinin A2 subunit, a 15-kDa LEA group III protein, and sucrose-binding protein homologue S-64.

As previously mentioned, the A3 and SBg7S may have been more frequently detected in soy-allergic and sensitive sera partly because of their weak background reaction with non-allergic non-sensitive sera. Therefore, the reported frequency in Table 3 may be overestimated; however, in a few soy-allergic patients, the A3 band was significantly more intense than background (Fig. 1) and should still be considered a significant allergen. In the case of SBg7S, it was detected by MS, once as the full-length protein of 44 kDa (band 8, Fig. 1) and once as the low kilodalton subunit (band 15, Fig. 1). The fact that it was recognized in two independent experiments raises the confidence level that it is, in fact, a true novel allergen. Moreover, both SBg7S forms (16 and 44 kDa) were often simultaneously recognized by the same patient IgE (see CO, CY, RJ, RM, DB, AG, and LM in Fig. 1).

The early study by Ogawa et al. (1991) was instrumental in ranking the frequency of IgE-binding soybean proteins among a group of Japanese children (mean age 6 years) with atopic dermatitis. Although they reported a high frequency of reactions to the protein Gly m Bd 30 K (later described as P34), in this present study of adults (mean age, 34 years), with known soybean allergies/sensitivities, the frequency and intensity of reaction of P34 was not superior to some other proteins (Fig. 1, Table 3). Because it was not possible to distinguish between lectin and P34 in Fig. 1, as they co-migrated on the gel, the frequency of reaction reported in Table 3 is the sum of serum IgE binding to both. A similar failure to detect P34 by MS has recently been reported (Batista et al. 2007). Therefore, it is possible that, individually, these allergens react to less than 50% of the population. In fact, preliminary results of patient serum screening on 2D gels would seem to indicate that these two allergens are not major or are not being detected by 2D gel screening (data not shown). We are presently making use of a recently discovered P34-null germplasm to further investigate these North American patients (Joseph et al. 2006). Although this group of patients is relatively small in number, it suggests a trend with regard to the soybean proteins most likely to provoke allergic reactions among North Americans. The list of allergenic proteins and their frequencies in this cohort is presented to demonstrate the results of the largest screening of soy allergic patients to date, but it is not intended to be a complete nor absolute list. It is very difficult to predict with any degree of certainty whether these trends would translate to the overall soy allergic population.

While several studies have shown that IgE from soy-sensitive patients react primarily to the glycinin and β-conglycinin fractions (Holzhauser et al. 2009; Ogawa et al. 1995), we have shown in this study that a high percentage of patient IgE also reacts to non-storage proteins, including a seed maturation protein known as LEA protein. In fact, two patients in this cohort had an IgE reaction only to LEA protein (Figs. 1a (PK), b(RC). LEA proteins are known to play a role in desiccation and stress tolerance in many plant seeds, allowing them to survive the dry storage phase. This particular type of LEA protein (group III) and the gene encoding it have been well characterized by Shih et al. (2004). We are continuing the characterization of this allergen by epitope mapping in an effort to better define regions of allergenicity.

Several studies have reported that the Gly m Bd 28K is an allergen (Hiemori et al. 2004). It has been determined that both the 23 kDa C-terminal domain and the 28 kDa N-terminal domain of this protein were detected by serum IgE. Our results, as shown in Fig. 1, band 14, indicate that Gly m Bd 28K is a protein of ~24 kDa, which is consistent with this previous report. MS analysis of this spot (Table 2) also indicated that both domains bind to serum IgE. A similar number of peptides were detected by MS for both the N-terminal (seven) and the C-terminal (nine) domains (Table 2). Upon careful observation of Fig. 1a, there appears to be a doublet in the ~24 kDa vicinity, which could contain both domains of Gly m Bd 28K.

Another seed maturation protein, which was identified by 2D gel Western blot analysis in this study, was the PM31 HSP (Fig. 3b). This protein was only detected by the IgE of one patient in a 2D Western blot; therefore, it is somewhat premature to speculate on its relative significance, but at this point, it does not appear to be a major allergen. However, other HSP proteins have been demonstrated to bind to human IgE from patients sensitized to penicillium (Shen et al. 1997), cystic echinococcosis (Ortona et al. 2003), and to corn and wheat dust (Chiung et al. 2000).

Kunitz trypsin inhibitor (KTI) has been previously identified as a soy allergen (Moroz and Yang 1980). Despite the fact that it is a well-resolved protein on a pH 4-7 2D gel (pI ~4.7, M w ~18 KDa), KTI was never detected by these 32 patient sera in IgE Western blot analysis. KTI null genotypes are available to further screen against this North American cohort; however, it appears that it is a minor allergen at best. Likewise, the basic subunit of glycinin G2 (B2) has also been previously reported as an allergen (Helm et al. 2000); however, in this study, it was not possible to detect it by patient sera on either 1D or 2D IgE Western blots.

We demonstrated in this study that A2 and A3 glycinin subunits are some of the most reactive to patient IgE in Western blots (Figs. 1a, b, 2, and 3). These findings are curious given the high amino acid sequence homology (78%) between the A3 and A4 subunits. One would have expected to also see the A4 subunit detected with patient serum; yet, the A4 subunit, which is well-resolved and identified on a pH 4–7 2D gel (Zarkadas et al. 2007), was never detected in IgE Western blots. We have not yet screened gels with appropriate pH and M w ranges to study the reactivity of the A5 subunit, which shares even higher amino acid homology (85%) with A3.

Another example of the exquisite specificity and sensitivity of patient serum IgE toward soybean proteins was the α subunit of β-conglycinin, which was recognized by serum IgE of patient CS, whereas the highly homologous (82%) α′ subunit was not (Fig. 3). Epitope mapping and alanine scanning experiments may allow us to elucidate the antigenic regions, linear or conformational epitopes of these and other allergens, which are responsible for this high degree of specificity. Ogawa et al. have previously observed this same phenomena with serum IgE reactions toward the α subunit of β-conglycinin (Ogawa et al. 1995).

A limitation of doing these patient surveys on small format 1D gels is the ability to conclusively resolve allergenic proteins, whereas the resolution would be considerably better on large format gels and would allow for better alignment of individual blots and better protein identification. Although this is a preliminary screen of this cohort group, we are confident that the large format gels used to isolate proteins for MS have correctly identified the proteins listed in Tables 2 and 3.

Another of the limitations of 1D gel electrophoresis for MS analysis is the fact that the most abundant proteins in the band are most often identified; however, the most abundant protein in the mixture is not necessarily the most allergenic. Using large format 2D gels of high resolution (those whose pH ranges provide the best separation for the protein under investigation) may help alleviate this problem and reduce the chances of cross-contamination but may also reduce the amount of protein per spot and, therefore, is likely to return a lower Mascot score, as was observed herein.

By using soybean protein knockout lines where possible, high-resolution 2D gel Western blots with well-characterized allergic patient serum, coupled with MS analysis and continually increasing genome sequences in public databases, the confidence level of allergen identification should rise. The simultaneous use of these three methods could become indispensable to conclusively identify new allergens with the utmost degree of confidence.