Development and adoption of genetic engineering to introduce a foreign gene to create a functionally distinct, genetically modified (GM) plant has been one of the most rapidly adopted technologies in the history of agriculture [1]. The technology allows the introduction of one or a few individual genes from any living organism into the genome of the recipient plant, whereas traditional breeding requires sexually compatible gene sources and acceptors and transfers thousands of genes, including undesirable genes [1]. Global cultivation of commercial GM crops began in 1996 with approximately 4.3 million acres, then expanded to more than 335 million acres (134 million hectares) by 2009 [2]. Since 1994, regulators in the United States have approved many GM events of insect pest–resistant corn, cotton, and potatoes; herbicide-tolerant canola, corn, cotton, and soybeans; and virus-resistant papayas, potatoes, and squash that have been evaluated for food, feed, and environmental safety [3]. Most of the genes introduced into GM plants encode proteins that are expressed at low levels (10–500 ppm; <0.2% of total protein) and either have a history of safe use or are similar to commonly consumed proteins [4]. Insecticidal cry 1A genes were transferred from a bacterium that has been commonly used by organic farmers. The glyphosate (herbicide)-tolerant enzyme introduced into soybeans is very similar to enzymes in nearly all bacteria and green plants. These plants require fewer pesticide applications or provide options for weed and disease control. A few introduced genes encode an antisense RNA that blocks viral infection or inhibits expression of specific endogenous plant genes without expression of any new protein [5]. Approximately 15.4 million farmers grew approved GM crops on 148 million hectares in 29 countries in 2010, representing 81% of soybean, 64% of cotton, 29% of corn, and 23% of canola cultivation [2].

Development and testing of a new GM crop typically requires 8 to 12 years, including more than 4 years of safety and environmental testing, before regulatory approval and commercial release. Because there is no global approval and registration process for foods derived from GM organisms, approvals are country specific, and testing requirements sometimes differ. However, the Codex Alimentarius Commission (Codex), under the Food and Agriculture Organization and the World Health Organization of the United Nations, attempts to harmonize food safety testing to protect human health while facilitating international trade. The Codex food safety guideline for GM plants was published in 2003 and includes a comprehensive assessment of potential allergenicity [6]. Regulations in the United States follow Codex guidelines and are administered by the US Food and Drug Administration (FDA), which has overall responsibility for food safety, while the Environmental Protection Agency shares responsibilities in evaluating food safety of GM crops that produce insecticidal proteins or antiviral or antifungal products [7]. The European Food Safety Authority (EFSA), a scientific advisory panel for European regulators, published food safety guidelines for GM plants in 2004 [8]. However, the European Union (EU) has been slow to approve any GM crops, even though tests demonstrate that they meet international safety standards and have been marketed in other countries for many years. Currently, the EU only allows cultivation of one potato and two corn (maize) GM events, and the importation of food or feed from less than half of the GM varieties accepted in the United States [9].

Testing and registration requirements should be clear. However, during the past 14 years, we have observed an increasing number of publications describing scientifically unsound studies, as well as demands by international regulators (or scientific advisors) for studies and tests that are not based on clear principles of risk assessment. In our opinion, these tests do not improve safety but do increase the chance of rejecting potentially useful products.

Food Safety

No food is safe for all consumers at all levels of consumption, even when prepared by traditional methods. Approximately 1% of the global population has celiac disease and must avoid foods containing wheat, barley, or rye grains. Those with food allergies must avoid the specific food(s) that elicit their symptoms. Navy beans and kidney beans (Phaseolus vulgaris) must be boiled for approximately 10 min to inactivate the abundant lectin phytohemagglutinin in order to avoid intestinal bleeding and discomfort. The guiding principle for GM safety since the early-1990s is that foods prepared using the new GM variety should be as safe as those prepared using the non-GM counterpart [6].

Risk assessment food safety involves four main components: hazard identification, hazard characterization, exposure assessment, and risk characterization [10, 11]. The primary focus for GM crop safety is on evaluating the potential toxicity of the protein or metabolites of GM enzymes and the allergenicity of the introduced protein(s) based on the extensive historical knowledge of toxins and allergens and proven test methods [12••, 13••]. The risk assessment process should provide information to guide regulators regarding identification of any hazard (allergen or cross-reactive protein), characteristics of the at-risk population (those with specific allergies), risk severity, and information to estimate exposure [14, 15].

Allergenicity Assessment

Essentially any protein could be allergenic for at least one person if administered with an IgE-inducing adjuvant. However, everyone is exposed to hundreds of thousands of proteins during his or her lifetime that do not sensitize or elicit allergic reactions. The risk associated with an allergen is for those who have been sensitized to a protein causing the production of protein-specific IgE antibodies that can elicit an allergic reaction. Individuals who are not sensitized are not at risk, and some who are sensitized will never experience an allergic reaction. The risk of allergy from traditional foods can only be managed if allergic individuals and their families know the identity of the food that causes the allergy, if the foods they consume are prepared without the allergen, and if they have reliable information about food ingredients in processed foods and in restaurants to enable avoidance. To be effective, the allergenicity assessment of GM crops must focus on the same risks of food allergy as posed by traditional foods. The primary goal is to prevent the transfer of an existing allergen or celiac-inducing protein into a new food source and to protect those who are allergic or have celiac disease.

The overall prevalence of food allergy in North America is approximately 6%, while less than 1% are allergic to peanut, one of the most commonly cited and potent allergenic foods [16]. Food allergy is caused by a small number of proteins in even the most commonly allergenic foods, such as peanuts, tree nuts, fish, milk, or eggs. People allergic to the same food may react to different proteins. It is not possible to predict who will become allergic or to which foods or proteins they will become allergic. Many countries now require clear labeling of a food’s major ingredients as well as refined ingredients from the major allergenic sources. Food companies go to great lengths to segregate and label allergenic ingredients that are included in packaged foods. However, accidents still occur from foods obtained from commercial packages, restaurants, and homes of friends and family members. Studies have estimated that 100 to 200 fatal reactions occur in the United States when allergic consumers are unexpectedly exposed to the food that causes their allergy (usually peanut or a tree nut), and that there are more than 100,000 visits to a hospital emergency department, in addition to mild reactions for which medical care was not sought [16, 17].

The Codex guideline on the allergenicity assessment of GM plants begins with an evaluation of the history of human exposure to the source of the gene and considers whether there is evidence of allergy associated with the protein or the source [6, 13••]. A similar risk is posed by proteins that are similar enough in sequence and structure to a food allergen that IgE from the allergic individual can bind and elicit an allergic reaction upon first exposure to the similar protein [13••, 18, 19]. Genes taken from wheat or relatives of wheat are also evaluated to ensure that they will not induce celiac disease [6]. The amino acid sequence of the new protein is compared with those of known allergens to determine whether the protein is already known to be allergenic or is so similar to an allergen (probably >50% identical) that cross-reactivity is possible. If the source or sequence comparison suggests possible risk, sera from appropriately allergic individuals should be tested for specific IgE binding and, if indicated, biological activity [13••]. In addition, the characteristics commonly associated with proteins that are important food allergens are evaluated to consider possible de novo sensitization or elicitation (if sensitized). Similar evaluations were performed prior to 2003 based on earlier guidelines [7, 16]. The assessment strategy worked well to identify and stop development of the only publicly identified potential product to date that would have presented a clear risk of allergy to a population of tree nut–allergic individuals [20]. However, some uncertainties exist. Allergies to some sources are so rare that donors cannot be identified to perform serum tests. Furthermore, the predictive values of biochemical characteristics, including protein stability in pepsin and abundance, are not known. There are stable and abundant dietary proteins that do not cause allergy, as well as labile and moderately abundant proteins that do cause food allergy. Although improvements are possible, it is also important to consider that when any individual begins consuming a traditional food, there is a risk of sensitization and allergy.

The current allergenicity assessment could be strengthened by an explanation of probable hazards and a description of who is at risk and how exposure and risk assessment should be evaluated. The lack of clarity has led some scientists and regulators to search for methods enabling perfect predictions or to demand additional unproven tests or modification of test methods without validation. Additionally, the concept of “substantial equivalence” of GM plants relative to similar varieties of non-GM plants is often interpreted to mean any statistically significant difference is unacceptable. In the context of allergenicity, scientists and regulators want to see results showing no increased expression of endogenous allergens at least for commonly allergenic crops such as soybean [6, 8]. However, natural variation across non-GM varieties is not documented, and the influence of environmental factors often will yield modest to marked differences in the expression of a wide variety of proteins, including allergens. More importantly, someone who is allergic to a food should avoid consumption of that food. Those without allergy can consume the food at will. Where is the risk?

Source of the Gene

The identity of the source of the introduced gene must be revealed, and literature related to human exposure and risk should be thoroughly reviewed. The information is essential to consider whether the protein may be an allergen, or whether there is an established “history of safe use” (HOSU) for the source and the protein. The HOSU is defined differently by various regulatory agencies [21]. The EU (1997) requires documentation of extensive use prior to May 1997 for HOSU. The FDA (1997) defines foods or food ingredients as GRAS (generally recognized as safe) only if commonly consumed prior to January 1, 1958. In our opinion, the definition of HOSU for the allergenicity assessment should be based on more current data and knowledge of allergy and allergens. Few allergens were well-defined before 1990. The evaluation should address two questions. First, was the gene encoding the GM protein isolated from an important source of allergy, including foods, pollen, arthropods, animal dander, latex, insect venom, or mosquito saliva? Second, is there documentation that humans have been exposed to the protein without any evidence of allergenicity?

Many allergenic proteins have been identified in the past 20 years, and recent publications typically describe more definitive evidence compared with 1990. Although allergenic protein sources contain hundreds to thousands of proteins, only a few proteins from any one source are demonstrated to be allergens. Many are now listed in allergen-specific databases (eg, AllergenOnline from the Food Allergy Research and Resource Program at the University of Nebraska [22] and the International Union of Immunological Societies Allergen Nomenclature database [23]).

Bioinformatics Search Comparison to Allergens

Allergens and cross-reactive proteins cannot be identified by structure or sequence similarity alone. Although the biochemical structure and function of many allergenic proteins may be classified into a few protein families, many members of those families are not allergens, and many allergens belong to diverse families [24]. Although sequence comparisons will not predict three-dimensional structures or IgE-binding sites (epitopes), it is clear that similar overall three-dimensional structures require relatively good sequence conservation and that the sequence alignment tools of BLAST (Basic Local Alignment Search Tool) or FASTA can easily match proteins that are sufficiently similar to suspect shared cross-reactivity [25]. An overall sequence identity alignment of proteins sharing greater than 70% suggests cross-reactivity, whereas proteins sharing less than 50% identity are unlikely to share IgE cross-reactivity [26]. Although identity matches of less than 50% are unlikely to share IgE binding, lower alignment score criteria have been used as a threshold to trigger serum IgE testing to evaluate potential cross-reactivity [27••]. An appropriate assessment requires a well-curated database as well as alignment tool and appropriate criteria [25].

Allergenic Protein Databases

The 1996 publication by Metcalfe et al. [18] only listed 150 allergenic proteins. Version 11 (February 2011) of our peer-reviewed AllergenOnline database lists amino acid sequences of 1,491 allergenic proteins, including isoforms from 553 taxonomic-protein groups (protein types listed by species) [22]. The sequences were selected from more than 24,000 sequences labeled with the keyword “allergen” from the 13 million protein sequence entries in the public protein database at the National Center for Biotechnology Information [28]. The review of publications describing the proteins and evidence of IgE binding and allergenicity is used to remove National Center for Biotechnology Information entries that are from allergenic organisms but without proof of allergy for the protein, and those listed as allergens only because of sequence similarity to an allergen. The other available databases are not updated regularly or do not list clear selection criteria for inclusion [25]. Some scientists have suggested searching for specific IgE-binding sequences (epitopes) rather than whole sequences, but the validity of most “epitopes,” including many of those in the Immune Epitope Database from the La Jolla Institute of Allergy and Immunology [29], have not been verified as IgE or allergenic epitopes.

Bioinformatics Search Tools and Criteria

Metcalfe et al. [18] proposed using a FASTA local sequence alignment tool developed by W. Pearson of the University of Virginia to compare the amino acid sequence of the introduced GM protein with those of known allergens to identify probable homologues. Then alignments were further evaluated to identify matching segments of eight contiguous amino acids as plausibly representing shared epitopes (B-cell–/IgE-binding sites or T-cell epitopes) that might elicit cross-reactions. They recommended using sera from individuals allergic to the matched allergen to test for possible cross-reactivity. However, the description of their method was ambiguous, and subsequent regulatory studies have typically used a simple “word” search to identify matches of eight contiguous amino acids without consideration of homology, which likely overpredicts potential cross-reactivity [25]. A later recommendation by an expert panel suggested reducing the length of the identity matches from eight to six, but without any validation [30]. Several authors challenged the short amino acid identity match as providing primarily false-positive identities as well as missing potentially cross-reactive matches [27••, 31, 32]. A slightly less conservative criterion of greater than 35% identity over 80 or more amino acids using FASTA or BLASTP was also proposed by the Food and Agriculture Organization and the World Health Organization of the United Nations panel and was accepted by Codex [6, 33]. The FASTA approach with a criterion of greater than 35% identity identifies fewer false-positive matches and thus far does not seem to have missed any known cross-reactive matches [13••, 33]. Although some regulators continue to rely on the short amino acid criterion, there are no data to demonstrate a positive predictive value in the absence of extensive homology. Regulators in Asia have insisted that three GM products approved in the United States be tested for serum IgE binding due to short amino acid matches. Serum IgE binding did not show cross-reactivity between Cry 1 F and the house dust mite allergen Der p 7 that shares an isolated seven amino acid match [34]. Two other GM proteins have been tested for serum IgE cross-reactivity based on isolated eight amino acid matches, and results for both studies were negative (personal communication with one developer; unpublished results from our laboratory). However, those studies were complicated, time consuming, and expensive, and results of IgE binding are often difficult to interpret because of low-level nonspecific or irrelevant IgE “binding” [3537]. Because the purpose of the bioinformatics search is to identify matches that may require further evaluation by IgE binding, full-length sequence evaluation or an increase in the threshold from 35% identity toward 50% for the 80 amino acid alignment should be considered [27••, 38].

A few potential GM products have been described in the literature that present matches of greater than 35% identity over 80 amino acids to allergens. Those products should be tested by specific serum IgE binding if they were intended for commercial use [13••]. The gene of α-amylase inhibitor from common bean was introduced into field peas to protect against the storage beetle, which often causes a high percentage of loss [39]. The protein sequence of α-amylase inhibitor is slightly more than 40% identical to peanut agglutinin, a minor allergen of peanut. Serum IgE-binding studies are in progress in our laboratory to test for cross-reactivity, but to date, only 1 serum from 34 peanut-allergic individuals had clear and specific IgE binding to peanut agglutinin, and that serum did not bind to α-amylase inhibitor (unpublished results). Serum IgE from a few patients with high IgE binding to cross-reactive carbohydrate determinants (CCDs) on phytohemagglutinin and other glycoproteins bound to the α-amylase inhibitor due to CCDs. The IgE binding was inhibited by preincubating the sera with nonhomologous CCD proteins. Additional tests are being performed to evaluate the ability of α-amylase inhibitor to activate basophils that are sensitized with IgE from peanut-allergic individuals or those with CCD binding, but thus far, results are negative (unpublished).

Some criticize the use of bioinformatics, as it has limited our ability to predict the allergenicity (rather than potential cross-reactivity) of a test protein [40]. However, no single test can predict allergenicity a priori, and the intent is merely to identify proteins that present a potential risk of cross-reactivity and require specific serum tests for verification [27••].

Jank and Haslberger [41] advocated the use of algorithms that predict antigenic epitopes. However, to date, no algorithms have proven predictive for allergenicity, and antigenicity predictions are not accurate. Others have tried to use peptide mimotopes to predict cross-reactivity [42], or allergenic motifs based on protein dimension vector calculations to predict possible epitope similarity [43], but those methods and databases have not been widely tested for prediction accuracy [33].

In Vitro IgE-Binding Tests

Antigen-specific IgE-binding assays should be performed when the source of the gene is commonly allergenic, or when the bioinformatics search identifies a significant match to a known allergen [6, 7, 18, 27••]. Various test formats may present proteins in conformations that are not equivalent. Native protein epitopes may be presented in enzyme-linked immunosorbent assay (ELISA) or native gel immunoblots. Proteins separated in sodium dodecyl-sulfate polyacrylamide gel electrophoresis (SDS-PAGE) or SDS-PAGE with reducing agent (β-mercaptoethanol) should present sequential and some conformational epitopes. More than one method should be used. The GM protein, the suspected allergen (source or sequence matched), and other control proteins should be included to evaluate specificity of binding. In addition, reciprocal inhibition tests are needed to determine the specificity of IgE binding [37]. In vitro IgE binding may be due to the presence of a single epitope or low-avidity binding and may not allow effective IgE cross-linking on mast cells and basophils [19]. The biological importance of IgE binding may be tested further by in vitro basophil histamine release or in vivo via skin prick tests [44].

Protein Stability in Pepsin and Processing

General observations that many important food allergens are stable to digestion by pepsin, and some are still able to elicit an allergic response after cooking led to the reliance on those properties as indicators that a given food protein is likely to sensitize some consumers [6, 17]. However, it is likely that those characteristics are related to the ability of some proteins to elicit systemic allergic reactions rather than sensitization. Although the predictive value of the pepsin assay can be debated, tests using uniform, standardized conditions with pH at 1.2 or 2.0 and a fixed ratio of pepsin activity to test protein provide a relatively good correlation for allergenicity [4547]. Tests to measure the stability of the protein under heating conditions are frequently requested by regulators. However, the conditions for heating or processing are not standardized, and there are no clear hypotheses or test methods that allow standardization. It is also clear that some highly stable proteins do not cause food allergy [47]. The abundance of food allergens suggests a plausible relationship for risk assessment, as many food allergens represent 1% or more of the total protein in the food ingredient [45]. However, as some highly allergic individuals react to milligram amounts of some allergenic proteins, no consensus exists regarding a lower limit of concern.

Evaluating Potential Changes in Endogenous Allergen Content

There are few data documenting normal variation of the expression levels of various allergenic proteins for currently used varieties of most crops. However, most regulators expect a relative comparison of IgE binding to a new GM soybean and genetically similar non-GM varieties of soybean because soybean is considered a commonly allergenic crop. Testing requirements from the EFSA recently have increased markedly [13••]. Our laboratory has conducted studies for biotechnology developers to meet EFSA requirements for IgE binding to immunoblots from both one-dimensional and two-dimensional gel electrophoresis–separated soybean proteins using multiple individual allergic sera as well as inhibition ELISA tests (unpublished results). However, no clear limits have been established for acceptance or rejection. Minor differences were noted but were similar to differences between other commercial soybean varieties. Some regulators are now asking for similar tests to evaluate new GM varieties of rarely allergenic crops such as corn (maize). Because those with allergies should avoid their allergen, the requirement to test for differences in expression is not relevant to safety.

Recently, some scientists who have been involved as advisors to regulators of GM products have performed comparative proteomics tests or animal antigenicity tests on previously approved GM products without describing a testable hypothesis or end points of acceptability. Monsanto’s (St. Louis, MO) insect-protected GM corn event (MON810) was approved in 1996 but was the subject of two European studies of questionable merit [48, 49]. The first study looked for changes in the proteome by replicate two-dimensional gel electrophoresis and staining of samples of MON810 in a background of DKC6575, compared with its “non-transgenic near-isogenic line” (identity not disclosed). The authors found as many differences between replicates as from the GM to non-GM. Rather than conclude that there was no biologically significant difference, the authors discussed the complexity of the tests and data interpretation and suggested that even more detailed studies are needed for “accurate safety evaluation of crop plants using Omics technologies.” Similarly, Adel-Patient et al. [49] dosed mice with purified Cry 1Ab and with MON810 to look for immune responses and metabolic changes. After seeing no significant changes, they still concluded that further tests are needed to evaluate potential unintended effects. Although regulatory guidelines recommend evaluating new GM plants for potential increases in allergen content [6, 8], they ignore recent published information that demonstrates serum IgE binding differs more than twofold in some patients across non-GM crop varieties [13••].

Animal Models

Several guidance documents and publications have suggested using animal models to predict sensitizing potential and allergenicity, but they also recognize that current models have not been demonstrated to predict which food proteins cause allergy in humans [6, 8, 13••]. Although a validated animal model would be useful to help evaluate the potential risk of de novo sensitization, test results in several species (mice, rats, dogs, guinea pigs, and swine) have failed to provide consistent results with fixed sensitization protocols for accurately predicting human responses [50].

Past Experiences for the Allergenicity Assessment of Genetically Modified Plants

The assessment process has proven effective at identifying the only potential GM variety that has been publically disclosed, which would have caused a real risk of allergy to specifically allergic consumers [20]. A few other GM crops under development may pose some risk of allergy based on our evaluation of published information, but those are unlikely to pass regulatory assessment without additional serum IgE testing to demonstrate a low probability of risk. Some uncertainties still exist based on imperfectly predictable testing. The pepsin-stable Cry 9 C protein in StarLink (Bayer CropScience AG, Monheim, Germany) corn probably posed little risk, as expression of the protein in seed was approximately 50 ppm in corn grain. However, regulators have yet to articulate a policy to integrate data on stability and abundance to provide acceptance criteria.


In our opinion, most current allergenicity assessment procedures for GM food crops are based on the best available science. There is no published evidence of allergic reactions to any GM protein or any adverse human health reactions associated with consumption of foods from GM crops during the past 14 years. However, the allergenicity assessment by most countries should be updated to emphasize the priorities of hazard identification, risk characterization, and exposure. It should center on protecting those with existing allergy from unexpected exposure to an allergen or proven cross-reactive protein focusing on: (1) evaluation of the history of safety (or risk) of the source of the gene, (2) sequence comparison of the GM protein to known allergens by FASTA or BLAST to identify matches of greater than 35% (or higher) identity over alignments of at least 80 amino acids, and (3) specific serum IgE-binding tests to be performed only if the source of the gene is a common allergen or the sequence matches an allergen (step two). Evaluation of de novo sensitization should be improved after further study to optimize and integrate information on the stability of the protein in pepsin and the abundance of the protein in food. Any requirement for additional tests, including evaluation of potential changes in endogenous allergen expression, should be weighed carefully to ensure methods are scientifically valid and end points are established that have biological meaning. Based on current evidence, consumers should feel confident that approved GM crops are as safe as traditional crops, and scientists should consider limiting studies to those that are predictive of food safety.