Genetically modified organisms (GMOs, in this context crops) are globally grown in larger quantities than ever before and it is anticipated that this increase will continue in the years to come. Research on GMOs is similarly increasing and performed in established GMO producing countries as well as upcoming GMO producing countries (ISAAA 2016). Guidelines to assess new GM varieties have been globally harmonised to a large extent (Kleter et al. 2001). The general guidelines have been formulated in the FAO/WHO Codex that aimed to provide guidance to conduct the food risk assessment of foods derived from recombinant-DNA plants. The content of Codex guidelines CAC/GL 44-2003 and CAC/GL 45-2003 predominantly describes a framework and the requirements for safety testing (WHO/FAO 2003a, b). The foundation of the assessment of the safety evaluation of GMOs is the comparison of the GMO with a genetically similar conventional crop that has a history of safe use. According to Codex, the GMO and direct comparator should be compared for the possible occurrence of differences, caused by intended and unintended effects, in composition, as well as in agronomic, phenotypic and molecular characteristics (Fig. 1).

Fig. 1
figure 1

Example of a random plot field trial, in this case for VCU of winter wheat at Wageningen Plant Research AGV in the Netherlands. The basic set-up for the different types of field trials is the same, differences can be observed in the detailed requirements that may vary per country. Photograph courtesy of Ir. L. van den Brink, Wageningen Plant Research

The applicant has to supply an extensive set of data for this comparative safety assessment, the extent of which may differ per country. The data that are generally asked from the applicant are divided in four categories: (1) Information on the parent crop, (2) Information on the donor of the transgene and regulatory sequences, transgenes and the delivery process, (3) Characterization of gene products (intended effects), and (4) Characterization of other products that may differ between the GM variety and its close comparator (unintended effects) (Canada Ministry of Justice 2015; CTNBio 2008; EC 2013; EFSA GMO Panel 2011; FDA 1997; FSANZ 2007; MOA China 2013). The field trials investigated in this paper were performed to obtain compositional data that will form part of the safety assessment of the GM crop for food and feed. This comparative compositional analysis is an important tool to identify potential safety issues that relate to any (un)intended effects in the GM crop. In order to perform a compositional analysis, a finite number of relevant compounds will be analysed. The Organisation for Economic Co-operation and Development (OECD), an intergovernmental organisation in which representatives of 34 industrialised countries in North and South America, Europe, and Asia and the Pacific region participate, has constructed a list of compositional variables for each new crop variety that should be analysed (OECD 2015). These compounds are listed in the OECD consensus documents that were drafted for each crop, and are nowadays available for 23 different crops, and another 15 are under development. The OECD objective is to allow a science-based risk assessment approach that is mutually acceptable among member countries. The documents do not give a testing protocol for the field trials to be performed, but provide a background of the physiology of the crop, including relevant nutrients and anti-nutrients. The OECD essentially offers a list of nutrients, anti-nutrients, natural toxins and allergens that are considered most relevant for the respective crops.

This paper provides an overview of the current field trial approaches for GM crops in countries around the world, with the focus on the practice of field trials to generate plant materials for the comparative compositional analysis in the GM plant versus the closest comparator, and/or additional commercially available non-GM cultivars.

In addition to this, the efficiency of GMO field trials was investigated, and how the requirements compare to the field trial requirements for variety registration. Field trials for variety registration are conducted to determine whether a new variety is effectively distinguishable from other known varieties (DUS-trials), and, in the case of agricultural crops, whether a new variety has added value for culture and use (VCU-trials) (Groenewoud 2014). These field trials are conducted separately in a parallel fashion from the field trials necessary for GMO risk assessment. It may be more effective in cost and labour if both pre-market assessments could be somehow combined or data exchanged, both for the applicant as well as for regulatory agencies, particularly in those cases where VCU trials will need to be performed.

GMO field trials in risk assessment

A literature search was performed on legislation, guidance documents and scientific opinions on specific GMOs to obtain information on globally applied field trial protocols. For the comparison of approaches, data have been obtained by the use of applications and decisions for GMO approvals available on the websites of the EFSA (EU) (EFSA 2014), APHIS and FDA (US) (FDA 2015; USDA APHIS 2015), FSANZ (Australia and New Zealand) (FSANZ 2013), CFIA and HC (Canada) (CFIA 2014b; Health Canada 2015), CTNBio (Brazil) (CTNBio 2006), the Biosafety Office and MOA (China) (Biosafety Office China 2015; MOA China 2013) and the FSC (Japan) (FSC Japan 2015). Not all relevant information could be obtained in this way. Therefore, in addition to this approach, information has been obtained by contacting relevant competent agencies in other countries. In this way an overview has been made on how field trials are conducted under different legislations worldwide and the results have been compared. A summary of all data is listed in Table 1. In those cases where available documents contained limited information on the required comparators and organization of field trials, the relevant authorities have been contacted to obtain additional information.

Table 1 Overview of factors included in the comparative compositional analysis

As far as can be deduced, most field trial sites worldwide are organised in a similar manner, containing a number of small replication areas. Three types of direct comparators are used in GMO field trials: (near-)isogenic non-GM crops with a history of safe use, negative segregants and, occasionally, GM near-isogenic parent lines with a history of safe use. Negative segregants are descendants from GM plants that lack the GM trait, and that can for example be the result of crossing non-GM plants and hemizygous GM plants or from self-fertilizing hemizygous GM plants (EFSA GMO Panel 2011). In the EU and Japan, preferably an isogenic non-GM crop with a history of safe use should be used as a comparator (EC 2011, 2013; FSC Japan 2015). The Food Safety Committee Japan has approved an application using a negative segregant as direct comparator in a single case, though it is not explicitly allowed according to legislation (FSC Japan 2015). In the US, Canada, Brazil, Australia and New Zealand, an isogenic non-GM crop with a history of safe use or a negative segregant can be used, depending on the crop (Canada Ministry of Justice 1994; CTNBio 2008; FDA 1997; FSANZ 2007). In China non-GM comparators are mentioned in the application documents, but it is not clear whether negative segregants are permitted (Biosafety Office China 2015; MOA China 2013).

The guidance documents of the EU (EC 2013), and previously of EFSA (EFSA 2011) stipulate that every applicant should at least include a number of non-GM reference varieties (RVs) that are generally regarded as safe to determine the range of natural variation for a certain compound in a specific field trial. Non-GM RVs are used because (1) the reference data are more representative when multiple crop varieties are included, and (2) the total plant population is increased to allow improved statistical assessments. The FSANZ risk assessment documents do not require the use of RVs, but competent authorities may request their use on a case-by-case basis (FSANZ 2007, 2014). The Chinese, US, Canadian, Brazilian and Japanese legislation do not require the presence of RVs (Canada Ministry of Justice 1994; CTNBio 2008; FDA 1997; FSC Japan 2015), and depend on literature values and the ILSI crop composition database for the evaluation of any observed differences between the GMO and the direct comparator (ILSI 2003). It was found that the number of RVs actually included in field trials is not very different in the different countries (data not shown). The most likely explanation is that dossiers are compiled for use in multiple countries, so in practice the data generated will meet the data requirements of the most demanding countries. Except for the EU, Japan and Australia/New Zealand, all responsible agencies perform most GMO field trials in their own country. For the EU and Australia/New Zealand, it is required that the field trials are conducted in locations that are agronomically and meteorologically similar to the regions where the crop eventually will be cultivated (EC 2013; FSANZ 2011).

The number of crop sites in most legislations is comparable. The EU, the US, Canada and Australia/New Zealand have performed risk assessments containing 12.1, 10.2, 9.7 and 8.9 sites, respectively, per field trial. This was the average over all assessments where the number of sites was known (period 2010–2015). Comparable data could not be retrieved for China, Brazil and Japan. Crop sites are counted across seasons, e.g. five crop sites used for the same field trials over three different seasons were counted as fifteen crop sites.

The common sources of literature mentioned in application documents are peer-reviewed scientific literature or standard crop composition databases. The EU prefers the use of RVs included in the same GM field trials over the use of data from literature and database sources (EFSA GMO Panel 2011). If the GM compositional values do not fall within the natural range of variation as observed in the RVs, the observed differences were assessed for possible hazards on a case-by-case basis. In other legislations literature is consulted regularly, both literature values as published in peer-reviewed journal articles and data from the International Life Sciences Institute crop composition database (ILSI 2003). The ILSI crop composition database contains compositional ranges measured in rapeseed, maize, cotton, rice and soybean over different growing seasons and locations. Another publicly accessible food composition database for safety assessment of GM crops as foods and feeds has been developed in Japan. This database contains multiple rice and soybean varieties and is focussed on regions in Japan (NARO 2011).

Field trials for plant variety registration (EU)

Field trials that have to be performed within the frame of applications for the registration of new plant varieties are regulated by member states according to CPVO protocols (Community Plant Variety Office, EU) and UPOV guidelines (International Union for the Protection of new Varieties of Plants) (CPVO 2015; UPOV 2015). DUS (Distinct-Uniform-Stable) and VCU (Value for Cultivation and Use, for arable crops) assessments are carried out to determine whether a plant variety meets the requirements to be registered as a new variety based on its phenotype (UPOV 2002). DUS assessment is required by the UPOV and is conducted by national agencies in EU member states (the situation is similar for other countries under the UPOV convention, but this is not detailed further in this paper). The novelty and function of a new crop is assessed during these trials. A set of criteria on qualitative and quantitative characteristics of the crop species has been put together to determine differences using statistical methods (UPOV 1961, 1978). Examination of uniformity is performed to assure a new crop variety fulfils the same standards for homogeneity among individual plants as other varieties of the same crop species. In line with Article 6(1)(c) of the 1961/1972 and 1978 Acts of the UPOV Convention, a variety is uniform if it is sufficiently homogeneous, having regard to the particular features of its sexual reproduction or vegetative propagation (UPOV 1961, 1978). Uniformity is measured with a set of unique qualitative and quantitative phenotypic characteristics that is assessed using statistical models. Examination of stability is performed to guarantee a potential new variety does not change over more than one season. For many varieties, stability and uniformity are intertwined; often a variety that has proven to be uniform within a single harvest, will be a stable variety over multiple seasons (UPOV 2002). The UPOV Convention requires a variety to be stable in its essential characteristics, that is, it must remain true to its description after repeated reproduction or propagation or, where the breeder has defined a particular cycle of reproduction or multiplication, at the end of each cycle (UPOV 1961, 1978). The set of crop characteristics to test stability is similar as the one for the test of uniformity.

Variety registration requires the conduct of field trials at two different occasions/locations. It is possible to obtain data from one location in two seasons, or from two locations in a single season. For variety registration (field) trials are performed on a small scale in a glasshouse or outside depending on the type of crop species. Usually tests contain up to 100 plants on typically an area of less than 100 m2, but this is dependent on crop species and reproduction system (CPVO 2015).

Observations for VCU (only for arable crops for EU listing) have to be performed during and after the growth season. The cultural value of a new crop variety is assessed in terms of whether the crop is a novel addition to existing crop varieties. The utility value of a new crop variety is assessed by investigating e.g. crop emergence in the field, weed suppression, insect resistance, susceptibility to plant diseases, percentage crop rot, shape of the roots and crop yield. Trials to obtain yield statistics are performed by the breeders themselves. VCU-research is aimed at determining the value of the new variety by assessing e.g. crop ground coverage, consequences of harmful influences and, these factors combined, the total financial return (Groenewoud 2014). To obtain information on resistances, a number of standard commercial varieties are included in the trials. It is directly possible for an applicant to request VCU trials for variety registration for new GMOs. These trials have specific requirements with respect to size and organisation, including the use of RVs (Raad voor Plantenrassen 2015a, b, c). It is common to conduct VCU-trials over multiple years/seasons, and on varying locations. The conductor of the trial records field emergence as a percentage of the seeds sown, weed suppression, insect presence/resistance and susceptibility to leaf diseases or infestations during the trials. After harvest, yield, contaminations, shape and quality are assessed and used in the evaluation of the variety for registration.

Variety registration in many countries follows similar lines of investigation and are very much crop-dependent. In many countries outside Europe, European procedures are being used with sometimes local adaptations related to number of sites and, especially, comparing varieties.

GMO field trials versus field trials for variety registration

Before a new GM crop variety can enter the EU market, the variety will have to be evaluated in two pre-market assessment procedures that both include the performance of field trials. From the overview presented in this paper, it is clear that there are different requirements for field trials within the frame of GMO risk assessment and similar trials within the frame of variety registration.

Field trials within the frame of GMO risk assessment generally require an extensive isolation zone around the crop site; this is usually not convenient in the small scale VCU trials. GMO compositional trials have to be conducted in a meteorological and geographical representative area and can be conducted on a smaller scale, but at least need to be performed under various conditions (soil, climate) that cover as much as possible the range of conditions under which they could be grown in practice. The idea behind this requirement is that some unintended effects may become primarily evident under particular, for instance, stressed conditions. Since the likelihood of the occurrence of stressed conditions (that may significantly change the physiology of the plant) in practice may still be limited in the current set-up, it may be that specifically applying controlled stress conditions, e.g. in the glass house, may be more informative and efficient in practice. Although 100% certainty cannot be achieved with any approach, including the current approach, it would appear that if growth under controlled stress conditions does not lead to the identification of new hazards, it is unlikely that such hazards would surface under other (field) conditions. This would critically depend on the feasibility of identifying representative stress conditions and of the possibility of performing such stress experiments in a more cost-efficient manner than is currently the case with the requirement for multiple field trials, and related analyses. Future research to improve the hazard identification in new plant materials may further focus on such aspects. It may still be that further research finds that in practice any relevant (toxicological) effect will already be observed under any normal growing condition, provided that the analytical techniques used are sufficiently informative, as well as cost-efficient.

A clear difference between to the two types of field trials is the size of the field plots in both procedures. The set-up of the field trials for the GMO risk assessment procedure will differ per species, but the number of plants will generally be in the order of 1000 to 10,000, depending on crop species, country and stage of the field trial (confined, open, large scale) in the safety assessment. In DUS and VCU research typically an area of less than 100 m2 is cultivated, where the number of plants may vary per crop species. A single GM field trial site is typically between 0.5 and 5 ha. GMO field trials contain the new GM plant, a conventional counterpart and multiple RVs that are cultivated in a random block design to eliminate location specificity. When the RVs are disregarded, the size of the field trials becomes more similar, with the GMO field trial shrinking to a small number of blocks of up to 100 m2. It seems likely that it is feasible to further reduce the size of these field trials without a direct effect on the potential to identify unintended effects, but this would require further investigation. Also, some VCU trials may require a specific crop placement for examining, for instance, resistance against certain pests or diseases. An optimised random block design of the field trials will need to be developed to accommodate this aspect into basic GMO field testing procedures.

Field trials for GMO risk assessment may be conducted anywhere in the world, as long as the meteorological and environmental conditions are representative for the countries producing the specific crop. If the producer wishes to cultivate the crop in the EU, this may include EU-locations as well. In this sense, it may be more efficient to use data from the larger scale GM field trials also for VCU assessment. The resulting data may be used to assess the new GM variety for its safety and, if proven safe, can lead to direct registration on the list of (EU) registered varieties. To meet both goals the applicant of the GMO has to conduct the field trials for compositional analysis in the country where it plans to register and grow the new variety. For example, in the European Union it is possible to apply for variety registration on an EU-wide level, as is the case with the authorization and risk assessment of new GMOs.

Discussion and conclusions

In all investigated legislations, regulations and guidelines, the novel GM crop variety is compared to a genetically close non-GM counterpart to assess for the presence of any unintended effects of the breeding process on the physiology of the GM plant. This comparison is in all cases done by growing the GM plant in the vicinity of the non-GM counterpart and by subsequently analysing the different plant parts for differences in phenotype and composition. In most countries no standards or technical regulations for the performance of these field trials are available. According to all legislations, the comparative analysis should preferably be performed with a near-isogenic direct comparator with a history of safe use. This will allow for the most direct comparison to identify unintended changes as a result of the genetic modification. In specific cases, a non-GM near-isogenic comparator will not be available for the comparison; in those cases, other genetically close comparators have been used that were not near-isogenic. It may also be acceptable in some countries to use negative segregants as a comparator, but as these lines are basically experimental lines that have not been assessed for their safety, it is often discouraged to use negative or null segregants as a comparator in the risk assessment procedure. In the US it is advocated to use multiple negative segregants in those cases where no other adequate comparator is available.

For the comparative assessment it is important to note that, when comparing 20–50 compounds per crop, differences between the GM crop and its comparator for one or more compounds will regularly be observed. This could be expected in a statistical evaluation that is based on a larger set of observations and a 95%, or similar, confidence interval for the null hypothesis. Besides these statistical considerations, other reasons for observed differences may still be slight differences between environmental conditions or growth stage between the GM and non-GM lines, despite the use of a random block design in the field.

The EU requires the use of RVs in field trials for comparative purposes under the same environmental conditions; Australia/New Zealand, Japan and the US prefer the use of RVs, but in these countries they are not obligatory. In Canada, the inclusion of RVs in a field trial is dependent on the previously collected data (CFIA 2014a). In Brazil no additional RVs are required, but they are taken into account when included in the application. Information on the use of RVs in China could not be retrieved from applications and published risk assessment reports (Biosafety Office China 2015; CTNBio 2006; EFSA 2014; FDA 2015; FSANZ 2015; FSC Japan 2015). Despite the use of RVs, however, occasionally values are observed for specific compounds in the GM crop that fall outside of the ranges of natural variation as determined by the reference set. Chances that this may occur will depend on the representativeness of the selected reference set. Within the EU, these differences are then separately assessed on the basis of scientific literature and for biological relevance (EFSA 2011). However, based on a detailed analysis of EU risk assessment reports so far, the added value of the inclusion of RVs in the same experiment has not been demonstrated: there has been no case where a value outside of the range of the RVs has led to additional analyses to conclude on possible safety issues. There is a growing amount of good quality data available on crop plant composition in databases as the ILSI crop composition database. It can therefore be disputed whether the use of reference varieties is essential for the initial screening for potentially present unintended effects. The use of reference varieties may, however, be of use in a second instance, when such unintended effects may have been tentatively identified and considered to be of potential toxicological relevance. Such a tiered approach seems defendable and even recommendable as in over 20 years of experience with this type of field trials, there has not been an example of an identified unintended effect that has been overlooked as a result of the lack of a selected set of most relevant reference varieties in the comparative compositional assessment. More in general, there are no examples of GMOs that have been rejected on the basis of observed levels of components as part of the comparative compositional analysis. Generally, a high phenotypic and compositional similarity between GMO and the parent crop is shown.

Within the GMO risk assessment procedure the applicants may choose their own field trials locations. This is usually not in the EU, primarily because most GM crop developers are located outside of the EU. After market approval of new GM varieties the producer will need to perform additional field trials under the procedure for variety registration (either national registration or European registration). The latter procedure includes the assessment of DUS and VCU (the latter for arable crops) characteristics. Field trials for variety registration generally do not cover safety aspects, with the exception of the establishment of the glycoalkaloid content in potatoes in the Netherlands and Sweden, but focus on an assessment of phenotypic characteristics with relevance to cultivation of the crop. The protocols for field trials as part of the GMO risk assessment procedure and the procedure for variety registration (i.e. VCU) have been assessed and appear at present to be compatible to some extent, although the requirements for the GMO risk assessment trials are currently more elaborate, and although VCU is only performed on arable crops. It seems, however, possible to increase the cost efficiency of the field trials by more often sharing the field trial data for both pre-market procedures than is currently done. The fact that both types of field trials are part of pre-market assessment procedures and that the requirements are highly similar, seems to argue for more efficient procedures. Sharing the data for both pre-market assessment procedures would help to perform the field trials as-cost efficient as feasible.