Introduction

Escherichia coli strains are enteric bacteria. Most of them are part of the beneficial natural flora of the human and animal intestine. Some strains have acquired mechanisms to cause diseases and have thus become pathogenic for humans. The E. coli strains are classified by their heat-stable somatic (O) antigens and flagellar (H) antigens, two major antigens of Enterobacteriacea (Ryan 2004), into >380 serotypes (Karmali et al. 2010). However, this classification gives no immediate information about the pathogenicity of the E. coli strain. Pathogenic E. coli are divided in two major groups: diarrhoeagenic E. coli (DEC) and extraintestinal E. coli (ExPEC). These groups can be further categorized into pathotypes, based on the type of virulence factor present in the E. coli genome and based on the host clinical symptoms. Two pathotypes belong to the ExPEC, namely, uropathogenic (UPEC) and neonatal meningitidis (NMEC) E. coli. The DEC group consists of eight pathotypes, namely, shigatoxigenic (STEC) (including the enterohemorrhagic (EHEC) E. coli (Croxen and Finlay 2010; Croxen et al. 2013)), enteropathogenic (EPEC), enterotoxigenic (ETEC), enteroinvasive (EIEC; including Shigella spp.), enteroaggregative (EAggEC), diffusively adherent (DAEC), adherent invasive (AIEC), and the recently described enteroaggregative shigatoxigenic (EAggSTEC) E. coli (Clements et al. 2012; Croxen and Finlay 2010; Croxen et al. 2013). In the case of STEC, Karmali’s seropathotype classification is based on a serotype-specific spectrum of disease frequency and severity, since only a limited number of serotypes appear to be associated with the majority of human disease (Karmali et al. 2010).

The natural reservoirs of pathogenic E. coli are the intestinal tracts of animals, mainly ruminants (Clements et al. 2012). Human infection occurs mainly by consumption of contaminated food products of animal origin, contaminated raw food products such as salads, drinking water contaminated with animal or human waste, or through direct person-to-person spread due to poor hygiene but also through direct contact with infected animals (Clements et al. 2012; Karmali et al. 2010). The STEC pathotype causes mostly sporadic disease and is the most important in terms of number of human cases (European Food Safety Authority (EFSA), European Centre for Disease Prevention and Control (ECDC) 2013, 2015, European Food Safety Authority (EFSA), European Centre for Disease Prevention and Control (ECDC) Prevention and Control 2016). STEC’s (or VTEC) pathogenicity is linked to the production of shiga-toxins (sometimes referred to as Vero toxins) (encoded by stx1 and/or stx2) in combination with an epithelial cell adhesion factor (encoded by the intimin gene (eae)) (Clements et al. 2012; Karmali et al. 2010). STEC can cause mild to bloody diarrhea often with abdominal cramps and fever. Hemolytic-ureamic syndrome (HUS) is a severe complication of STEC and can lead to acute renal failure in young children (EFSA and ECDC 2013, 2015, European Food Safety Authority (EFSA), European Centre for Disease Prevention and Control (ECDC) Prevention and Control 2016). In the European Union (EU), STEC is the fourth cause of human zoonosis with 5995 confirmed human cases in 2014, a hospitalization rate of 39.2% and a fatality rate of 0.20% (European Food Safety Authority (EFSA), European Centre for Disease Prevention and Control (ECDC) Prevention and Control 2016). In 2011, these statistics were even worse due to a large outbreak caused by sprouts contaminated with an O104:H4 serotype of E. coli (EAggSTEC) in Germany and France (ECDC 2013; King et al. 2012; Wadl et al. 2011). The strain involved in this large outbreak had an unusual combination of pathogenic features typical for enteroaggregative E. coli (EAggEC) together with the capacity to produce shiga-toxin 2 (stx2), the latter being the hallmark of STEC (Scheutz et al. 2011). Consequently, the EU extended the regulation 2073/2005 on the microbiological criteria for foodstuffs (Commission of the European Communities 2005) with the amendment 209/2013 (Commission of the European Communities 2013). These adaptations propose significant changes in the control strategy, especially to test for the presence of the top five STEC serogroups causing human disease (i.e., O157, O26, O111, O103, and O145) and the O104:H4 serotype involved in the German outbreak.

To comply with this new EU legislation, a new standard for the detection of these serogroups was published, i.e., the ISO/TS 13136:2012 (ISO: International Organization for Standardization 2012), and includes a qPCR detection assay targeting the stx-genes as a first screening for STEC (ISO: International Organization for Standardization 2012) prior to detection of the serogroup. Several other qPCR assays for stx and other virulence genes are available in literature but most of them are limited to STEC detection (e.g., Anklam et al. 2012; Bugarel et al. 2010; Nielsen and Andersen 2003; Paton and Paton 1998; Pavlovic et al. 2010; Perelle et al. 2004; Sharma and Dean-Nystrom 2003; Wasilenko et al. 2014). Moreover, some methods allowing the detection of several pathotypes of E. coli such as STEC, EAggEC, EHEC, EPEC, ETEC, EIEC, and DAEC, targeting other virulence genes, have been developed (e.g., multiplex conventional PCR (Aranda et al. 2004; Baccin Fialho et al. 2013; Botteldoorn et al. 2003; Chandra et al. 2013; Kuwayama et al. 2011) and more recently also qPCR assays (Barletta et al. 2013; Fukushima et al. 2009; Liu et al. 2013; Tzschoppe et al. 2012)). However, these detection systems are not compatible to run under the same condition and target a single gene within a single assay. This leads to a diagnosis based on multiple experiments performed on different analytical runs and can lead to false negative results in case of gene mutation or deletion in the annealing site of the primers of the targeted genes (Barbau-Piednoir et al. 2013b).

In this paper, we report the development of the pathogenic E. coli detection system (CoSYPS Path E. coli) based on a set of 13 qPCR assays targeting 7 genes of interest that can be combined to be used in a single run in a 96-well plate format. Two assays target most of E. coli (uidA gene) and the 11 other assays target 6 different markers of 5 E. coli pathotypes with two assays per target. Applied to isolates, this detection system allows the discrimination of five E. coli pathotypes, i.e., STEC (including EHEC), EPEC, EAggEC, EAggSTEC, and EIEC.

In addition, this CoSYPS Path E. coli would be of great interest for food matrices screening as it replies to the need of detecting virulence factors from E. coli pathotypes other than those from STEC, i.e., EAggEC, EAggSTEC, EIEC, and EPEC. This CoSYPS Path E. coli would be an appropriate tool to better evaluate by screening the possible presence of a broader range of pathogenic E. coli in a food sample.

Materials and methods

Bacterial strains

The bacterial isolates used in this study are listed in Table 1. A panel of DNA extracted from 113 bacterial isolates (76 E. coli isolates, 3 others species of the Escherichia genus, and 34 isolates from 22 other genera), two mold species and three animal species has been tested. The bacterial isolates were obtained from National Reference Centres and Laboratories. The meat samples were purchased at a retail shop.

Table 1 Selectivity assessment of the 13 SYBR® Green qPCR assays of the CoSYPS Path E. coli

Bacterial growth conditions, DNA extraction, and DNA quantification

Overnight cultures of each bacterial isolate were grown in brain-heart infusion (BHI) broth or Bolton broth (for Campylobacter) at the appropriate temperature and oxygen condition. The total DNA from each of the bacterial isolates was extracted using the “Gram-negative or Gram-positive bacteria” protocol of the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany). Fungal genomic DNA (gDNA) was extracted with the ZR Fungal/Bacterial gDNA Extraction Kit (Zymo Research, Irvine, CA, USA). The total DNA from meat was extracted using the “Animal Tissue” protocol of the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany). All kits were used according to the manufacturer’s recommendations. The DNA quality was verified on agarose gel (1%) and the DNA concentration was measured using a NanoDrop® 2000 device (ThermoFisher Scientific, Schwerte, Germany).

Calculation of bacterial genomic copy number

The bacterial genomic copy number was calculated according to the genome size of each targeted bacterial isolate using the formula published in Barbau-Piednoir et al. (2013a).

Design and in silico assessment of primer pairs

A uniform primer design approach was applied in the development of all primer pairs, as previously described for the primer design for the Salmonella and Listeria detection and discrimination system (Barbau-Piednoir et al. 2013a, 2013b). The first step consisted of identifying genes of interest, either genus or pathotype specific, by means of a bibliographic study (Clements et al. 2012; Croxen and Finlay 2010). The second step included the collection of primer sequences available in the literature targeting the selected genes giving an amplicon between 60 and 120 bp (Anklam et al. 2012; Botteldoorn et al. 2003; Fukushima et al. 2009; Kim et al. 2010; Nielsen and Andersen 2003; Pavlovic et al. 2010; Perelle et al. 2004; Sharma et al. 1999; Takahashi et al. 2009; Thiem et al. 2004; Tzschoppe et al. 2012). If none were found, primer pairs were designed, preferentially within conserved regions, using the “Primer 3” program (http://frodo.wi.mit.edu/primer3/ (Rozen and Skaletsky 2000)) with the “product size range” specification set at “60 to 120 bp” and “primer size” optimal set at “22 bases.” In the third step, a collection of bacterial DNA sequences of other foodborne pathogenic bacteria and bacteria naturally present in food matrices was retrieved from the NCBI public database (http://www.ncbi.nlm.nih.gov/sites/entrez). An in silico test of the primer pairs was subsequently performed as previously described in Barbau-Piednoir et al. (2013a). Only primer pairs that gave in silico the expected amplicon were retained for the following steps.

Qualitative SYBR® Green qPCR assay and optimal primer concentration

All qPCR assay reactions and analysis of the results were performed according to the protocol described in Barbau-Piednoir et al. (2013a).

The optimal concentration of the selected primer pairs was determined by testing one to three positive isolates with different concentrations of each primer, i.e., between 250 and 1000 nM. The concentration giving the lowest Cq value without formation of a high level of primer dimer was selected. At this selected concentration, a positive sample at a concentration around the limit of detection (LOD) should not present a primer dimer dissociation peak higher than the dissociation peak corresponding to the amplicon derived from the positive sample. The primer pairs used in this study and their optimal concentrations are presented in Table 2.

Table 2 Primer pair sequences, concentration, amplicon size, and T m value for each SYBR® Green qPCR assay

Selectivity test and inclusivity, exclusivity, and accuracy calculation

Primer pairs that passed the in silico evaluation were tested for their selectivity in situ. This selectivity test consisted of two steps:

  1. 1.

    A preliminary selectivity test involving a few target isolates and a few non-target isolates (most important foodborne pathogenic bacteria) was performed. Primer pairs amplifying only the DNA extracted from the target isolates were tested for full selectivity.

  2. 2.

    The full selectivity test allows testing the inclusivity, exclusivity, and accuracy of each developed qPCR assay. This experiment includes target and non-target isolates representing species belonging to 28 genera (76 E. coli isolates, 3 other species of the Escherichia genus, and 34 isolates from 22 other genera), two mold species and three animal species, and a no-template control (NTC) (Table 1). The non-target microorganisms relevant to test the exclusivity were chosen among taxonomically closely related and not closely related (pathogenic or not) bacteria that can be present in the food matrices (ISO: International Organization for Standardization 2011).

The qPCR reactions were performed with approximately 104 copies of genomic DNA under the conditions referred to above.

Five criteria were set to define a “specific signal” generated in the selectivity of a SYBR® Green qPCR analysis (as described in Barbau-Piednoir 2010). They are (1) the presence or absence of an (exponential) amplification, (2) presence of a single peak upon melting analysis with a unique T m value, (3) the presence or absence of a single band on agarose gel with (4) a correct size, and (5) the sequence of the amplicon.

The inclusivity, exclusivity, and accuracy of the assay can be calculated from the selectivity test. The inclusivity represents the ability of the assay to detect its targets. The exclusivity represents the ability of the assay to not detect the non-targets. The accuracy represents the closeness of agreement between a test result and the accepted reference value (International Organization for Standardization (ISO) 1993). Their formulas are the following (EU-RL for E. coli 2013):

$$ \mathbf{IN}=\frac{\mathbf{TP}}{\left(\mathbf{TP}+\mathbf{FN}\right)}\times \mathbf{100} $$
$$ \mathbf{EX}=\frac{\mathbf{TN}}{\left(\mathbf{TN}+\mathbf{FP}\right)}\times \mathbf{100} $$
$$ \mathbf{AC}=\frac{\left(\mathbf{TP}+\mathbf{TN}\right)}{\mathbf{N}}\times \mathbf{100} $$

where AC is the accuracy, IN is the inclusivity, EX is the exclusivity, TP is the true positive samples, TN is the true negative samples, FP is the false positive samples, FN is the false negative samples, and N is the number of tested samples.

Dynamic range and calculation of the PCR efficiency

Primer pairs presenting an acceptable selectivity (i.e., amplifying all targets and none of non-target according to the expectation) were subsequently examined for their dynamic range and PCR efficiency as described previously (Barbau-Piednoir et al. 2013a, 2013b). The dynamic range of a qPCR assay is the range of concentrations where it performs linearly. The dynamic range was assessed for the 13 SYBR® Green qPCR of the CoSYPS Path E. coli by analyzing in duplicate a serial dilution in a carrier DNA background (4 ng/μL Calf Thymus DNA (CTD) (Invitrogen, Carlsbad, CA, USA)) of pure isolate gDNA (10,000 to 0.01 theoretical genomic copies) of STEC O145:H- (TIAC1681) and STEC O103:H2 (TIAC614) for the uidA and eae assays, Shigella flexneri 2a (12–0081) and Shigella boydii 2 (12–0531) for the ipaH assays as no EIEC strains were available, STEC O157:H7 (TIAC617) and STEC O118:H16 (TIAC1804) for the stx1and2–4 assay, STEC O55:H12 (TIAC1703) and STEC O118:H16 (TIAC1804) for the stx1–185 assay, STEC O157:H7 (TIAC617) and STEC O145:H28 (TIAC623) for the stx2–81 assay, and EAggEC O104:H2 (TIAC2322) and EAggSTEC O104:H4 (TIAC2003) for the aggR and aaiC assays. The carrier DNA avoids the improper dilution due to low concentration of gDNA. This analysis also allows the assessment of the coefficient of determination (R2) and the PCR efficiency (E) of the SYBR® Green qPCR assays. The coefficient of determination (R2) is an indicator of the correlation of data regarding the linear regression curve. The PCR efficiency (E) can be calculated according to the formula reported by Rutledge and Cote (2003). Although no performance values are given in the last GMO guidelines for qualitative methods (European Network of GMO Laboratories (ENGL) 2015), R2 ≥ 0.98 and a PCR efficiency ranging between 80 and 120% have been indicated as good performance criteria for the validation of qualitative qPCR methods (Broeders et al. 2014).

Sensitivity test

The sensitivity of the chromosomal targets (i.e., uidA, eae, and aaiC) was assessed to determine the LOD of these SYBR® Green qPCR assays. The LOD is defined as “the lowest amount or concentration of analyte in a sample,” which can be reliably detected (with a level of confidence of 95%), but not necessarily quantified (ENGL 2015). The strains used were STEC O145:H- (TIAC1681) and STEC O103:H2 (TIAC614) for the eae and uidA assays and EAggSTEC O104:Hnt (TIAC1951) and EAggSTEC O104:H4 (TIAC2003) for the aaiC assays. To determine the LOD, a range of copy numbers between 10 and 0.1 theoretical genomic copies was tested (i.e., 10, 5, 2, 1, 0.5, 0.2, and 0.1). Each dilution was tested in six replicates, for both isolates. Moreover, the analysis was performed at three independent times, under repeatable conditions, resulting in 36 repeats for each dilution point. It has to be noticed that the dilution points beyond the theoretical single genomic copy were carried out to assess the dilution series’ correctness. Indeed, for chromosomal targets, as it is statistically impossible to get amplification in all replicates with the dilution points beyond 1 theoretical genomic copy, none of these dilution points should give 100% of positive signals.

While some of the targets are located on the bacterial chromosome and occur in single copy (i.e., uidA, eae, and aaiC), some other targets are present in several copies. The shiga-toxin genes 1 and 2 (stx1 and stx2) of the STEC strains are prophagic genes, and a single bacterial host can harbor more than one Stx prophage (Fogg et al. 2012). The transcriptional activator of the aggregative adherence fimbriae (aggR) of the EAggEC strains is located on a plasmid (Nataro et al. 1994). The invasion plasmid antigen H (ipaH) is present in multiple copies both on a plasmid and on the chromosome of EIEC and Shigella spp. (Venkatesan et al. 1989). Since these four genes can be present in multiple copies in a single bacterium, the dilution series strategy described above to determine the LOD of the qPCR assays was not performed for these targets. Only the range of genomic copy numbers between 1 and 10 has been tested to confirm that these amounts are detected 100% of the time by these SYBR® Green qPCR assays. The tested strains are STEC O145:H- (TIAC1681 and STEC O103:H2 (TIAC614) for the stx1 assays, STEC O157:H7 (TIAC617) and STEC O145:H28 (TIAC623) for the stx2 assays, and EAggSTEC O104:Hnt (TIAC1951) and EAggSTEC O104:H4 (TIAC2003) for the aggR assays.

Repeatability calculation

As described previously (Barbau-Piednoir et al. 2013a, 2013b), to evaluate the repeatability of the assay, independent tests were performed with the same protocol, with the same samples, by the same operator using the same qPCR apparatus within a short interval of time (International Organization for Standardization (ISO) 1993). The repeatability limit (r) and the relative standard deviation of repeatability (RSDr) were calculated according to ISO 16140:2003 (ISO: International Organization for Standardization 2003). The RSDr should be ≤ 25% for all dilutions above the LOD for quantitative methods, but there is no critical value fixed for RSDr regarding qualitative qPCR methods (ENGL 2015). The RSDr and r values of the Cq values were calculated at each dilution point, while the RSDr and r values of the T m values were calculated with all the T m values coupled with amplification (Cq < 40).

Reproducibility study and calculation

To evaluate the reproducibility of the assays (International Organization for Standardization (ISO) 1993), independent tests were performed with the same protocol, using the same samples, in two different laboratories, by two different operators using different apparatus, i.e., Bio-Rad iQ5 (Biorad, Hercules, CA) and ABI 7300 (Applied Biosystems, Life Technologies, Foster City, CA). The tested samples consisted of gDNA extracted from STEC, EAggEC, and Shigella spp. and subsequently diluted at different concentrations between 10 and 200 genomic copies per reaction. Each sample was analyzed in duplicate by each operator. The positive controls, analyzed in simplicate, used in this analysis are 104 copies/assay of (i) STEC O157:H7 (TIAC 1615) for uidA, eae, stx1, and stx2; (ii) Shigella dysenteriae 3 (12–1388) for ipaH as no EIEC were available; and (iii) EAggEC O104:H2 (TIAC2322) for aaiC and aggR.

Two reproducibility measures can be calculated from these results: the relative standard deviation of reproducibility (RSDR) and the uncertainty (U) (Barbau-Piednoir et al. 2013a). The RSDR should be ≤ 35% for all the tested samples (ENGL 2015). The RSDR of the Cq values are calculated for the tested samples. The RSDR of the T m values are calculated with all the T m values coupled with amplification (Cq < 40).

CoSYPS Path E. coli on food samples

Pathogenic E. coli inoculum preparation

STEC O91:H21 (stx2 positive, ref. TIAC1863), STEC O157:H7 (eae and stx2 positive, ref. TIAC2096), STEC O55:H12 (stx1 positive, ref. TIAC1873), and STEC O121:H19 (stx2 and eae positive, ref. TIAC1871) were used to artificially contaminate the food samples. To prepare the spike, a single colony was inoculated in 10 ml of BHI broth and cultured at 37 °C without shaking for 16–18 h. This culture was diluted in sterile BHI broth to obtain an OD600nm of 1 (approximately 5.108 CFU/ml). This dilution, called D0, was used as first culture in a 10-fold serial dilution until D-7 in buffered peptone water (BPW). The enumeration of D-6 to D-7 was performed by plating 100 μl of these dilutions in triplicate on nutrient agar plates and incubated for 18 ± 2 h at 37 °C (Table 4). These two dilutions were used to contaminate the food samples.

Artificial contamination of food samples

To obtain food samples contaminated with pathogenic E. coli, artificial contamination was performed. Salami, tomatoes, red fruits, and minced meat (all free of pathogenic E. coli as confirmed by analysis of not artificially contaminated samples (Blank in Table 4)) were purchased at a retail shop. These matrices have been selected as these represent products at risk for pathogenic E. coli contamination. Three sub-samples of 25 g of each food sample (matrix) were stomached in 225 ml of buffered peptone water (BPW) medium in a filter stomacher bag. One sub-sample was kept not contaminated (Blank-“matrix name”), and the two others were contaminated with 100 μl of D-6 and D-7 (D-6-matrix name and D-7-matrix name) by adding the bacteria to the already stomached samples (after stomaching to avoid contamination of the lab material) and subsequent soft homogenization through mixing the stomacher bag by hand. Three no matrix controls were also added containing only the 225 ml of BPW, one without contamination (Blank-blank), the two other contaminated as described previously (D-6-Blank and D-7-Blank). Tomatoes were artificially contaminated with STEC O91:H21 (stx2 positive, ref. TIAC1863), salami was spiked with STEC O157:H7 (eae and stx2 positive, ref. TIAC2096), red fruits were spiked with STEC O55:H12 (stx1 positive, ref. TIAC1873), and minced meat was spiked with STEC O121:H19 (stx2 and eae positive, ref. TIAC1871).

Enrichment step

According to ISO/TS 13136:2012, samples were enriched in 225 ml of BPW for 24 ± 2 h at 37 °C without shaking.

DNA extraction from food samples

After 24 h of enrichment, 1 ml of the enriched broth was transferred into a 1.5-ml micro-centrifuge tube, centrifuged for 10 min at 6000×g at room temperature, and the supernatant was discarded. DNA was extracted from the pellet with the Nucleospin Food Kit (Macherey-Nagel®, Düren, Germany) according to the manufacturer’s recommendations.

CoSYPS Path E. coli on food samples

A 1/10 dilution of the DNA extracted from food samples was identified as the best dilution to avoid inhibition of the PCR reaction and to give a positive signal even with low contamination levels of the food matrix. Thus, the dilution 1/10 of the gDNA extract of each sample was analyzed with the 13 qPCR SYBR® Green assays of the CoSYPS Path E. coli detection system, with the same PCR program for each assay as detailed in Barbau-Piednoir et al. (2013a) and using the appropriate concentration of each primer (Table 2). The PCR positive controls used in this analysis are 104 genomic copies (GC) of STEC O157:H7 (TIAC 1615) for the uidA, eae, stx1, and stx2 assays; 104 GC of S. flexneri 2a (12–0081) for the ipaH assay; and 104 GC of EAggEC O104:H2 (TIAC 2322) for the aggR and aaiC assays. The PCR negative control used in this analysis is a NTC using UltraPure DNase/RNase-Free Distilled Water (Life Technologies, Foster City, CA, USA) instead of the DNA template.

Results

In silico selection of the primer pairs and optimization of primer concentration

As a first step, specific genes to detect and discriminate five pathotypes of E. coli were identified.

For the detection of the E. coli species, the uidA gene was selected as it is present in approximately 97% of E. coli isolates (Feng et al. 1991; McDaniels et al. 1996). This gene encodes the β-D-glucuronidase enzyme (Feng et al. 1991; McDaniels et al. 1996). However, this uidA gene is also present in approximately 44% of Shigella spp., 29% of Salmonella spp., and in a few Yersinia, Citrobacter, Edwardsiella, Hafnia, Staphylococcus, Streptococcus, Corynebacteria, and Clostridium species (Feng et al. 1991; Tryland and Fiksdal 1998). Therefore, as an additional control to discriminate for most E. coli and Shigella, the ipaH gene was chosen for the detection and discrimination of EIEC and Shigella isolates, as it is present in all EIEC and Shigella but not in other E. coli (Ud-Din and Wahid 2014). This gene encodes a type-3 secretion system effector that is involved in the bacteria’s escape from phagosomes of the host cells and in the inhibition of the immune system of the host (Croxen and Finlay 2010; Schroeder and Hilbi 2008). For the detection and discrimination of STEC and EAggSTEC isolates, the stx1 and stx2 genes were selected as they encode the two sub-groups of shiga toxins, i.e., Stx1 and Stx2, which are the main virulence factors of STEC (Croxen and Finlay 2010). These toxins suppress an inflammatory response in the host and increase the attachment of the pathogen to the host cell (Croxen and Finlay 2010). Stx2 is more prevalent in hemorrhagic colitis and HUS than Stx1 (Nataro and Kaper 1998). As the eae gene is known as the specific marker for EPEC and EHEC strains (Croxen and Finlay 2010), it was picked for the detection and discrimination of EPEC and EHEC isolates. This gene encodes the bacterial outer membrane protein intimin, which is involved in the intimate adherence and effacement of the host cells (Croxen and Finlay 2010). For EAggEC and EAggSTEC isolates, the aggR and aaiC genes were selected as they are described in literature as the discriminatory genes for these pathotypes (Croxen and Finlay 2010; Dudley et al. 2006). These genes encode, respectively, the transcriptional activator of the aggregative adherence fimbriae and the AggR-activated island C which induces the adherence of EAggEC and EAggSTEC to the intestinal mucosa of the host (Croxen and Finlay 2010; Dudley et al. 2006).

The primer pairs collected in the literature and those designed during this study (see the “Materials and methods” section) to detect the selected targets were evaluated in silico for their selectivity (data not shown). During this in silico evaluation, some nucleotides were degenerated when necessary. The primer pairs passing the in silico selectivity test were then evaluated in situ. Twenty primer pairs were tested in situ with the preliminary selectivity test (data not shown). From these, 13 primer pairs were retained for 6 targets, with each time two primer pairs for each target in order to avoid/ decrease the risk of false negatives due to mutations in the targeted sequence: uidA-3 and uidA-7 for E. coli detection, ipaH-569 and ipaH-3 for EIEC and Shigella spp. discrimination, eae-185 and eae-EBP-1 for EPEC and EHEC discrimination, stx1and2–4, stx1–185 and stx2–81 for STEC and EASTEC discrimination, and aggR-185, aggR-2, aaiC-EBP1, and aaiC-EBP2 for the EAggEC and EAggSTEC discrimination (Fig. 1), when applied to isolates. The optimal concentration of the 13 primer pairs was also evaluated (Table 2). These assays were subsequently experimentally evaluated for their full selectivity.

Fig. 1
figure 1

Decision tree of the CoSYPS Path E. coli. The CoSYPS Path E. coli is a multi-target SYBR® Green qPCR system. Each target is a marker of a pathotype of E. coli (except uidA which is a marker of E. coli (and Shigella)). Each level of detection is performed by two SYBR® Green qPCR assays. Abbreviations for the genes are as follows: uidA β-D-glucuronidase gene, ipaH invasion plasmid antigen H, aggR aggregative adherence fimbriae gene, aaiC aggR-activated island C, eae intimin gene, stx1 Shiga toxin 1, stx2 Shiga toxins 2

Determination of SYBR® Green qPCR assays’ inclusivity, exclusivity, and accuracy calculation

The primer pairs uidA-3 and uidA-7 amplified 100% of the E. coli tested isolates, 83% of the tested Shigella spp., and none of the non-target strains or the NTC (Table 1). The primer pairs ipaH-569 and ipaH-3 and eae-185 and eae-EBP-1 amplified 100% of their targets Shigella strains (no EIEC strains were available in our collection) and eae-positive E. coli strains and 0% of the non-target strains nor the NTC (Table 1). The stx1and 2–4 assay for STEC detection gave a specific amplification with 90.8% of STEC tested strains and 0% of the non-target strains and NTC (Table 1). The STEC strains not detected by the stx1and 2–4 are all STEC strains containing the variant “f” of the gene stx2. In other words, the stx1and2–4 is able to amplify all variants (a, b, c, d, e, g) of stx2 (Croxen et al. 2013) except the variant “f.” The stx1–185 and stx2–81 assays for STEC detection gave a specific amplification with 100% of STEC tested and 0% of the non-target strains and NTC (Table 1). Thus, the stx2–81 is amplifying all variants of stx2 gene including the variant f. The aggR-185, aggR-2, aaiC-EBP1, and aaiC-EBP2 assays for EAggEC and EAggSTEC detection gave a specific amplification with 100% of EAggEC tested (this last result was obtained using a low number of strains (four) due to lack of availability and would require a higher number of strains in order to obtain a more accurate result) and 0% of the non-target strains and NTC (Table 1). Therefore, considering the two assays per target, the detection of each target is 100% accurate.

These 13 assays applied on a positive control showed a unique band at the expected size upon agarose gel analysis (data not shown). Each amplicon was sequenced and shown to correspond to the expected sequence (data not shown). In addition, the 13 detection assays gave a unique melting peak with a specific melting temperature (Table 2).

Determination of SYBR® Green qPCR assays’ dynamic range and PCR efficiency

The 13 SYBR® Green qPCR assays of the CoSYPS Path E. coli applied to isolates performed in a linear manner between 1 and 10,000 copies as their R2 values were between 0.972 and 0.999 (Table 3). From the dynamic range analyses, the PCR efficiency (E) of each assay was calculated. The 13 assays displayed PCR efficiencies ranging between 93.3 and 108.5% (Table 3). Although no performance value are given, in the last GMO guidelines for qualitative methods (ENGL 2015), R2 ≥ 0.98 and a PCR efficiency ranging between 80 and 120% have been indicated as good performance criteria for the validation of qualitative qPCR methods (Broeders et al. 2014).

Table 3 Standard curve, amplification efficiency, coefficient of correlation, LOD, repeatability, and reproductibility of the 13 SYBR® Green qPCR assays for detection and discrimination of pathogenic E. coli

Determination of SYBR® Green qPCR assays’ sensitivity and repeatability

The LOD of the chromosomal assays was determined to be between 1 and 10 copies (Table 3 and Supplementary Table S1) complying with the requirement “between 1 and 10 CFU” (ISO: International Organization for Standardization 2011). The r values at the LOD of the Cq values ranged between 1.7 and 3.7 Cq and those of the T m values at all dilutions ranged between 0.4 and 0.9 °C (Table 3). The RSDr values at LOD of the Cq values of the 13 assays were between 1.9 to 3.8% while those of the T m values at all dilutions ranged between 0.2 to 0.4% (Table 3). The RSDr should be ≤ 25% for all dilutions above the LOD for quantitative methods, but there is no critical value fixed for RSDr regarding qualitative qPCR methods (ENGL 2015).

Determination of SYBR® Green qPCR assays’ reproducibility

For all the developed SYBR® Green qPCR assays, the RSDR values were below 35% as requested by the ENGL guideline (ENGL 2015), i.e., between 0.09 and 0.94% for the T m values and between 0.02 and 6.89% for the Cq values (Supplementary Table S2). The uncertainty at 99% of confidence was also calculated from the reproducibility data. U was ranging between 0.10 and 1.15 for the T m values and between 0.83 and 3.97 for the Cq values (Supplementary Table S2).

CoSYPS Path E. coli on food samples

After the validation of the CoSYPS Path E. coli system on pure isolates, the performance of the developed SYBR® Green qPCR assays was subsequently tested on real-life food samples. Hereto, different representative food matrices were artificially contaminated at different initial concentrations with different E. coli pathotypes, followed by an enrichment step and a total DNA extraction. The negative controls (i.e., non-contaminated food samples) demonstrated absence of natural STEC, EAggEC, EPEC, and EIEC contaminations in the four matrices tested; i.e., ipaH, eae, stx1, stx2, aaiC, and aggR markers were negative in the corresponding qPCR assay (Table 4). Natural presence of E. coli, i.e., the uidA marker is positive, was shown for minced meat and a weak positive signal was detected in the other food matrices but not in the no-matrix blank (Blank-blank) control. The low level of spiked STEC contamination in each of the four different types of matrices (i.e., tomato, salami, red fruits, and minced meat) was detected after 24-h enrichment using the SYBR® Green qPCR assays of the CoSYPS Path E. coli system. Indeed, all expected markers were positive in the qPCR assays using the DNA extracted from the artificially contaminated samples as template (Table 4). The qPCR assays of the CoSYPS Path E. coli detection system were able to detect a level of an initial contamination level as low as 2 to 7 cfu/25 g after 24 h of enrichment giving Cq values from 13.4 to 25.5 (Table 4), which are far below the Cq values of the assays at the LOD (i.e., 32.11 to 32.88 (Supplementary Table S1)).

Table 4 Detection of spiked pathogenic E. coli on four different food matrices using the CoSYPS Path E. coli detection system

Discussion

In the EU, STEC is the fourth cause of human zoonosis, particularly the O157 serogroup (European Food Safety Authority (EFSA), European Centre for Disease Prevention and Control (ECDC) Prevention and Control, 2016). Until 2013, the reference method for the detection of STEC (ISO 16654:2001 2001) was limited to the detection of E. coli O157. This method does, however, not include virulence gene detection. To deal with the detection of the new O104:H4 serotype (King et al. 2012; Wadl et al. 2011) causing the large German and French E. coli outbreak, the EU has adapted and extended its related regulation and a new standard for the detection of these serogroups was published. In this new standard, prior to detection of the serogroup by qPCR, a sample is screened by qPCR for the presence of the main virulence factors of STEC, i.e., stx1, stx2 (able to be transferred to other serogroups as proven by the 2011 outbreak), and eae. This is a significant change in the detection strategy of STEC compared to the previous E. coli detection method (ISO 16654:2001), where only serogroup O157 was targeted without any consideration of the presence of particular virulence genes.

In line with this new STEC screening strategy, the inclusion of other virulence factors from E. coli pathotypes other than STEC, such as EAggEC, EAggSTEC, EIEC, and EPEC, could be considered in order to better evaluate the possible presence of pathogenic E. coli in a food sample.

In this study, the combinatory SYBR® Green qPCR screening system for pathogenic E. coli (CoSYPS Path E. coli) was developed and validated on isolates and explored on food samples in order to answer this need. Although next-generation sequencing (NGS) is now becoming a standard for surveillance and typing of bacterial isolates (whole-genome sequencing), metagenomics on more complex samples is still far from becoming routine practice as it remains too expensive and rather sophisticated in data analysis as compared to CoSYPS Path E. coli for rapid screening of pathogens in food samples. The 13 SYBR® Green qPCR assays developed and validated in this study allow the detection of six genes of interest allowing the discrimination of E. coli, Shigella, and five pathotypes of E. coli isolates (Fig. 1). It has to be noticed that if applied to food samples, the uidA gene can also be positive if Salmonella is present in the tested sample (Feng et al. 1991; Tryland and Fiksdal 1998). Thus, if uidA is the only positive gene, the CoSYPS Salmonella (Barbau-Piednoir et al. 2013b) could be run to check for the presence of Salmonella in the sample. Two SYBR® Green qPCR assays have been developed for each targeted gene in order to avoid false negatives due to polymorphisms in the primer annealing sites and for detection of all variants of the targeted gene. The 13 assays have been tested for their exclusivity, inclusivity, and accuracy. All of them indicate, using the number of strains available for the test, an efficient detection of the target with an accuracy of 100%, except for the assay stx1and2–4 which does not detect the variant f of the stx2 gene (accuracy at 90.5%; Table 1). This variant, first described in pigeon isolates, was rarely associated with symptomatic human infections and was therefore not included in ISO /TS 13136:2012. However, recently, an increase of stx2f variants was observed in human isolates in the Netherlands and this was linked to mild disease (Friesema et al. 2014) and occasionally severe disease (Friesema et al. 2015). The second assay of the stx2 gene in the CoSYPS Path E. coli system, i.e., stx2–81, detects all variants of the stx2 gene including the variant f. Thus, with the two assays, all variants of stx2 gene are detected. This is an added value of the CoSYPS Path E. coli system presented in this paper. Therefore, considering both assays of each target, the detection of the seven targets (six virulence genes and one E. coli marker) of the CoSYPS Path E. coli is 100% accurate for all the strains tested in this assay. The LOD of the chromosomal assay has been studied and is between 1 to 10 genomic copies, which complies with the foodborne PCR performance requirements of ISO 22118:2011 (2011). The detection at these levels was also confirmed for non-chromosomal assays. The dynamic range, PCR efficiency, repeatability, and reproducibility of each developed assay were also evaluated and compared with the European requirements for qPCR detection assays for GMO detection (ENGL 2015), where qPCR is the gold standard for detection. All these parameters complied with the EU requirements for the developed assays except for the R2 of the stx1 and 2–4, which is below the required 0.98. This is of low importance as the assays are used qualitatively in the CoSYPS Path E. coli system. In conclusion, this validation demonstrates that the SYBR® Green qPCR methods developed in this study are compliant with the requirements of an efficient qPCR assay. In addition to the above mentioned advantages of being cheap and allowing melting-curve analysis, the CoSYPS Path E. coli system has other benefits. First, the CoSYPS Path E. coli system is homogeneous; i.e., all SYBR® Green qPCR assays can be run on a single 96-well plate as they are all validated with the same PCR program. Secondly, each SYBR® Green qPCR assay constituting the CoSYPS Path E. coli system shows a satisfactory inclusivity, exclusivity, repeatability, and reproducibility, as demonstrated during its validation. Thirdly, the CoSYPS Path E. coli analysis is fast. Indeed, the screening results can be obtained 1 day after receiving the suspected food samples, which already indicates the possible presence of one of the five pathotypes. Further analyses including the isolation of the strain and subsequent confirmation of the pathotype will however be needed, as a final conclusion requires the presence of the detected target genes within one genome (isolate). Fourthly, due to its modularity, in case of appearance of a new emerging hybrid strain as observed in the O104:H4 outbreak (King et al. 2012; Wadl et al. 2011), new targets can be easily and rapidly added to the existing CoSYPS Path E. coli system. Last but not least, the CoSYPS Path E. coli could be combined with the 11 SYBR® Green qPCR assays previously developed and validated for the Listeria (CoSYPS Listeria) and Salmonella (CoSYPS Salmonella) detection and discrimination (Barbau-Piednoir et al. 2013a, 2013b, 2015). These 24 assays constitute a multi-pathogen screening system, which is called CoSYPS Path Food system for “Combinatory SYBR® Green qPCR Screening system for pathogen detection in food samples.” As for the individual species-specific CoSYPS systems, this screening system could be run in a single 96-well plate as all developed qPCR assays use the same PCR program. Furthermore, since also this approach is modular, selected assays could be run individually or more qPCR assays could be combined to detect a wider range of foodborne pathogens or emerging pathogens in a same sample. The only requirement to add a new assay in this modular CoSYPS Path Food system is to develop an assay that is able to be run under the same conditions, allowing its use in high-throughput modus in the same 96-well plate. Additionally, to allow a user-friendly and automated data analysis of the CoSYPS Path Food results, especially when such a large number of qPCR assays are run, a decision support system (DSS) has been previously developed (Van den Bulcke et al. 2010). Combined with this DSS, the CoSYPS Path Food detection system offers a very useful approach for a high-quality screening for food samples, which makes it a remarkable food surveillance tool which can be modulated in response to the laboratory needs. It will also considerably reduce the time and the cost of a sample analysis. Such simultaneous detection may be useful when a global screening and rapid identification of foodborne pathogens is requested, as in the case of a bio-emergency or outbreak of unknown origin.

The present paper focuses on the targeted genes and the performance criteria of the qPCR method. The method is fully validated for the use with isolates. As a proof of concept, the CoSYPS Path Food screening system was tested on four artificially contaminated representative food matrices. For full implementation of the workflow for food samples, it is recommended to extend the number of tested food matrices, and also to include sprouts, sprouted seeds, and the irrigation water obtained during the sprouting process for which a legislation exists, and to test more strains per matrix. This would confirm the full applicability of the system for food in the context of the EU legislation and ISO norms currently used by the EU enforcement laboratories.