Prediction of antimicrobial resistance in clinical Campylobacter jejuni isolates from whole-genome sequencing data

Campylobacter jejuni is recognised as the leading cause of bacterial gastroenteritis in industrialised countries. Although the majority of Campylobacter infections are self-limiting, antimicrobial treatment is necessary in severe cases. Therefore, the development of antimicrobial resistance (AMR) in Campylobacter is a growing public health challenge and surveillance of AMR is important for bacterial disease control. The aim of this study was to predict antimicrobial resistance in C. jejuni from whole-genome sequencing data. A total of 516 clinical C. jejuni isolates collected between 2014 and 2017 were subjected to WGS. Resistance phenotypes were determined by standard broth dilution, categorising isolates as either susceptible or resistant based on epidemiological cutoffs for six antimicrobials: ciprofloxacin, nalidixic acid, erythromycin, gentamicin, streptomycin, and tetracycline. Resistance genotypes were identified using an in-house database containing reference genes with known point mutations and the presence of resistance genes was determined using the ResFinder database and four bioinformatical methods (modified KMA, ABRicate, ARIBA, and ResFinder Batch Upload). We identified seven resistance genes including tet(O), tet(O/32/O), ant(6)-Ia, aph(2″)-If, blaOXA, aph(3′)-III, and cat as well as mutations in three genes: gyrA, 23S rRNA, and rpsL. There was a high correlation between phenotypic resistance and the presence of known resistance genes and/or point mutations. A correlation above 98% was seen for all antimicrobials except streptomycin with a correlation of 92%. In conclusion, we found that WGS can predict antimicrobial resistance with a high degree of accuracy and have the potential to be a powerful tool for AMR surveillance. Electronic supplementary material The online version of this article (10.1007/s10096-020-04043-y) contains supplementary material, which is available to authorized users.


Introduction
Campylobacter is a foodborne bacterial pathogen, which is able to cause gastroenteritis in humans and is found in the environment where it is widely distributed in the gastrointestinal tract of most warm-blooded animals [1]. Foodborne transmission is believed to be the main route of human infection [2,3]. In 2018, Campylobacter continued to be the most common foodborne pathogen causing infections in Denmark with more than 4500 registered cases; most of these were caused by C. jejuni (85-95%) [4]. Clinical symptoms during infection often include diarrhoea, fever, abdominal pain, nausea, headache, and/or vomiting [5,6] and in some cases result in Guillain-Barré syndrome (GBS) or reactive arthritis [7,8]. Most patients recover without any antimicrobial treatment, whereas more severe and immunocompromised cases may need antibiotic treatment. Due to the extensive use of antimicrobials worldwide, within the production of food animals and human medicine, antimicrobial resistance (AMR) against clinically relevant antibiotics is emerging in Campylobacter spp. presenting a serious health concern [2,9,10]. In Denmark, AMR surveillance in Campylobacter jejuni has been in place since 1996 based on phenotypic susceptibility testing against ciprofloxacin (CIP), nalidixic acid (NAL), erythromycin (ERY), gentamicin (GEN), streptomycin (STR), and tetracycline (TET). The data on clinical isolates is reported as part of The Danish Integrated Antimicrobial Resistance Monitoring and Research Programme (DANMAP) (https://www.danmap.org/).
Campylobacter has developed several mechanisms for antibiotic resistance including point mutations, acquisition of resistance genes, and efflux systems [11][12][13]. Resistance towards fluoroquinolones (FQs) in Campylobacter is mainly caused by point mutations in the DNA gyrase A (GyrA) region at multiple positions [10,14]. The mechanisms involved in erythromycin resistance include ribosomal target modification and efflux of the antibiotic from the cell and this resistance is often associated with point mutation in domain V of the 23S rRNA, which is the target of macrolides, or the ribosomal proteins L4 and L22 [10,15]. Campylobacter harbour three copies of the 23S rRNA gene, but only two copies need to be affected by point mutations to result in erythromycin resistance [16,17]. Streptomycin resist ance in Campylobacter is associated with substitution within the ribosomal RpsL protein or by expression of an aminoglycosidemodifying enzyme (ANT(6)-I) [18,19]. Resistance towards gentamicin and tetracycline is in general conferred by the acquisition of resistance-associated genes. Resistance to tetracycline in Campylobacter is mainly conferred by the ribosomal protection protein tet(O), a plasmid-encoded gene, whereas the resistance gene aph(2″)-If often is associated with gentamicin resistance [20][21][22].
The whole-genome sequencing (WGS) technology provides complete genomic DNA sequence of a bacterium, and WGS has proved to be an effective surveillance tool for foodborne pathogens [23,24]. This method provides a large amount of information generated from a single assay [25] and allows multiple tests to be performed in silico simultaneously, whereas phenotypic susceptibility tests can be time-consuming, labour intensive, and are highly dependent on standardised laboratory procedures. Several studies have demonstrated that WGS can be used to identify resistant genotypes for different foodborne pathogens [23,24,26]. The aim of this study was to evaluate the concordance between WGS-based AMR prediction and the phenotypes identified from conventional antimicrobial susceptibility in a set of clinical C. jejuni isolates. The focus of the evaluation is to allow the replacement of phenotypic testing with a WGS-based surveillance of AMR in Campylobacter.

Bacterial isolates
Clinical C. jejuni isolates representing common and less common resistance profiles were included as the study population. In total, 516 clinical C. jejuni isolates from Danish patients were analysed. The isolates were submitted to SSI from 2014 to 2017. Of the submitted isolates, 422 were from domestically acquired infections, 84 were travel related, and 10 isolates had no available travel information. At SSI, the isolates were subjected to MALDI-TOFF for species confirmation, and antimicrobial susceptibility was tested as described below.

Antimicrobial susceptibility testing
Isolates were grown on modified charcoal cefoperazone deoxycholate agar (mCCDA) plates. After overnight incubation, one colony was selected and inoculated on blood agar plates. This culture was used for AMR testing and DNA purification for WGS. Incubation was performed in 18 to 24 h under microaerophilic conditions (85% nitrogen, 10% carbon dioxide, 5% oxygen) at 41°C.

Whole-genome sequencing
Genomic DNA was purified from pure cultures using QIAGEN DNeasy Blood & Tissue Kit (https://www.qiagen. com). The DNA concentration was measured using an Invitrogen Qubit Fluorometer (ThermoFisher Scientific). After adjustment of the concentration, the isolates were sequenced using Illumina sequencing technology (https:// www.illumina.com) on either Miseq or Nextseq sequencing machines using the Nextera XT Library preparation protocol for paired-end reads of 150 bp or 250 bp. The sequence data was evaluated by an in-house quality-control pipeline (https:// github.com/ssi-dk/bifrost) to check for contaminations (maximum 5% of other genus allowed), correct species identification, and read depth and genome size. Sequences representing C. jejuni isolates with a genome size outside the range of 1.6-1.9 mbp were excluded as well as if the number of contigs exceeded 500. Coverage depth of the included genomes ranged from 32 to 506 (median 111). Data analysis was performed in BioNumerics version 7.6 (Applied Maths) where 7 locus MLST was assigned and phylogenetic minimum spanning trees were generated.

AMR prediction
The presence of acquired resistance genes was determined by four approaches: two based on prediction from methods using de novo assemblies (ResFinder Batch Upload (web tool) [28] and ARIBA [29]) and two methods using reference mapping (KMA [30] and ABRicate, available at https://github.com/ tseemann/abricate). All methods utilised the ResFinder gene database for detection and for all four approaches, hits were only considered for gene lengths ≥ 90% and identities of ≥ 60%. When alleles were not represented in the database (i.e., novel alleles of a present resistance gene), reference mapping by the use of KMA splits the mapping between two or more references in the database. To alleviate this in our setup, an additional step was added to perform another mapping against the reference gene with the highest coverage in order to obtain a result with a full-length gene variant (modified KMA; https://github.com/ssi-dk/modified_kma).
For detection of point mutations conferring antimicrobial resistance, a database containing reference genes (gyrA, rplD, rplV, rpsL, and 23S) and associated point mutations known from the literature to be coupled with antimicrobial resistance in C. jejuni was constructed from C. jejuni strain NCTC 11168 (Table 1). Point mutations were detected using a local wrapper (https://github.com/ssi-dk/punktreskma) for KMA [30]. For 23S rRNA, where multiple gene copies (3) are present in the genome, the detection of a certain nucleotide at a certain position was done by considering the particular depth for each nucleotide, specifically by assuming a uniform distribution of reads over gene copies. Thus, the count of each nucleotide was assumed to be the fraction of reads it was observed on. Of note, this method does not consider the cooccurrence of point mutations on single reads.
In addition, we included in the database the point mutations G86A and C696T in the promotor region (P cmeABC ) of the efflux system cmeABC. These point mutations have been hypothesised to increase the level of resistance towards CIP /NAL, ERY, and TET [31,40,41].
For each isolate, phenotypic susceptibility to a given antimicrobial agent was compared with the presence of acquired resistance genes and resistance-associated point mutations found by WGS. The correlation between resistant phenotypes and genotypes was calculated by dividing the number of isolates considered resistant based on their genotype by the number of isolates exhibiting resistant phenotypes.
Susceptibility against the six antimicrobials was retested for 62 isolates and new sequence data was obtained for 47 of these to consolidate observed discrepancies between phenotype and predicted genotype.
The distribution of phenotypic resistance in the study population is illustrated in Fig. 1. All phenotypic resistance profiles were widely represented in the population and the 43% quinolone-resistant isolates were found in most major sequence types (ST).

Concordance between phenotypic and genotypic AMR
In the study population, known resistance-associated point mutations were observed for gyrA, 23S, and rpsL mediating resistance to CIP/NAL, ERY, and STR, respectively. Some resistance-associated point mutations were not found in the study population (Table 1) and therefore, it was not possible to verify these. After retesting, it was found that most deviations were due to an error in the phenotypic resistance profile and only seven isolates ended up with a new sequence after WGS rerun (possible mix-up of isolates). An overview of the a Verified: point mutations were found in the study population and these correlated with the relevant phenotype b Not verified: point mutations found in study population but not possible to connect to phenotype c Not found in data: point mutations not found within study population detected resistance-associated point mutations and acquired resistance genes is shown in Table 3.
The most frequently observed point mutation was the gyrA T86I, which was detected in 215/221 (97%) of the quinoloneresistant isolates. In 13 of these isolates, the gyrA P104S point mutation was also present. However, the importance of P104S in relation to CIP/NAL resistance could not be verified in this study as P104S was also found in a CIP/NAL sensitive isolate (MIC CIP = 0.25 μg/ml, MIC NAL = 8 μg/ml). We only found one CIP/ NAL resistant isolate harbouring the gyrA D90N mutation.
Three isolates were phenotypically resistant to just one of the two quinolones. Two CIP-resistant isolates harboured gyrA T86I and showed high-level resistance towards CIP (MIC CIP = 16 μg/ml) while being well below the breakpoint for NAL (MIC NAL = 2 μg/ml and 4 μg/ml). One NALresistant isolate harboured the gyrA T86A mutation and showed high-level resistance towards NAL (MIC NAL = 64 μg/ml) while being just below the breakpoint for CIP (MIC CIP = 0.5 μg/ml). Discrepancies between the phenotype and genotypic prediction were observed for four isolates. These isolates showed a resistant phenotype against CIP/ NAL but without any gyrA point mutation present.
Eleven (2%) isolates were resistant to ERY and all carried resistance-associated point mutations in the 23S genes; eight harboured the mutation A2075G, while three harboured the A2074T mutation. Most (10/11) isolates had the same point mutation in all three copies of the 23S rRNA gene. One isolate carried a mutation in only two of the three gene copies, while still displaying high-level resistance (MIC ERY > 128 μg/ml).
It was not possible to verify the importance of two point mutations, D72N and A103V, associated with ERY resistance (Table 1). We found that 68 isolates harboured A103V but only three isolates showed phenotypic resistance towards erythromycin (MIC > 128 μg/ml) and these isolates also harboured point mutation in 23S at position 2075. Furthermore, D72N was also found in an erythromycin sensitive isolate.
Among the 516 isolates, seven different acquired resistance genes were identified. Four of these genes are known to mediate resistance towards antimicrobials represented in the phe- Phenotypically, 120 (23%) of the isolates were resistant to tetracycline and this was conferred by either tet(O) or   Figure 2 shows that TET resistance occurred widespread in the population. The less common gene, tet(O/32/O), was rare in the most common STs and the non-functional variant of tet(O) was only found in isolates of ST21 (n = 1), ST50 (n = 7), and the closely related ST8873 (n = 3).
Only two isolates in the study population were resistant to the aminoglycoside GEN, both with high-level resistance (MIC GEN ≥ 16 μg/ml) and harbouring the resistance gene aph(2″)-If.
Twelve (2%) isolates were STR resistant, and 11/12 (92%) possessed either the ant(6)-Ia (n = 6) or one of the known rpsL point mutations (n = 5) ( Table 3). The resistant isolate (MIC STR = 16 μg/ml) without any of these markers carried the aph(3′)-III gene, which is known to confer resistance to other aminoglycosides than STR. This gene was also found in three STR susceptible isolates.
We investigated the presence of the point mutations G86A and C696T in the promotor region of CmeABC (P cmeABC ) in the study population. G86A was not found in any of the isolates, but the point mutation C696T was identified in 21/516 (4%) isolates. All 21 isolates showed resistance to CIP/NAL, 19 against TET, and two against ERY, where all isolates had a known point mutation and/or acquired resistance gene facilitating the resistant phenotype. Comparing the MIC values for these 21 isolates with the same point mutation and/or resistant gene without C696T showed significantly (P < 0.05) higher MIC values to CIP/NAL.  The detection of resistance genes was performed using four different methods (KMA, ABRicate, ARIBA, and ResFinder Batch Upload). In general, there was a high conformity between the four approaches; however, the mapping-based method modified KMA resulted in more resistance genes being identified and therefore the preferred method to use. The four methods identified four genes relevant to the phenotypic panel (Table 4). All applied methods were able to identify aph(2″)-If in the two GEN-resistant isolates and ant (6) Additional resistance genes, not related to the phenotypic test panel, were also found and included blaOXA, aph(3′)-III and cat (results not shown). Modified KMA and ABRicate identified blaOXA, mediating resistance towards beta-lactam, within 458 and 456 isolates, respectively, whereas ARIBA and Resfinder found blaOXA in 459 and 429 isolates, respectively. For 420 of these isolates, all methods identified blaOXA genes and in the remaining isolates, not all methods identified a variant of the blaOXA gene or the gene length of blaOXA was < 90%. The aph(3′)-III gene was identified by all

Discussion
Resistance against clinically important antibiotics in Campylobacter continues to be a public health concern worldwide. Campylobacter is exposed to antibiotics in both animal production and human medicine making development and transmission of AMR possible. To assess the ability of using WGS to identify and predict AMR in C. jejuni, a dataset of 516 clinical isolates were analysed by comparing their resistance genotypes to their respective phenotypic resistance profile. The resistance profiles were found based on susceptibility to six antimicrobials all included in the Danish antimicrobial surveillance programme, DANMAP [2]. In this study, we evaluated and optimised the analysis of WGS data to identify-with high confidence-the relevant genetic markers that enable the prediction of the phenotypic resistance against specific antimicrobials. For the genome-based prediction of AMR, we analysed WGS data by the use of mapping-and assembly-based bioinformatics methods to detect known AMR genes listed in a curated database, ResFinder. For detection of point mutations, we constructed a database containing reference genes and associated point mutations known from the literature. For each of the six antimicrobials tested, there was a concordance between 92 and 100% between the presence of known resistance genes or mutations and MICs above the ECOFF breakpoint. To obtain this high concordance, a number of modifications were made to the raw output from the ResFinder database search as well as the detected point mutations validated before inclusion in the final output presented here.
The four approaches to identify the presence of acquired resistance genes (modified KMA, ABRicate, ARIBA, ResFinder Batch Upload) worked well and with an overall good compliance between the results obtained, however with a few important differences that made us chose a modified KMA as the preferred method. A systematic error was attained when using three of the four methods for the detection of the mosaic gene tet(O/32/O) conferring resistance to tetracycline. Although tet(O/32/O) is present in the ResFinder database, only the mapping-based KMA method was able to detect this gene in 27 isolates, whereas neither of the assembly-based methods (ARIBA and ResFinder Batch Upload) detected tet(O/32/O) with a gene length > 90% in any of the genomes. This would potentially make an under-estimation of the occurrence of TET resistance as this gene was present in 23% of the TET-resistant genomes and it was widely distributed in our C. jejuni population. For all approaches, another systematic error was found to be related to frameshift mutations resulting in non-functional genes. We discovered 11 nonfunctional tet(O) genes resulting in a disagreement between phenotype and genotype. For ten of the isolates, the frameshift was due to a nucleotide deletion, whereas for the last isolate, the tet gene harboured an extra nucleotide compared with the reference. As a result, 3% of the TET sensitive isolates were wrongly determined as resistant. Such frameshift-related errors can be avoided by an automated frameshift check of the matched open reading frames or by a tblastx search.
Verification of markers responsible for resistance is essential for building a reliable pipeline. In this study, it was not possible to verify the significance of some point mutations described in the literature. This was the case for the ribosomal protein L22 (rplV) and the substitution P104S in gyrA associated with resistance against ERY and CIP/NAL, respectively. One mutation in L22 (A103V) was equally common in resistant and sensitive isolates, whereas another mutation in L22 (D72N) was only detected in a sensitive isolate. Thus, our findings suggest that D72N and A103V in L22 do not contribute to erythromycin resistance although these substitutions were reported to be the responsible marker for ERY resistance in one and two isolates, respectively [37,42]. The P104S substitution in gyrA has been found previously in clinical isolates [43,44], but our data could not support the association to fluoroquinolone resistance, as P104S was found either together with T86I in gyrA in resistant isolates or in a phenotypically sensitive isolate. These findings, summarised in Table 1, highlight the importance of being cautious when including suggested mechanisms of resistance in a routine analysis of predicting phenotypic resistance.
The most frequent resistance profile observed among the isolates was CIP/NAL resistance. Quinolone resistance in Campylobacter is most often due to mutations in the DNA gyrase A gene, gyrA, and especially substitutions at Thr86 are associated with high-level resistance to quinolones [32]. The substitution T86I was present in 97% of the quinolone resistant isolates and none of the sensitive isolates in our study. Two isolates carrying the T86I gyrA substitution showed high-level resistance to CIP but were fully susceptible to NAL. This phenomenon is described previously [45] and shows that the exact phenotype may not be possible to predict with the current knowledge. Another mutation, the more atypical T86A gyrA substitution, was observed in one isolate thatin accordance with previous studies [45][46][47] displayed high-level resistance towards NAL, but was susceptible to CIP. Double point mutations resulting in two amino acid substitutions in gyrA were also seen. In addition to the common T86I, 13 isolates had the P104S substitution and one isolate had the D90N substitution. The D90N substitution is less common and associated with moderate resistance [31][32][33]; however, we detected D90N as the only marker of quinolone resistance in one of the CIP/NAL-resistant isolates.
Resistance to erythromycin is commonly mediated by mutations in 23S rRNA, especially A2075G [36,48]. C. coli is more often resistant to macrolides than C. jejuni, and the ermB gene has been shown to mediate resistance to erythromycin in C. coli isolates of animal origin [49,50], but recently ermB was also identified in a clinical C. jejuni isolate [51]. In this study, none of the isolates had the ermB gene. All isolates exhibiting phenotypic resistance against ERY had a mutation in the 23S rRNA genes at position 2074 or 2075, resulting in a 100% concordance between the phenotype and genotype. None of the isolates had any of the described point mutations in the ribosomal protein L4 (rplD).
When disregarding detection of the mentioned nonverified point mutations and the non-functional tet(O) genes, discrepancies between phenotype and genotypic prediction were seen for eight isolates, four for CIP/NAL, one for STR, and three for TET. In one of the TET-resistant isolates, a tet gene was clearly present as determined by the alternative methods. Otherwise, it is possible that the discrepancies were caused by unknown resistance mechanisms, sequence gaps in the genome preventing detection of known resistance genes or point mutations, or due to mixed cultures in the sample.
Based on our findings, we will establish a routine WGSbased analysis pipeline for AMR detection that we believe has a sufficiently high level of performance for replacing the phenotypic surveillance of AMR in clinical isolates of Campylobacter. Phenotypic susceptibility testing may still be optimal for clinical purposes as the WGS approach cannot deliver an MIC value. However, a sequence-based approach predicting MIC values from markers influencing the MICs may be clinically relevant and could be implemented in the database in the future. In our selected C. jejuni population, we only detected a limited number of genes and mutations that seem to be relevant for predicting the AMR profile. Thus, future routine analysis will only include the verified mutations and the acquired genes in the curated public databases with focus on avoiding the systematic errors encountered in this study, e.g. non-functional genes and detection of the mosaic tet gene. The pipeline should be maintained continuously. We were only able to detect a minority of the many reported point mutations, and therefore, it is relevant to include all potentially important mutations in an additional analysis and evaluate these by phenotypic testing of the isolates. Likewise, new genes or variants of genes may be added to the curated databases, and the performance of our detection pipeline should be evaluated for these. In addition, a subset of the isolates should be tested phenotypically to be able to detect new mechanisms, non-functional genes, and generally perform quality assurance on the analysis pipeline. This further evaluation and development can be done on small selected strain collections at regular intervals.
Funding The study has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement No 773830.
Data availability Sequences are available at ENA, project numbers PRJEB40238 and PRJEB31119. The full antimicrobial resistance data and the individual accession numbers are available in the supplementary file.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Ethics approval Not applicable.
Consent to participate Not applicable.

Consent for publication Not applicable.
Code availability Code for modified KMA and detection of point mutations are available at https://github.com/ssi-dk/modified_kma and https://github.com/ssi-dk/punktreskma, respectively.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.