Introduction

Clavibacter michiganensis subsp. michiganensis (Cmm) is a seed-born Gram-positive bacterium that causes the bacterial canker of tomatoes (Tsiantos 1987). It was first isolated in Michigan, USA, in 1910 (Smith 1910, Davis et al. 1984) and has since spread to many parts of the world where tomato is cultivated. The disease reduces the yield and the marketability of the product, causing destruction of the crop if not contained. Bacterial canker is a serious threat to tomato producing countries worldwide, having caused catastrophic outbreaks over the years with severe economic losses (Smith 1910, Kleitman et al. 2008, Blank et al. 2016). The first appearance of bacterial canker in Greece was recorded by Zachos and Georgopoulos (1957). Since then, the disease has been observed in most parts of the country where outdoor or greenhouse tomato crops are established. In the last few years the outbreaks of the disease have become more frequent due to the import of seeds and the intensification of crops, which favor the growth of the bacterium. From 2003 to 2018, twelve bacterial canker outbreaks were recorded in Greece, during which significant economic losses were documented.

Cmm colonizes plants through wounds or natural openings, reaches the vascular system, where it grows, and can be transmitted systematically to every plant tissue, causing plant rot and death (Carlton et al. 1998). The infection of the tomato plant causes symptoms that depend on the age or the environmental conditions, such as discoloration of the stem and canker, leaf-wilting that may reach up to hemiplegia, or the pathogen-characteristic spots on the fruits called “bird’s eye” (de León et al. 2011, Gleason et al. 1993). The bacterium easily spreads through the commercial routes of infected seeds, covering large geographic regions worldwide in short time (Fatmi et al. 1991, Quesada-Ocampo et al. 2012). Additionally, Cmm may spread through infected residues that survive in the soil, where the bacterium may survive for 2 to 3 years (Fatmi and Schaad 2002, Hadas et al. 2005) or through irrigation and cultivation practices (Gitaitis et al. 1991, Carlton et al. 1998, Xu et al. 2010). Unfortunately, there is neither a resistant cultivar (Umesha 2006) nor an effective chemical or biological control to date (Umesha 2006, de León et al. 2008).

Cmm is a regulated pathogen listed in the A2 list of the European and Mediterranean Plant Protection Organization (EPPO) since 1982 (Jensen 1982) due to its inability to be controlled and the serious financial losses it produces. So far, prevention is the only way to control the disease through the application of good agricultural practices and the utilization of healthy seeds or seedlings (de León et al. 2011).

Several DNA fingerprinting methods have been utilized to exploit the genetic diversity of Cmm strains isolated in different geographic regions or growing seasons. Among them, repetitive element polymerase chain reaction (rep-PCR) using REP, ERIC or BOX primers was very popular as it had the ability to separate strains at the subspecific level, as was shown by Louws et al. (1998). Since then, rep-PCR has often been used for genetic diversity studies, as in isolations from Northwestern Iran (Nazari et al. 2007), Israel (Kleitman et al. 2008), Japan (Kawaguchi et al. 2010, Kawaguchi and Tanina 2014), Turkey (Basım and Basım 2018), Southern Italy (Ialacci et al. 2016), Canary Islands (De León et al. 2009), and Argentina (Wassermann et al. 2017). Apart from rep-PCR, DNA fingerprinting methods used for Cmm phylogenetics were Random Amplification of Polymorphic DNA (RAPD) for strains in Lithuania (Burokiene et al. 2005) and Canary Islands (De León et al. 2009), Inter Simple Sequence Repeats (ISSR) for strains in Japan (Kawaguchi and Tanina 2014) and Turkey (Baysal et al. 2011), Amplified Fragment Length Polymorphism (AFLP) for strains in Canary Islands (De León et al. 2009) and Southern Italy (Ialacci et al. 2016), and Multiple Locus Variable number of tandem repeat Analysis (MLVA) for strains in Belgium (Zaluga et al. 2013).

The above-mentioned DNA fingerprinting methods have been widely used to study the genetic diversity of Cmm. Most of them have shown a satisfactory level of resolution and have provided important information on the structure of the studied populations. Nevertheless, many of these methods are time-consuming, relatively expensive, and present low interlaboratory accuracy and reproducibility. Multi Locus Sequence Typing and Multi Locus Sequence Analysis (MLST and MLSA respectively) are two different approaches to study of Cmm genetic diversity that overcome the abovementioned limitations (Maiden et al. 1998).

The genes that are exploited in a MLSA approach are not regulated, meaning that there is a plethora of genes to choose from. Jacques et al. (2012) working on isolated strains from different regions worldwide and Croce et al. (2016) working on isolated strains from Uruguay studied the genetic correlation between Cmm strains performing MLSA and MLST analysis based on six housekeeping genes (atpD, dnaK, gyrB, ppK, recA and rpoB). Similarly, Osdaghi et al. (2018) and Ansari et al. (2019) made genetic diversity studies on strains isolated from Iran but utilized five instead of six genes as was also the case for strains from Chile (Valenzuela et al. 2018). In an analysis with a different selection of housekeeping genes than those that were mentioned above (kdpA, sdhA, ligA, gyrB and bipA), Milijašević-Marčić et al. (2012) studied the genetic variability of Cmm strains isolated in Serbia. Similarly, Tancos et al. (2015) added one more housekeeping (dnaK) and three pathogenic (celA, tomA and nagA) genes to study New York strains by MLSA analysis, while Sen et al. (2018) added three pathogenic genes (ppaA, chpC and tomA) on isolates from Turkey. In Michigan, USA, 96 Cmm strains were analyzed utilizing six genes related to pathogenesis (celA, pat-1, ABC transporter, mop, perforin, and phosphatase C). The same strains were also studied simultaneously by rep-PCR (Quesada-Ocampo et al. 2012). Beyond the population analysis that detected 36 haplotypes, the population structure seemed to follow the geographical distribution of the strains.

All the abovementioned analyses did not conclude to any geographical or temporal isolate distribution while the primary infection was consistently coming from infected plant residues left in the soil rather than from contaminated seeds (de León et al. 2011).

Regarding the pathogenicity of Cmm, the bacterium bears pathogenesis or virulence genes that encode serine proteases, chymotrypsin-like proteases, subtilases, xylanases, pectinases, endoglycanase, and a tomatinase, a type of a beta-glucosidase. The role of these genes has been determined mainly by expression and mutational studies, however, their exact function has not been clarified (Nandi et al. 2018). Pathogenic genes are mainly detected in a 129 kb low GC region called chp/tomA pathogenicity island (PAI) (Gartemann et al. 2008) and have been identified in all twelve strains that have been fully sequenced (Thapa et al. 2017). The PAI is subdivided into the chp subdomain, consisting of 16 pathogenicity genes and the tomA subdomain, containing the tomA gene encoding for tomatinase (Kaup et al. 2005, Stork et al. 2008). Usually, the bacterium carries two plasmids, the pCM1 (31-59 kb) and pCM2 (64-109 kb), bearing the celA and pat-1 genes, respectively. Tomatinase is one of the major secreted pathogenic effector proteins of Cmm which degrades the alkaloid α-tomatine in invaded tomato tissues and participates in the defense mechanism against pathogenic microorganisms. Mutational studies have shown that tomatinase is not necessary for bacterial pathogenesis, but its absence reduces the intensity of the infection (Kaup et al. 2005).

The Pat-1 (or Chp) gene belongs to a family of serine proteases and was acknowledged as the most important pathogenicity gene of pCM2 by Burger et al. (2005). However, there are wild-type strains that do not have the pCM2 plasmid without it adversely affecting their pathogenicity (Thapa et al. 2017). The Pat-1 family also includes the chpA to chpG genes, which are located in a 50 kb region of the chp/tomA PAI (Stork et al. 2008) and also have a role to Cmm pathogenicity; the suppression of the chpC, chpE, chpF, and chpG genes reduced the severity of symptoms in artificial inoculations on tomato plants (Stork et al. 2008, Chalupowicz et al. 2010). The chpC protease negatively affects the ethylene level, which appears to be part of the plant's defense against Cmm (Balaji et al. 2008, Savidor et al. 2012). The chpA, chpB, and chpD genes are considered pseudogenes because of the detected shifts or termination codons inside their open reading frames (ORF) (Stork et al. 2008).

Furthermore, the Ppa family consists of multiple chymotrypsin-like serine proteases; six genes are located on the chp/tomA PAI (ppaA to ppaE, including ppaB1/B2), four genes are dispersed in other chromosomal regions (ppaF to ppaI), and a gene is located in the pCM1 plasmid (ppaJ) (Gartemann et al. 2008). As previously described, mutational studies with the ppaA and ppaC genes showed reduced severity of symptoms in the infected plants, though without statistical significance (Chalupowicz et al. 2017).

The third group of Cmm effectors are the subtilase proteases (Siezen and Leunissen 1997) which are involved in Cmm pathogenesis. While the sbtA gene is located on the chp/tomA PAI, the sbtB and sbtC genes are located on a chromosome (Gartemann et al. 2008). When a mutated version of the sbtA gene was introduced, it gave milder symptoms (Chalupowicz et al. 2017). In addition, Cmm has enzymes that break down the cell wall of the host. Among those, a cellulase gene (celA) is located in the pCM1 plasmid and is considered a significant factor for systemic infection where the presence of celA transcripts and proteins are detected (Chalupowicz et al. 2010, Savidor et al. 2012, Chalupowicz et al. 2017). When infections on leaves are established, transcripts are detected but at lower levels (Chalupowicz et al. 2017). When the celA was introduced into the Cmm100 and CmmCASJ002 strains that did not contain the pCM1 and pCM2 plasmids, the ability to infect was restored, as observed by the wilting of the infected plants (Meletzus et al. 1993, Thapa et al. 2017).

The ability of Cmm to infect their hosts differs from strain to strain (Strider and Lucas 1970, Ialacci et al. 2016). So far, there is no strong correlation between the above-mentioned pathogenicity genes and the ability of the bacterium to infect; several studies around the world have indicated that there are strains with increased virulence that do not have any of the pathogenicity genes, and on the other hand, there are less virulent strains with all of the genes (Kleitman et al. 2008, Milijašević-Marčić et al. 2012, Tancos et al. 2015, Croce et al. 2016, Ialacci et al. 2016, Osdaghi et al. 2018). Most probably, these situations are explained either by the existence of additional unknown genes, or the loss of functionality of the pathogenicity genes, respectively (Jacques et al. 2012, Tancos et al. 2015).

Most of Cmm outbreaks in various parts of the world have originated from infected seeds and seedlings. Nevertheless, no study has been conducted so far to investigate the primary infection of outbreaks in Greece. In this research work, we describe the results from a) the study of genetic diversity of 93 Cmm strains in Greece during bacterial outbreaks in various regions of Greece from 2003 to 2018, and b) the study of gene distribution on the chp/tomA PAI and the two pCM plasmids.

Materials and methods

Bacterial isolation and growth conditions

During the period 2003-2018, 98 bacterial strains isolated (Collection of the Department of Agriculture, Hellenic Mediterranean University, HMU) and previously identified as Cmm from 11 Prefectures in Greece were selected for further analysis. All strains were isolated from infected tomatoes cultivated outdoors or in greenhouses, and no more than one strain was selected per area per year for the analyses. When different strains refer to the same Prefecture but different year (e.g., 2016 Lasithi and 2017 Lasithi), isolations were not made to samples from the same location (greenhouse or outdoor field). The isolation was achieved by dispersing the extract from plant-infected tissues onto the semi-selective medium Corynebacterium nebraskense specific medium (CNS) (Gross and Vidaver 1979, Schaad et al. 1988) and Corynebacterium nebraskense specific medium fast (SCMF) (Fatmi and Schaad 1988, EPPO 2016) and incubating at 28°C for 5 to 9 days. Isolates were preliminary identified as Cmm with Gram staining, immunofluorescence, and ELISA tests, and stored in 15% v/v glycerol at -80°C. For further experimentation, the bacterial strains were grown in the solid nutrient medium Nutrient Agar with glucose (NAG, nutrient agar 23 g L-1, NaCl 8 g L-1, glucose 2.5 g L-1) or in Lysogeny Broth (LB, tryptone 10 g L-1, NaCl 10 g L-1, yeast extract 5 g L-1, pH 7).

DNA extraction and purification

Each isolate intended for DNA extraction was grown in LB broth at 28 °C for 48 hours. Four ml of the liquid culture (108 cfu/ml) was used to purify total DNA with the DNeasy Blood and Tissue Kit (Qiagen) following the respective instructions. Quality control and quantification of isolated DNA were done with the Thermo Scientific NanoDrop 2000c spectrophotometer. DNA aliquots of 10 ng/μl were prepared and stored at -20°C.

Indirect immunofluorescence (IF) assay

The indirect immunofluorescence test was performed according to the EPPO guidelines (EPPO 2009) using the polyclonal antibodies anti-Clav 782 and anti-Clav 25 of LBBA (dilution 1/1200), the anti-Cmm polyclonal antibodies from Loewe (Cat No: 07363 Antiserum Clavibacter michiganensis subsp. michiganensis ex goat), and the secondary labeled antibody CF488A Goat anti-Rabbit IgG (H+L) (Cat No: 20012, Biotium Inc). Specimens were observed on an Olympus BH2 microscope with an epifluorescence source.

Molecular identification

Cmm strains were identified by the amplification of a 268 bp and a 132 bp DNA fragments in PCR reactions utilizing the PSA-8/R (Pastrik and Rainey 1999, EPPO 2016) and RZ_ptssk 10/11 (Berendsen et al. 2011) species specific primer pairs. All PCR reactions contained 20 ng of purified DNA as substrate and were performed on a Bio-Rad T100™ Thermal Cycler using the 2X KAPA Taq ReadyMix kit (KK1024, KAPABIOSYSTEMS) in a final volume of 25 μl. The Cmm CFBP4999 and the Dickeya solani IPO2222 were used as positive and negative reference strains, respectively. For the amplification of the 268 bp DNA fragment with the PSA-8/R primer pair (0.5 µM final primer concentration), the PCR program was: initial denaturation at 94 °C for 5 min, 35 cycles of 95 °C for 30 s for denaturation, 63 °C for 20 s for primer hybridization, and 72 °C for 45 s for product polymerization, with a final extension at 72 °C for 5 min. On the other hand, for the amplification of the 132 bp DNA fragment with the RZ_ptssk10/11 primer pair (0.3 µM final primer concentration), the PCR program was: initial denaturation at 95 °C for 10 min, 40 cycles of 95 °C for 15 s for denaturation, 60 °C for 30 s for primer hybridization, and 72 °C for 45 s for product polymerization, with a final extension at 72 °C for 10 min. The amplified fragments were visualized in 1.7% and 2% w/v agarose gels, respectively containing 0,5 μg/mL of ethidium bromide in 1X Tris-acetate-EDTA (TAE). The amplicon size was confirmed by comparison with the bands of a 100 bp DNA Ladder (NM016S, Enzyquest).

Pathogenicity tests

The pathogenicity of isolated strains was initially evaluated with infiltrations in Mirabilis jalapa leaves. Two hundred μl of a 108 cfu/ml bacterial suspension in sterile water from each Cmm strain was infiltrated with a sterile hypodermic syringe (2 replicates per strain) (Gitaitis 1990). Plants were sprayed with water and placed in a chamber with a constant temperature of 26 °C and a photoperiod of 16 h of light and 8 hours of darkness inside transparent bags. Symptoms were evaluated after 36-48 hours.

Artificial inoculations were performed on forty-day-old tomato seedlings (Solanum lycopersicum cv. Ekstasi) grown in pots with black peat mixture as substrate. Tested strains were grown for 72 hours on NAG medium to form single colonies that were used to inoculate the plant with a toothpick. The central stem was pickled between the cotyledons, 1 cm below the junction, with a sterile toothpick that was immersed in a single colony (2 plants per strain). The toothpick was left in the hole to prevent the dehydration of the wound. A toothpick soaked in clean water was used to pickle the negative control plant. The inoculated plants were kept in a chamber at 26 °C and 16:8 photoperiod. Symptoms appeared 20 days after inoculation and were recorded for another 30 days. Two independent replicates were performed, and symptoms were categorized as follows: no symptoms (-), presence of ulcers at the infection site (+), curling, yellowing, and/or wilting of one or more leaves (++), and wilting of the whole plant (+++).

Phylogenetic analysis

Ninety-three strains from the HMU collection were selected for MLSA phylogenetic analyses based on partial amplification of the atpD, kdpA, ppk and sdhA genes (Table 1). Each PCR reaction contained 20 ng of purified total DNA, a mixture of dNTPs at a final concentration of 0.2 mM each, 1.5 mM MgCl2, primers at a final concentration of 0.4 µM (atpD-F/R and ppk-F/R) or 0.5 µM (kdpA-F/R and sdhA-F/R), and 0.5 units of Taq DNA polymerase (KK1016, KAPABIOSYSTEMS) in a final volume of 25 μl. All PCR reactions were performed on a Bio-Rad T100™ Thermal Cycler. The PCR program for the amplification of the atpD and ppk DNA fragments was: initial denaturation at 94 °C for 5 min, 35 cycles with denaturation at 94 °C for 30 s, hybridization at 60 °C for 30 s and elongation at 72 °C for 1 min, final elongation at 72 °C for 10 min. On the other hand, the PCR program for the amplification of the kdpA and sdhA was: initial denaturation at 95 °C for 3 min, 35 cycles of denaturation at 95 °C for 30 s, hybridization at 58 (sdhA-F/R) or 60 °C (kdpA-F/R) for 30 s and extension at 72 °C for 1 min, final extension at 72 °C for 5 min. The products of the PCR reactions (2 μl of a 25 μl reaction) were analyzed on a 1.5% w/v agarose gel containing 0,5 μg/mL ethidium bromide in 1X TAE.

Table 1 Primers used for the identification, phylogenetic analysis, and the distribution of pathogenicity genes of Clavibacter michiganensis subsp. michiganensis strains

Purification and Sanger sequencing of the amplified fragments were performed by Eurofins (Eurofins Genomics Germany, GmbH Anzinger Str. 7a, 85560, Ebersberg, Germany). Processing of the raw data from the forward and reverse sequencing of the DNA fragments was done with the SnapGene software (from Insightful Science; available at snapgene.com). The consensus sequences from each gene were concatenated and aligned with the MUSCLE algorithm (Edgar 2004).

Sequence polymorphism was analyzed on single genes or on the concatenation of various combinations of gene sequences with DnaSP (Rozas et al. 2017) utilizing the Maximum Likelihood (ML) method. The subspecies insidiosus, nebraskensis, and sepedonicus of Clavibacter michiganensis were used as outgroup taxons while the Cmm strain NCPPB382 used as reference strain. Phylogenetic analysis was performed with the raxmlGUI 2.0.0-beta.14 (Stamatakis 2014, Edler et al. 2021) implementing RAxML (Randomized Axelerated Maximum Likelihood). The individual parts of the concatenated sequences were defined as separate genes (partitions) and the nucleotide substitution model GTR with Gamma distribution (GAMMA) was applied. The statistical robustness and reliability of the dendrogram topology were projected with the bootstrap test at 1000 replications. The dendrogram was visualized using the program MEGA version X (Kumar et al. 2018).

Pathogenicity gene identification

The distribution of 9 genes related to the pathogenicity of the Cmm was studied in 39 out of the 93 strains used in the phylogenetic analysis. The presence of the genes chpC, celA, ppaA, tomA, chpE, pat-1, chpG, ppaC, and sbtA was confirmed by the amplification of gene-specific DNA fragments by PCR (Table 1). The program followed was: initial denaturation at 94°C for 5 min, 35 cycles with denaturation at 94 °C for 30 s, hybridization as shown in Table 1 for 30 s and extension at 72 °C for 30 s, final extension at 72 °C for 5 min. For the amplification of the chpG the reaction program was: initial denaturation at 95 °C for 3 min, 35 cycles of denaturation at 95 °C for 30 s, hybridization as shown in Table 1 for 30 s, and extension at 72 °C for 30 s, and final extension at 72 °C for 1 min. All PCR reactions were performed on a Bio-Rad T100 Thermal Cycler using the 2X KAPA Taq ReadyMix with dye (KK1024, KAPABIOSYSTEMS) in a final volume of 25 μl. The substrate used was 20 ng of purified total DNA and the primer concentration was as shown in Table 1. The Cmm strain CFBP4999 was used as a positive control, while the strain Dickeya solani IPO2222 was used as a negative control. The products of the PCR reactions (10 μl of a 25 μl reaction) were analyzed by agarose gel electrophoresis.

Results

Isolation and identification of Cmm strains

From 2003 to 2018, 12 outbreak periods of Cmm in tomato fields have been recorded in various regions of Greece, in covered as well as outdoor crops. During this time, samples of infected plants were received mainly from Crete (Lasithi, Heraklion, Chania), but also from the Prefectures of the Peloponnese (Argolis, Arcadia, Corinthia, Messinia), Central Macedonia (Imathia, Chalkidiki), Magnesia, East Macedonia and Thrace (Xanthi). From these Cmm samples, more than 150 strains were isolated and identified, which make up the Cmm collection at HMU. Colonies on NAG were mucilaginous and yellow, pale yellow, or white in color.

Ninety-eight strains from the HMU collection identified as Cmm were selected for further analyses. The identification of the isolated strains was done by collective evaluation with the hypersensitivity response reaction (HR), the 3% w/v KOH test, the indirect immunofluorescence test (IF), the two species-specific primer pairs PSA-8/R and RZ_ptssk 10/11 PCR tests, and the evaluation of artificial inoculations on tomato seedlings (Fig. 1). All strains were positive in 3% KOH (Gram positive strains), HR, IF, and PCR tests but presented variation in the reproduction of symptoms (Table 2).

Fig. 1
figure 1

Bacterial canker symptoms on tomato plants. a, typical disease symptoms in a greenhouse crop at a final stage of infection with the appearance of hydrobacteriosis in the planting line. b, unripe tomato fruit injuries presented as spots (left) or marbling (right). c, Discoloration, necrosis and separation of vascular tissue from the tomato pith. d, Pustular spots on the upper surface of the tomato leaf

Table 2 Strains of the Department of Agriculture, Hellenic Mediterranean University (HMU) selected for phylogenetic analyses. The reproduction of symptoms after artificial inoculations on tomato seedlings are presented. In addition, the haplotypes resulted from the phylogenetic analysis of the strains are presented

The disease symptoms in artificially inoculated tomato plant seedlings appeared 20 days post-inoculation. Among the main symptoms were the canker of the stem at the point of inoculation and the upward turning of one or a few of the leaves. As the systemic infection progresses, the entire leaves may wilt and shrivel (Fig. 1). Most of the strains caused stem canker and wilting of more than one leaf or total of the plant, except for four that caused only canker (HMU4046, HMU4646, HMU4666, and HMU4698) and 6 that gave no symptoms (HMU4209, HMU4221, HMU4533, HMU4690, HMU4721, and HMU4940).

Genetic diversity analysis

The genetic diversity of 93 strains isolated from 11 Prefectures of Greece in 12 different years from 2003 to 2018 (Table 2) was studied by MLSA. The MLSA scheme was based on the concatenation of partial DNA sequences of atpD(GenBank: OP471139-OP471231), kdpA(GenBank: OP470953-OP471045), ppk (GenBank: OP471046-OP471138), and sdhA (OP470860-OP470952). The atpD and ppk genes were chosen due to the presence of increased polymorphic nucleotides in Cmm strains from Uruguay (Croce et al. 2016). Similarly, the kdpA and sdhA genes were chosen, as they proved polymorphic in strains from Serbia and New York (Milijašević-Marčić et al. 2012, Tancos et al. 2015). The diversity analysis of the single- as well as two-, three- and four-gene concatenated sequences was investigated with the DNASP software and showed relatively low genetic variability between Cmm strains (Table 3).

Table 3 Diversity analysis of 93 Clavibacter michiganensis subsp. michiganensis strains isolated from 11 Greek Prefectures. The analysis was based on single genes as well as on various gene combinations

The highest polymorphism in individual gene sequences was observed in kdpA, with 35 polymorphic nucleotides (6.1% percentage of polymorphic nucleotides) and the highest number of haplotypes (23). The sdhA gene followed with a small difference (31 polymorphic nucleotides, 4.8% and 18 haplotypes). The atpD and ppk genes had considerably lower polymorphism; the ppk gene had a 3.5% polymorphism while the atpD had 1.4%. However, the latter yielded an increased number of haplotypes (10) than the former (8). In the sequence analysis of various gene concatenation sequences, the highest percentage of polymorphic nucleotides was observed in the kdpA-sdhA pair (5.4%, 34 haplotypes), while most haplotypes (35) arose from the kdpA-ppk and atpD-kdpA sequences (4.9% and 3.6%, respectively). When the analysis was based on a three-gene concatenation, the atpD-kdpA-sdhA combination gave 75 polymorphic nucleotides (4%) and 38 haplotypes, the same as the haplotypes yielded from the analysis of all genes together. The combinations atpD-kdpA-ppk, kdpA-ppk-sdhA, and atpD-ppk-sdhA yielded 37, 36 and 26 haplotypes, respectively. Finally, the four-gene combination (atpD-kdpA-ppk-sdhA) pointed-out 93 nucleotide polymorphisms (3.9%) and gave 38 haplotypes. Out of the 38 haplotypes, the most frequent (Hap_27), found in 13 strains, and the rarest was found in one strain.

Following MLSA, a Maximum Likelihood (ML) method was utilized to generate dendrograms based on individual gene sequences of atpD (Supplementary Fig. 1), kdpA (Supplementary Fig. 2), ppk (Supplementary Fig. 3), and sdhA (Supplementary Fig. 4) as well as based on the concatenation of three-gene sequences (atpD-kdpA-sdhA, Supplementary Fig. 5) and four-gene sequences (atpD-kdpA-ppk-sdhA, Fig. 2). The atpD-kdpA-sdhA combination was selected because of the highest number of haplotypes and nucleotide polymorphisms (Table 3). No dendrograms were generated with the two-gene combinations.

Fig. 2
figure 2

Maximum Likelihood phylogenetic tree based on the concatenated sequences of atpD, kdpA, ppk and sdhA genes of Clavibacter michiganensis subsp. michiganensis strains isolated from different Prefectures of Greece between 2003 and 2018. Bootstrap values above 50% are shown in branch nodes (1000 replicates). The clades with high bootstrap values (80-100%) are highlighted with a grey shadow. Within the clades, the strains in blue-dashed rectangulars were isolated from the same region at the same year or different years

The dendrograms generated with individual genes and four-gene analysis were in-agreement with respect to their topology. In the latter, all Cmm strains were grouped with strong Bootstrap support (100%) and clearly separated from the three closely related Clavibacter michiganensis subspecies insidiosus, sepedonicus, and nebraskensis. The Cmm monophyletic group is further divided into 16 clades with strong bootstrap support (80-100%) though strains in these clades do not appear to be related either to the location or year of isolation. More specifically, isolates from the same year or county are spread across different clades. For example, strains isolated in 2016 and 2017 from the Prefecture of Lasithi and Chania are found in over five different clades, indicating the existence of different sources of infection. It is possible that those producers obtained infected seeds or seedlings from different sources where they were infected with different strains of Cmm. Although there were cases where strains isolated from the same year and from the same Prefecture had the same haplotype (HMU4505/HMU4512, HMU4548/HMU4570, HMU4678/HMU4673) it was not the rule. There were cases where strains with the same haplotype isolated from different counties and years, also with strong ootstrap support, indicating the common source of propagating material. That was also the case for the strains HMU4258 (Chalkidiki) and HMU4256 (Lasithi), which although isolated from areas outside of Crete, were grouped with strains from Crete.

Consequently, there was no geographical or temporal dispersion of haplotypes since no haplotype was found in more than 5 different Prefectures or years of isolation.

Pathogenicity gene distribution in isolated strains

Most of the genes linked to Cmm's ability to cause disease are found in the chp/tomA PAI and the plasmids pCM1 and pCM2. More specifically, the chpC, chpE, chpG, ppaA, ppaC, sbtA, and tomA genes of the chp/tomA PAI and the celA and pat-1 genes in pCM1 and pCM2 were studied. Thirty-nine of the 93 isolates from the MLSA analysis were selected based on the Prefecture, the year of isolation, and the symptoms on tomato seedlings. The presence of pathogenicity genes in the genomes of the selected strains was analyzed by conventional PCR (Table 4). In addition, the pathogenicity of the strains was assessed by artificial inoculations in tomato seedlings (Table 4). The investigation of all the above-mentioned genes except celA and pat-1 gave the expected DNA band (Table 1).

Table 4 Pathogenicity of isolated strains in artificially inoculated tomato seedlings and PCR tests for the presence of the main pathogenicity genes in the genome of 39 Clavibacter michiganensis subsp. michiganensis strains isolated from different Prefectures of Greece between 2003 and 2018. Cells with negative results are shaded grey. In pathogenicity tests, symptoms were categorized as follows: no symptoms (-), presence of ulcers at the site of infection (+), curling, yellowing and/or wilting of 1 or more leaves (++), wilting of the entire plant (+++)

Results from the pathogenicity assays were also not consistent, a fact that is probably not related to the distribution of the genes in the chp/tomA pathogenicity island or the respective plasmids. Twenty-four out of 39 strains gave the typical symptoms on one or more leaves (++) while 11 strains caused total wilting of the plant (+++). Of the rest strains, HMU4209 and HMU4721 did not cause any symptoms (-), and strains HMU4046, HMU4646 showed reduced virulence (+) causing only ulceration at the point of infection.

Reactions to detect the genes tomA, chpC, chpE, chpG, sbtA, ppaA, and ppaC gave positive signals in all tested strains (Table 4). The celA gene of pCM1 was detected in all strains except HMU4209, which did not cause any symptoms (-) in tomato seedlings. This was an indication that the lack of the celA gene may be related to the absence of pathogenicity in the strain.

The pat-1 gene was not detected in 17 strains that presented different intensities of pathogenicity. Nevertheless, this lack of PCR signal does not necessarily mean absence of the pat-1 gene and may be the result of mutations at the primer annealing sites. One of them caused no symptoms (-), two caused only an ulcer at the site of infection (+), 12 gave the typical symptoms (++) and two caused total wilting of the plant (+++). Thus, the pat-1 gene may contribute to the pathogenicity of Cmm but does not appear to be essential. Furthermore, strains lacking the pat-1 gene did not appear to be phylogenetically different from the rest of the Cmm strains in the dendrogram based on the atpD-kdpA-ppk-sdhA sequence. Most are in different branches, while only HMU4521/HMU4721, HMU4646/HMU4752, and HMU4330/HMU4337/HMU4349/HMU4375 had the same haplotype (Fig. 2).

Discussion

Tomato bacterial canker is one of the most important diseases in Greece, when it comes to growing tomatoes, causing serious losses in greenhouse and outdoor crops that can lead to destruction. The disease has spread throughout the tomato-growing regions of the world, mainly due to infected propagating material (Tsiantos 1987, Gleason et al. 1993). The lack of any chemical or biological agent to effectively combat Cmm makes the disease extremely dangerous for tomato-producing countries.

The identification of the sources of primary infections during disease outbreaks is helpful in creating a comprehensive plan to manage the importation of propagation material and the application of practices that will prevent the spread of the disease. From 2003 to 2018, twelve outbreaks of bacterial canker were recorded in Greece, during which significant economic losses were caused in tomato production. During this period, a large number of tomato plant samples were investigated, and a large number of strains were isolated. The samples derived from greenhouse and outdoor tomato crops were mainly infected during the autumn and winter; the short photoperiod favored the growth of the pathogen (EFSA_Panel_on_Plant_Health 2014). In few cases, the pathogen was isolated during the summer. From the analysis of infected plants, 98 Cmm strains were isolated and identified by phenotypic, immunological, and molecular methods as well as by pathogenicity tests.

According to EPPO classification, the Cmm is regarded as an A2 quarantine pathogen in the EU. The most effective way to prevent the import and spread of the disease is to avoid the transport of infected propagation material, the main source of inoculum, but also to ensure proper management of already infected crops to eliminate new inoculant spots (Yasuhara-Bell and Alvarez 2015).

In this work, 93 Cmm isolated strains were selected and genetically analyzed to determine the source of primary infections during the reference period. The MLSA method was utilized based on the four housekeeping genes atpD, kdpA, ppk, and sdhA, which were chosen due to the nucleotide polymorphism they present. The atpD and ppk genes showed lower polymorphism among the Cmm strains studied (1.4% and 3.5%) polymorphic loci that separated the strains into 10 and 8 haplotypes, respectively (Table 3). Those numbers were close to Uruguayan strains (1.6% and 4.3%, respectively) (Croce et al. 2016) but much lower than those derived from strains isolated from all over the world (Jacques et al. 2012). The kdpA and sdhA genes showed greater polymorphism in our study, 6.1% and 5.1%, which gave 23 and 18 haplotypes respectively. Although the kdpA gene yielded a significantly larger number of polymorphic sites compared to the study of Cmm in Serbia and New York (2.7% and 3.7%) (Milijašević-Marčić et al. 2012, Tancos et al. 2015), the sdhA gene presented slightly lower polymorphism (5.4% and 4.3%). When phylogenesis was based on the concatenated sequences of the four genes, 93 polymorphic loci (4.1%) were identified, producing 38 haplotypes. The detected polymorphism was lower than that detected when a more wide global collection of strains was analyzed (8.7%) (Jacques et al. 2012). Though, it was higher than that of Uruguayan strains (1.8%) (Croce et al. 2016).

The phylogenetic analysis of this study was done by creating a Maximum Likelihood (ML) phylogenetic tree, using the concatenated sequence consisting of the atpD, kdpA, ppk, and sdhA individual sequences (Fig. 2). Moreover, additional dendrograms were created based on the individual gene sequences (Supplementary Figs. 1-4). In all cases, the isolated strains were consistently separated from the three related subspecies (C.m. insidiosus, C.m. nebraskensis, and C.m. sepedonicus). Consequently, the concatenation of these four gene sequences is suitable for the phylogenetic analysis of Cmm strains since it increases bootstrap support and strain resolution.

Moreover, the Cmm group of strains is comprised of 16 clades with very good bootstrap support (80-100%). The separation of strains in most of these branches is not related to area or year of isolation, i.e., there are branches consisting of a) strains isolated from different areas or isolated from different years, b) strains isolated from the same area but from a different year, and c) strains isolated from the same area and the same year. More specifically, the phylogenetic analysis showed that strains isolated from the same Prefecture and the same year span more than five different branches. In these cases, the primary contamination of the crops probably came from different sources through infected seeds or seedlings supplied by the producers. The clustering of strains isolated from different regions and in different years on the same branch indicates a common initial source of infection. A possible explanation would be the import of seeds from a country in which the specific haplotype of Cmm exists and was spread in Greece in the following years. In some branches, only strains isolated from the same region and year are grouped together. This fact can be explained either by the common source of contamination through the propagating material or by the transmission of the bacterium between closely spaced fields without applying good agricultural practices.

Though, this finding is consistent with data from Argentina and Turkey where rep-PCR and MLSA analysis were performed with the housekeeping genes hipA, gyrB, kdpA, ligA, and sdhA and the pathogenicity genes ppaA, chpC, and tomA (Wassermann et al. 2017, Sen et al. 2018). In contrast, a similar study in Michigan by BOX-PCR and MLSA showed a significant correlation between the genetic variability and the region of isolation of the strains (Quesada-Ocampo et al. 2012). In general, the geographical and temporal distribution of haplotypes was relatively low since no haplotype was found in more than five different Prefectures or years of isolation. This result supports the hypothesis of the introduction of the disease through the import and spread of infected propagation material.

The chpC, chpE, chpG, ppaA, ppaC, sbtA and tomA genes of the chp/tomA PAI and the celA and pat-1 genes of the pCM1 and pCM2 plasmids, respectively, are considered important for the pathogenicity of the bacterium (Nandi et al. 2018). The presence or absence of these genes was confirmed by conventional PCR in 39 Cmm strains out of the 93 analyzed by MLSA that were selected to include at least one strain from each different region and year of isolation. The pathogenicity tests revealed extreme differences in virulence, with symptoms ranging from mild to typical to extreme, causing complete wilting of the plants, while two strains showed no symptoms.

Based on the abovementioned differences in pathogenicity, it was expected that several genes that are linked with the chp/tomA PAI would be absent from the strains that showed reduced or no virulence. Nevertheless, the seven chromosomal genes of the chp/tomA PAI were detected in all 39 Cmm indicating that the intensity of the pathogenicity is not related to the genes chpC, chpE, chpG, ppaA, ppaC, sbtA, and tomA. However, we cannot support this conclusion with certainty since it may be possible that some of the genes are not fully functional. Further studies on the functionality of each gene could confirm the above conclusion. A similar study on Cmm strains in Argentina showed that the ppaA and chpC genes were absent from non-pathogenic strains, concluding that the ppaA and chpC genes are essential for the virulence of the bacterium (Kleitman et al. 2008).

The celA gene, which is located on the pCM1 plasmid, was found in all studied strains except for the non-infectious HMU4209. The absence of celA from this strain could be responsible for the lack of pathogenicity, however, we cannot support this with certainty, although studies in mutant strains lacking the celA have shown that the re-introduction of the gene restores the virulence (Meletzus et al. 1993, Thapa et al. 2017, Hwang et al. 2019). In the New York study, the celA gene was the least absent gene (Tancos et al. 2015) as is the case in this study. In the same study, of the three strains from which the celA was absent, one was fully virulent while the other two had reduced virulence. In the study with the Uruguayan and Iranian strains, only one strain lacked the celA while being pathogenic (Croce et al. 2016, Osdaghi et al. 2018). Moreover, in the global collection studied by Jacques et al. (2012), the celA was not detected in four strains, which, in contrast, retained their virulence. Consequently, it appears that the celA gene may contribute to Cmm virulence but is not always essential for it. Most probably, there are unknown virulence factors that interact with the celA gene.

Concerning the pat-1 gene on the pCM2 plasmid, it was not found in 17 of the 39 strains that were studied (Table 4), but all the other pathogenicity genes were there. The ML phylogenetic tree (Fig. 2) shows that these strains are not close to each other. This suggests that the loss of the pat-1 gene has nothing to do with their phylogenetic separation. Pathogenicity tests of these strains showed the full range of symptoms in infected plants. Most of them (12) gave the typical symptoms (++), two gave total wilting of the plant (+++), two gave simple cankers at the site of infection (+) and one plant’s symptoms were absent. These results show that pat-1 may contribute to the pathogenicity but does not seem to play an essential role in the virulence of the bacterium, and its contribution may be masked by other infectious agents. To this end, there are several studies where, although the strains are negative for the pat-1 gene, they present strong, reduced (Tancos et al. 2015, Ialacci et al. 2016, Jacques et al. 2012) or no virulence (Milijašević-Marčić et al. 2012). The abovementioned results are justified by the observed heterogeneity in the pathogenic ability of the studied strains lacking the pat-1, strengthening the hypothesis that the pat-1 contributes to Cmm virulence but is not essential for it.

In conclusion, the study of the 93 Cmm strains isolated from various areas of Greece and studied with an MLSA approach, separated the strains into 38 haplotypes, showing increased genetic variability. The analysis was based on the atpD, kdpA, ppk, and sdhA genes and appeared suitable for the clustering analysis of the Cmm population in Greece. The phylogenetic separation of the strains in the dendrogram showed that there was no regional or temporal distribution of the isolations. The increased variability has probably resulted from the introduction of the bacterium into Greece from different sources; infected seeds were possibly imported by seed distribution companies, establishing the initial contamination that was then transmitted by farmers. The wide spread of the disease was probably due to the inappropriate control of symptomatic plants and favorable weather conditions that were further spread to neighboring crops by infected plant debris, although we cannot support this since the strains were isolated from unrelated geographic areas.