Introduction

Phytophthora infestans, is an Oomycete pathogen that causes one of the most serious diseases in potato and tomato (Nowicki et al. 2011, 2013). Late blight, caused by the pathogen P. infestans, affects potato production worldwide and devastating costs can be incurred. In developed countries, fungicide spray programs for the control of late blight account for the highest proportion of fungicide use for disease control worldwide. When considering crop losses in addition to the cost of control measures to minimise disease impact, the disease is estimated to cause US$6.7 billion worldwide annually (Chowdappa et al. 2015; Nowicki et al. 2011). Worldwide seed potato movements during the early 1800s likely spread the pathogen, resulting in the notorious Irish potato famine in the mid 1840s, when P. infestans migrated on potato germplasm, across the Americas and into Europe (Fry 1993; Martin et al. 2014, 2015). Estimates of food loss based on 2009–10 world harvest statistics for rice, wheat, maize, potatoes and soybean and caloric assumptions, show even low-level persistent diseases cause losses that would be enough to feed 8.5% of the 7 billion people alive in 2011 (Fisher et al. 2012).

P. infestans is heterothallic with two compatibility mating types, known as A1 and A2 (Galindo and Gallegly 1960) and is able to reproduce asexually and sexually. Until the late 1980s, the populations of P. infestans around the world were dominated by a single clone of the A1 mating type, known as US-1. For single mating types, reproduction is entirely asexual, the pathogen requiring living host tissue (usually tubers) for long term survival. A wave of migrations in the 1970’s and 80’s of new A1 strains and strains of the A2 mating type from Mexico into North America and Europe (Gómez-Alpizar et al. 2007; Grünwald and Flier 2005), ultimately resulted in the spread of new strains of both mating types throughout the world (Fry et al., 2008). The present-day resurgence of late blight is due to the displacement of “old” P. infestans populations by new, genetically more variable, aggressive and fungicide resistant populations affecting potatoes (Cooke et al. 2012). Where, for more than 150 years, populations of P. infestans outside central America have been asexual, sexual reproduction has now become part of the life history of this organism in many parts of the world (Sjöholm et al. 2013). The opportunity for sexual reproduction due to the spread of A2 has resulted in many novel A1 and A2 strains occurring worldwide (Fry 2008). Sexually recombined spores increase the chance of incursions due to the potential of new fungicide resistance and novel collections of effector genes that can overcome host plant resistance. An additional advantage of sexual reproduction for P. infestans is the generation of oospores, which are capable of remaining viable in soil and plant debris for several years, giving rise to the opportunity for reinfection over a prolonged period (Chowdappa et al. 2015; Nowicki et al. 2011).

Australia is one of the few countries that has not been affected by the world-wide spread of post 1980’s populations of P. infestans; partly due to the pathogen favouring moist, cool environments for reproduction (12–18 °C) and slightly warmer environments for lesion growth (20–24 °C) (Haverkort et al. 2009). The first records of P. infestans causing late blight disease in potatoes in Australia were in 1909 when there were widespread epidemics in all six states of Australia over the period 1908–1911, although there was anecdotal evidence that the disease had been present before the 1900’s (McAlpine 1911). Generally, however, severe outbreaks of late blight have been relatively infrequent in Australia, and overall, the disease had been considered to be of minor economic significance (Cox and Large 1960). The infrequent, nature of the disease outbreaks is due to sporadic favourable weather conditions that favour the disease. Typically, late blight will develop when there are waves of warm, moist air combined with stagnant or slow-moving depressions that give rise to prolonged periods of still, humid and often overcast weather (Harrison 1992). These conditions occur periodically (intervals of 3–10 years) in temperate Australia where most of the potato crop is grown, when the remnants of tropical low-pressure systems move over south eastern Australia. Otherwise, conditions in the main production areas are too dry for late blight disease.

A study of Australian P. infestans isolates collected over 1998–2000 (Drenth et al. 2002) identified a clonal population of the A1 mating type (designated AU-3) that had a DNA fingerprint (RG57 – dominant multi-locus RFLP marker) similar to that of two isolates from Tasmania (Goodwin et al. 1994) and were designated AU-1 and AU-2. This fingerprint was unique and was closely related to the old US-1 strain (Drenth et al. 2002). This ancient A1 strain of P. infestans proved to be highly sensitive to the phenylamide fungicide metalaxyl (Drenth et al. 2002). A biosecurity risk assessment undertaken in 2006 had determined that if more aggressive and better adapted strains of A1 and A2 mating types were to find their way into the country, late blight disease could become more common and more costly to control (Edwards 2006). While a minimal but informed and vigilant approach is taken to control late blight within Australia, if an outbreak of an exotic A1 or A2 strain was to occur, there is the potential for a major epidemic of highly adaptable and infectious P. infestans to cause massive losses to potato production, as there has been little emphasis on disease resistance breeding and management of locally adapted cultivars. Within the local geographic region, this is becoming of increasing concern, as exotic strains have been identified in neighbouring countries, including Papua New Guinea and Indonesia (Edwards 2006). Presently the Australian potato industry prepares for possible incursions of exotic strains of P. infestans through industry standard pest risk analysis and development of contingency plans (Edwards 2006).

With the advent of next-generation sequencing and advances in extractions of ancient DNA (aDNA) from herbarium specimens, molecular genetic epidemiology can now be applied to study the history and evolution of P. infestans. Previously, it was thought the P. infestans populations outside of the centre of diversity (South America and Mexico) were dominated by a single haplotype called US-1 (lb) (Goodwin et al. 1994) and it was assumed this was the strain responsible for the first outbreak in North America and then the subsequent outbreak leading to the Irish potato famine (Austin Bourke 1964). Subsequent PCR analysis concluded that both these strains belonged to the mtDNA haplotype, la, which is distinct from the US-1 group (May and Ristaino 2004; Ristaino et al. 2001). As technology has advanced, many genomes of the various P. infestans strains have been described (Cooke et al. 2012; Haas et al. 2009; Lee et al. 2020; Raffaele et al. 2010; Yoshida et al. 2013). Yoshida et al. (2013) described shotgun sequencing of 11 historical strains and 15 modern strains of P. infestans from North America, South America, Africa and Europe (including Great Britain and Ireland). These studies found low sequence diversity amongst the historical strains which had a single, distinct mtDNA haplotype, designated HERB-1 which could be distinguished from the modern strains. It was concluded that the Irish potato famine in the 19th century was caused by a HERB-1 genotype termed FAM-1 and, although distinct from modern strains, is a close relative to the dominant US-1 strains that exist today, indicating a common ancestor between HERB-1 and US-1 (Birch and Cooke 2013).

The first public P. infestans genome was published in 2008 by Haas et al., using strain T30-4. The genome is c. 240 Mb in size with around 90% represented by conserved regions comprised of blocks of conserved gene order in which gene density is relatively high and repeat content is relatively low(Cooke et al. 2012; Haas et al. 2009). The initial calculation of repetitive DNA present in the genome was approximately 74% (Cooke et al. 2012). Noted was the presence and amount of genes coding for secreted effector proteins, many of which are active during disease infection and are assumed to have host altering capabilities, such as immune suppression. Of most importance with regard to virulence of the pathogen, around 550 genes encode for RXLR effectors which can be recognised by host resistance genes, that represents a significant expansion in comparison to the wild relatives P. sojae and P. ramorum (Yin et al. 2017; Haas et al. 2009). This contributes to the natural selection of the pathogen containing mutated alleles (Vleeshouwers et al. 2008). More recently, Lee et al. (2020) completed additional short and long read sequencing to assist the assembly and genome annotation for two P. infestans isolates sampled in the Republic of Korea. A reduced number of RXLR effector encoding genes were detected, 433 and 310 in the Korean strains, KR_1_A1 and KR_2_A2, and 306 in the T30-4 (Lee et al. 2020).

Understanding the history of plant diseases and associated epidemics greatly benefits our knowledge and ability to future proof crop production through informed crop disease management and targeted plant breeding. Obtaining DNA from a variety of preserved samples or aDNA (Pääbo et al. 2004) has seen incredible advances since the first attempted studies with the invention of next-generation sequencing (NGS) technologies. Previously, polymerase chain reaction (PCR) was utilised in conjunction with herbarium specimens to understand the course of past late blight epidemics (May and Ristaino 2004; Ristaino 1998; Ristaino et al. 2001) and more recently, herbarium species have been investigated using NGS in order to reconstruct the genomics of ancient plant pathogens (Martin et al. 2013; Yoshida et al. 2014; Smith et al. 2020). Whilst useful, working with aDNA comes with its own set of unique challenges. aDNA is often only present in low quantities, with high levels of degradation and containing modifications not normally found in fresh samples (Pääbo et al. 2004; Yoshida et al. 2015).

In Melbourne, Australia, the Victorian Plant Pathology Herbarium (VPRI) is maintained by Agriculture Victoria. The herbarium contains a rich diversity of plant disease specimens dating back to the late 1800s. The establishment of the collection by Daniel McAlpine in 1890 allowed the collection of plant pathogens across Australia, totalling c. 43,000 dried specimens and cultures (Fish 1970; Shivas et al. 2006). The importance of such collections has been widely accepted for understanding past pathogen epidemics, reconstructing the genomics of ancient specimens for understanding its evolution, re-identifying present pathogens with updated technologies and informing disease prevention strategies (May and Ristaino 2004; Pääbo et al. 2004; Ristaino 1998; Ristaino et al. 2001; Smith et al. 2020, 2021; Yoshida et al. 2014, 2015). Recently, an NGS approach was used to re-examine VPRI powdery mildew specimens that are up to 130 years old, identifying that all Australian-collected samples labelled Podosphaera spp. are Po. clandestina and not the cherry powdery mildew pathogens, Po. pruni-avium, Po. cerasi or Po. prunicola, which are presumed absent from Australia (Smith et al. 2021). Given that the VPRI contains late blight specimens collected by McAlpine during the initial outbreaks of the disease in Australia and specimens retained from further outbreaks during the 20th Century, the same NGS approach used by Smith et al. (2021) could provide a snapshot of the history of P. infestans within Australia.

In this study we aim to elucidate the history and progression of Australian isolates of P. infestans across the last century while highlighting how the knowledge of genetics combined with NGS technologies can overcome the shortcomings of inferring historic epidemics through geographical distribution and small numbers of molecular markers.

Materials and methods

Samples and DNA extraction

All samples labelled P. infestans maintained by VPRI (Bundoora, Victoria, Australia), were accessed and examined for suitability for pathogen DNA extraction under clean room conditions, as described by Smith et al. (2020). The samples included specimens that were maintained as dried infected leaf as well as dried agar cultures of mycelium. In addition, a series of more recent samples was accessed that were collected from Tasmania of infected plant tissues that had been sampled onto FTA cards following the EuroBlight protocol (Cooke 2018). To standardise the starting material, a 6 mm leaf punch was used to cut sections of infected material, sterilising in between with 80% ethanol, aiming for the border of the lesion on dried leaf specimens and FTA cards and visually dense mycelia clumps on the dried agar specimens. These samples were processed in a clean room setting keeping each sample separate. Between each sampling, tweezers, hole punch, microscope plate and immediate workspace were sanitised with 80% ethanol. In total, 73 samples were collected, 9 from FTA cards.

DNA was extracted from all collected samples using the E.Z.N.A forensic DNA kit (Omega Bio-Tek, Norcross, GA, USA) following the methods detailed in Smith et al. (2020). Yield, quality and degradation were assessed using a Nano-Drop 2000 (Thermo Fisher, Waltham, MA, USA), Qubit 2.0 Fluorometer (Thermo Fisher, Waltham, MA, USA) and Tapestation 2200 (Agilent, Santa Clara, CA, USA). The total number of samples taken through to the sequencing stage was 58, comprising 50 samples taken from leaf and 8 from FTA cards.

Library preparation and sequencing

Samples were prepared for sequencing using the Ovation Ultralow system V2 (NuGen, Redwood City, CA, USA) following manufacturer’s instructions. Samples were bar coded and then pooled in equimolar concentrations. All samples were sequenced on a HiSeq 3000 (Illumina, San Diego, CA, USA) generating 2 × 150 bp paired end reads. Two sequencing pools were created to account for the array of DNA concentrations, a 3 nm pool which was sequenced once with the other 10 nm pool. In addition, the 10 nm pool was sequenced again. All sequencing data is available from NCBI under the Bioproject ID PRJNA934104.

Sequencing data analysis and reference genome sequences

The generated sequence data after demultiplexing into the individual samples was evaluated for sequence quality. Reads where 3 or more bases had a quality score lower than 20 were trimmed using a custom perl script (Braich et al. 2020, supporting data) or the reads were trimmed from the 3’ end when the median phred score was lower than 20 and adaptors were removed using cutadapt v1.9 (Martin 2011) only reads longer than 75 nucleotides were retained. The resulting quality sequence reads were aligned to the P. infestans mitochondrial (NC_002387.1; Paquin et al. 1997) and whole genome (T30-4, ASM14294v1, assembly accession GCA_000142945.1; Haas et al. 2009) using the Burrows-Wheeler Aligner mem algorithm (Li and Durbin 2009). If multiple bam files existed for a sample, they were merged at this stage after which samtools flagstats and samtools coverage outputs were generated. Single nucleotide polymorphisms (SNPs) were called using samtools mpileup and using the call function of BCFtools for each sample (Li 2011), ignoring indels. Samples with less than 75% coverage of the mitochondrial genome were removed from the analysis. The VCF files were filtered and processed using VCFtools missing data score of 0.8 to remove low quality SNPs generating a genotype file and depth statistics (Danecek et al. 2011).

For each sample, a separate consensus mitochondrial sequence file was generated using the BCFtools (Li 2011) consensus command by imputing the sample’s filtered SNPs onto the reference mitochondrial genome. The generated consensus whole mitochondrial genome fasta files were aligned using Clustal Omega online (v1.2.4; Sievers et al. 2011) and then a maximum likelihood phylogenetic tree using 1000 bootstraps (ultrafast bootstrap) was generated using IQ-TREE web server (version 1.6.12; Trifinopoulos et al. 2016) and the resulting newick tree was visualised using TreeView (http://etetoolkit.org/treeview/ ).

A collection of reference genomes was used as standards to augment the analysis. Four genome sequences were added that covered mitochondrion haplotypes Ia, IIa and IIb (Genbank reference - AY898627.1; AY898628.1 and U17009.2) collected from the Netherlands, United Kingdom and USA as well as reference mt genome AY894835.1 (collected from the USA).

Results

Sample extraction and sequencing

A total of 66 samples, with collection dates ranging from 1873 to 2019, were accessed from VPRI and a small section of tissue was removed for DNA extraction. The DNA extraction method was not successful for any sample maintained on dried agar (8 dried agar samples in total). From the DNA extraction, a total of 58 samples (87.88%) were deemed successful and of sufficient DNA quality (> 1ng/µl estimated in the 50–400 bp range) to proceed with sequencing library preparation. All samples then generated a sequencing library of sufficient quantity to be progressed for sequencing. Reads generated per sample ranged from 17,879,359 to 120,626,031 with an average of 44,658,563 reads generated overall. Of the samples preserved on FTA cards, the number of reads generated ranged from 19,285,670 to 78,625,894. Based on the read numbers generated, no trends were observed relating to age of sample or the preservation medium (Table 1).

Table 1 Passport data and sequence read generated numbers for Phytophthora infestans specimens obtained from the Victorian Plant Pathology Herbarium and on FTA cards from Tasmania, Australia, in 2019

Sequence alignment and SNP discovery

The generated sequence reads were aligned to the complete T30-4 reference genome. However, from the alignments to the nuclear genome, 89% of samples showed less than 2x coverage which was deemed insufficient (Supplementary file 1) for any form of genomic analysis. The lack of coverage is likely the result of nuclear sample degradation due to prolonged storage; a decision was, therefore, made to focus on mitochondrial alignments. The sequence alignment was repeated but with only the mitochondrial genome used as the reference.

The P. infestans mitochondrial (mt) genome alignments generated lower percentages of aligned reads compared to the nuclear genome, the highest percentage was 1.33% (VPRI 30,141), the lowest being 0.0002% (VPRI 223). However, the overall coverage of the mitochondrial genome was 100% for 40 samples, due to the smaller size and higher cellular copy number. Of the remaining samples that did not have 100% coverage, the lowest coverage was 28.4% (VPRI 223). VPRI 223 also had the lowest depth coverage at 0.409x; in contrast VPRI 30,139 had the highest depth coverage at 3578.33x (Table 2). Samples with less than 75% sequence coverage of the mt genome (specifically samples FTA 2, FTA 4, VPRI 204, VPRI 206, VPRI 211, VPRI 214, VPRI 220, VPRI 223, VPRI 225, VPRI 230 and VPRI 30,145) were removed from any further analysis, leaving mt genome assemblies of 47 P. infestans isolates remaining in the study (41 samples from dried leaf specimens, 6 samples taken from FTA cards). From each individual alignment a reference imputed whole mt genome sequence was generated for the remaining 47 samples.

Table 2 Mitochondrial genome alignment statistics for Phytophthora infestans specimens obtained from the Victorian Plant Pathology Herbarium and on FTA cards from Tasmania, Australia, in 2019

The aligned reads were then assessed for variants and a total of 106 variant sites were identified from the 37,957 bp mt genome (0.27% of the genome; Supplementary file 2 and 3). A total of 12 variant sites were identified as fixed for the alternative nucleotide across all isolates compared to the reference sequence and therefore uninformative in the analysis of this set of P. infestans isolates from within Australia. Of the remaining 94 informative variant sites, 73 (77.6%) were variant in only one sample from the data set generated.

Phylogenetic analysis

The set of P. infestans mt genomes were then augmented with four publicly available reference sequences and aligned (aligned sequence available in supplementary 4). All of the Australian isolates in this study grouped together. The three reference A1 strain isolates AY898627.1; AY898628.1 and U17009.2 (collected from the Netherlands, United Kingdom and USA) and reference mt genome AY894835.1 (collected from the USA) grouped with the Australian samples (Fig. 1). The most genetically distant isolate was identified as VPRI 212 from South Australia, collected in 1909. Accessions AY898627.1 and AY898628.1, both mitochondrial haplotypes IIa and IIb, formed their own clade but definition of the relatedness of all other samples was hard to decipher.

Fig. 1
figure 1

Maximum likelihood phylogenetic tree based on 1000 ultrafast bootstraps of mitochondrial alignments of Australian P. infestans samples from the Victorian Plant Pathology Herbarium and FTA cards from Tasmania, A1 strain samples AY898627.1; AY898628.1 and U17009.2 and reference mitochondrial genome AY894835.1. Substitution model used was K3Pu + F + I + G4, branch lengths represent evolutionary distances, specifically estimated number of substitutions per site, tree is unrooted. Total tree length (sum of branch lengths) is 19.30 and values displayed in red text are bootstrap supports (%)

To further refine relatedness of the VPRI samples, sequences for AY898627.1; and AY898628.1 were removed. The resulting maximum likelihood phylogenetic tree (Fig. 2) still shows the most unrelated P. infestans isolate is identified as VPRI 212. Further examination identified that VPRI 212 generated 38 out of 94 (40.4%) variant sites that were unique to the sample, with only 56 variant sites (0.15% of the mt genome) identified from the other 46 mt genome sequences. Despite the limited diversity, the P. infestans isolates were divided into several broad clades. FTA 60 collected from Tasmania in 2019 was most closely related to the sequences originating from the UK and US. Eight of the Victorian samples from 1911 can be identified with a high degree of similarity, with only 10 of the 56 variant sites not being in agreement by 1 or 2 genotypes between these samples (VPRI 199, 201, 202, 219, 221, 224, 226, 229; Fig. 2). These isolates included two host species: Solanum tuberosum (potato) and S. aviculare, a native Australian plant called ‘Kangaroo Apple’. Whilst these isolates do not have an exact present-day counterpart, FTA 79 from Tasmania 2019 was closely related.

Fig. 2
figure 2

Maximum likelihood phylogenetic tree based on 1000 ultrafast bootstraps of the mitochondrial alignments of Australian P. infestans samples from the Victorian Plant Pathology Herbarium and FTA cards from Tasmania with reference mitochondrial genome AY894835.1. Samples isolated from S. aviculare are indicated with a * against the name. Substitution model used was K3Pu + F + I, branch lengths represent evolutionary distances, specifically estimated number of substitutions per site, tree is unrooted. Total tree length (sum of branch lengths) is 0.004, sum of internal branch lengths is 0.0006 and values displayed in red text are bootstrap supports (%)

Discussion

The objectives of this study were to utilise next-generation sequencing of historical and current P. infestans specimens to examine the evolutionary history and journey of strains collected within Australia, in combination with potentially identifying missed historical events.

While the amount of DNA that was available from these sample types and post extraction concentrations were above expectations, several challenges of using DNA derived from herbarium samples were identified. As with most sample types, a high level of contaminating DNA (e.g. host) was present in samples, the average percentage of reads that aligned to the nuclear genome was only 6%. In two other studies (Martin et al. 2013; Yoshida et al. 2013), the P. infestans DNA was 1–20% of the total DNA extracted, the majority belonging to the host and a smaller percentage to other organisms. The DNA of the pathogen was highly fragmented making nuclear de novo assembly or reference genome alignment impossible. To enable nuclear genome comparisons between historic and modern strains, novel approaches would be required, focussing on deeper sequencing or potentially more targeted approaches. These and other challenges have been documented by other studies (Besnard et al. 2014; Haas et al. 2009; Hofreiter et al. 2001; Martin et al. 2013; Pääbo et al. 2004; Yoshida et al. 2013, 2015). However, using the NGS approach of Smith et al. (2020) for herbarium specimens, enough sequence data was generated to enable comparison of the mitochondrial genomes of P. infestans collected from the UK in 1873 and 1879, the USA in 1889, and Australia in 1900 and onwards, i.e. up to 149 years old, demonstrating the benefit of plant pathogen herbaria.

From the mt genome sequence data presented, it is clear that the isolates used in this study specifically belong to the Ia subgroup of the A1 mating type and that the A2 strain is absent from Australia. Furthermore, the mt genome relatedness did not specifically diverge into historic and recent P. infestans isolates. The integrated genotype data indicates that there are a series of evolving and circulating strains present in Australia that probably date from the early 20th Century, potentially even earlier, that could predate the Quarantine Act of 1908 (Fish 1970). Following from the 1909-11 late blight outbreak, that many of the samples studied originated from, biosecurity measures were implemented within Australia limiting the movement of infected crops to other districts and areas, along with the application of fungicides. Formal seed potato certification schemes were then implemented, and were made official a little later, in the 1940’s (Philip 2018). Australia has some of the strictest phytosanitary requirements for the import of potato germplasm, needing routine permits, certification and inspections along with pre-export treatments and post-entry requirements and the import of bulk, fresh unprocessed potato tubers has been banned for several decades (Eschen et al. 2015). These biosecurity measures have likely resulted in the exclusion of additional strains, potentially aided by an unfavourable climate. The genotypic diversity identified from the historic strains does indicate that there were potentially multiple introductions at that time.

The outgroup nature of VPRI 212 (collected from South Australia in 1909) is indicative of a strain that has potentially been lost from circulation within Australia as this study did not detect any genotypically related samples in the following years samples. Interestingly, this specimen was one of five recorded on the same date originating from the same place (West Adelaide, SA, 2nd Oct 1909), three of which (VPRI 212, 213, 215) were included in the mitochondrial analysis but did not cluster together. The identification of one such example of a strain that has been detected and subsequently eradicated, likely indicates that this event could have occurred several times during the period under investigation. Alternatively, the absence of breakout clades from recent samples is also informative, indicating that there have been no recent introductions of exotic strains of P. infestans, further supporting the premise that all introductions were in the early part of the 20th Century.

VPRI 12,317 and VPRI 216 were closely related (Fig. 2). Both were collected from within Victoria, but differ in collection date by 74 years. The most extreme version of this phenomena is between samples 13,186 and 16,892, both from Victoria collected in 1986 and 1989 respectively, that have identical mt genomes compared to samples 205, 207, 215 and 222 from Victoria, Tasmania and South Australia collected between 1909 and 1911. Outside of Australia only samples, VPRI 6188 (collected from the UK in 1873) and VPRI 6190 (collected from the USA in 1889) show a high degree of relatedness with FTA 60, collected from Tasmania in 2019 (Fig. 2), with these strains spanning 146 years.

VPRI 9107 was collected from the UK in 1879, but as seen in both Figs. 1 and 2, it clusters with the majority of the Australian samples. VPRI 9107 is most closely related to VPRI 30,140 collected in 2002, putting 123 years between these two samples. The ability to obtain usable and informative mtDNA from samples stored on dried leaf tissue and preserved for over a century is not only an additional benefit to herbarium collections but provides informative data of strain haplotypes and their introduction over recent history. It appears there are two separate clades forming for P. infestans isolates analysed in this study showing relatedness between VPRI 12,317 and VPRI 216, and another clade showing relatedness between VPRI 6188, VPRI 6190 and FTA 60. It is interesting to note that the sample FTA 60 collected in 2019, groups closely with the historic reference sequences, U17009.2 and AY894835.1 as well as VPRI 6188 and VPRI 6190 (both collected in the late 1800’s). This potentially indicates a lack of evolution or selection pressure on P. infestans in Australia.

The earliest remaining samples of P. infestans in Australia date from 1909 originating from Queensland and South Australia (VPRI 212, 213, 215 and 217). These early samples appear to have distinguishable mt haplotypes, with two of the South Australian samples being more related (VPRI 213 and 215). The following year the first retained Victorian and Tasmanian samples were deposited (VPRI 205, 207, 208 and 216). The VPRI samples 205 and 207 from 1910 are identified as the same haplotype as the VPRI 215 sample from South Australia from 1909, whilst VPRI_216 is potentially more closely related to the 1909 Queensland sample VPRI 217. The VPRI 203 sample from Victoria from 1911, is certainly more similar to the 1909 Queensland sample (VPRI 217). The sequence similarity and the collection dates indicate that there was potentially a dual or mixed strain introduction of P. infestans into Australia via South Australia and Queensland, however within 2 years both incursions had migrated across Australia and were detectable in all states.

For the first time, the mt genome of six specimens of P. infestans infection of kangaroo apple, a native Australian Solanum species, S. aviculare, were sequenced. The specimens dated from 1909 to 1911 and were collected by McAlpine and colleagues (1911) in Victoria. McAlpine noticed these infected native plants growing alongside infected potato fields and documented this with photographs in his publication on potato diseases (McAlpine 1911). The kangaroo apple P. infestans mt genome sequences clustered with those from potato collected from the same place and time, but also with those from potato collected from 1910 in Tasmania and Victoria, 1909 in South Australia, as well as 1986 and 1989 in Victoria, 2003 in South Australia and 2019 in Tasmania. This lends evidence to the hypothesis that the old strains have been surviving in Australia since at least 1909. It also suggests that perhaps P. infestans is persisting on this wild host, however despite active searches by plant pathologists over the years, no further infected kangaroo apple plants have been observed since McAlpine’s detections.

The absence of the A2 strain demonstrates the success of the biosecurity measures used, but also presents an ongoing biosecurity and disease management challenge. As the more virulent strains have been successfully excluded from Australia, the potato cultivars under production in Australia are likely to have been developed in absence of this disease pressure and may be lacking in effective disease resistance genetics. Targeted, preventative breeding measures should therefore be made to safeguard and derisk Australian production, through the introgression of some of the recent resistance mechanisms identified from S. bulbocastanum (Oh et al. 2009; Park et al. 2009; Song et al. 2003; van der Vossen et al. 2003, 2005; Zhu et al. 2012).

The identification of only 56 variant sites within the 37 kb mitochondrial genome across most of the samples demonstrates the closely related nature of all of the sequences generated in this study. The identification of 12 variant sites that are different to the mtDNA genome of P. infestans reference sequence, published by Pacquin et al. (1997), with the alternative nucleotide in all samples, further supports the high level of genetic similarity and relatedness among all of the samples successfully processed. As all the mt genomes generated in this study, with the exception of VPRI 212, were 99.85% or greater identical, there exists the possibility that all of the strains originated from a single historic introduction. However, the minor sequence variation identified, along with the premise that VPRI 212 represents a different strain of the pathogen that failed to establish in Australia, and the observations surrounding the 1909 samples from Queensland and South Australia, makes a single introduction unlikely and suggests that several related strains were historically imported. There is the possibility that with missing data being replaced with reference sequence genotypes that samples are artificially made to appear more similar to the reference genome than they are. However, there are only four samples with missing data in the samples.

Continual surveillance of the circulating strains of P. infestans is of critical importance to safeguard the Australian potato industry. The current study was limited in depth and continuity to be able to extensively describe the evolution, development and movement of the pathogen in Australia due to the low number of samples retained across time. The sporadic nature of the samples that were available to be included is however a reflection of the disease status in Australia, where the pathogen is continually present, but typically does not present as damaging for crop production, unless environmental conditions are favourable (Drenth et al. 2002). With the current management strategies in place, low incidences of crop loss will hopefully continue. However, with prediction models of climate change showing Australian locations with desirable environments for new strains of P. infestans (Edwards 2006), the established production zones are at risk of infection of new strains, heightening the need for proactive preventative breeding with robust crop management and strict biosecurity measures and preparedness in managing the disease.