Introduction

Capsicum peppers belong to the family Solanaceae and are native to tropical and temperate regions of the Americas (Barboza et al. 2022; Eshbaugh 1993). Capsicum peppers were domesticated before 6,000 BP (Perry et al. 2007), making them one of the oldest domesticated crops in the Americas (Davenport 1970). The Olmecs, Toltecs, and Aztecs used Capsicum peppers extensively (Boswell 1949). After Columbus returned to the Old World in 1493 with Capsicum peppers, they soon spread from Spain to other European countries. In 1585, the Portuguese brought Capsicum to India (Sturtevant 1885), and it was subsequently transported via the sea to Malacca and spread to the Far East (Andrews 1995:1–10), reaching Oceania during the early European era (Whistler 1992:131–132).

In Japan, Capsicum pepper seed remains have been recovered from the Motobukuro Site (late sixteenth to seventeenth centuries) in Sendai City, Miyagi Prefecture (Sendai City Board of Education 2004:248–251), several Edo era (1603–1868) sites in Tokyo (e.g., Tokyo Metropolitan Archaeological Center 2000:255–268), and the Miyazawanakamura Site (the middle to late Edo era) in Higashiyatsushiro-gun, Yamanashi Prefecture (Yamanashi Prefectural Archaeological Center 2000:34–41). A literature survey by Yamamoto (2015) revealed several hypotheses pertaining to the introduction of Capsicum peppers to Japan: in 1542 by the Portuguese; in 1592–95 from the Korean Peninsula; and in 1596–1615 or 1605 with tobacco by European merchants. An entry in the Tamon-in Nikki (a diary written by monks at Tamon-in temple in Nara) on February 18, 1593, includes a description of chili peppers, while a note in the Korean book Jibong yuseol published in 1614 indicates that the Japanese introduced chili peppers to the Korean Peninsula. Based on archaeological findings and the literature, Capsicum peppers were introduced into Japan before the seventeenth century and were transferred between countries through trade. During the initial stage of the introduction of Capsicum peppers to Japan, they were sold as medicine with other spices and herbs at stalls in front of temples, but within 100 years they were cultivated and widely used as a spice (Matsushima 2020:20–30, 195–202).

Of the approximately 35 Capsicum species currently recognized (Carrizo García et al. 2016), five are economically important: C. annuum, C. frutescens, C. chinense, C. baccatum, and C. pubescens. The phylogenetic relationships of these species were reviewed recently; C. annuum is a close relative of C. frutescens (Barboza et al. 2022). In the Asia–Pacific region, C. annuum and C. frutescens are primarily used as food and condiments. The comprehensive population structure and distribution of C. annuum has been extensively investigated (Tripodi et al. 2021); however, many varieties have been produced and distributed worldwide, making it difficult to reconstruct its dispersal routes. Capsicum frutescens also has a global distribution (Barboza et al. 2022) and there are many local varieties in the Asia–Pacific region.

Capsicum frutescens is a semi-domesticated species characterized by seed dormancy, small deciduous fruit, and flowering inhibition under prolonged illumination (Yamamoto and Nawata 2006, 2009a; Yamamoto et al. 2007, 2008). In Japan, C. annuum, which is adapted to the temperate zone, is mainly produced on the mainland, while C. frutescens is cultivated only in the subtropical Ryukyu and Ogasawara islands. Capsicum frutescens is a short-day, late-maturing plant that needs appropriate temperatures for flower bud formation (Cochran 1942; Quagliotti 1979). In the temperate zone, the weather is not warm enough after the flowers are finally induced in autumn for the mature fruit to develop. In the Ryukyu Islands, a product called koregusu is made by soaking mature C. frutescens fruits in awamori, an Okinawan spirit, and is used to flavor noodles and other foods (Yamamoto and Nawata 2005). In the Ogasawara Islands, people place fresh mature, or sometimes immature, C. frutescens fruit in soy sauce and squash it to make a dip used for raw fish.

Capsicum frutescens is used as a condiment (fruits), vegetable (leaves), and medicine (fruits, leaves, or seeds), and plays a role in popular beliefs and rituals (mainly fruits) across the Asia–Pacific region (e.g., Yamamoto and Girsang 2021). Naturalized forms of C. frutescens are often found along forest edges, in fields or orchards, and along roadsides in the Asia–Pacific region. These are used, and local people have become more strongly attached to C. frutescens than to C. annuum, which is rarely naturalized in this region (e.g., Yamamoto 2012, 2013). To track early dispersal routes, botanical approaches are especially important in areas that lack archaeological and historical records of Capsicum peppers, as in Southeast Asia and Oceania. Yamamoto and Nawata (2004, 2005, 2009b) and Yamamoto et al. (2011) studied the distribution and dispersal of C. frutescens in Southeast and East Asia using morphological characteristics and isozyme analysis. With detailed DNA analyses, it is possible to clarify the dispersal and distribution of C. frutescens in the Asia–Pacific region.

Restriction site-associated DNA sequencing (RAD-seq) is a popular technique for population genetic studies in different species (Davey et al. 2011). Tanaka et al. (2024) assembled the complete circular chloroplast genome sequences of C. frutescens and found two different haplotypes (CF types 1 and 2, referred to here as T type and TC type, respectively). For a higher-resolution analysis of the dispersal and distribution of C. frutescens in the Asia–Pacific region, we combined DNA analyses of the nuclear and chloroplast genomes.

Materials and Methods

Plant Materials

We used a total of 357 accessions of C. frutescens from Japan (3), the Americas (32, mainly Columbia), Southeast Asia (278, mainly Indonesia and Cambodia), Micronesia (36), and other areas (8). Capsicum annuum “Takanotsume” was used as a near-outgroup sample for the RAD-seq analysis because it is closely related to C. frutescens (Barboza et al. 2022). Seeds from the Americas were obtained from the Norio Yamamoto Collection; such seeds are also stored in the National Agriculture and Food Research Organization GeneBank, Japan. All other seeds were obtained from universities in Kagoshima, Kindai, Kyoto, and Shinshu, Japan. Our plant materials are not inbred pure lines, but some of them have self-pollinated at least once. The materials in this study maintain a degree of heterogeneity, which can exist in the native habit. Although our analysis might not cover lost allelic diversity, the phylogenetic analysis is sufficiently robust to discuss the genetic structure of C. frutescens due to the large number of samples and comprehensive SNP detection in chromosomes.

RAD-Seq Analysis

A Nucleon PhytoPure Kit (GE Healthcare, Little Chalfont, UK) was used to extract DNA from young Capsicum leaves. RAD-seq libraries of 357 C. frutescens accessions were constructed as described previously (Sakaguchi et al. 2015), and sequenced using a HiSeq X sequencing system (Illumina, Hercules, CA, USA). Raw sequence reads (PE150) were preprocessed using the Trimmomatic v0.39 trimming tool (Bolger et al. 2014), using the following parameters: Trimmomatic PE -threads 16 -phred33 ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:19 TRAILING:19 SLIDINGWINDOW:30:20 AVGQUAL:20 MINLEN:51. The quality of trimmed reads was verified by FastQC v0.11.9 (Andrew 2015). The trimmed reads were mapped onto the whole-genome sequence of C. chinense (PI159236) (Kim et al. 2017) using the BWA-mem v0.7.17-r1188 aligner. Variant calling was conducted using the GATK v4.1.2.0 HaplotypeCaller tool (Poplin et al. 2017). Extracted single-nucleotide polymorphism (SNP) data were filtered using VCFtools (Danecek et al. 2011) to exclude SNPs with more than two alleles, minimum depth < 6, maximum missing rate > 0.8, and minimum allele frequency > 0.03.

Then, 8,210 biallelic SNPs were analyzed using IQ-TREE v1.6.12 (Nguyen et al. 2015) to estimate a maximum likelihood (ML) tree, with the general time-reversible substitution model and ascertainment bias correction option. A total of 1,000 bootstrap replications were performed to assess the clade support, while the population structure was assessed by Bayesian clustering in Admixture v1.30 (Alexander et al. 2009). The number of clusters (k) was set to vary from 1 to 25, and k = 6 was selected based on the resulting low cross-validation error and clustering structure.

Chloroplast Genome

Our preliminary study of the complete C. frutescens chloroplast genome for four accessions (“Tabasco,” Irabu2, Jire Khursani, and Mates Sor) originating from different countries (the USA, Japan, Nepal, and Cambodia, respectively) identified two haplotypes that are distinguished by a single nucleotide insertion in the intergenic region between psaA and ycf3 (Tanaka et al. 2024). The two haplotypes have no other SNPs. Therefore, we amplified only the intergenic region between psaA and ycf3 from C. frutescens accessions using a genomic polymerase chain reaction (PCR). Total DNA was extracted from young leaves of 357 accessions of C. frutescens using a Nucleon PhytoPure kit (GE Healthcare). This sample set was exactly the same as that used for the RAD-seq analysis. The PCR was performed in a 20-μL reaction mixture containing 0.3 μL of KOD FX neo (Toyobo, Osaka, Japan), 10 μL of buffer (provided with DNA polymerase), 4 μL of dNTPs (2 mM), 0.5 μL of forward and reverse primers (forward: GATTGGGTCTTCCAAAAGCA, reverse: TTCTTCTGAAGGTGGGAAAAA; 10 μM each), and a 0.5-μL aliquot of genomic DNA. The PCR conditions were as follows: one cycle at 94°C for 2 min, followed by 35 cycles at 98°C for 10 s, 55°C for 30 s, and 68°C for 1 min. The PCR products were sequenced by Eurofins Genomics (Tokyo, Japan). The ATGC sequence assembly software (GENETYX, Tokyo, Japan) was used to analyze the nucleotide sequence.

Results and Discussion

RAD-Seq Analysis of C. frutescens in the Asia–Pacific Region

Figure 1 shows the ML tree for 357 accessions of C. frutescens inferred from the RAD-seq dataset. This provides a phylogenetic model that groups lineages from breeding populations scattered over a large part of the species distribution. Six accessions from the Americas and two from Nepal were genetically close to C. annuum. Tokuda et al. (2020) reported that one of the two Nepal accessions was a hybrid derived from a cross between C. frutescens and C. annuum; our results are consistent with this finding. The six accessions from the Americas were highly divergent and did not form a monophyletic group. They might all be hybrids between C. frutescens and C. annuum, or they might indicate a diversity within C. frutescens that is much higher in the Americas than in other regions. No hybrid was found in either the Pacific region or Southeast Asia in this study, suggesting that a hybrid could have been introduced to Nepal via Europe and India after an early transatlantic crossing (Boswell 1949; Sturtevant 1885) or that hybridization could have occurred in Nepal.

Fig. 1
figure 1

The maximum likelihood tree of 357 accessions of Capsicum frutescens inferred through restriction site-associated DNA sequencing. The tree was re-rooted with C. annuum “Takanotsume” as an outgroup sample. Pie chart colors indicate genetically different clusters predicted by admixture mapping (k = 6)

Group I consisted of 17 accessions from the Americas, six from the Republic of the Marshall Islands (RMI), 19 from the Federated States of Micronesia (FSM), one from Tahiti, five from Indonesia, three from Japan, two from Cambodia, and one from Sudan. Next, we focused specifically on subgroups I-1 and I-2, which include our samples from Japan.

Subgroup I-1 includes two accessions from the Ogasawara (Bonin) Islands, Japan, which are located approximately 1,000 km south of Tokyo Bay, six from the FSM, one from the RMI, and seven from the Americas. The accessions from the Ogasawara Islands were genetically very close to those from Micronesia and the Americas, but were distant from those in both continental and insular regions of Southeast Asia. The Ogasawara plants had small fruit (length: 1–3 cm) that are green when immature (Fig. 2A); within subgroup I-1, other accessions with available morphological descriptions had the same fruit characteristics.

Fig. 2
figure 2

Fruit characteristics of Ogasawara (A) and Okinawa (B) accessions of C. frutescens

Subgroup I-2 consisted of one accession from Okinawa (the Ryukyu Islands), southern Japan, four from the FSM, one from Tahiti, one from Sulawesi Island, Indonesia, and five from the Americas. This result was similar to that reported by Yamamoto et al. (2011), in which accessions from the Ryukyu Islands were found to have a rare isozyme pattern (shikimate dehydrogenase phenotype B). This phenotype is distributed throughout Taiwan, the Batanes Islands in the Philippines, Indonesia, Vanuatu, and Ecuador, but is not present in continental Southeast Asia. One accession from Ecuador also possessed the rare isozyme, and we also found it in subgroup I-2 (Fig. 1).

In this subgroup, one accession from Okinawa, two accessions from the FSM, and one from the USA (“Tabasco”) all had relatively large fruits (length: 3–4 cm) that are greenish-yellow when immature and morphologically different from the Ogasawara Islands accessions (Fig. 2B). The Okinawa accession tested here also had a combination of isozyme phenotypes different from that of the Ogasawara accessions (Yamamoto and Nawata 2005). These differences, and the significant genetic distance observed here between accessions from Okinawa and Ogasawara, suggest that there were at least two independent introductions and dispersal routes for C. frutescens into Japan.

Settlement of the Ogasawara Islands started in the early nineteenth century, first by a mixed group of Europeans, Hawaiians, and other Pacific islanders from Hawaiˋi in 1830 (Tanaka 1997:41–42, 62), and then by Japanese people mainly from Hachijojima Island, located 287 km from Tokyo (Long 2002:164–168), with some from Okinawa (Arima 1990:110, 195). When Japan ruled Micronesia between 1914 and 1945, people traveled back and forth between the two regions (Long 2002:164–168), and many farmers from Okinawa migrated to Micronesia (Suzuki 2019) via the Ogasawara Islands on ships. C. frutescens does not occur on Hachijojima Island because it belongs to the temperate zone, as explained in the Introduction. The Ogasawara Island accessions are morphologically, biochemically, and genetically different from those of Okinawa (i.e., they were not introduced from Okinawa). The accessions from the Ogasawara Islands were genetically very close to those from the Pacific. Capsicum peppers (particularly C. frutescens according to testimonies such as “it is naturalized in mountainous areas and nobody cultivates it in home gardens”) were recorded on the Ogasawara Islands in 1888 (Hattori 1888:8), although there is no way to confirm that the pepper in that book was genetically the same as the samples in this study. People on the islands also thought that C. frutescens was introduced before 1945 according to a field survey conducted in 2003. Therefore, the Ogasawara accessions used in this study were probably introduced from the Pacific in 1930–88 or 1914–45.

All three Japanese plants tested here (and subgroups I-1 and I-2) lie within a larger group comprised mainly of accessions from the Americas and Micronesia (see Fig. 1). Our sample size for Japan was very small, and there could have been many other introductions during the last few hundred years of exchange among Japan, Southeast Asia, and the Pacific islands.

A second monophyletic group, group II, is comprised of a much larger group of accessions from a wider area, including the Americas (9), the FSM (11), Indonesia (158), and Cambodia (98), plus small numbers from Fiji, the Philippines, Malaysia, Vietnam, Laos, Thailand, Myanmar, Nepal, and Cote d'Ivoire (19 from these nine countries; see Fig. 1). All ten of the FSM accessions in subgroup II-1, which can be defined at a single node, were from Yap State, and were closely related to accessions from Southeast Asia, distant from the other Micronesian accessions. Yamamoto (2021) found more Capsicum pepper cultivars on the Yap Islands than other islands of the FSM. Yap International Airport was opened in 1988, and there are island-hopping flights between Guam, Yap State, Belau (Palau), and Manila, as well as between Guam, Chuuk State, Pohnpei State, Kosrae State, the RMI, and Hawaiˋi. The ten accessions from Yap State might have been introduced from Manila in recent decades (Fig. 3). However, we should consider historical movements between Southeast Asia and Micronesia. Therefore, further studies are needed, especially on Belau, which is much closer to the Philippines. Subgroup II-2 can be also defined at a single node with accessions only from Southeast Asia.

Fig. 3
figure 3

Distribution map of the pie charts (colors indicate genetically different clusters predicted by admixture mapping; k = 6) shown in Fig. 1 for C. frutescens accessions, except from Cambodia, Micronesia, and Indonesia (A), from Cambodia (B), from Micronesia (C), and from Indonesia (D)

The RAD-seq analysis indicated that the accessions from Japan are most closely related to those from the Americas and Micronesia, and they are distant from most of those from other islands and continental Southeast Asia (see Figs. 1, 2, and 3). These accessions were likely introduced to Japan from the Americas via the Pacific. From the mid-sixteenth century to the early nineteenth century, Spanish trading ships traversed the Pacific between Manila in the Philippines and Acapulco in Mexico. Tobacco is believed to have been introduced to Manila by these galleons in the late sixteenth century, where it was used as a medicine (Ueno 1998:96–113). Local names for Capsicum peppers, which are probably derived from chile in Spanish, are used in the insular region of Southeast Asia and Micronesia; sili and siling are used in Cebuano and Tagalog, which are the two major languages of the Philippines (Madulid 2001:59); sili, kasiri, and sini are used in the Batanes Islands, Philippines, and Taiwan, respectively (Yamamoto and Nawata 2009b); cili is used in Maluku Province, Indonesia (Yamamoto and Girsang 2021); and sele and jeli are used in Pohnpei State in the FSM (Yamamoto 2011). These local names also support the Pacific dispersal route hypothesis.

Many group II accessions found in Indonesia and Cambodia may have reached Southeast Asia via Portuguese, Dutch, and other European trade routes associated with the spice trade and colonization beginning in the early sixteenth century. Subgroup II-2 is not associated with any American accessions and might comprise recently differentiated lineages from initially diverse introductions that remain present and are associated with the American accessions in group II. Interestingly, Sumatra, in western Indonesia, might have been greatly affected by sea lanes via the Atlantic, while Sulawesi and Maluku in eastern Indonesia might have been affected by both the Atlantic and Pacific dispersal routes (see Figs. 1, 2, and 3).

Distribution of C. frutescens Chloroplast Haplotypes in the Asia–Pacific Region

The distribution of the two C. frutescens chloroplast haplotypes is shown in Table 1. Only the T type was found in the Americas and Japan, whereas both the T and TC types were distributed in other areas (8 and 31 in Oceania; 157 and 121 in Southeast Asia; 2 and 3 in other areas, respectively). We offer two interpretations for this result. First, the TC type may have originated from the Americas and was later introduced to the Asia–Pacific region, but was not detected in the Americas in this study due to a limited number of samples. Alternatively, the TC type may have occurred as a mutation somewhere outside of the Americas and spread to other regions. Further studies of the chloroplast genome of C. frutescens, especially using samples from the Americas, are needed to definitively resolve this issue. For either interpretation, it is apparent that only the T type of C. frutescens was introduced into Ogasawara and Okinawa, even if it occurred via independent dispersal routes.

Table 1 Distribution of the two different haplotypes (T and TC) of C. frutescens in the Americas, Oceania, Japan, Southeast Asia, and other regions

As noted above, two Nepalese accessions were considered to be hybrids derived from a cross between C. frutescens and C. annuum. Both had the TC type genome, which has not been found in C. annuum (Tanaka et al. 2024), so C. frutescens must be the maternal parent assuming that the chloroplast genome is inherited from the maternal parent (the most common pattern of chloroplast inheritance).

The main groups (I and II) shown in Fig. 1 included accessions with the T and TC chloroplast types (25 and 29 in group I and 171 and 124 in group II, respectively; Fig. 4). This suggests that the two chloroplast haplotypes represent an ancient divergence and are both widespread in the species and in the American source populations. In addition to further studies of the chloroplast genome, an investigation of the intraspecific variation of the mitochondrial genome, which is also inherited from the maternal parent, would be useful to comprehensively understand the dispersal and distribution of C. frutescens in the Asia–Pacific region.

Fig. 4
figure 4

Distribution of the two different haplotypes (T and TC) of C. frutescens overlaid on the maximum likelihood tree in Fig. 1. Numbers in blue and pink indicate T and TC types, respectively

Conclusion

The historical Pacific dispersal route contributed strongly to the spread of crops native to the Americas (Alvina and Madulid 2009; Crosby 2003). In addition to a few well-studied examples (e.g., tobacco and sweet potato), many other American crops have naturalized in the Pacific, including tomato, papaya, passionfruit, pumpkin, maize, cassava, and Capsicum peppers. The present genome-wide association study and geographical sample set also indicate a Pacific route for extant populations of C. frutescens in Micronesia and Japan. Only the T type C. frutescens chloroplast genome was found in Japan, in a small sample set. Since both types are widespread in the Pacific, it is possible that the TC type will also be found in Japan in future surveys. To clarify the multiple likely routes of introduction of C. frutescens into Southeast Asia and Japan, further study is needed with many more samples from Japan and areas such as Polynesia, Melanesia, South Asia, and Africa.