Some recent reviews of rice genome analysis have revealed that the domestication of rice may have been a considerably more complex process than previously suspected [26, 28]. Recent genome analysis of Oryza species and cloning of several domestication-related rice genes should have provided hints that would elucidate the rice domestication process, but unfortunately these data did not fit well with previous models or lead to a new, unified model [7, 17, 18, 26, 28]. Although understanding of the complexity in the process of rice domestication is meaningful, in order to consider this rice case as a model of plant evolution and pick up the general messages on crop domestication, it will be necessary to highlight critical points (or events) in the rice domestication process and to develop as simple as possible a model of the domestication process. Since DNA changes that have occurred during the domestication process can be considered as a historical record, changes such as functional nucleotide polymorphisms (FNPs) in existing landraces, cultivars, and wild relatives provide strong hints that can be used to establish such a model. Recently, several domestication-related genes were cloned and the FNPs utilized during the domestication process were identified [16, 19, 24, 30, 27]. Therefore, in this review, I propose a new draft model to explain the process of rice domestication, with an emphasis on japonica rice, in the hope of stimulating further discussion and the research required to test the model.

Currently, two rice species are grown around the world: African rice (Oryza glaberrima) and Asian rice (Oryza sativa) [8, 33, 32]. Genome analysis suggests that O. glaberrima is closely related to an existing wild rice species, Oryza barthii, whereas O. sativa is closely related to another wild species, Oryza rufipogon [8]. In some of the literature, the annual and perennial types of O. rufipogon are considered to be distinct species, O. rufipogon and Oryza nivara, respectively [6, 28]. In the present review, both types of O. rufipogon are considered to be a single species, since the distinctions between O. rufipogon and O. nivara have not been clearly established. Since the geographical distributions of the African and Asian species are distinct, the processes of domestication of these species are thought to have been independent and unrelated for at least the last 10,000 years, the time period that has been believed for rice domestication. Since many published studies have focused on the domestication of Asian rice, I focus on this species.

Reconsideration of rice domestication

Ancient human is believed to have selected a limited number of plants or seeds from wild-relative population and have started cultivation of crop once upon a time. This image implies an assumption that rice domestication was a single and simple historical event. Since recent evidences from molecular genetics and genome bioinformatics revealed a more complex nature of rice domestication, a primary question could be whether rice domestication can be defined as an event rather than a process. This question relates to the definition of “domestication” and must be discussed before the other questions. Then a few general questions should be clarified next: when, where, how many times, and how did rice domestication occur? Other related questions are: how many key genetic changes were involved in rice domestication? What kinds of plants were the ancestors of modern rice? From a different point of view, these questions could be rephrased as follows: when and where did key mutations occur? When and where were they selected? The DNA changes involved in rice domestication could be generally considered as targets for artificial selection by ancient humans. It is likely that they were selected either as new mutations occurring naturally in the groups of plants subjected to domestication or by introgression of standing variation (pre-existing mutations) from wild relatives. The evidence for such introgression has often been discussed in the context of “selective sweep”, which is a sort of natural selection for short chromosomal fragments following multiple recombination events resulting from natural crossings [3, 10, 22]. Since rice is a self-fertilizing plant, one must be careful when forming hypotheses about such introgression steps, which would have required at least several back-crossings during domestication and, so-called, the bottle-neck effect, in which rapid genome-wide fixation of unrelated loci would be observed. Here, it is noteworthy that genome-wide DNA changes should have occurred gradually during rice domestication if natural mutations accumulated in some local populations as a result of artificial selection or if introgression occurred as a result of crossing between closely related individuals. To account for the increasing information that is available on changes in the genes and genomes of rice, these questions are carefully reconsidered later in this review.

The definition of domestication

There is debate within the scientific community over how the process of domestication works (For example, see [7, 17, 18, 26, 28]. Some researchers give credit to natural selection, where mutations outside of human control make some members of a species more compatible to human cultivation or companionship. Others have shown that carefully controlled selective breeding is responsible for many of the collective changes associated with domestication. These categories are not mutually exclusive. Therefore, a current working definition of domestication may include concept of processes (ongoing selection), but not a simple event. In some cases, there may be no good way to distinguish between natural and artificial selection, even based on evidence of DNA changes. For instance, where ancient humans selected an individual plant or a group of plants with favorable agronomic traits (due to one or more mutations) and started to cultivate their progeny as crops, such selection may have an equivalent effect to natural selection with a strong bottleneck effect under growing conditions that differ from those of the wild ancestors [5, 11, 12, 15].

Instead, we must consider the historical trend in how a trait changes. For example, easy seed shattering is favored under natural conditions, whereas non-shattering seeds are favored under cultivated conditions. In this situation, artificial selection clearly proceeds in the opposite direction from natural selection.

Recent genetic analyses using the F2 populations of crosses between O. sativa and O. rufipogon and between O. sativa ssp. indica and ssp. japonica revealed many QTLs for various agronomic traits (could be related to domestication) at disperse chromosomal positions, but not a single major locus of a single chromosomal position [1, 16, 35, 37]. In some cases, several distinct QTLs were located together on a limited number of chromosomal segments, suggesting that such loci became involved in rice domestication as a result of introgression steps [4, 18]. The scattering of agronomically useful QTLs is one piece of evidence suggesting that we should not hypothesize a single event for rice domestication: it would be difficult to incorporate many QTLs scattered among many chromosomal positions in a single event by artificial selection. This evidence suggests that it will be better to hypothesize that rice domestication occurred as a process characterized by a series of events that involved several key selections.

How many times did rice domestication occur?

Recent genomic analysis has revealed the diversity of the O. sativa genome [5, 6, 12, 15]. In particular, the location of retroelements and diversification of long terminal repeats (LTRs) revealed that the two subspecies of O. sativa (indica and japonica) shared a common ancestral species 200,000 to 400,000 years ago at the genome level [21, 34]. These results clearly indicated that indica and japonica subspecies have distinct domestication histories since the beginning of divergence had started long time before predicted domestication beginning dates. In addition, several recent studies revealed that strong population structures exist among O. sativa cultivars, suggesting distinct histories [5, 6, 12, 15]. It is difficult to determine the number of subtypes or subspecies based only on calculated population structures, but indica has been divided into two groups (indica and aus), and japonica has been divided into three groups (tropical japonica, temperate japonica, and aromatic). On this basis alone, the number of independent rice domestications cannot be determined simply. Clustering of O. sativa and O. rufipogon using the short interspersed nuclear element (pSINE) retroelement distribution revealed that the genomes of tropical japonica and temperate japonica are closely related and belong to a monophyletic clade with O. rufipogon as an outgroup [6]. This indicates that domestication of tropical japonica and temperate japonica can be considered as a series of events within a single domestication process. On the other hand, genomic diversification of the retroelements in indica subtypes in this clustering did not support the existence of a monophyletic clade and did not reveal clear, simple boundaries with the genome of O. rufipogon [6]. Therefore, we still need more information to discuss the indica domestication process, including the number of processes and events. Recently, a detailed microsatellite analysis has suggested non-independent domestication of indica and japonica, although this research found a more severe bottleneck to the establishment of japonica than to that of indica [11]. More clarification of O. rufipogon based on genome analysis will be needed to fully understand rice domestication, especially in the indica group.

Population structures and association studies in rice

Since cultivated rice, O. sativa, has a complex population structure, care is needed when we attempt to study associations of certain DNA natural variations with variations in traits among landraces and cultivars [5, 6, 12, 15]. The population structure of rice indicates that various DNA polymorphisms at scattered chromosomal positions must be observed coincidently in plants within a certain type of population structure, such as indica, aus, aromatic, tropical japonica, and temperate japonica. Therefore, simple association of a certain DNA polymorphism with an agronomic trait (phenotype) does not necessarily confirm that the polymorphism is responsible for the agronomic trait. In contrast, when tight genetic linkage of a phenotype with certain DNA polymorphisms that are present within natural variations can be confirmed, one can conclude that the polymorphism is responsible for the phenotypic diversity and refer to it as an FNP [16, 27]. If the genotype of the FNP shows an association with phenotypic variation in the corresponding trait among landraces or modern cultivars of a certain type of population structure, this strongly suggests that the FNP is involved in the rice domestication process or the modern breeding process. For instance, when an FNP seems to behave like a mutation in a subspecies, with only some landraces having the FNP in the subspecies showing the characteristic trait, one can conclude that the FNP was involved in the process of establishment of the subspecies. In contrast, if the association fits closely with a certain rice subspecies based on its population structures, it is possible that the FNP was not directly involved in rice domestication. In this case, the examination for the FNP of existing wild species that could have common ancestors with the subspecies is necessary to discuss the involvement of the FNP in domestication.

Some researchers may believe that such FNPs that have resulted from crop domestication should be fixed in all the landraces [31]. This hypothesis makes sense if a single mutation or FNP became the prerequisite for cultivation of a landrace as a crop, but only one such gene has been reported in rice: sh4 [19, 20]. All tested landraces and cultivars contain a defective allele of sh4 that reduces seed shattering. If this defective allele were the only critical feature to domesticating wild rice species, all wild rice species that contain the defective allele could be grown as crops. However, some lines of the wild rice species O. rufipogon also contain the defective sh4 allele, but since the stature and panicles of these plants are similar to those of typical O. rufipogon, these plants cannot be successfully cultivated as a crop. Therefore, even though all cultivated rice contains this defective allele, mutation of sh4 is not the only change required to permit the beginning of rice domestication. Recent analysis has revealed that an allele of sd1 present in the main cultivar (i.e., IR8) used in the green revolution [1] is found in some O. rufipogon accessions. Furthermore, the effect of the defective sd1 allele appears to be somehow masked in some plants in the F2 generation of a cross between an O. rufipogon line with the same sd1 allele with IR8, and a cultivar of O. sativa, A58, suggesting the importance of the allele effect in combination with a certain genetic background [23].

When did rice domestication occur?

Archeological studies have indicated that rice domestication started more than 10,000 years ago [32]. Using the DNA diversity in existing O. sativa and O. rufipogon lines, the concept of molecular clocks can be used to test this hypothesis. Unfortunately, the estimate of 10,000 years may be too short to accurately simulate this timing of rice domestication, considering it on the basis of the spontaneous mutation rate of DNA under natural conditions and nature of rice as a self plant. In contrast, the order of occurrence of major DNA changes such as single nucleotide polymorphisms (SNPs) can be followed during this short period of rice domestication. In addition, insertions of certain retroelements, SNPs, and simple sequence repeats (SSRs) can be used. Konishi et al. [16] examined such DNA changes in a seed shattering gene, qSH1. The results suggested that at a single-locus level, we could work backwards through time, observing how japonica landraces coalesce into their ancestor haplotypes. It is interesting that this predicted ancestor has the same haplotype as the existing O. rufipogon line used in this study, indicating that evolutionary DNA changes in the qSH1 region occurred within a time as short as 10,000 year. On this basis, we can deduce the chronology of DNA changes using the haplotypes in existing rice landraces. Once this becomes available at a genome-wide level in the near future, it will be a useful tool to elucidate the rice domestication process. To deduce some absolute times, archeological evidence can be integrated with the chronology of DNA changes.

Where did rice domestication occur?

The current geographical distributions of rice landraces and their wild relatives have provided critical hints about the domestication process [32]. One difficulty of using the current geographical distribution to discuss domestication is that the distribution of O. rufipogon has changed in response to long-term climate changes. Archeological evidence and old literature found in China have suggested that the northern upper limit of O. rufipogon's geographical distribution was around 30°N, versus a current limit of around 25°N [14, 32]. This, together with archeological evidence such as the remains of rice paddies, is sometimes considered to be strong evidence that japonica rice was domesticated around the Yangtze River (Changjiang) region of China more than 7,000 years ago. However, since the current range of O. rufipogon indicates that adaptation to local regions has occurred during the last 10,000 years due to climate changes, it is likely that the change in areas of cultivation due to the adaptation to climate changes may interfere with elucidating the domestication process. Although the current population structures of rice landraces and wild species should reflect the effects of such migration: the movement of natural populations under the influence of climate, once a type of rice is cultivated, rapid expansion of its area of cultivation can be hypothesized to follow migrations of human populations or transfers of seeds and plants between populations. Indeed, genome-wide polymorphisms of landraces are strongly associated with their local origins [15, 27]. Furthermore, the associations of the genome-wide polymorphisms with local growing areas would also occur for existing wild rice species, although no evidence to support this hypothesis has been reported yet. Note that the current population structures of landraces and their wild relatives may be affected by certain distinct factors. For example, the population structures of landraces can be affected by the migration of ancient human beings and by their cultivation preferences, whereas those of their wild relatives would be more strongly affected by climate changes. These genome-wide polymorphisms, which reflect local origins or a history of migrations and transfers, are very useful information to elucidate the domestication process. In particular, it is possible to speculate how such landraces migrated and propagated into different areas based on such information. When a single agronomic trait conferred by a specific FNP was favored and selected in early rice intermediates, plants with this FNP would be propagated via migrations and transfers from the plant's original local origin, regardless of whether the FNP originated from selection of a simple mutation or introgression with a locus containing a certain standing variation.

Cloning of domestication-related genes in rice

Based on cloning of QTLs that are responsible for domestication-related traits and analysis of the associated haplotypes, several genes have been considered to be rice domestication-related genes, such as Waxy, sh4, qSH1, and Rc [16, 19, 25, 30]. Furthermore, other domestication-related genes, such as Gn1a and GS3, a grain number gene and a grain size gene, have been cloned respectively, although their haplotypes have not yet been analyzed [2, 9]. In addition, some flowering-time genes may be involved in this process, because the ability to grow rice in more northern areas such as China, Korea, and Japan is an agronomically useful trait [14]. Hd1, the rice ortholog of the Arabidopsis thaliana CONSTANS gene, is likely to be involved, although the natural variations among landraces and wild species in this gene have not yet been analyzed [14]. Recently, Xue et al. [36] reported cloning of a flowering-time gene, Ghd7, and the accumulation of defective alleles of Ghd7 in northern areas. Interestingly, except for sh4, these domestication-related genes appear to have been fixed only in some local groups of cultivated rice [19]. This might be partly due to independent domestications of subspecies indica and japonica. In addition, since an allele of qSH1 is found only in the temperate japonica subspecies, this natural mutation is likely to have been selected during the domestication of japonica rice [16]. Even in japonica rice subspecies that suffered some severe bottleneck during the domestication process [11], all domestication-related genes that have been tested so far have not yet become fixed, except for sh4. Although some believe that domestication-related genes should become fixed in cultivated species, the definition of these terms such as domestication-related gene (otherwise domestication genes) should be reconsidered. My experience suggests [16, 27] that many domestication traits have been conferred by exploiting natural variation such as QTLs. The nature of domestication/domestication-related genes may interfere the clear definition of these terms.

Quantitative nature of domestication traits

As an example of the quantitative nature of domestication traits, I have chosen the example of seed shattering. Several major QTLs have been reported for this trait in rice [16]. Two of them, sh4 and qSH1, have been cloned [16, 19]. Since the selected sh4 allele was found in all examined landraces and modern cultivars of both indica and japonica rice, the sh4 mutation is likely to have been utilized during the very early stages of the rice domestication process. Note that the defective sh4 allele was also found in some O. rufipogon, and would thus represent standing variation. In contrast, qSH1 appears to be a mutation that originated at least 3,000 years ago and that is found only in a subset of temperate japonica [16].

The degree of seed shattering has been examined among more than 100 rice landraces (Fig. 1; [16]). Clearly, there is considerable variation in the degree of seed shattering among these landraces. In particular, the qSH1 FNP explains approximately 70% of the variation between the japonica ‘Nipponbare’ and indica ‘Kasalath’ cultivars. In contrast, sh4 explains less than 5% percentage of these variations when compared the degree of seed shattering between rice easy-shattering landraces and some accessions of O. rufipogon (Fig. 1). This makes sense because sh4 accompanied the change from seeds that detach spontaneously upon seed maturation (in the wild) to seeds that detach only with physical stress (such as wind or harvesting upon seed maturation). This subtle change of phenotype with the sh4 defect supports the idea that this represents standing variation in wild species. As shown in Fig. 1, natural variations within the tropical japonica subgroup cannot be explained by either sh4 or qSH1. Therefore, other loci must have changed during the domestication of tropical japonica. This clearly indicates that the domestication of a single agronomic trait has proceeded gradually by the accumulation of several QTLs. A stronger QTL was not necessarily selected earlier than a weaker one, as demonstrated by the selection of sh4 and qSH1. In addition, it is noteworthy that the masked sd1 effect (described above) could also be explained by a combination of QTLs. Therefore, to elucidate the domestication of rice, the combination of natural variations, such as multiple QTLs, should receive more attention.

Fig. 1
figure 1

Natural variation among O. rufipogon and O. sativa in the degree of seed shattering. The degrees of seed shattering were measured upon maturation of the rice seeds. Colors of bars indicates subgroups of lines: black for O. rufipogon, orange for indica, green for tropical japonica, dark blue for temperate japonica with the qSH1-Kas allele, and light blue for temperate japonica with the qSH1-Nip allele. A clear association of the degree of seed shattering with the presence of a qSH1 genotype can be seen, but qSH1 alone cannot explain all the variation in the degree of seed shattering in tropical japonica. Note that all O. sativa contain the defective sh4 allele. Data were obtained from Konishi et al. [16].

New model for rice domestication

Recently, my colleagues and I cloned a novel QTL for grain width, qSW5 (QTL for seed width in chromosome 5), and demonstrated that a deletion in this gene in a japonica cultivar may represent an FNP for natural variation in grain width [27]. We demonstrated that this deletion caused a loss of most of predicted qSW5 gene product increased the grain yield of paddy rice; as a result, this gene may have been involved in the domestication of japonica rice. The ancient people who domesticated these plants might have favored them for the increased yield that resulted from this deletion. To elucidate the origin of the deletion, we have matched genome-wide restriction fragment length polymorphism (RFLP) data at 179 loci and two other FNPs (qSH1 and Wx) that were also involved in the domestication of japonica rice [13, 16, 24]. We revealed that these three FNPs contributed to the establishment of japonica rice. In addition to a series of selections of the newly induced mutations in the key domestication-related genes such as Wx, qSH1, and qSW5, natural crossings and the selection of some genetic combinations between mutations (which can be considered to be standing variation resulting from crossing) might have been repeated several times during the domestication of japonica rice [27]. Again, these data strongly suggest that the quantitative nature of domestication traits should be considered more carefully in efforts to elucidate the rice domestication process. Furthermore, the RFLP patterns for these three FNPs strongly suggest unexpected local origins of japonica rice. Japonica landraces with all the original alleles (qSW5, qSH1, and Waxy(Wx)) grow mainly in Indonesia and the Philippines, and landraces selected for these alleles seem to have propagated to broader areas from these original areas. Therefore, we should also consider local adaptation during rice domestication associated with the propagation of selected alleles in certain cultivation areas.

In addition, the RFLP pattern revealed that such ancestral landraces consist of a local subgroup, and have similar genome patterns, with a mixture of ‘Kasalath’- and ‘Nipponbare’-type RFLP polymorphisms (Fig. 2). This pattern reminded us of some intermediate genotypes when we created recombinant inbred lines between japonica and indica cultivars, and may imply that there was a critical crossing between relatively distant wild species when japonica rice domestication began. Considering all these data, I propose a new model of japonica domestication to stimulate discussion and guide future research (Fig. 3). In this model, four events can be proposed that explain the domestication of japonica rice: (1) the first critical crossing between relatively distant wild species, followed by migration and local adaptation; (2) gradual fixation of segregating loci (standing variation) according to adaptation to local cultivation styles or climate conditions; (3) selection of new naturally occurring mutations; and (4) natural crossings and selection. The three genes (qSW5, Rc, Wx, and qSH1) contributed much to this process. However, there would be more domestication-related genes involved in local adaptation during domestication. Standing variation such as that in sh4 should also have contributed to this process, although their contributions have not yet been integrated in this model. We have already shown some possibility of the involvement of selection for a Gn1a allele during the domestication of japonica rice [27]. In particular, flowering-time genes are likely to have been involved in the domestication process to permit rice to grow in northern areas, although recent breeding steps may conceal older changes that occurred during rice domestication

Fig. 2
figure 2

Genome-wide restriction fragment length polymorphism (RFLP) patterns at 179 loci. RFLP patterns are aligned with the physical order of the chromosomes. Red indicates ‘Kasalath’-type polymorphisms; white indicates ‘Nipponbare’-type polymorphisms. Only landraces with the indicated genotypes are shown. Data were obtained from and arranged on the basis of Supplementary Figure 4 in Shomura et al. [27].

Fig. 3
figure 3

A new model for the domestication of japonica rice. In this model, migration in new cultivation areas and local adaptations are considered to be part of the domestication process.

Based on this model, formerly mysterious distributions of putative standing variation in rice landraces, such as those found in sh4, and some SSR patterns, can be explained well [11, 19, 29]. Further discussion will be required, based on new data on FNPs involved in novel domestication-related genes and genome polymorphisms, to clarify this model.