Background & Summary

The genetic correlation between filament and corolla tube lengths in wild radish is very high in magnitude (0.85, ref. 1), estimated with precision, and known to be caused by pleiotropy or extremely tight linkage2. The relative lengths of these two traits determine the position of the pollen-bearing anthers relative to the opening of the corolla tube; this composite trait is called anther exsertion, which can be defined as ln-long stamen filament length minus ln-corolla tube length. The high filament-corolla tube correlation is likely due to stabilizing selection on anther exsertion by bees in the family Halictidae3,4; functionally, intermediate anther exsertion maximizes pollen removal by these bees5. Stabilizing selection on the difference between two traits is equivalent to correlational selection to increase the correlation between the traits6,7.

This paper describes six datasets derived from a series of studies designed to understand selection and genetics of anther exsertion; Figure 1 gives a flowchart of all of the experiments. All plants were derived from a single natural population (see Data Records). To create increased variance in anther exsertion to better test for stabilizing selection, as well as determine the rate of response to selection perpendicular to the major axis of variation, artificial selection for increased and decreased exsertion was performed for a total of 11 generations, with two replicates of each selection treatment8. There were also two randomly-mated control lines for a total of six selection lines; each line contained 12 outbred maternal ‘lines’. We refer to these as ‘matrilines’ because they are denoted and followed based on the maternal parent, but each generation these lines were outcrossed using pollen from a unique randomly-chosen plant in a different matriline, so that matrilines are not distinct from each other in the nuclear genome. The floral, fitness, and pedigree data for the four selected lines (not the controls) are contained in the 'ArtificialSelectionExsertion.csv' file.

Figure 1: Flowchart of the experiments that produced the datasets.
figure 1

The top three boxes show the procedure used to produce the data in the ‘ArtificialSelectionExsertion.csv’ file, and the other five boxes each correspond to one of the five other datasets.

To test for correlated responses to this selection, after five (replicate 1) or six (replicate 2) generations, 571 plants evenly distributed across the two high, two low, and two control selection lines were grown and 12 floral traits were measured, as well as flowering time and aboveground biomass. These data are in 'CorrelatedResponses.csv'. To quantify floral trait variation over the lifetime of these annual plants, seven floral traits were measured five or six times, and pollen viability was scored twice, over a period of three months on seventy-two of these plants ('2001FieldFlowerMeas.csv').

An F2 mapping population was created by crossing high and low selection lines to determine the genetic basis of the rapid evolution of anther exsertion in the seelction lines. The 11th generation of selection consisted of choosing the extreme anther exsertion plants from 10 matrilines in each of the four selection lines ('QTLParentalMeasurements.csv'). These 40 plants were crossed in all four high by low exsertion line combinations. The resulting F1 plants ('QTLF1measurements.csv') were outcrossed to produce 4,863 F2 plants distributed among 20 full-sibling families, five per cross type (see Methods; Fig. 2 and Table 1). Six floral traits were measured from floral photographs on each of these F2 plants ('QTL F2 Measurements.csv').

Figure 2: Crossing design for production of 20 outbred F2 Families.
figure 2

Shown is the crossing design used for each of the five octets of plants, which produces four full-sib F2 families, one for each of the possible high X low exsertion line crosses. Parental plants from selection replicate 1 (Reed) shown in cool colors (blue and purple) and replicate 2 (KBS) in warm (orange and red). The circled individuals depict the plants chosen randomly for this one example octet; arrows to the Parental generation show how the two of these from one line were used in crosses as an example. The crosses in this diagram were repeated five times total in the five octets, using different randomly chosen pairs of parental plants in each line for each; two plants in each line were not used as parents.

Table 1 Crosses performed to produce the F2 mapping population.

Many analyses of these data are possible in addition to those in the one paper that has been published to date8, which used only one of the six datasets included here ('ArtificialSelectionExsertion.csv'). Because anther exsertion and the component traits of filament and corolla tube lengths were measured in 10 generations under selection, multiple times across the lifespan of the same plants, and in a very large outbred F2 composed of full-sibling families from reciprocal crosses means that a variety of questions concerning genetic and microenvironmental causes of trait variation can be addressed. Additional unanalyzed traits are also included in some of the datasets, and photos of the flowers from top and side views are available for additional trait measurements. Novel integrated analyses across these datasets are also a possibility. Stored seeds from virtually all matrilines in virtually all generations are also available for entirely new phenotypic or genomic research.

Methods

Artificial selection

We conducted 10 generations (11th in the F2 study below) of selection for increased and decreased long-stamen anther exsertion (ln long filament length—ln corolla tube length), with two replicate lines for each of increased anther exsertion, decreased exsertion, and two randomly-mated controls. Each of the six replicate selection lines consisted of 12 unique matrilines; the most extreme of up to 10 offspring in each matriline was mated in each generation. The matriline of each plant is noted; full pedigree information (paternity) is available starting at generation 5. The first three generations of selection were done at University of Illinois, generations 4 and 5 of replicate 1 of the artificial selection and half of the correlated responses plants were grown at Reed College, and the rest of the greenhouse and lab work at Kellogg Biological Station; thus replicate 1 is also referred to as Reed or R and replicate 2 as KBS or K. For details see ref. 8.

Outbred F2 QTL design

For future QTL analysis, six plants from each of the 12 matrilines in the two high and two low exsertion selection lines were grown for a total of 288 plants. One flower from each was photographed and the lengths of the corolla tube, short and long filaments, and short and long filament anthers were measured; this was done for all F1 and F2 plants as well. The plant with the highest or lowest exsertion (matching the selection direction) within each matriline was chosen; this represents the 11th generation of artificial selection on exsertion. This most extreme plant from the 10 most extreme matrilines in each selection line were chosen for the outbred crossing design; the other two matrilines in each selection line were discarded. These 40 parental plants were then randomly paired to make five pairs within each selection line, and then each pair was randomly grouped with a pair from each of the other three lines to form five 'octets' of plants. Each octet was used to produce four outbred full-sibling F2 families, one from each of the four cross types; the design for one octet is shown in Figure 2.

To produce the F1 generation, each plant was mated to one plant from each of the other lines within the same octet, producing four F1 families, one for each of the four possible crosses between high and low exsertion selection lines. Because there were two pairs of parental plants from each selection line, this design produced pairs of unrelated F1 plants for each of these four cross types; these pairs were then crossed reciprocally to produce one of the 20 outbred full-sibling F2 families (Figure 2). Due to the reciprocal crosses, each of the 20 F2 families is subdivided into A and B groups depending on maternal plant.

A total of 4,863 F2 plants were grown in 10 blocks of up to 500 plants each, with each full sibling family represented by up to 25 plants per block, and each octet represented by up to 100 plants per block. Blocks alternated between consisting entirely of seeds from the A moms, or entirely of seeds from the reciprocal B moms; thus all the odd number blocks were A seeds, and all the even number blocks B seeds.

Data Records

The six datasets are stored at Dryad (Data Citation 1). Some contents are common across datasets:

Matriline: All of the plants are descended from the Binghamton NY population (BINY; 42.184089E, 75.835319W) and most have a code with a capital letter A–E and a number up to 475. This refers to the original mothers in the seed collection, where 5 transects (A–E), one meter apart, were run across an alfalfa field and seeds were collected from one maternal plant every meter. The transects varied in length—the last plant collected in each was A368, B385, C355, D475, and E100. The numbers refer to the same grid position in each transect, i.e., B1 is one meter from A1, B2, and C1. Seeds were collected from a total of 1,575 maternal plants, although some have no seeds left. A total of eight matrilines in the high and low replicate 1 populations have different codes without the initial letter; these are descendants from the BINY population but their pedigree cannot be traced back to the original field maternal plant. In a number of cases over the generations a matriline produced no viable seeds, so two families in the next generation came from one matriline; these are denoted with decimals added to the number and/or lowercase letters at the end of the code, but in all these cases the maternal lineage can be traced back to the field maternal plant.

Floral traits: the core set are Petal Length (PetLen), Petal Width (PetWid), Corolla Tube Length (Tube), Short Filament Length (ShrtFil), Long Filament Length (LongFil), and Pistil Length. In the early generations of artificial selection these traits were measured using calipers on dissected flower as described in Conner and Via1. In later studies, these are measured from floral photographs, and also include the length of the anther on one short and one long stamen (ShrtAnther and LongAnther). Often the ovules were counted (Ovule#). We often calculated Anther Exsertion as Long Filament minus Corolla Tube. All values are mm.

Treatment: High or H—selection for increased exsertion; Low or L—selection for decreased exsertion; Cntrl—randomly mated controls

Replicate line: 1 (= Reed=R) or 2 (= KBS=K) respectively for the two replicates nested within each Treatment.

Photo: Some files have the code from the camera denoting the image the measurement was made from, available from the first author.

ArtificialSelectionExsertion.csv

Offspr: the replicate offspring grown from each matriline; in later generations usually 1–10.

ID: a unique integer identifier added in later generations to track the pedigree.

MomID, DadID: the ID of the parents of that plant. In the first generation with IDs, these are lower case letters, because the parents of these individuals were not recorded.

RelFit: Relative fitness=RawFit/Mean Fitness for that line and generation; this is used to estimate selection differentials and gradients.

RawFit: Number of offspring grown and measured in the next generation from that plant. Within each matriline, typically only one will have nonzero fitness, that is, the selected plant, except when different plants within a matriline were used as males versus females due to incompatibility or where a matriline was split due to failure of a different matriline (see above).

Gen: generation of selection.

CorrelatedResponses.csv:

Matriline, AvPetLen, AvgTube, AvShrtFil, AvgLongFil, Pistil, Ovules: See above, except the four traits with 'Av' were the average of two measurements of different structures within the same flower, i.e., two different petals, filaments, etc. The third flower was measured in most cases, but sometimes a later flower close to the third was used.

Treatment: Direction of artificial selection.

Replicate Line: the two replicates within each treatment.

Block: Plants were grown at KBS or Reed; some traits differed between sites.

CRoffspring#: up to four plants were grown at each location from each matriline

Days to flower: number of days from planting to first open flower.

Nectar vol: volume of nectar in microliters from the 5th and 6th flowers on the central inflorescence.

Nectar conc%: % sugar concentration from refractometry using the same nectar sample.

FlowerNo: The total number of flowers was counted on some plants at Reed at harvest, just over two months after planting.

Biomass: aboveground dry biomass in grams was measured at Reed at harvest.

Total pollen: Number of pollen grains produced were counted using a Coulter Counter on all six anthers from one flower at KBS, and 3 long and 1 short stamen anther at Reed.

LongPollen: the count for the four long stamen anthers at KBS.

ShrtPollen: the count for the short stamen anthers at KBS.

2001FieldFlowerMeas.csv

Matriline, AvPetLen, AvgTube, AvShrtFil, AvgLongFil, AvgPistil, AvgOvules: See above, except the traits with 'Av' were the average of the five or six flowers measured over the life of the plant. The individual flower measurements and the date in 2001 that they were taken are in the columns following the averages and denoted by a number 1 through 6 following the variable name.

Treatment: Direction of artificial selection.

Replicate Line: the two replicates within each treatment.

CRoffspring#: up to four plants were grown at each location from each matriline.

DNA ID: The unique code used for the tissue and DNA sample taken from each plant, available from the first author.

Array: These plants were divided into three arrays of 24 plants each; arrays were taken into the field five or six times.

FieldRow and Field Column: The grid positions used for the plants in the field.

AvgFlwr#: The number of flowers open when the plants were taken into the field, averaged over the five or six field days.

QTL ParentalMeasurements.csv and QTL F1 measurements.csv

All columns as described above except Offspr denotes the six offspring grown from each maternal plant. There are two additional Cross Types in the F1 dataset, the High X High and Low X Low; seeds from these are available, but have not been used to make F2 plants to date.

QTL F2 measurements.csv

Cross: The four possible crosses between the two replicate high and low exsertion lines—RH=Rep 1 (Reed) High, RL=Rep 1 Low, KH=Rep 2 (KBS) High, KL=Rep 2 Low.

Family: There are five outbred full sib families within each cross; these correspond to the parental 'octets'.

Mom: Crosses of the F1 to make each full-sib F2 family were done reciprocally, so there is Mom A or B depending on the direction of the cross.

F2: Replicate offspring from each cross. Note that this is redundant with Mom A or B, because all F2s within each family were given a unique number—A is mom for 1–25, 51–75 etc, and B is mom to 26–50, 76–100 etc.

Block: 1–10 for the 10 temporal blocks.

Flwr date: the date that the first flower opened on that plant.

Technical Validation

The distributions of all floral measurements show a good fit to a normal distribution; all outliers (identified graphically as clearly outside the normal distribution) were either validated or corrected using original data or photos (Figure 3). For the artificial selection lines, the very tight fit of the data (R2=0.99 for both replicates; Figure 4 in ref. 8) to the fitted regression of response to selection on the selection differential strongly indicates that the data are precise and reliable.

Figure 3: Distributions of anther exsertion from the QTL experiment.
figure 3

Shown are the parents from the low (a) and high (b) exsertion selection treatments and the F2 generation (c). In the original data in each case, one plant had values >4 s.d. from the mean; these were found to be simple errors upon remeasuring the original photo and were corrected. Upon remeasurement of the photograph, the one individual with negative exsertion in the high line parents was found to actually have a slightly positive value (0.15).

Additional information

How to cite this article: Conner, J. K. et al. Artificial selection on anther exsertion in wild radish, Raphanus raphanistrum. Sci. Data 1:140027 doi: 10.1038/sdata.2014.27 (2014).