Introduction

The seed oil of safflower (Carthamus tinctorius L.) has a high content of polyunsaturated linoleic acid and tocopherol, and it is produced for nutritional as well as for medicinal uses [1, 2]. These days, safflower is mainly grown in arid and semi-arid climates and as a minor oil seed crop in Europe. Safflower cultivars which are adapted to the temperate climates, could be planted here as an alternative to oilseed rape in organic farming. Standard quality analyses like Soxhlet extraction for oil content and gas chromatography for fatty acid composition are laborious and time-consuming [3], which is why a fast method for standard quality analyses would be desirable. A non-destructive method for the screening of seed samples would also allow high throughput analyses in plant breeding and genotype selection where the seeds of a genotype are limited. Near-infrared reflectance spectroscopy (NIRS) is a technique which has been used to screen intact seed samples for oil content and fatty acid composition in different species including rapeseed and sunflower [4, 5]. Studies on a worldwide safflower germplasm collection [6] and on Indian safflower genebank accessions [7] have shown that the oil content of seed meal [6] and oil content and fatty acid composition of intact seeds [7] can be estimated by NIRS. One of the main breeding aims in safflower is an increase in the seed oil content and one strategy for this is the reduction of the fraction of seed hulls. Genotypes with reduced or partial hull and increased oil content have been described [8]. The production of safflower seeds in temperate climates can be limited by disease incidence and an unfavorable climate as shown by Kahn et al. [9], who described, that only 193 of 747 accessions planted in Germany, produced seeds. The determination of the seed hull fraction and of the percentage of empty seeds is time consuming and complicated. Here, as for the estimation of oil content, linoleic and oleic acid content, a non-destructive method would be beneficial in plant breeding. In this study, we developed NIRS calibrations for the estimation of the quality parameters oil content, linoleic and oleic acid content and fractions of seed hulls and empty seeds using intact seeds and seed meal.

Experimental Procedures

NIRS Recording of the Spectra in Calibration Development

About three grams of cleaned seeds were filled into cuvettes with a diameter of 4.7 cm and were scanned by near-infrared reflectance spectroscopy NIRSystems 6500 (NIRSystems Inc. Silver Springs, MD). Reflectance spectra were collected between 400 and 2,500 nm in 2-nm intervals. The intact seed samples were scanned three times in a row, the cuvette was emptied after each scan and refilled with the same seeds. For the analyses of seed meal, the previously scanned three gram seed sample was ground in a Retsch mill (type ZM 100) to a particle size of 0.5 mm and the ground samples were scanned once in a cuvette with a diameter of 4.7 cm.

Origin of Seed Samples for the Calibration Development and Cross Validation

The seed samples mainly originated from crosses of lines selected from the safflower varieties Saffire, Sabina and AC Sunset, which are of European and Canadian origin, with adapted lines with high oil content. Furthermore, material of the safflower breeding program of the Georg August Universität Göttingen, consisting of adapted genotypes with a European genetic background, was included. For the parameter oil content, a first calibration was developed in 2004 (oil-1) and improved in 2005 (oil-2) and 2006 (oil-3) by increasing the number of samples within the seed sample set. For oil-1, 1,060 seed samples harvested from single plants and as bulk samples of three plots in the German locations Göttingen, Hohenheim and Wilmersdorf were scanned and 108 samples were selected for the calibration. Of these, 102 samples originated from Göttingen and three from Hohenheim and Wilmersdorf each. In 2005 and 2006, 300 and 950 samples, respectively, harvested at the German locations Göttingen, Hohenheim, Darzau and Wilmersdorf were scanned and diverse genotypes were selected and included into the calibration set of samples of oil-2 and oil-3. In the calibration development of oil-2 samples harvested earlier in 2002 and 2003 in Göttingen and Hohenheim were added. Two calibrations were developed by use of intact seeds (named s-oil) and seed meal (m-oil).

For the development of a calibration for linoleic and oleic acid content, 534 samples with the highest variation in the NIRS spectra were selected. The seed samples originated from field trials at the German locations Göttingen, Hohenheim and Wilmersdorf in the years 2003–2006.

In the development of a calibration for the parameters fraction of hulls and empty seeds, the selection of samples was based on the assumption of a negative correlation between oil content and seed hull quantity and consequently, a sample set with a broad range in oil content was chosen. The sample set was composed of 158 seed samples with a high variation in oil content which originated from trials in the years 2004–2006 at location Göttingen.

Reference Analyses

The oil content of small samples was determined by the Soxhlet extraction technique. Seeds were ground in a mill (Retsch, type ZM 100) to a particle size of 0.5 mm. Then 500 mg of the seed meal were transferred into a weighed cellulose extraction thimble (Whatman, 19 × 90 mm) which was sealed with cotton wool. The samples were dried in the thimbles (15 h at 60 °C) and the thimbles weighed again. The oil was extracted with petroleum ether in a 500-ml Soxhlet instrument for 10 h at 70 °C. The oil content of the samples was calculated after drying and weighing the extracted samples with the thimbles. This adaptation of the Soxhlet extraction method allowed us to use smaller sample sizes as well as reduced quantities of petroleum ether.

The determination of linoleic and oleic acids was performed by gas chromatography [10] of 200 mg of seed meal mixed with 0.5 ml sodium-methylate. The samples were incubated twice in a water bath at 20 °C for 10 min and mixed in between. After adding 300 μl NaHSO4 (5%) and 300 μl iso-octane, the sample was centrifuged for 10 min at 2,000 U/min. Subsequently, 200 μl of the upper liquid phase was analyzed gas chromatographically (Perkin Elmer Corp., model 8600).

The percentage of the fraction of hulls was calculated as the ratio of the seed hull to the total seed. Twenty seeds per seed sample were dried (5 h at 60 °C), weighed and afterwards watered for 15 h. The seed hulls were separated from the rest and dried (5 h at 60 °C). The number of empty seeds was determined of the same seed sample and is given in % of total seeds. Empty seeds are defined as absolutely empty, and not partially filled.

Statistical Analyses of the Spectra and Calibration Development

The intact seeds were scanned three times and the replications were averaged before further analyses. NIRS data were combined with the data of the reference analyses.

The analyses of the scanned spectra and the development of calibrations were performed using the software WinISI II (Infrasoft International LLC; Analysis, calibration and network computer software Version 1.06) and modified partial least squares regression (MPLS). The math treatment (1,4,4,1) was performed using the combination of the two spectra transformation procedures SNV (standard normal variate) and detrend. The cross validation was repeated three times for the parameters oil content, oleic and linoleic acid, and four times for the fraction of seed hulls and empty seeds, since the sample sets of the latter were smaller. Outliers were not eliminated from the calibration.

The calibration accuracy was assessed by the coefficients of determination of the calibration (R 2) and of the cross validation (1-VR). Williams and Sobering [11] suggested the use of the ratio of SD/SEP (SD standard deviation of the reference analyses, SEP standard error of prediction) for a comparison of calibrations. A value of SD/SEP >2.5 is therein defined as satisfactory for the screening in a breeding program, values of 5–10 are defined as adequate for quality control.

In contrast to Williams and Sobering [11], who worked with two sets of samples in the development and the validation of NIRS calibrations, in this study, a cross validation was performed and, therefore, the ratio SD/SECV (SECV standard error of cross validation) was used to compare the results and their meaningfulness [12].

Correlations of oil content, linoleic and oleic acid, fraction of hulls and empty seeds as well as direct and indirect effects on oil content were calculated using the software PlabStat [13].

Results and Discussion

Calibration for Oil Content

The reference analyses of 108 seed samples of oil-1 varied for oil content from 10.03 to 29.57% oil with a standard deviation of 3.61 (Table 1). The standard deviation was higher in 2005 (SD = 5.28, oil-2) when samples with extremely high and low oil content were added to the seed sample set. These samples had been harvested in 2002, 2003 and 2005 in Göttingen and Hohenheim. The integration of further samples resulted in the largest range and SD in oil-3 in 2006 (Fig. 1). Soxhlet reference analyses and NIRS calibration corresponded best in calibration oil-1 for seed meal (R 2 = 0.92) as well as for intact seeds (R 2 = 0.91). The coefficient of determination of the cross validation was higher in the calibration for seed meal (1-VR = 0.91) and its SECV smaller (1.09) than for intact seeds (1-VR = 0.81, SECV = 1.58). The coefficient of determination of the cross validation of s-oil-2 and s-oil-3 for intact seeds was higher than in s-oil-1, but also the corresponding standard error as well as the SD in the Soxhlet reference analyses were higher. In contrast to this, cross validation and calibration parameters for seed meal calibrations were not improved compared to m-oil-1. The factor SD/SECV was a little higher for the calibration for intact seeds s-oil-2 and s-oil-3 than in the first developed calibration s-oil-1.

Table 1 Reference analyses of the sample sets, NIRS calibration and cross validation for oil content (%), seed oleic and linoleic acid content (in % of fatty acids), seed hulls (%) and fraction of empty seeds (%) in intact seeds and in seed meal of safflower
Fig. 1
figure 1

Reference analyses versus NIRS estimations for oil contents in three different calibration sets in intact seeds and seed meal

Calibration for Linoleic and Oleic Acid

The reference values for linoleic and oleic acid ranged from 67.26 to 86.02% and from 6.06 to 17.81%, respectively (Table 1; Fig. 2). The accuracy of the NIRS calibration for the linoleic acid content for seed meal (1-VR = 0.62) was higher than for intact seeds (1-VR = 0.76) and SEC and SECV were smaller for the latter.

Fig. 2
figure 2

Reference analyses versus NIRS estimations for linoleic and oleic acid contents of intact seeds and seed meal

The NIRS calibration for oleic acid for seed meal predicted the fatty acid content with a higher accuracy than for intact seeds. The corresponding standard errors (SEC, SECV) were smaller and the coefficients of determination (R 2, 1-VR) higher. The factor SD/SECV is smaller for the calibrations for oleic and linoleic acid in intact seeds than in seed meal.

Calibration for Seed Hulls and Empty Seeds

The development of a calibration for the estimation of the fraction of seed hulls based on a set of 158 seed samples. The same sample set was used for the NIRS calibration for the fraction of empty seeds. The variation was high for both characters (Table 1; Fig. 3) as demonstrated by a SD of 13.40 and 23.34% for the fraction of seed hulls and empty seeds, respectively. The accuracy of both calibrations was higher when intact seeds instead of seed meal were scanned. Despite of the high variation in the calibration sample, the calibration for the fraction of empty seeds had only a low SD/SECV and high standard errors in calibration and cross validation.

Fig. 3
figure 3

Reference analyses versus NIRS estimations of the parameters fraction of seed hull and empty seeds using intact seeds or seed meal

Reliability of the NIRS Calibrations

A comparison between the statistical parameters of the developed calibrations revealed the highest reliability for the calibrations for oil content (R 2 = 0.92 and 0.91 of m-oil-1 and s-oil-1, respectively). This corresponds to a coefficient of determination of R 2 = 0.90 in NIRS calibrations for oil content in ground seeds [6] and intact seeds (R 2 = 0.92) [7] of safflower germplasm collections.

In this study, the calibrations for seed meal had lower standard errors and higher coefficients of determination than the calibrations for intact seeds. Pérez-Vich et al. [5] and Moschner et al. [14] analyzed the oil content of sunflower seeds by NIRS and described the calibrations for seed meal as more reliable than those for intact seeds. It was discussed earlier, that the variation in seed size and form, as well as the diameter of the seed hull, could be a problem when intact seeds are scanned [15, 16].

As for oil content, the calibrations for linoleic and oleic acid predicted the fatty acid content with a higher accuracy for seed meal than for intact seeds. The group of samples with linoleic acid content below 77% and oleic acid content above 10% had higher deviations between NIRS estimated and reference values (Fig. 2). The coefficients of determination were higher and standard errors reduced when seed meal was used. In earlier studies in soybean [16] and sunflower [5, 14] the same relation was observed.

For intact seeds as well as for seed meal, the calibrations oil-2 and oil-3 revealed higher standard errors and smaller coefficients of determination in comparison to oil-1 and, at the same time, a higher variation in the set of reference values. Still, these extended calibrations are recommended for use in practice, since the calibration oil-1 was developed using a set of samples which originated mainly from one environment (Göttingen, 2004). Increasing the number of samples used for calibration led to a more robust calibration [4]. This is confirmed by comparison of calibration s-oil-2 and s-oil-3, where the standard error of the cross validation was reduced and the coefficient of determination increased (Table 1).

Under moist and humid conditions safflower production can be hampered by poor seed set and seed development that is further complicated by diseases [8] as shown in a collection of 741 safflower accessions [18] were only 25% showed a reasonable seed set under the humid conditions in Germany in the year 2002. To measure the fraction of empty seed is rather laborious, and for practical breeding a fast and non-destructive method is required.

The calibrations for the estimation of the fraction of seed hulls and empty seeds were developed with a set of samples which was higher in variation, but originated from location Göttingen only (Table 1). Furthermore, it did not include genotypes with partial or reduced seed hull [8], and was smaller in dimension compared to the calibration for fatty acids. Therefore, the calibration for seed hulls can be considered, despite of the higher R 2 value, as less broad and stable than the calibration for fatty acids and ought to be improved before being used in breeding and selection as described by Alomar et al. [17] for a NIRS calibration for the seed coat proportion in lupins.

Correlations of Seed Parameters

The significantly high and negative correlation of oil content and fraction of seed hull was described earlier [8], even though genotypes with partial hull were not included in this study. The selection for a reduced seed hull would result in increased oil content. The highly negative correlation (Table 2) (r = −0.826**) of linoleic and oleic acid was demonstrated earlier by Fernández-Martinez et al. [1] (r = −0.98**) and Johnson et al. [19] (r = 0.98**). The inverse relationship results because oleic acid is desaturated to form linoleic acid. The positive correlation of oil content and linoleic acid content and the negative correlation of oleic acid content and oil content were unexpected. Johnson et al. [19] described a negative correlation of linoleic acid content and oil content (r = 0.20**). A positive correlation of oil and oleic acid content has been described earlier in safflower by Johnson et al. [19] as well as in rapeseed [20]. The splitting of correlation effects into direct and indirect effects, as demonstrated by Baye and Becker [21] for the Vernonia oil crop, revealed, that oleic acid had a positive direct effect on oil content which was masked by the negative indirect effects of linoleic acid (Table 2).

Table 2 Correlation coefficients for the reference analyses results of the parameters seed oil content, linoleic and oleic acid content, fraction of seed hull and empty seeds of 146 safflower seed samples and direct and indirect effects on oil content

High oleic safflower genotypes had not been included into the calibration set since the genotypes were not as good adapted to the test environment as the high linoleic lines. Since it is possible to distinguish by NIRS between high and low oleic types in other oil crops like rapeseed [4] or sunflower [5], it should be possible in safflower as well.

Conclusions

It was demonstrated that NIRS is a fast and efficient method to screen a high number of safflower genotypes for the most important quality traits high oil content, fatty acid composition and low seed hull fraction. The NIRS analysis of fatty acids of single seeds has been described for oleic and linoleic acid in rapeseed [22] and for stearic acid in sunflower [23]. The development of respective single seed NIRS calibrations in safflower would be a valuable tool in plant breeding programs.