1 Introduction

Demand for food from the increasing population requires crop production monitoring, especially in the food and water-scarce regions. Dong et al. [1] specified that achieving food security comprises crop area estimation, yield forecast, support crop choice, proper monitoring and management, and predicting how the changing climate affects crop production. Knowing which crops are growing where at the individual to global scales is a prerequisite for assisting all of those higher objectives. Ozdogan and Woodcock [2] indicated that precise and appropriate crop statistics information is a prerequisite for studies on agricultural policy and hydrological and biochemical processes. Specifically, researchers provided the importance of crop mapping in predicting yield and estimating crop-water demand and production status. Consequently, field-to-global scale food security and resource management analyses necessitate the ability to dependably crop-type mapping [3].

In numerous countries worldwide, traditional surveys and censuses are frequently employed to ascertain crop statistics., but these methods are often time-consuming, error-prone, laborious and do not provide statistics within the cropping year. However, throughout the past three decades, various spatial, temporal and spectral scales have been considered to collect crop statistics using remotely sensed satellite data [4]. Typically, these methods depend on the association between the crops' optical and biophysical characteristics [5]. The dynamic biomass differences are primarily owing to crop type and the phases of the phenology. The knowledge of crop type and phenology is significant to capture this difference. Crop mapping typically incorporates the temporal component because different crop varieties display unique and fluctuating reflectance properties throughout the growing season. Due to the dynamic nature of cropping systems, temporal time-series techniques surpass single-date image analyses in crop-type classification [6].

Crop categorization, however, varies depending on the location, data sizes and natures, applied techniques, and accessibility to training and field datasets. Researchers have discussed various methods and techniques that have been successfully used to classify crops successfully [7]. Odenweller and Johnson [8], Wardlow et al. [9], and Pena et al. [10] employed spectro-temporal profiles and multi-sensor data fusion techniques, respectively, as tools, strategies, and methods for crop mapping.

Particularly in the more competitive crop-growing regions, the approaches of spectro-temporal characteristics specific to crops may be advantageous to classify crops. In the 1980s, when crop characteristic profiles based on vegetative indices were revealed, Odenweller and Johnson [8] introduced the first phenology-based algorithms for classifying crops. Numerous research at various spatial scales studied crop mapping using phenology-based approaches [9].

Spatial resolution is frequently favored in the case of spectral-temporal profiles as the size of each crop field. Furthermore, mapping and monitoring of crop types depend on temporal resolution. Many researchers observed the optimal timing for image acquisition based on phenology to achieve the most accurate crop-type discrimination [11]. Remote sensing algorithms are molded for collecting data by different sensors such as MODIS [12], Landsat [13] and Sentinel-2 [14].

Monitoring of crop development is very challenging in cloudy and diverse lands like Bangladesh. Because competitive crops grown in the winter season have similar kinds of spectral features and responses, which impose further complications in classifying crops based on the spectro-temporal characteristics [15]. Sentinel-2 data could then be a better choice to detect the phenology of the crops more precisely. Besides, phenology-based information and knowledge acquired from ground-based sensors can solve the issues for crop mapping accurately. The harmonizing use of multiple sensors could be helpful to overwhelm the problems of limited availability of phrenology-based data from a single sensor and in obtaining high temporal-spectral data, which can differentiate phenology among the in-season competitive crops [16].

Ground-based hyperspectral Spectroradiometers can concurrently obtain spectral data of crops along with multispectral satellite sensors. Combining multi-sensor remotely sensed data acquired from both satellite and in-situ devices has gained popularity owing to evolving innovative algorithms [17]. Multi-sensors data-fusion practices rather than single sensors can be used to afford more reliable statistics. These fusion techniques are valuable for crop mapping accurately [18]. Limited research on multiple sensor-based fusion studies has been done so far for the correlation of spectral signatures of forest plants [19], soil moisture mapping [20], wetland vegetation [21], cotton plants [22], aquatic macrophyte study [23] and crops [24]. However, these studies lack the phenology-based crop information collected from in-situ hyperspectral data to train the multispectral satellite images for monitoring and mapping crops spatially.

The present study, therefore, investigated phenology-based spectral information using both multispectral satellite imagery and in-situ ground-based hyperspectral measurements for crop mapping and monitoring. Spectral signatures of crops were predominantly obtained through the hand-held ASD Spectroradiometer at various phenological stages of the wheat crop, aligning with the Sentinel-2 multispectral satellite overpasses on the same dates over the study area. This study aims to employ a rule-based classification technique with multispectral Sentinel-2 satellite imagery, utilizing in-situ ground-based hyperspectral reflectance of crop phenology as training data to delineate wheat from other competing winter crops in Northwestern Bangladesh as a case study.

2 Materials and methods

2.1 Study area

Bangladesh is a country with a subtropical monsoon climate. Its climate is categorized by high temperatures, including humidity and extensive seasonal variation in rainfall, especially heavy in monsoon and low in dry winter. Among the three distinct seasons, dry winter is the favorite for most of the crops here. In the wet summer season and cooler dry season, temperature ranges from 25 °C to 45 °C and 4 °C to 25 °C, respectively. The driest region among all is Bangladesh, where average yearly rainfall is limited to 1728 mm with significant seasonal and spatial variations. This region is recognized as an ecosystem susceptible to drought conditions [25]. Despite this, the northwest area of Bangladesh possesses the most significant cultivated lands, and its dry winter crop production is vital for the recent achievement of food security in Bangladesh, mainly in cereal grain production. This region produces the country’s 35 percent Boro rice (winter season cultivar) and 60 percent wheat production.

Kaharole Upazila (3rd tier administrative level) of Dinajpur district (2nd tier administrative level), situated in this northwestern region, was selected for the innovative wheat area estimation as a case study (Fig. 1). This area was selected to collect the data because it is a representative area in this region for wheat cultivation under favorable weather conditions. Between November 15 and December 15 is considered the most favorable period for planting winter crops like wheat, potato, maize, potato Boro rice, etc., in this particular area.

Fig. 1
figure 1

The geographic location of the study area

In this study, twelve wheat fields were selected for collecting hyperspectral data using the Spectroradiometer. Additionally, twelve fields of maize, three fields of Boro rice and three fields of potato were also chosen to gather hyperspectral data to aid in the identification of wheat crops, particularly in determining the phenology of competitive crops.

In this region, farmers typically plant their wheat seeds between November 20th and December 5th, winter maize from November 15th to December 15th, potato from November 15th to November 30th, and rice from January 15th to February 15th. For the selection of wheat fields, consideration was given to both early and late sowing dates and the varying management practices employed by farmers so that these fields represent the overall management practices of the entire study area as shown in Table 1.

Table 1 Depiction of the wheat fields indicating their cultivation schedule and management status. Temporal information of the in-situ and satellite data collected during different phenological stages of crops

2.2 Description of dataset

2.2.1 Multispectral satellite data

For this study, the Sentinel-2 sensor data was chosen as the preferred tool for wheat delineation because of the higher resolution (temporal and spatial), which is crucial to monitoring crop phenology. The two missions of the Sentinel-2 satellite covering Sentinel-2A and Sentinel-2B (launched in 2015 and 2017, respectively) provide a higher temporal resolution with 5 days of revisiting time. The Sentinel-2 sensor is equipped with 13 bands that cover the three visible, two red-edge and two near-infrared bands of the electromagnetic regions, including 10–20 m spatial resolution. These sun-synchronous sensors visit at 10:30 a.m. local time track at a height of 786 km with a swath width of 290 km [26]. The high frequency of revisits by the Sentinel-2 satellite every 5 days makes it suitable for monitoring crop growth and phenology even in areas prone to cloud or fog during the winter season of the study area. We downloaded a time series of cloud-free Sentinel-2A and Sentinel-2B satellite images (Tile Number: T45RXJ) for this study from the USGS archives, which coincided with the wheat phenology from November 2017 to April 2018 (Table 1).

The raw Sentinel-2 data that was obtained in this study was followed by atmospheric correction using the Semi-Automated Classification plugin within QGIS 2.18.1 software. This correction is essential for removing the effects of the atmosphere on the measured signal, resulting in more consistent and reliable images for analysis. Additionally, this correction ensures that the corrected image data is comparable to the in-situ spectral data, enabling more accurate and meaningful comparisons between the two.

2.2.2 Hyperspectral field data

Hyperspectral reflectance of crops was gathered using the Handheld Spectroradiometer, which captures spectral information across the wavelength range of 325–1075 nm, with a resolution of 1.6 nm per interval and a field of view of 25 degrees. The Spectroradiometer was employed to measure the canopy surface reflectance spectra of selected crops during the Rabi crop growing season 2018, which extends from November 2017 to April 2018 (Fig. 2). While conducting field operations using the Spectroradiometer, white reference measurements and instrument optimization were carried out before collecting spectral data from crops at each data acquisition time. Reflectance was calculated using the following formula:

$$\% {\text{ Reflectance}}\,\frac{{\text{Reflected radiance from the crop canopy }}}{{\text{Incident energy on the white reference panel}}} = \, \times { 1}00$$
(1)
Fig. 2
figure 2

Location map displaying selected crop fields for collecting spectral signature (upper) and randomly collected ground truth points over the study area (lower)

The radiance data obtained was converted to reflectance data using the View SpecPro software. During data collection, the instrument was set to capture 15 scans per dark current measurement, with an integration time of 217 ms. It was positioned with a nadir view, approximately 0.6 m above the canopies of the plants. Each crop plot was sampled ten times using the Spectroradiometer. The selection of these plots was made to match the pure pixel size (10 m) of the Sentinel-2 image, enabling the comparison of consistent data sets acquired from both sensors. The geographic coordinates of the sample locations were recorded using the Trimble Juno SB handheld GPS device. The data collection dates were carefully chosen to correspond with the growth stages of wheat, ensuring that they align with the date of the overpass of the Sentinel-2 satellite over the study area. This ensured that data from both the ground and satellite sensors were available on the same date. Besides, special care was taken to maintain the timing of spectral data collections as close as to the Sentinel-2 local passing time at 10:30 a.m. on every specified date.

2.2.3 Field survey data

During the wheat crop-growing season, field surveys were carried out on three different dates, such as 21st December 2017, 19th February 2018, and 21st March 2018, in the study area to obtain ground truth data. For this purpose, the geographic locations were collected using GPS devices for both wheat and non-wheat fields. Additionally, information on the sowing and harvest dates, as well as phenological data, was collected from the wheat farmers at each sample site.

A total of 54 randomly selected fields of information were recorded for further analysis (Fig. 2). While these samples used for validation may seem small, in crop type mapping, validation sample data typically are one-third or one-fourth of the training data sets. In this study the training datasets were different due to their distinct purposes and methods. Besides, these 54 samples were collected randomly over the entire study area. Hence it is believed that the validation data samples, although apparently small, are of an acceptable size.

Besides, the smallest administrative boundary, i.e., union-level area statistics of the wheat crop cultivated area of Kaharole Upazila, were collected from the local agricultural extension office Department of Agricultural Extension (DAE) for evaluating the developed map of the wheat crop.

2.3 Preparation of training datasets

Thirty sets of spectral data (12 for wheat and maize each and 3 for rice and potato each) were available for the determination of training threshold data used for wheat crop discrimination analysis. Apparently, the sample size is small. However, considering the study's objective, it was imperative to gather spectral signatures from each crop field (prior to calibrating each spectroradiometer reading) on a specific date, aligning with the Sentinel-2 satellite overpass (Table 1). Additionally, the study had to consider optimal sunlight, available only within a four-hour window (from 10:00 am to 02:00 pm), to ensure optimum spectral signatures during the winter season. Despite intensive efforts, data collection was limited to a maximum of 30 different plots per date across the six phenological growth stages.

The spectral signatures of the crops were plotted to visually interpret and ensure significant data for further analysis. To ensure data accuracy, any inconsistencies or outliers in the Spectroradiometer and Sentinel-2 data pairs were identified and removed. The crop's spectral signatures obtained from in-situ and satellite images were transformed into reflectance for the comparison of the data. Spectral data based on date and phenology were analyzed in order to visually comprehend the crops' phenology and spectral characteristics. To determine the optimal narrow bands for distinguishing between wheat and other crops, discriminant analysis was conducted with the SPSS-V-10. This analysis involved the use of multivariate separability trials (e.g., F-Value, Wilks' lambda) to identify the most effective parameters for discriminating between various samples. The K-means classification algorithm was then applied as a criterion for selecting the narrow bands [27]. Wilks' lambda is the extent of how successfully each function categorizes cases. It's the percentage of the overall variance in discriminant scores that can't be explained by group differences. Wilks' lambda values that are smaller suggest that the function has better discriminatory ability. The F-value determines the statistically significant difference among the three or more groups of data. A high F-value indicates that signal or mean differences are bigger than what would be predicted randomly by chance.

Determining threshold values, the analysis focused on five Sentinel-2 satellite imagery bands, namely Blue band (#2), Green band (#3), Red band (#4), NIR band (#6), and Red-edge band (#8), as these bands were found to be the most compatible with the optimal narrow bands determined through discriminant analysis. From these computable bands, five different vegetation indices (VIs), namely NDVI, EVI, SAVI, RENDVI and GCC, were computed as described in Table 2.

Table 2 Depiction of the five vegetation indices (VIs), including their formula and references

With the exception of data collected during the harvesting period, six sets of data that corresponded with different growth stages were examined using regression analysis to determine if the recorded responses from both sensors exhibited notable differences. The same process was employed to evaluate the derived vegetation indices. The examination offers a comprehension of the sensors' compatibility and the potential for the Spectroradiometer to furnish crucial information as training datasets to classify the crops with Sentinel-2 image datasets. A linear model was applied to fit the sensor's data pairs. Then, the coefficient of determination (R2) and root mean square error (RMSE) were calculated and subsequently compared. Equation 23 was applied to assess the degree of correlation between the two sensors.

$$R^{2} = \frac{{\sum \left( {Si \times Oi} \right) - \sum Si \times \sum Oi}}{{\sqrt {\sum Si^{2} - \left( {\sum Si} \right)^{2} \times \sqrt {\sum Oi^{2} - \left( {\sum Oi} \right)^{2} } } }}$$
(2)
$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i - 1}^{n} \left( {Si - Oi} \right)^{2} }$$
(3)

where "Si" denotes the data obtained from Sentinel-2, while "Oi" refers to the data gathered from the Spectroradiometer.

The relationship between the sensors was found stronger for vegetation indices (VIs) than for individual bands. The VIs data collected on-site were employed as training data sets for wheat delineation from Sentinel -2 image datasets. Normally, ground truth data is used to establish the threshold values for VIs. However, in this investigation, the threshold values for VIs were established using the regression model among VIs derived from two sensors. Regression analysis (linear regression model) was done for 12 VIs value pairs of wheat for each phonological date. The threshold VIs values were then determined from those regression equations using the VIs values of the Spectroradiometer. The data pairs obtained from the Spectroradiometer and Sentinel-2 were evaluated for inconsistencies and outliers, and any such data was removed. In this case, values of vegetation indices less than 10th and greater than 90th percentiles were checked and excluded as an outlier for the inconsistency of the collected data. The derived regression Eqs. 47, for example, were used to determine threshold values for vegetative stages (on 26 December).

$${EVI}_{Threshold} = {EVI}_{SR} \times 1.17 + 0.021$$
(4)
$${NDVI}_{Threshold} = {NDVI}_{SR} \times 0.948 + 0.049$$
(5)
$${SAVI}_{Threshold} = {SAVI}_{SR} \times 1.082 +0.044$$
(6)
$${REVIT}_{hreshold} = {NDVIre2}_{SR} \times 0.815 +0.108$$
(7)

where the suffix "Threshold" is utilized to specify a specific threshold, either minimum or maximum, while the suffix "SR" is utilized to indicate in-situ data gathered from the Spectroradiometer.

2.4 Rule-based wheat classification

The wheat crop was classified using a rule-based technique, which utilized the VIs threshold values. This approach involves integrating auxiliary datasets, including expert knowledge through spectral datasets for crop classification. Thorat et al. [33] employed knowledge-based expert methods to estimate crop data. Knowledge engineering enables the creation of decision trees that define the variables and the rules through calculated spectral data. These rules and conditions are used to generate trees. Figure 3 shows the current wheat delineation method utilizing the rule-based classifier.

Fig. 3
figure 3

Flow chart depicting the rule-based classification method utilized in this study

The rule-based wheat crop delineation technique involves two rules: one is wheat, and another is non-wheat, and the variables are based on a threshold of VIs. The knowledge-based classifier was initially used to delineate a wheat crop on a single date. Subsequently, a rule-based binary-numbering system was applied to classify all single-date images. The binary numbering classification system counted pixels as either wheat (represented by binary 1) or non-wheat (represented by binary 0) based on whether the VIs values were within or outside the threshold values, respectively. Finally, an ultimate phenology-based map of wheat crops was delineated by combining the 6 single-dated wheat maps by spatial mathematical operations. Validation of the estimated wheat area for Kaharole Upazila was performed through an error matrix and agreement of the comparison methods for the study year.

The accuracy of the classified wheat map, including user, producer, overall accuracy, and kappa coefficient calculations, was assessed using a dataset consisting of 54 sample points obtained from both wheat and non-wheat fields (Fig. 2). The evaluations were performed by accuracy assessment. The Eqs. (811) presented below were utilized to compute the accuracy assessment of the classified wheat maps.

$${\text{Producer's}}\,{\text{accuracy}} = \frac{{{\text{Number of truth values in column}}}}{{{\text{Sum of values in column}}}} \times 100$$
(8)
$${\text{User}}'{\text{s}}\,{\text{accuracy}} = \frac{{{\text{Number of truth values in a row}}}}{{{\text{Sum of values in a row}}}} \times 100$$
(9)
$${\text{Overall accuracy}} = \frac{{\text{Sum of truth values}}}{{\text{Total number of ground truth}}} \times 100$$
(10)
$${\text{Kappa coefficient k}} = \frac{{\left\{ \left( {\text{Total number of GT}} * \sum{}{\text{truth values}} \right) - \sum{} \left( {\text{Row\,*\,Column}} \right) \right\}}}{{\left\{ \left( {\text{Total number of GT}} \right)^2 - \sum{} \left( {\text{Row\,*\,Column}} \right) \right\}}}$$
(11)

3 Results and discussions

3.1 Training datasets from sensors’ information

The training datasets were selected by utilizing Sentinel-2 imageries, which were generated from both satellite images and in-situ spectroradiometer data, for obtaining the spectral signatures of wheat.

3.1.1 Hyperspectral-based crop phenology

Throughout the growing season, the spectral signature of the wheat crop is found within the visible and near-infrared (400–1000 nm) wavelength (Fig. 4). A spectroradiometer was used to capture the time series spectral reflectance of wheat during the winter crop growing periods of 2017–2018, enabling the identification of the spectral signature mentioned. Throughout both the early and late stages of the growing season, the wheat spectra exhibit decreased reflectance in the near-infrared (NIR) region and increased reflectance in the visible region. This is due to the fact that the crop chlorophyll absorbs less in the green and NIR regions, whereas it absorbs more in the visible regions (blue-red) during these two stages. The results demonstrate the opposite trend for the vegetative and reproductive stages. In the vegetative stage, the crop exhibits greater vitality, leading to increased reflection of visible light and absorption of near-infrared (NIR) light. This trend is reversed during the reproductive stage.

Fig. 4
figure 4

The spectral signature of wheat throughout the 2017–2018 growing period

As depicted in Fig. 5a, the initial seedling stage of wheat, maize, and potato occurs 9–14 days after sowing (DAS). At this stage, the reflectance is predominantly affected by soil properties rather than vegetation signals, particularly for maize. Conversely, during the booting stage of the wheat crop, which occurs at 65–70 DAS, as shown in Fig. 5b, The higher spectral reflectance of wheat in the visible green and near-infrared (NIR) regions enables its differentiation from other crops. While maize and potato share a similar spectrum pattern in the visible spectrum, they exhibit contrasting reflectance characteristics in the near-infrared (NIR) region. The denser vegetation canopy of the potato contributed to its comparatively higher reflectance in the near-infrared (NIR) zone, leading to the observed difference from maize. In Fig. 5c, there was a slight decrease in wheat reflectance in the NIR zone during the 89–94 DAS period. This could be attributed to the spikelet and grain formation stage of wheat. On the other hand, maize reflectance may have peaked during this stage due to its abundant leafy vegetation before the formation of the cob. Despite the presence of gaps, the canopy coverage remains dense enough to allow for a small amount of energy reflectance to reach the sensor. During the initial stages of vegetation, Boro rice displays a reduction in NIR spectrum reflectance due to the presence of standing water. On the other hand, potato exhibits a decrease in the NIR spectrum reflectance as it reaches physiological maturity.

Fig. 5
figure 5

The spectral signatures of wheat, maize, potato, and rice at various stages of development. Specifically, a shows the spectral signature at 9–14 DAS for wheat, maize, and potato. b shows the spectral signature at 65–70 DAS for wheat, maize, and potato. c shows the spectral signature at 89–94 DAS for wheat, maize, potato, and rice. Finally, d shows the spectral signature at 104–109 DAS for wheat, maize, and rice. e The crop phenology curve, based on NDVI, was obtained from the ASD sensor's data

During the 104–109 DAS period (Fig. 5d), wheat starts to senesce, which causes a sharp change in the crop structure and a rapid decrease in light reflection. Among the mentioned crops, maize and rice have the longest growing seasons, spanning approximately 140–160 days for maize and 140–150 days for rice. Wheat follows with a growing season duration of around 110–120 days, while potato has the shortest growing season, lasting approximately 70–90 days. As a result, potato tends to senesce quickly compared to wheat and maize, which undergo a gradual senescence process. Based on visual observation, the 89–94 DAS period (Fig. 5c) appears to be the most suitable for distinguishing wheat from other winter crops in this area. Based on the spectral signature, the growth curves of the four crops (rice, wheat, potato and maize) were obtained using a spectroradiometer-derived NDVI approach, as demonstrated in Fig. 5e, which presents the standard phenology patterns of these crops.

During the initial phases of the growth cycle, the crops exhibit lower VI values primarily because of the limited canopy cover and greater exposure to the soil background. As the growth progresses to the middle of the curve, the NDVI values tend to saturate, indicating a negligible difference in the values. As the crops approach the end of their growth cycle, the values of the NDVI display a declining trend as the crops undergo senescence, leading to a greater reflection of red light. Nonetheless, the utilization of NDVI-based multi-temporal crop phenology reveals a distinct crop cycle pattern, effectively distinguishing wheat from other crops. Tian et al. [34] suggested the potential utilization of multi-temporal images to classify crops to achieve a better distinction between crops. As each crop may have distinct sowing and harvesting dates and variable rates of canopy growth, the temporal, spectral signature could enable differentiation among different crop types. In regions where multiple crops are grown in close proximity, a single-date image may cause ambiguity in distinguishing crops since several crops may exhibit comparable indices during specific growth stages [35].

3.1.2 Bands and parameter for threshold values

According to the discriminant analysis, the top five bands for differentiating wheat crops are mainly located at 905, 735, 650,558 and 530 nm. The selection of these bands is based on the lower Wilks' Lambda value (0.00108) and a higher F-value (165.2), which serves as the foundation for selecting these bands. Similar findings regarding narrowband selection have also been reported by Manjunath et al. [27]. Subsequently, these optimal bands were employed to identify the homogenous bands in the Sentinel-2 imagery. As a result, Sentinel-2 bands B8 (767–908 nm), B6 (731–749 nm), B4 (646–685 nm), B3 (537–582 nm), and B2 (439–535 nm) were selected for comparative and subsequent analysis of wheat crop delineation based on the significant differences observed in their reflectance spectra. The Spectroradiometer's hyperspectral reflectances were averaged to create broadband reflectance values akin to those of Sentinel-2's B2, B3, B4, B6, and B8 bands.

Through a comparison of individual bands, the reflectance values from the Spectroradiometer were aligned with those of the Sentinel-2. Table 3 presents the coefficients of determination for each band of the five most closely corresponding dates, calculated from the data obtained from the Sentinel-2 and Spectroradiometer. The R-squared value and standard deviation of residuals for comparing Vis is shown in Table 3.

Table 3 The coefficients of the determination result from a comparison of individual bands and R-squared value and standard deviation of residuals for comparing VIs

It is found that on 26th Dec 2017, 25th Jan 2018, 19th Feb 2018, and 06th Mar 2019, the R2 values were greater than 0.60 with a p-value of less than 0.05 for each band. This implies that there is a correlation between the two sensors, albeit it is relatively weak. The red-edge band presents a unique scenario, as it exhibits the lowest coefficient of determination among all bands, indicating that it has the least correspondence between the two sensors. Results showed that the linear model can account for a minimum of 61% of the variability observed in the data. Any remaining variability might be attributed to variations in the acquisition dates and times of the data; discrepancies in radiometric correction, as well as variations in sensor center wavelength, may contribute to the observed differences. Figure 6a indicates the presence of an outlier and signs of skewness and asymmetry, as revealed by further data analysis. Table 3 demonstrates that while comparing the two sources of data for vegetation indices (VIs), the coefficients of determination are more robust compared to those obtained through reflectance data from band contrast.

Fig. 6
figure 6

a On February 19, 2018, a boxplot displayed the reflectance values per band for both sensors (left). b Boxplot shows the distribution of VIs obtained data from two sensors on February 19, 2018 (right)

On the other hand, the paired difference exhibited greater symmetry, indicating that the sets of samples had comparable levels of skewness. Given that the skewness was not significant, the precision of the p-values (p < 0.05) and critical values derived from the t-distribution was expected to be enhanced, and hence, the t-test was considered performing adequately. As illustrated in Fig. 6a, the correlation between the data pairs is not consistent across all pairs. Analyzing the vegetation indices, it was found that EVI, NDVI and SAVI exhibit a comparatively stronger correlation, whereas RENDVI and GCC display lower coefficients in all analyzed pairs (as indicated in Table 3). After analyzing the data, it was found that EVI, NDVI and SAVI exhibit stronger correlations and have comparable mean values across different sensors. Following these, the correlation between RENDVI and GCC was observed. The Spectroradiometer calculated higher VIs and had larger variability for all cases, as shown in Fig. 6b.

Calculations of the SAVI and EVI using the spectroradiometer resulted in higher values, while the NDVI was able to capture a wider range of variability when using data from both sensors. It is found that the wheat phenology based on NDVI obtained from the spectroradiometer was compared with that obtained from Sentinel-2 data, which revealed a more plausible pattern of phenology. The utilization of VI-based values was deemed necessary for achieving more precise crop classification based on knowledge, as the correlation between the Spectroradiometer and Sentinel-2 sensors at the band-by-band level was found to be limited. While the mean values of EVI, SAVI, and NDVI appear to be consistent across sensors, there appears to be some disparity between the GCC and RENDVI values. Nevertheless, RENDVI, along with the three other VIs, is still a relevant factor in establishing threshold values. These values serve as training data for wheat classification in Sentinel-2 imageries.

3.1.3 Threshold values as training datasets

The imageries has been classified using the appropriate threshold values to delineate crop areas. The threshold values for time series VIs were determined based on spectral data for a specific crop growth profile across all dates within the respective crop field. The phenology-based maximum and minimum NDVI threshold values of wheat are shown in Table 4.

Table 4 Phenology-based NDVI threshold values for the wheat crop derived from the reflectance data obtained through the Spectroradiometer

The maximum VIs value indicates a dense coverage of crops across the fields, while the minimum value indicates less dense coverage of crops due to bare soil being exposed during the early stages of crops. If multiple crops enter the vegetative to the flowering stage on a single day, using a single date threshold value during the classification process can lead to confusion. Figure 6 illustrates, for example, that on 25th January 2018, the VIs (NDVI) of the maximum crops show a high NDVIs value for all four crops, i.e., more than 0.70. Numerous studies, including Li et al. [36], have demonstrated the significance of phenology-based classification in achieving precise crop classification.

3.2 Wheat classification and validation

In 2017–18, the VI-phenology of the cultivated wheat crop in the Kaharol Upazila was observed, and its distribution is illustrated in Fig. 7. According to the classification based on NDVI, EVI, SAVI, and RENDVI, the estimated areas of wheat cultivation in the Upazila were 1950.29, 1982.23, 1838.88, and 2439.45 ha, respectively, while the official reported area was 1751.00 ha.

Fig. 7
figure 7

Remotely sensed cultivated wheat area in the Kaharole Upazila based on vegetation indices. a shows NDVI-based wheat map. b shows EVI-based wheat map. c shows SAVI-based wheat map. d shows RENDVI-based wheat map

Validating the produced maps, an accuracy assessment was conducted, and a comparison agreement was made. The accuracy (overall) of the classified maps based on NDVI, EVI, SAVI, and RENDVI were found to be 83.33%, 85.19%, 81.48%, and 72.22%, respectively. These results, as presented in Table 5, indicate satisfactory classification outcomes (p < 0.05). The error matrix used to assess the accuracy of the classified wheat crop based on the NDVI, as a case study. The user, producer and overall accuracies shows relatively good results (more than 80%). However, the overall accuracies may vary due to the spatial variability of crop and land cover types, characterized by small holding, complex, heterogeneous and fragmented patterns. Mixed pixels containing a mix of different crop types might reduce overall accuracies as well. Thus, the small-sized wheat fields are likely to appear in mixed pixel on satellite images with a resolution of 10 m and be missed in the classification procedure [37]. Since the study area has a high percentage of mixed pixels due to the small holding farming, accurately assigning a single crop class to those pixels becomes challenging.

Table 5 Accuracy of the VIs-based wheat classification. The error matrix to assess the accuracy of the NDVI-based classification of wheat crops

After estimating the area, a comparison was made with the union-level acreage reported by the DAE. The results, presented in Fig. 8, show a good level of agreement (R2 = 0.98) between the two, although remotely sensed output estimated higher wheat areas. Differences in the methodologies used for estimation might contribute to these variations. The higher estimates in remotely sensed wheat cultivation areas may be attributed to advanced methods, finer spatial resolution, reduced sampling error, and improved technology, compared to the traditional and broader-scale observations reflected in the official data. Besides, changes in land use over time, such as conversion of non-agricultural land or other crops to wheat cultivation, may not be adequately reflected in official statistics [38]. Advanced technologies and improved algorithms used in the research study may enhance the ability to detect wheat fields accurately, leading to higher estimates compared to traditional. Official data might use coarse spatial resolution, leading to generalized estimates.

Fig. 8
figure 8

Comparison of the VI-based remotely sensed union-level wheat area with the officially reported union-level area for the 2017–2018 season

The overall accuracies and kappa coefficients revealed that the classification based on EVI achieved the highest accuracy, followed by NDVI, SAVI, and REVI. The area estimated using EVI showed a relatively higher level of accuracy, which could be attributed to its high sensitivity to changes in vegetation density, particularly in areas with dense biomass where NDVI may reach saturation. Moreover, as the study area located in high humid region, and the use of EVI in the classification process may have helped to alleviate the effect of atmospheric factors, as noted by Huete et al. [39]. EVI is capable of decoupling the canopy signal from the background, which may have contributed to its ability to produce more accurate results [40]. The increase in accuracy can be attributed to the addition of the blue band, which mitigates the influence of atmospheric aerosols on the red band, resulting in improved outcomes. Additionally, adjustment factors were applied, diminishing the influence of the reflectivity effects of soil. Based on these differences, Mancino et al. [41] noted that NDVI exhibits higher sensitivity to chlorophyll content, while EVI shows greater responsiveness to the structural characteristics of the plant's vegetation cover. Thus, it can be inferred that NDVI is more indicative of chlorophyll content, whereas EVI is more reflective of the plant's structural features. In contrast, Reed et al. [42] proposed that SAVI (Soil-Adjusted Vegetation Index) is more efficient in capturing the structure of the vegetation canopy and minimizing the influence of background elements and atmospheric factors. Among the VIs, REVI shows less accuracy, which might be due to the red-edge region where the spectral signature is very heterogeneous. Despite the variations in the remotely sensed winter wheat acreage, we can conclude that the threshold values for VIs (NDVI, EVI or SAVI) generated by the Spectroradiometer can serve as reliable training data for wheat classification using Sentinel-2 images.

4 Conclusion

This study aims to develop an innovative wheat crop classification methodology using a multi-temporal satellite sensor and in-situ hand-held Spectroradiometer sensors’ field-level datasets. The proposed knowledge-based classifier combined spectral features of crops obtained from both Sentinel-2 imagery and ground-based hand-held Spectroradiometer to achieve accurate classification results. Additionally the comparison with the official union-level estimated area reveals a favorable level of agreement. The proposed method can solve uncertainties in the image classifications that arise from confusion among the different growth stages of the crops. The crop growth phenology profiling method used in this study effectively discriminates wheat from other crops due to each vegetation index (VI)-based crop phenology having a unique pattern. The incorporation of multispectral satellite imageries and hyperspectral in-situ data could enhance the classification ability to delineate wheat from other crops. However, this proposed methodology could address classification problems in areas with in-season major competing crops in fragmented and heterogeneous croplands. Finally, this method has the potential to extend for developing the phenology and area delineation of other crops locally and globally.