1 Introduction

Agriculture sector and food production system consumes nearly 30 percent of the total global energy [13] Over the time the pattern of energy consumption in this sector is transmuting due to changes in production technologies, limiting supply of land, exponentially rising population, shifting dietary and eating patterns and more [4,5,6,7,8]. There are two types of energy sources-direct (such as fuel, machinery, gas) and indirect sources (such as seed, fertilizer, pesticides [9,10,11,12]. Most of this energy is derived from non-renewable resources, which are limited and degrades the environment [2, 13,14,15,16,17,18]. The prices of these energy inputs fluctuate with global prices of oil and gas, which are highly volatile, making agriculture production activities highly cost vulnerable and inviable [13, 19,20,21,22]. Given the limiting supply of energy inputs it is imperative to study the load forecast for energy consumption in agriculture sector to make production sustainable [6, 20, 23,24,25].

Modelling and forecasting crop yields, output energy, input energy, carbon footprints in agriculture sector is useful for farmers, governments, agribusiness industries. It assists in making informed decision regarding policies, market support, manage supply chains, forecast purchasing and storage decisions [22, 25]. However, energy analysis in plant behavior is highly intricated due to genetic, environmental, soil conditions and many direct and indirect factors [5, 14, 20, 23, 25,26,27,28,29]. Traditionally, econometric models, based on Cobb–Douglas production function were the most popular modeling technique for investigating functional relations between input energy and crop yield for various crops [6, 25, 30,31,32,33]. But many studies have found nonlinear relationship between the covariates and input energy [25, 26, 34,35,36,37]. There also exists lack of homogeneity of energy sources [38, 39] and it laborious to assign the amount energy consumed in case of multiple outputs, such as energy needed in grain and straw [40,41,42,43]. Moreover, there is little prior information about functional form of the relation, and thus assuming normality can be erroneous in modeling. Here, ANNs provide a powerful and flexible tool for modeling complex systems [43, 44].

ANNs are data driven and distribution free; therefore, they can approximate non-linear functions and solve the problems where input–output relationship is not easily computable [45, 46]. Major benefits of using ANNs is their resistance to noise, efficient in handling multiple non-linear unknown interactions in the system, ease of dealing with missing data and high predictive accuracy [5, 40, 47, 48]. Neural networks have greater estimation and prediction accuracy which makes it superior modeling technique over MLR, PM, logit models [47,48,49,50,51] NNs have been extensively used by researchers to study energy engineering in crop sector to model- input energy [25, 36, 39, 40, 52], output energy [34, 53,54,55], crop yield [35, 37, 56,57,58], carbon emissions [59, 60].

Accurate modelling can correct the pattern, type, and save energy consumption in agricultural activity and grow clean products, especially for developing countries like India, which is characterized by booming population and greater fight amongst the sectors for limited energy resources for growth and development purposes [1, 5, 61]. Wheat is the staple food in India which is grown on nearly 29.8 million hectares. Despite the concerns regarding incessant increases in energy consumption in wheat production since the green revolution [6264], there are no significant study to forecast input energy consumption in wheat production in India. Thus, the aim of this research is to model and forecast input energy extracting data from the wheat dominant region of western Uttar Pradesh, using MLFANN model based on various direct and indirect factors. It is believed that this study is first of a kind and therefore, can aide in formulation of appropriate policies to manage the food-energy-environment crisis in India to achieve the sustainability development goals [18].

2 Modelling Technique

2.1 Heuristics of Artificial Neural Network

The modus operandi of ANN technique animates the biological nervous system such as brain information processing mechanism, wherein, the neurons take the new information, process it with the existing information or bias, and keep transmitting activated information to next neuron until learning of a situation is achieved. A simple artificial neuron has been depicted in Fig. 1 below,

Fig. 1
figure 1

Simple artificial neuron

Numerous such neurons are stacked across multiple layers in NNs, sequentially put as input layer, hidden layer/s and output layer. The information is first fed into input layer which are the explanatory variables in the models, which goes through a process of panning and is discreetly weighting and then transmuted to neurons in next layer. In the hidden layer, net input is calculated in each neuron node as given in Eq. 1 below before firing it using an appropriate activation function.

$${netinput}_{i}={\sum}_{j}{w}_{ij}* {output}_{j}+{\mu }_{i}$$
(1)

The role of the activation function is transforming the summed information for next layer. There are various activation functions such as-threshold, sigmoidal, hyperbolic tangent, ReLu, which are used in specific context [43]. Using appropriate activation function is vital to optimizing NN architecture for modelling. Like threshold activation is based on binary step function which makes it easy to apply, however, creating binary classifier will not work when there are more than 2 classifications in the neurons. Sigmoidal activation function has “S” shaped characteristic and can be used for models where probability as output is to be predicted. The drawback of sigmoidal function is that it causes the NN to get stuck in the training time if strong negative input is provided. Here it is recommended to used Hyperbolic tangent activation function as it maps strong negative inputs to negative output and only zero valued inputs are mapped to near zero output and thus, less likely to get stuck in training. Rectifies liner units or ReLu is another nonlinear activation function which is good approximator than hyperbolic tangent function and is recommended to be applied only to the hidden layers in neural networks. Another important factor in ANN modulation is weight adjustment. There are two ways for weight adjustment- brute force method, which is best suited for single layered feedforward network, batch gradient, which first order iterative optimization algorithm and ticks off wrong weights one at a time.

A sigmoid activation function, which is a common type of activation function is represented as,

$$g\left(netinput\right)=\frac{1}{1+{e}^{-netinput}}$$
(2)

Hidden layers in ANN separates functional characteristics data when it has to be separated nonlinearly and the number of hidden layers required is based on complexities in identification decisions. According to Karsoliya [65] three hidden layers are appropriate with respect to time complexity and accuracy, whereas Panchal et al. [66] proposed maximum two hidden layer for model efficiency. The choice of layers also depends on length of linear problems. Seifollahi et al. [67] suggested that single hidden layer with large number of neurons is accurate in simple and linear problems, whereas Villiers et al. [68] found 3 to 4 layers to work efficiently with small and large data sets. Thus, there is no rule pre-defined appropriate hidden layers and determining the appropriate number of hidden layers varies from case to case [1]. The information from input to hidden layer is then processed and activated using an activation function.

Another important consideration in NN modeling are the training algorithms. These are applied in hidden layer where in the activated neuron containing information is process inter-alia within the other neurons in the hidden layer/s to further process the information and generate predicted output. These are—Gradient descent, Newton method, Conjugate gradient, Quasi Newton Method and Levenberg–Marquardt algorithm. It is very crucial to determine most suitable NN architecture for accurate modelling and forecasting input energy, which can be done through repeated trials [45]. Back propagation (BP) is one of the most widely used training algorithm for supervised learning [35, 43] combined with gradient descent with momentum makes it more efficient. This forward induction of information processing is called as feed-forward ANN, which is used in this study [69].

Various topologies of ANN architectures have been used in field of medicine, physics, agriculture, environment, social sciences, data mining, and more [69]. In the study done by Srinivasan et al. [70] used a four-layer multilayer perceptron to predict hourly load in a power system, while Nizami et al. [71] used 7-year data to predict electric energy consumed in Saudi Arabia using NN modeling with two layers. They used variables such as data, global radiation, and population to model the structure. Fang et al. [72] developed a NN model to estimate energy requirements for the reduction of cultivated wheat area. Aydinalp et al. [73] used a simple NN based energy consumption model for the Canadian residential sector. Ashhab [74] used NN to model energy demand in the transport sector. Zangeneh et al. [11] used NN to predict mechanization indices which are based on electricity (power) and energy consumption. Avami et al. [75] used recurrent NN to forecast energy consumption as function of population and GDP of Iran. Nabavi et al. [34] used LM learning algorithm to model GHG emissions using ANN modelling. Hosur [76] used ANN with feed forward back propagation technique to model cost, where the analytical tool of minimizing MSE can help increase the accuracy of the prediction. The parameters used in cost modeling were soil type, pH factor, nitrogen, phosphate, potassium, organic carbon, calcium, magnesium, sulphur, copper, iron, temperature. Gupta et al. [77] presented how ANN can be used for water quality index. Raut et al. [78] studied sustainable business performance using big data analytics and NN modeling. Coskuner [79] used NN modeling to predict generation of domestic, commercial and construction wastes. Khushroo et al. [25] used NN to perform sensitivity of energy inputs in crop production. Rohani et al. [80] used NN to predict tractor repair and maintenance cost. Jahani et al. used radial base NN technique to model some aesthetic preference and mental restoration values in urban parks based on landscape natural characteristics [81]. A process chart as shown in Fig. 2 was used to determine the most suitable MLFANN topology to determine energy consumption in this study.

Fig. 2
figure 2

Process chart to attain optimal NN topology

2.2 Statistical Evaluation of NNs

Various statistical measures were used to compare the models linear models and NN models such as sum of squared errors as shown in Eq. (3), correlation coefficient “\(R\)” (given in Eq. 4) and coefficient of goodness of fit \({R}^{2}\). Larger value of R or \({R}^{2}\) greater is model estimation [38, 46, 82]

$$E=\frac{1}{2}{\sum}_{p=1}^{n}{({y}_{p}-{\widehat{y}}_{p})}^{2}$$
(3)
$$R=\frac{{\sum }_{i=1}^{N}\left({Y}_{i}-\overline{Y }\right)(\widehat{{Y}_{i}-}\overline{Y })}{\sqrt{\sum {({Y}_{i}-\overline{Y })}^{2} \sum {(\widehat{{Y}_{i}}-\stackrel{-}{Y)}}^{2}}}$$
(4)

Another criterion compare model predictive accuracy is Mean sum of squares (MSE) and Root mean sum of squares (RMSE) given in Eqs. (5) and (6), respectively was used for statistical evaluation of the models. Sensitivity analysis to determine which input variable is most important in the predicted model was done to rank the importance of the variables in energy modelling [25, 32]

$$MSE=\frac{1}{n}{\sum}_{i=1}^{n}{({P}_{i}-{A}_{i})}^{2}$$
(5)
$$RMSE=\sqrt{MSE}$$
(6)

3 Survey Data and Methods to Determine Input Energy

3.1 Survey Region and Data Collection

The western region of Uttar Pradesh, India was chosen as the survey site for this study. This wheat dominated area with 76 percent cultivated area under wheat and contributes 33 percent to wheat production in India. Additionally, the agricultural practices in this region are relatively more advanced than the other regions in the state [64]. Data was collected via face-to-face interview from 256 farmers from 4 districts- Hapur, Agra, Bareilly, Bulandshehar and Meerut, using a pretested open and closed ended questionnaire in the period 2017–2018 using random sampling technique. The information covered 5 main themes—farming tools and machines, farm inputs, output produced, cost of production and socio-economic information. Further information was retrieved from literature review of district reports. The average farm size in surveyed region was 2.8 ha and around 95% of farms were irrigated using electric motors. Almost 92% of farms were under farmer ownership and were within 5 km radius of the market.

To calculate the energy consumed in wheat production in the surveyed farms, information was gathered until for on farm production only. Environmental sources of energy such as winder, water, rain, solar and synthesis were also excluded in energy determination as it would make the energy modeling intricate and also energy measurement of such natural resources is highly scientific and beyond the scope of this study.

3.2 Methodology

Data processing method was carefully undertaken to identify, determine and model energy consumption in wheat farming. There are three ways through which energy is consumed in farming systems, namely—source, field operation and indirect factors. Table 1 tabulates energy by source, energy from field operations and indirect factors effecting energy. The direct factors considered are diesel, electricity, human energy, and indirect factors considered are fertilizers, pesticides, machinery and seeds [39] and each were multiplied by respective energy conversion coefficients. Quantity of energy sources used on farm in terms of inputs and in various farming operations were carefully understood and calculated.

Table 1 Energy from sources, field operation and indirect factors
Table 2 Energy conversion coefficients from various studies

Tractors were the most important farm machinery used in the region. Tractors were used separately and with extended tractor driven implements such as tiller, harrower, leveller, rottavator, cultivator. The understand the energy input from tractor, it was important to understand the mass of the machine. This could be calculated using weight of the machine, working life span and average surface area on which it is used [39]. The average life of the machine was drawn from Farmtech/Farmer’s guide, UP, annual use of machine was taken down from the survey done on each farm and average weight of machine was taken from Smil [14]. They showed that there was a correlation between tractor mass and related power (hp). The horsepower of the tractors used in the surveyed area was between 25 to 65 hp. Mostly the farmers rented the tractors to be used on the farm and hence they used the tractors which was available and had to compromise on the horsepower of the tractor. To calculate the energy from tractor, following formula was used from [24, 39, 62].

$$ME=(G*E)/(T*Ca)$$
(7)

where ME is machine energy (MJ/ha); G is the weight of the implement (kg); E is the energy sequestered in agricultural machinery (MJ/kg); T is the economic life of the machine (h) and Ca was effective field capacity (ha/h).

For calculation of Ca, the following equation was used:

$$Ca=\left(s*w\right)*FE/10$$
(8)

where s is ground speed (km/h); w is the width of the machine (m) and FE is field efficiency (%).

For analysis in this research, total tractor hours used per ha of land was calculated. This was inclusive of all the activities performed on wheat crop through the season. Then the calculated total number of tractor hours on wheat on each farm was multiplied with tractor energy coefficient per hour, which in this study was taken as 63 MJ/h [53, 57].

Various farming operations are implemented in wheat cultivation in India. From preparation of land to harvesting, is done using various farm operations and using various farm machinery. Various farming operations for wheat cultivation are—tillage, planting, spraying, fertilizer distribution, harvesting, irrigation. Energy consumption in various farming operation in wheat production was determined in the primary data collection. The mode of conduct of the operation, number of hours of operation, human labor used if any, etc. were interviewed from the farmers. Since most of the farming machinery were tractor driver, the details about the tractor use and diesel used to fuel the tractors were noted carefully for energy analysis from farming operation in wheat.

For the energy from irrigation, was in form of electricity driver motors used to draw water for the crop. Hence, the power of the electricity used to power the electric motor was noted, such as number of hours motor is operated, horsepower of the motor, frequency and duration of the irrigation, quantity of water applied and number of times wheat crop is watered in a season, for the farms where canal-based irrigation was used, then distance from the source was also inquired and noted. Energy consumption in irrigation was determined for water pumped to the land surface and for surface irrigation. Finally, total energy used was sum of inputs used, multiplied by their respective energy conversion coefficient, which is given in Eq. (9)

$${E}_{j}={\sum}_{i=1}^{n}{I}_{i}{C}_{i}$$
(9)

Many researchers have calculated conversion coefficients for various inputs used in wheat cultivation, which has been summarized in Table 2 below [26, 39, 53, 55, 58, 62].

Various indirect factors such as farmer’s attributes, social factors, geographical factors, financial factors, etc., were found to be significantly correlated with energy consumption and were thus considered here in input energy modeling [36, 39]. For energy modelling in this study, various ANN topologies were considered and finally MLFANN with two hidden layers was found to be most suitable and has been represented in Fig. 3 below.

Fig. 3
figure 3

Neural Network model architecture used in this study

4 Results and Discussions

It was found that on an average energy consumed in wheat production is 29612.43 MJ/ha with urea contributing to almost 47%, followed by diesel at 32% and electricity at 10%. This was however, lower compared to input energy consumed in wheat production in other nations and similar studies have found diesel and fertilizers as one of the main contributors of energy consumption in wheat production [39, 55, 62]. It was interesting to see the difference in energy consumed and various farm attributes with respect to size of the farm. This has been presented in below Table 4. Compared to other studies the average total energy used per hectare was found to be relatively low, however it was significantly higher on small farms (40,011 MJ/ha) compared to large farms (31,895 MJ/ha). The major difference in the percentage contribution in total energy intake on small and large farm was in terms of electricity and tractor. Large farms require more water to water the crop and thus require operating the electric motor for irrigation linger than small farm and hence consume more kWh electricity on large farms than small farms. Also, for the tractor use, larger farms require larger or heavier tractor than small farms. Other inputs were proportionately equal per ha on both small and large farms. As it can be seen from Table 3 for both small and large farms, major portion of energy consumed came from urea, electricity, and diesel (almost summing to 90% energy on both farm type). Amongst them small farms consumed relatively more of per ha urea than large farms. Large farms consumed relatively more of diesel and electricity. Therefore, for energy conservation, it is necessary to focus more on fertilizer, electricity, and fuel consumption than other factors.

Table 3 Comparison of energy sources and consumed on small and large farms

4.1 ANN and Energy Modeling in Wheat Production

Principal component analysis (PCA) was used to extract the variables that explained maximum variance from a total of 18 variables. Using varimax rotation, 17 components were extracted showing almost 74.7% of cumulative variance. For the final modeling 9 inputs were selected which had lowest correlation with each other. Direct factors taken were urea, phosphate, potassium, diesel, electricity, and attributes variables taken were ownership, farm size, experience, and distance from the market for energy modelling. Out of total 202 respondent, 70% of data was used for training ANN and 30% for validation. No dataset was holdout for testing due to insufficiency. The two hidden layer MLFANN was trained using two algorithms, GDA and SCGD on MATLAB and SPSS software and covariates were preprocessed using appropriate scaling and were normalized between [0, 1]. MSE and R2 was used to determine the best topology [83]. The R2 between actual and predicted energy from various NN topologies has been compared in Fig. 4. It can be seen that MLFANN with 2 hidden layers with 8 and 15 neurons respectively in each hidden layers gave the best results.

Fig. 4
figure 4

MLFANN with different number of hidden layer/s, neurons and their R2

Sigmoidal activation function was used in hidden layers and output layer with GDA training algorithm. The estimated energy consumption in training data set had R2 of 0.99% and 0.97% in validation dataset (Fig. 5). The model was back propagated in 100 iterations to minimize the MSE at 0.0078 and RMSE to 1328 MJ/ha in validation dataset. Comparing squared sum of errors was 0.06 and 0.0891, respectively for training and validation datasets. The residual to actual energy chart came out to be horizontal indicating negligible relation between residuals and actual energy values, which acclaims the predictive accuracy of this attained MLFANN model for energy consumption in wheat production.

Fig. 5
figure 5

R2 of actual and predicted energy consumption from ANN in training data (a) and on validation data (b)

4.2 MLR and Energy Modeling in Wheat Production

In this study ANN modelling for energy consumption in wheat production is compared to MLR modelling technique. A linear model considered is given in Eq. (10) below, where predicted output \({Y}_{i}\) is regressed on set of predictors \({X}_{1},{X}_{2},\dots ,{X}_{k}\), with \({\beta }_{k}\) being the partial coefficients.

$${Y}_{i}={\beta }_{1}+{\beta }_{2}{X}_{2}+{\beta }_{3}{X}_{3}+ \cdots \cdots \cdots +{\beta }_{k}{X}_{k}$$
(10)

Table 4 displays the comparison between ANN model and MLR model in both training and validation dataset based on the attained R2 values and RMSE. As can be seen that, R2 value is greater and RMSE is lower in both datasets for ANN model compared to MLR model, which clearly indicates the superiority of ANN technique for energy modelling in wheat production.

Table 4 R2 and RMSE of ANN and MLR model in training and validation

4.3 Sensitivity Analysis and Energy Prediction

It was imperative to determine which input is crucial in the energy modeling and for that sensitivity analysis was performed which measures the change in the output when an input variable is altered within a specified range. The results of sensitivity analysis for energy consumption modeling have been depicted in Fig. 6 below. The most important input in energy modeling is electricity followed by urea and fertilizers. An important input in wheat cultivation is timely availability of water, which in the region is provided via electric pumps which consumes exorbitant amount of electricity. Fertilizers (NPK) which boost the growth of the crops, was used more than the desired proportions, as farmers complained of stunted crop growth in previous cycle and expected fertilizers to improve moisture and soil quality.

Fig. 6
figure 6

Results of sensitivity analysis on input energy modeling

4.4 Input Energy Forecasting

To forecast the energy consumption in wheat production, the trained MLFANN model with two hidden layers as described above was used with 95% confidence limits for various input combinations. This has been shown in Fig. 7 on training dataset and in Fig. 8 on validation dataset. There are four lines in each plot: network output, desired output, and the high and low bounds of the confidence interval to visualize the energy prediction in the final model with an error margin of ± 7889.83 MJ/ha on training data and error margin of around ± 3298.47 MJ/ha for validation data. The 95% prediction confidence means that there is only 5% probability of predicted energy from this model to have error of more than 3298.47 MJ/ha [83].

Fig. 7
figure 7

ANN predicted, actual and 95% confidence interval for energy consumption based on training data

Fig. 8
figure 8

ANN predicted, actual and 95% confidence interval for energy consumption based on validation data

This study has attempted to model input energy consumed in wheat production in India using ANN technique based on several direct and indirect factors that contribute to energy consumption behavior. An accurate modeling gives indication and direction for altering energy consumption and making it more efficient in wheat production and address the energy poverty and food security issues in India. Some variables in the final model are fixed and cannot be changed, and they show the farming conditions such as crop area and farmer’s education. However, variables such as N, P, irrigation frequency can be optimized to achieve desired level of input energy. The MLFANN model derived in this study has greater predictive accuracy and can be used to compare energy use on farms effectively which can educate the farmers to identify crucial inputs and thereafter explore inputs that have potential to reduce energy costs in farming. Agricultural scientists and policy makers can explore this model to estimate energy consumption in various other wheat regions in India, specially with similar social attributes in farming such as farm sizes and farmer’s education.

5 Conclusion and Discussions

This study presents an application of ANN technique to model and predict input energy consumption in wheat production in India. Compared to other studies the average total energy used per hectare was found to be relatively low, however it was significantly higher on small farms compared to large farms. The use of electric motors which covers for 95% of irrigation in the region contributed significantly to energy modeling. In sensitivity analysis it was found to be one of the most important factors in energy consumption modeling is electricity followed by urea, phosphate, and potassium fertilizer. The result of this study showed that ANN technique outperformed MLR method to model energy input based on direct factors and certain indirect factors such as farm and farmer’s attributes. The results of this study can be generalized to the areas with same latitude especially in other states of Haryana and Punjab where wheat is an important crop, and the techniques of production are similar, as well the farmers, farm, and other characteristics in to optimize energy consumption specially in energy vulnerable sector such as agriculture to ensure food security and environmental sustainability.