1 Introduction

Agriculture [1] is essential to the economy in many ways, including providing food, raw materials, and fibers and maintaining economic autonomy. Agriculture has significant challenges in irrigation, fertilization, and crop rotation. Farmers have been experimenting with their crops ever since the dawn of agriculture [2]. This practice dates back to the earliest days of human history. The search is on to find plants with desired characteristics, such as resistance to drought or pests. As a result of an increase in population, urban regions are experiencing an increasing amount of difficulty dealing with water scarcity. Agriculture already has several challenges, and environmental issues such as soil erosion and inappropriate irrigation only complicate the situation. The consequences of global warming on changes in temperature, weather, and precipitation patterns may also hurt agricultural productivity. IoT and machine learning are essential components of an agro-environment monitoring system. All crucial information about crop production forecasts, fertilizer management, smart irrigation, crop monitoring, crop disease diagnosis, and pest control is provided by IoT-employing sensors [3, 4]. Applications of Agriculture environment monitoring using IoT is shown in Fig. 1 below.

Fig. 1
figure 1

Applications of agriculture environment monitoring

Many businesses today consider agriculture a significant part of their production activities. Businesses operating in the agriculture industry [5] often attempt to maximize the revenues generated by production. Over the last ten years, farmers have depended on soil characteristics to identify seasonal crops based on field variations. As a result of a lack of appropriate technology, this information could have been more successful in targeting and optimizing agricultural inputs, especially in farming on a big scale [6].

Since the industrial revolution, manufacturers have modelled their manufacturing processes after agricultural practices. When it comes to agricultural practices, the majority of companies are only concerned with one thing: maximizing their profits. Over the last ten years, farmers have depended on soil characteristics to identify seasonal crops based on field variations. On the other hand, given the absence of the appropriate instruments, it was impossible to put this knowledge to good use to locate and optimize agricultural inputs, particularly on a large scale [7].

Precision agriculture [8] contributes to the resolution of these issues by cutting down on the amount of water consumed during agricultural production and increasing crop yields through the ongoing monitoring of soil moisture, humidity, temperature, and pH. Because precision agriculture can prescribe the appropriate ratio of water to fertilizer to paste, it may be possible to link increased crop yields to this technique [9]. In many industrialized countries, precision farming relies heavily on the use of equipment that is connected to the Internet of Things (IoT). Smart farming is represented in Figs. 2 and 3 [Source: https://www.renkeer.com/agricultural-environment-monitoring/].

Fig. 2
figure 2

IoT enabled smart farming

Fig. 3
figure 3

Agricultural environment monitoring system

Machine learning [10] refers to the capability of gaining knowledge via experience. In order for the algorithms to learn from the input datasets, progressively train the samples, and increase their performance, the equations need to be modified. The use of ML approaches has been deemed preferable by industry professionals for the purpose of addressing nonlinear problems that are based on data from sensors or other sources.

By modelling hypothetical outcomes on real ones, it provides individuals with the ability to make judgments with little input from other humans. Machine learning-based algorithms are increasingly being used in every sector of the economy. However, the accuracy is impacted by the data quality, which is described below. As a direct consequence of this, the techniques of machine learning substantially depend not only on the representations of datasets but also on target variables.

A number of plant diseases, as well as the lack of a pesticide control mechanism that is both effective and efficient, have a substantial influence on crop yield. The Internet of Things enabled sensors to take a picture of diseased plants and provide a solution that was both appropriate and timely for the purpose of protecting the plant's existence. The Internet of Things (IoT) and image preprocessing come together to provide plant pathologists with the assistance they need to make precise diagnoses of disease [11, 12].

The amount of fertilizer that is applied to a crop throughout the growing process has a significant bearing on the final product. When determining when and how to apply pesticides on their crops, growers of all types need to take into consideration the specific qualities of the plants they are cultivating. Despite the fact that they have a detrimental impact on plant nutrition, chemical fertilizers continue to play a key part in the development of a wide variety of illnesses that were not predicted [13]. The quality of the soil is believed to be of the utmost importance to farmers since it directly impacts the yield of their crops. As part of the process of precision agriculture, the detection and monitoring of soil moisture is done automatically. For the purpose of keeping track of the condition of the soil, some of the most cutting-edge examples of Internet of Things (IoT)-based agricultural monitoring equipment now available on the market are gamma-radiometric soil sensors, a soil moisture sensor, and an electrical conductivity sensor. In addition to recording the weather, the additional sensors that are used to monitor the environment also record the physiological state of the crops [14]. Multi linear regression along with feature selection is quite good in classification as far as accuracy and precision is concerned.

The farmer has to determine as soon as possible the rough estimate of the quantity of water that will be required for cultivation. The amount of water that a region gets is influenced by a vast number of factors, some of which include the climate, the season, the kind of soil, the variety of crops grown, and the stage of development. When a crop is being cultivated, water can frequently be lost from it due to processes such as transpiration and evaporation.

The implementation of innovative practices led to improvements in both the quantity and quality of the food that was made available. The practices that have been considered acceptable for years, decades, or even centuries. It is vital to focus on inputs like fertilizers, seeds, and other such things in order to increase agricultural yields. It is necessary to develop a new approach because there is a need to fill a hole that cannot be filled by using old techniques.

Food insecurity has been related to a broad variety of unfavorable health effects, including infertility, human health problems, hastened ageing, improper insulin management, and more. The societal acceptance of such technology leads to the creation of bioorganic food products. Traditional agricultural practices have been refined over the course of decades of research, resulting in improvements that have almost little impact on output. In order to overcome these obstacles and make progress toward cultural and urban farming, it is now necessary to make use of cutting-edge methods that are based on sensing technology.

This article presents IOT based smart fertilizer recommendation system for precision agriculture. This framework uses IoT devices and sensor to acquire agriculture related data, and then machine learning is applied to suggest fertilizer in correct quantity and at appropriate time. Data Acquisition phase collects input data including soil temperature, soil moisture, soil humidity, regions weather data, crop details. Features are selected using Sequential Forward Floating Selection algorithm. Data classification is performed by Multi linear Regression. Performance of SFSS-MLR is compared to Random Forest, C4.5, Naïve Bayes algorithm. Performance of SFSS MLR is better in terms of accuracy, precision, recall and F1.

2 Literature survey

The nutrients that plants need in order to flourish may be found in the soil. If any of these minerals in the soil are missing, growth will be significantly hampered. Regular testing of the soil's composition is essential in order to guarantee that plants are receiving the nutrients they need. If and when a nutrient deficiency is discovered in the soil, the problem may be corrected by applying a fertilizer that is high in the nutrient to the soil. Fertilizers have had a considerable positive influence on agricultural output, but their extensive usage has had a negative effect that has been destructive to the ecology. Because of this, soil nutrient testing is an important instrument in the field of agriculture. The conventional methods of soil testing, despite the fact that they provide accurate data, are not suitable for use in precision agriculture owing to the high costs involved and the protracted amount of time required for results to be obtained. Traditional testing is restricted in its ability to measure the geographic heterogeneity of a field since the expense increases with an increase in the number of samples that are tested. For this reason, it is essential to have access to methods that are not only fast and portable, but also economical and capable of producing results with a high degree of precision.

A wide variety of sensors, such as those that measure moisture levels, plant vitality, soil texture, and so on, are used in the process of gathering information about crops and soil. One of the most prevalent applications of distant and proximal sensing technologies is the collection of high-density data that provides information about an area. In contrast, remote sensing technologies utilize sensors that are installed on aerial systems or spacecrafts to collect data from a field; proximal sensing technologies receive data from a field by situating sensors at close range or in direct contact with the surface of the field.

The use of nonrenewable resources, such as soil, is essential to the maintenance of agricultural output. The soil is a highly dynamic and complex system that varies substantially in both space and time. These changes may be observed in a number of different ways. The condition of the soil is an essential component to the practice of sustainable agriculture. When soils have reached ecological balance, they are able to sustain plant and animal production, maintain and enhance the quality of air and water, and nurture the well-being of both plants and animals. The presence of soil is necessary for the development of plants, which are then used in the production of food for humans and animals, textiles, and fossil fuels. The quality of the soil may either deteriorate over time or improve as a result of a mix of natural and human-caused causes.

Agriculture is a vital industry that requires careful management in conjunction with expanding people. The authors [15] state that the modern agricultural system is highly mechanized but still significantly depends on human labor. As an example, between the years 1920 and 1970, a thirty percent investment generated a one hundred eighty percent return. In addition, the increase in productivity was not the result of an increase in the amount of data sources that were utilized; rather, it was the result of advancements in productive farming. Researchers have recently shown that elements such as sifting machines, mechanical innovation, and synthetic manures all have a role in determining agricultural profitability. Farmers have been more reliant on different types of technological communication and information storage over the course of the last decade. This has allowed them to better monitor their interactions with third parties and monitor their financial data. These days, having access to information is often seen as essential to normal life. As a result, the horticulture industry makes it easier for farmers to compile data and conduct statistical analyses based on field observations. Accurate information may be effectively disseminated via the use of a wide variety of sensors, as well as agricultural and meteorological machinery.

Authors [16] presented a number of proprietary approaches as potential ways to enhance agricultural monitoring. Researchers uncovered more complex frameworks in their research on the subject of keeping track of geographical regions and climatic fluctuations. Farm Management Information Systems, often known as FMIS, have developed throughout time to concentrate on the specific activities and needs that are typical of farms. At the moment, these frameworks are working their way into the contemporary age of the Internet by making use of established systems administration and replies to strengthen agricultural structures. In spite of this, it is a commonly held belief that the Internet has a number of flaws, most notably in the area of managing the enormous number of linked devices, whether they be devices associated with the internet of things (IoT) or devices used by stakeholders (stakeholder devices). At the same time, there is not a standardized solution that can guarantee reliable interoperability between the relevant authorities and interested parties. Since that time, people have been relying on the frameworks provided by Future Internet (FI) to fill up these gaps.

Farmers may at long last get a hold on their agricultural yield with the assistance of sensors, allowing them to adopt the type of thorough control that has been urged for a very long time. Researchers have outlined the fundamental purpose of sensor technology as well as its significant contributions to the field of agriculture. The sensor developed by Shining Li et al. is put to use in the Precision Agriculture Monitor System (PAMS) in order to keep an eye on how farming is being carried out. To increase production by means of reinforcing socioeconomic elements, it is advised that farmers adopt the IFarm Framework system for monitoring and regulating water consumption. This system may help farmers monitor and regulate how much water they use. Anisi M.H. et al. [17] categorized the sensor technology by determining how well it performed based on a number of different characteristics. Million.

According to Csoto [18], the key sources of precision agricultural information for 14 cotton-growing southern states in the United States include farm-based equipment, crop consultants, university extension, the media, and government organizations. The proliferation of precision agriculture technology is influenced in a variety of different ways by data coming from a wide variety of sources. One such example is the widespread utilization of yield monitors that come pre-fitted with both global positioning system (GPS) and soil survey map technology. However, input from dealers has a significant effect on the degree to which technologies such as zone soil sampling and soil survey maps are implemented.

In a great number of emerging countries, the availability of agricultural water is decreasing. Even the sections of the booming economy that use the least amount of water nonetheless need a substantial amount of it. For this reason, it is vital to create effective techniques of agricultural production, generate the required amounts of food, and execute critical plans for collecting and distributing rainfall, as well as making arrangements for the use of artificial irrigation systems. Rainwater collection and efficient irrigation techniques need to get more attention if we are to preserve our available water supplies. The quantity of water that is used by each industry has been continuously growing throughout the course of time. The length of time rain continues to fall is mostly dependent on sporadic climate shifts and other factors, such as evaporation and transpiration [19]. It is very necessary to evaluate the climatic factors that are associated to the production of agricultural goods in order to prevent global water shortages and food shortages.

For a Wireless Sensor Network (WSN) [20] scenario, ZigBee [21] and [22] deploy multiple sensor nodes in a restricted location. Authors also proposed a hybrid heuristic approach that combines genetic algorithms with decision trees in order to find the optimum decision tree for modelling farmer behavior and predicting irrigation events. This technique combines Decision Trees with Genetic Algorithms. The strategy was used over a whole irrigation area. According to the findings, the best models developed were able to accurately predict between 68 and 100% of the observed irrigation events, while also predicting between 93 and 100% of the unexpected ones.

Fertilizer use during crop growth affects the end result. All producers must consider the plant's characteristics while applying pesticides. Chemical fertilizers contribute to the development of several unexpected diseases notwithstanding their negative effects on plant nutrition. Farmers value soil quality because it affects agricultural output. Precision agriculture automatically monitors soil moisture. Gamma-radiometric soil sensors, soil moisture sensors, and electrical conductivity sensors are some of the latest IoT-based agricultural monitoring equipment for soil monitoring. The environment sensors also capture crop physiological status. Multi linear regression and feature selection classify well.

3 Methodology

This section introduces an IOT-based system for smart fertilizer recommendation system. This platform collects data on agriculture using IoT devices and sensors, and then applies machine learning to recommend fertilizer in the right amount and at the right time. The data collection phase gathers input data such as crop information, soil temperature, soil wetness, and soil humidity. Data is acquired using moisture and humidity sensors. The Sequential Forward Floating Selection method is used to choose the features. Multi linear Regression is used for data categorization. SFSS-MLR's performance is contrasted with that of Random Forest, C4.5, and the Naive Bayes method. Methodology is presented in Fig. 4.

Fig. 4
figure 4

IOT based smart fertilizer recommendation system

In order to accelerate the process of generating models and to increase the accuracy of detection and classification, a significant number of researchers are resorting to feature selection techniques. These researchers hope that by doing so, they can achieve their goals more quickly. The fundamental goal of any feature selection algorithm is to first determine the characteristics that are the most relevant to the problem at hand by making use of a variety of selection criteria, and then to filter out any features that are unnecessary or that are duplicates of other features. The work that is being presented here makes use of a method for choosing features that is known as Sequential Forward Floating Selection (SFFS) [23]. This method was used to pick the features that are being presented here.

Even though it is not a method that is utilized in machine learning, linear regression is often used as a foundation for more advanced models that are used for data prediction. This is because linear regression is so straightforward to implement. The linear regression model makes use of a variety of research approaches, one of the most significant of which is work done in economics. Other approaches, such as statistical analysis, are also used. Analyses of the associations between the independent factors and the dependent variables are carried out, and an appropriate prediction model is built by applying the regression model to the data that has been gathered. Additionally, the structural interpretation hypotheses might be tested with the aid of the variables in the study. In the context of structural interpretation, the independent factors do, in fact, have an effect on the variable that is being interpreted. The statistical regression model, which postulates that the usefulness of a collection of dependent characteristics in one dataset is dependent on a set of independent features in another dataset, is evaluated using MLR [24], which states that the usefulness of a collection of dependent characteristics in one dataset is reliant on a set of independent features in another dataset. This is done with the purpose of figuring out whether or not this hypothesis is correct. It is very necessary for the dependent variable to be of a continuous sort in order for the regression technique to function in an appropriate manner.

On the other hand, the independent variable might either be continuous or discrete in its presentation. One kind of linear modelling is known as linear regression, which gets its name from the fact that the resultant regression line is often straight. The linearity of the connections between the variables is an important point to keep in mind. The regression model is one of the most popular choices for predicting crop yield because of the correlation that exists between the independent variable yield and the dependent characteristics fertilizers utilized, area of production, weather factors, and irrigation parameters. This makes the regression model one of the most accurate models available. In crop yield prediction models, the crop yield itself is referred to as the dependent feature of the dataset, also known as the response feature. On the other hand, the other important parameters are referred to as the independent features. The MLR model is mathematically illustrated by Eq. 1.

$$ y_{i} = \beta_{0} + \beta_{1} x_{{i_{1} }} + \beta_{2} x_{{i_{2} }} + \beta_{3} x_{{i_{3} }} + ... + \beta_{ik} + \varepsilon_{i} $$
(1)

where k is the number of features used in the data set, xik is the ith measure of the xk feature in the dataset, β1, β2, β3,…, βk are the regression coefficients calculated using the independent features, β0 is the intercept and εi are the residuals or the error.

4 Result and discussion

A data set is created by collecting related data from [25]. This data set contains soil temperature, soil moisture, soil humidity, soil type, crop type, nitrogen, potassium, phosphorus, fertilizer name. Features are selected using Sequential Forward Floating Selection algorithm. Data classification is performed by Multi Layer Regression. Performance of SFSS-MLR is compared to Random Forest, C4.5, Naïve Bayes algorithm. Weka tool is used to implement proposed methodology.Results achieved are shown in Table 1 and Fig. 5. Performance of SFSS MLR is better in terms of accuracy, precision, recall and F1.

Table 1 Result comparison of SFSS MLR technique
Fig. 5
figure 5

Result comparison of SFSS MLR technique with C4.5, NB and RF techniques

For the purpose of determining whether or not a classification model is effective, evaluation metrics such as accuracy, precision, recall, and the F1 measure are used. The performance of various classifiers may be compared via the use of benchmark measurements. It is usual practise to calculate a variety of distinct metrics while doing a binary classification job. These metrics might include the percentage of right classifications, the number of false positives, and the number of false negatives. When we talk about "positive instances," we're referring to the examples that have been labelled as positive, and when we talk about "negative instances," we're talking about the examples that have been labelled as bad.

True positive (TP): Positive instances that are correctly classified as positive.

False negative (FN): Positive instances that are correctly classified as negative.

True negative (TN): Negative instances that are correctly classified as negative.

False positive (FP): Negative instances that are correctly classified as positive.

$$ {\text{Accuracy}} = \left( {{\text{TP}} + {\text{TN}}} \right)/{\text{N}} $$
$$ {\text{Precision}} = {\text{TP}}/\left( {{\text{TP}} + {\text{FP}}} \right) $$
$$ {\text{Recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} $$
$$ {\text{f}}\_{\text{measure}} = 2*\left( {\left( {{\text{Precision}}*{\text{Recall}}} \right)/\left( {{\text{Precision }} + {\text{Recall}}} \right)} \right) $$

The precision, recall and F1 score of SFSS MLR are also higher than other machine learning techniques for agriculture environment monitoring. The accuracy of SFSS MLR is 99.3 per cent.

5 Conclusion

Precision agriculture can prescribe the appropriate ratio of water to fertilizer to paste; it may be possible to link increased crop yields to this technique. Precision agriculture contributes to the resolution of these issues by reducing the amount of water that is used during the production of agricultural goods, as well as by increasing crop yields through the ongoing monitoring of soil moisture, humidity, temperature, and pH. This article presents the design and development of a smart agriculture environment monitoring system. Instability in one's access to food has been linked to a wide range of adverse impacts on one's health, including infertility, human health issues, accelerated ageing, poor insulin regulation, and several other conditions. Both the amount and the quality of the food that was made accessible were able to significantly improve as a result of the deployment of novel practices. The development of bioorganic food items directly results from society's acceptance of the technology in question. During the phase known as "Data Acquisition," input data such as temperature, moisture, and humidity of the soil, as well as regional meteorological data and crop specifics, are collected. The Sequential Forward Floating Selection algorithm is used in feature selection. The categorization of the data is accomplished by using multi-layer regression. The performance of the SFSS-MLR method is analyzed and contrasted with that of Random Forest, C4.5, and the Naive Bayes algorithm. Regarding accuracy, precision, recall, and F1 performance, SFSS MLR performs more admirably than its predecessor. The precision, recall and F1 score of SFSS MLR are also higher than other machine learning techniques for agriculture environment monitoring. The accuracy of SFSS MLR is 99.3 per cent. Precision rate and recall rate is 99 percent. F1 score of SFSS MLR is 99.3 percent. In future, SFSS MLR can be tested by handling larger data sets, data sets of different sizes. Performance of SFSS MLR is also needed to be tested in federal learning environment.