Introduction

Global agricultural food production has to increase by at least 70% to meet the needs of the increasing world population (Ahirwar et al., 2019). This is a challenging goal because the agricultural sector depends largely on conditions that are not fully controlled such as the weather, soil condition and the quality and quantity of irrigation water. Therefore, it is crucial to adopt precision technologies such as drones to make optimum use of resources and improve agricultural productivity.

Drones have been used for diverse application purposes in precision agriculture and new ways of using them are being explored. Many drone applications have been developed for different purposes such as pest detection, crop yield prediction, crop spraying, yield estimation, water stress detection, land mapping, identifying nutrient deficiency in plants, weed detection, livestock control, protection of agricultural products and soil analysis (Celen et al., 2020).

Plant disease detection is one of the application areas of drones and has been investigated extensively (Veroustraete, 2015). One of the benefits of using drones is the early detection of diseases and the prevention of the spread of infection to mitigate crop loss (Kitpo and Inoue 2018). Previous review studies on the application of drones in precision agriculture are limited in scope. Mogili & Deepak (2018) reviewed the literature on the types of drones and their use for spraying pesticides. Likewise, Devi & Priya (2021) focused on the recognition of plant diseases using images captured by drones. Decision-support systems using drones can lead to better decisions, increase production, improve the quality of products and save labor (Sinha, 2020).

Drones are used for many different crop types and diseases. Some diseases show visible symptoms, while other diseases can only be detected by measuring temperature. Although drones have been proven to be promising tools for disease detection, a detailed systematic overview of the state-of-the-art on the adoption of drones for disease detection is lacking. Recently, a large number of studies have been published on the application of drones. For example, pesticide spraying and crop monitoring studies have been reviewed recently (Hafeez et al., 2022), however, the review was not conducted systematically. Similarly, a traditional review study has been published on the use of drone technology for sustainable weed management focused only on weed management (Esposito et al., 2021). Some recent studies explain the design of drones for precision agriculture (de Oca and Flores, 2021; Hajare et al., 2021).

The objective of this study was to present a systematic review of the literature on disease detection using drones. The systematic literature review (SLR) presented in this paper is different from traditional reviews and aims to identify all relevant scientific papers related to the main theme of this study.

Background and related work

Disease detection

Traditional farming related to disease detection relied on naked-eye observation, which is time-consuming and expensive and requires a lot of expertise (Sandhu & Kaur, 2019). Currently, there are many methods to detect diseases in agricultural crops. Two main categories can be distinguished: direct and indirect detection methods. Direct detection methods consist of polymerase chain reaction, fluorescence in-situ hybridization, enzyme-linked immunosorbent assay, immunofluorescence and flow cytometry. Indirect detection methods consist of thermography, fluorescence imaging, hyperspectral techniques and gas chromatography. The focus of this study lies within the indirect detection methods, especially thermography and hyperspectral techniques that are supported by drones. Thermography is based on differences in the surface temperature of plant leaves and canopies. Hyperspectral imaging can measure the changes in reflectance resulting from the biophysical and biochemical characteristics changes upon infection. Indirect methods can be used to identify biotic, abiotic and pathogenic diseases (Fang & Ramasamy, 2015).

Related work

A lot of research studies on disease detection using drones have been undertaken. All of these studies have had different focii, and different diseases and detection methods have been investigated. In Table 1, an overview of the current literature reviews and surveys on disease detection by drones is presented.

Table 1 Related review works on the use of drones for precision agriculture

Methodology

To perform this study, an SLR protocol was followed. The SLR covers the last decade of papers on this research topic. The SLR answers several research questions that were defined in this study. The selected primary studies are used to extract data and which was analysed to respond to the research questions.

As there is no overview of the published papers on disease detection using drones, the goal of this research was to present an overview of the state-of-the-art. Wright et al. (2007) designed a guideline to conduct an SLR, which was followed in this study.

Research questions

The motivations for the SLR were to identify the.

  1. 1.

    diseases detected by drones,

  2. 2.

    drone type used in disease detection,

  3. 3.

    actors/stakeholders involved in disease detection by drones,

  4. 4.

    executed tasks in disease detection,

  5. 5.

    main parameters for stakeholders to work with,

  6. 6.

    techniques used to support decision-making in disease detection by drones,

  7. 7.

    product types that drones are used for, and.

  8. 8.

    problems of using drones in disease detection.

Following these motivations the following research questions were formulated:

  1. 1.

    What kind of diseases are detected by using a drone?

  2. 2.

    What is the drone type used in disease detection?

  3. 3.

    What are the actors/stakeholders involved in disease detection by drones?

  4. 4.

    Which tasks are executed to support decision-making in disease detection by drones?

  5. 5.

    Which data are generated by drones to support disease detection?

  6. 6.

    Which techniques are used to support decision-making in disease detection by drones?

  7. 7.

    For which agriculture product types are drones used for disease detection?

  8. 8.

    What are the challenges to the application of drones in disease detection?

Search string

To determine the search string, a number of test searches were done on Scopus and ACM with the following terms:

(drone OR uav) AND disease AND detection.

(drone OR uav) AND “disease detection”.

(drone OR uav) AND (“bacteria*” OR “fun*” OR “vir*” OR blight OR wilt OR rot) AND detection.

Based on the insights obtained, the following search string was used to carry out the SLR:

(Drone OR UAV) AND Disease AND Detection.

The databases selected were ScienceDirect, IEEE, ACM Digital Library, Springer, Wiley and Scopus.

Selection criteria

The first step of selection was filtering by reading the title, abstract and introduction in order to determine if the paper is relevant. The second step was to exclude papers using the exclusion criteria given in Table 2.

Table 2 Exclusion criteria

Quality Assessment

All of the selected papers were scored based on eight quality assessment questions given in Table 3 (Kitchenham et al., 2009). Papers were assigned either 1 (good quality), 0.5 (fair quality), or 0 (bad quality) scores. Papers with a total score lower than four were excluded from the research.

Table 3 Quality assessment questions

Table 4 presents the distribution of papers based on databases where they were found at different selection stages. After the initial search, 1852 papers were retrieved, of which 58 remained after applying the selection criteria. After quality assessment, 38 papers were selected as primary studies. The 38 papers were carefully read in full and the required data for answering the research questions were extracted.

Table 4 The process of selecting primary studies

Data collection

The data extraction form is presented in Table 5. All the collected articles are listed in Table 6.

Table 5 Data extraction form
Table 6 Selected primary studies

Results

RQ 1: What kind of diseases are detected by using a drone?

There was a wide variety of diseases that have been studied (Fig. 1). Blight accounted for 8 out of 44 identified diseases. The number of counts of wilt is six. All of the other diseases are more or less evenly distributed.

Fig. 1
figure 1

The number of articles per disease in the 38 studies that used drones for disease detection

The identified diseases were classified into major categories; the categorization scheme that was selected is related to the disease-causing pathogens, which are: fungus, bacteria, virus, nematode and abiotic (Abdulkhadir & Alghuthaymi, 2016). A major finding is that, according to Fig. 2, fungus alone accounted for 64% of the diseases investigted. Bacteria was second place with a percentage of 26%. The remainder of the disease-causing pathogens accounted for the rest of the diseases. This means that viruses, nematodes and abiotics were only responsible for 10% of diseases.

Fig. 2
figure 2

Proportion of disease-causing pathogens related to the identified diseases

According to Fig. 3, there were a wide variety of plants for which drones were used to detect disease causing pathogens. The two crops that stand out in this figure are grape and watermelon. They both have a large representation of fungus. Citrus seems to be vulnerable only to bacterial diseases and wheat to fungus diseases. Tomato has three disease-causing pathogens, namely fungus, bacteria and viruses.

Fig. 3
figure 3

Disease-causing pathogens related to the product type

RQ 2: What is the drone type used in disease detection?

Five different drone types were distinguished. The fixed-wing has a very different design in comparison with the other drones; it has two wings for an aerodynamic shape that makes it look like an airplane. A single-rotor helicopter has one big rotor on top of it and one small rotor at the end of the tail, it looks like a helicopter. The quadcopter is a design that has four rotors; two of them rotate in the clockwise direction and the other ones in the counter-clockwise direction. The hexacopter has six rotors and the octocopter has eight rotors (Mogili & Deepak, 2018). Due to the fact that the single-rotor helicopter is not mentioned in anyone of the extracted papers, this drone type is left out of the categorization. The drone type most used, according to Fig. 4, is the quadcopter, namely 14 times. The hexacopter is used in 9 of the identified papers. The remainder of the papers used the fixed-wing and octocopter. This means that the quadcopter is the dominant drone type used in disease detection.

Fig. 4
figure 4

Drone type used for disease detection

RQ 3: What are the actors/stakeholders involved in disease detection by drones?

According to Table 7, several stakeholders were involved in disease detection by drones. At first, the farmer was always involved. He or she needed to operate the drone, interpret the data or take action after the data is analyzed. The research community was identified as a stakeholder in the 29% of the papers because the studies were used for future work and especially because the data is donated to researchers for research. This indicates that disease detection with drones is still an active area of research. From the consumer perspective, early disease detection leads to better quality of products and safer products; 24% of the papers explicitly mention this issue. 8% of the papers state that drone use has a positive impact on the environment because it leads to less usage of fertilizers. One paper was related to the tourism sector due to the relationship between plants, landscape and tourism.

Table 7 Actors/stakeholders involved or influenced by disease detection

RQ 4: Which task is executed to support decision-making in disease detection by drones?

Figure 5 shows that the main task of supporting decision-making in disease detection is a classification task, which was done in 27 selected papers. There are 15 papers that applied a detection task for data processing. A total of 8 studies applied a mapping technique to analyze data. Other tasks executed were categorization, monitoring, discrimination, quantification, identification and prediction.

Fig. 5
figure 5

The most performed tasks among the 38 studies summarized

RQ 5: Which data are generated by drones to support disease detection?

The results show that mainly CIR images were generated by drones to support decision-making, as shown in Fig. 6. CIR images were the most frequently used with the occurrence of 20 out of the 38 papers. RGB images were the second most used in disease detection (14 times). Other types of images used for disease detection were visible and near-infrared (V-NIR) image, thermal image and multispectral (MS) image and together account for 9 of the 38 selected papers.

Fig. 6
figure 6

Categorization of image type used for disease detection

There are 3 different kinds of images distinguished based on the subject of the image, namely leaf, plant or field-based. Field-based images are the dominant type as shown in Fig. 7. Most, 58%, of the papers made use of field-based images. The rest of the papers made use of plant and leaf-based images, which accounted for 28% and 14%, respectively.

Fig. 7
figure 7

Overview of image (Leaf, Plant, Field)

RQ 6: Which techniques are used to support decision-making in disease detection by drones?

The results show that a wide variety of techniques were used to support decision-making in disease detection using drones. Largely the CNN-based models were applied in disease detection using drones (Fig. 8). CNN-based models were applied in 10 of the 38 selected papers. The CNN-based model category consists of the following models: GoogleNet, VGG16, RetinaNet, YOLO and VGG-Net.

Support vector machines (SVM) is another algorithm that stood out in Fig. 9 with a count of 6. Both RBF and RF have 3 counts each. Other algorithms identified were K-means clustering, AKAZE, Segnet, MLP, SDA, LSC, QSVM, LDA, unsupervised clustering, KMSVM, KMSEG and KNN.

Fig. 8
figure 8

The techniques most used in the 38 summarized studies

RQ 7: For which agriculture product types are drones used for disease detection?

As shown in Fig. 9, a wide variety of agricultural crops were analyzed for disease detection using drones. Grape occurred 6 times in the 38 selected papers. Olive, citrus, cotton and wheat were all counted 3 times. The rest of the product types were more or less evenly distributed.

Fig. 9
figure 9

The most investigated crop types in the 38 studies that apply drones in disease detection

RQ 8: What are the challenges of the application of drones in disease detection?

The results show that there are several challenges encountered in the application of drones for disease detection (Table 8). The challenges can be categorized into two core categories, namely dataset and model building. Challenges related to the dataset were deformations on the image dataset, the limited number of expert-labeled data, strong randomness in data and the lack of classes in the dataset. Challenges related to model building were the small size of the training sample, long training time and long processing time. As seen in Table 8, only 2 papers proposed possible solutions for the encountered challenges.

Table 8 The encountered challenges in the application of drones in disease detection

Discussion

General discussion

Blight and wilt were the two major diseases studied using drone data because both of these two disease categories exhibit very visible symptoms in the picture. In addition, the major disease-causing pathogen that was identified using drones was fungus. This is also in line with the fact that fungus diseases show visible symptoms. It shows that drones are mainly used for detecting diseases that show visible symptoms.

The dominant drone type used was the quadcopter. According to the reviewed papers, this is mostly due to financial motives. When a large area must be covered, either multiple drones are flown at the same time as a swarm or a drone with a larger range is used. Therefore, the relation between drone type and field size must be carefully analyzed.

According to the reviewed papers, drones have been used more often to detect disease in grapes and, to a lesser extent, in olive, citrus, cotton and wheat production. Grapes, watermelon and tomatoes were mentioned often in relation to disease-causing pathogens. This indicates that drones are probably used for multiple purposes. The diversity of techniques identified indicates that either different techniques are used for different types of decisions or plants, or the researchers are still exploring diverse techniques. The results clearly show that classification is the dominant task performed in disease detection by drones. This means that a plant or part of the field is assessed as healthy or not healthy in relation to the investigated disease but not necessarily detecting the disease.

Farmers seem to be involved in all cases because they need to take action after the data gathered by the drones is analyzed. A significant finding is that 29% of the identified papers stated that their data is available for the research community for future work.

The data gathered in disease detection is diverse. This could be because of the fact that this is an active area of research. Researchers are experimenting with cameras that need to be mounted on drones flown at various heights and conditions. That is probably an additional reason why many different algorithms are applied. The challenges encountered where papers did not come up with possible recommendations indicate the need for further investigations.

While traditional machine learning algorithms such as SVM may provide satisfactory results in many precision agriculture studies, researchers should also consider using advanced deep learning algorithms that can utilize the increasing amount of data available and have better performance (Oikonomidis et al., 2022a, b; Kaya et al., 2019). For disease detection using drones deep learning algorithms, such as transformers, long short term memory (LSTM) and autoencoders, can be investigated for various problem scenarios. More research can also be done in detecting diseases that do not have visible symptoms.

This study has also identified some challenges and potential solutions for the challenges. For example, the processing time and training time were mentioned in some studies as potential challenges. Advanced data infrastructures and techniques such as distributed machine learning and hierarchical federated learning should be considered in future studies.

Threats to validity

There are a number of threats to validity in relation to the conducted SLR, which include construct, internal, external and conclusion validity threats.

Construct validity refers to measuring to what degree the test measures what it claims, or purports, to be measuring. In other words, the SLR should be the right method for the goals of the study. Databases are a very effective source for literature searching, but they are highly susceptible to query phrasing. Minimal differences in a search query can result in major differences in the outcome of relevant literature. The databases also have different formats, therefore, the search method is slightly different. Some papers might have been missed due to the search criteria used, but the 38 primary studies helped to respond to all the research questions. Six widely used electronic databases have been covered, but there might be some papers that are not indexed by these databases. Many papers can be found using Google Scholar, but many of them are not peer-reviewed. As such, Google Scholar was excluded as a literature database in this research because the targets were only peer-reviewed and high-quality studies.

The quality assessment could be vulnerable to subjective decisions. This threat has been minimized by following the standards for this procedure. The main objective of the quality assessment is to identify low-quality papers instead of assigning a precise quality score per paper. As such, while the assessment can be considered subjective, the overall strategy has generally been adopted in SLR studies. In addition to this, the extraction of data could be incomplete because of the fact that data is not available or missed in the papers. This is reduced by formulating a clear, unambiguous data extraction form. Since categories were widely used, there might be some risk of incorrect categorization. However, the impact of such misinterpretation should be minimal because of the large number of items in each category.

Internal validity is the extent to which a study establishes a trustworthy cause-and-effect relationship between a treatment and an outcome. In this SLR, all of the research questions are formulated to investigate the necessary elements for disease detection by drones in precision agriculture. Because of the fact that all of these elements are well-defined, the relationship between the questions and the research goal is satisfactory.

External validity is the extent to which the outcome of a study can be expected to apply to other settings. In other words, this validity refers to how generalizable the findings are. Because of the fact that algorithms can be applied to other areas without major modifications, it will be possible to use these for other (new) disease detection methods by drones. Since this research field is very active and many articles are published, results might be different in a new SLR study that includes recent papers. During this research, the aim has been to cover all the papers published so far. However, due to the formal review processes that took substantial time, new papers may have been published and high-quality primary studies might not have been included by the time the study is published.

The reproducibility of the SLR is measured by reliability. The procedure of Wright et al. (2007) was followed in this study. The processes of question design, search process, screening criteria and quality evaluation all comply to the standards. The results of the collected data were analyzed with tables and graphs to formulate objective conclusions.

In addition to these potential threats, emphasis should be given to two particular exclusion criteria. One of them excludes review articles. Review articles were excluded because, in SLR studies, only primary studies are included. The other excluded papers were those that did not present experimental results. That is because the answers to research questions require studies that have experimental results. If there is valuable information in papers that are either review papers or are not experimental studies, they might have been missed in this research. However, the main objective was to respond to the research questions defined in this research instead of presenting all the information discussed in the literature.

Conclusion

Results show that blight and wilt are the most widely studied disease types. More than 10 disease types were covered by a single study. To have a better understanding of the use of drones for disease detection, the diseases were categorized into five categories and the results show that fungus accounts for 64% of the diseases for which drones were used. Virus, nematode and abiotic were studied only in 10% of the studies. This observations indicate that while researchers can perform new research for the less studied disease categories, practitioners can apply drones to detect fungus-related diseases because there is already substantial scientific evidence. Grape and watermelon have been widely studied in different studies. There are few studies on kiwi, squash, pear, lemon, onion and rice, which shows the potential of the utility of drones but further and in-depth research is needed. Most researchers apply drones for classification tasks. Most of the studies (58%) utilized field images and very few papers used leaf images (14%) or plant images (28%). Research on small-scale objects such as leaves and plants requires higher-resolution visual inspections and this might not be possible in some cases where the available equipment and sensors do not support very precise inspections. The most used algorithm is CNN probably because this algorithm has been the basis for complex deep learning-based models such as VGG16, GoogleNet and VGG-Net and because many researchers represented the underlying problem as a classification task. However, if the problem is represented in a different way rather than a classification task, the corresponding appropriate algorithm could be a different one.