Introduction

Despite massive reports on pesticide analysis and detection in fruits and vegetables worldwide, reports on cases from Indonesia are very limited. Currently, there are around 50 thousand of articles indexed in Scopus for pesticide and analysis within article titles, abstracts, and keywords, but only around 139 articles or 0.3 % if Indonesia and detection are added to the query filter. Moreover, to our knowledge, detection of pesticide in Indonesian plants or plant products are mostly related to tea [1,2,3,4], cacao [5, 6], coffee [7], with only two reports in vegetables: that are shallot [8], potato and tomato [9], but none in rice and fruits. These limited reports may imply either low cases of pesticide detection- or a low number of pesticide analysis.

On the other hand, as an agricultural country, pesticide application in Indonesia is massive. Registered pesticides in Indonesia are increasing [10] and so is their trade [11]. Moreover, pesticide application in Indonesia is not only during plantation but also during storage after the harvest [1]. Therefore, pesticide residues on agriculture products are very likely, and low number of reports on pesticide detection in Indonesian agriculture products most likely suggests low number of analysis rather than low number of cases.

Furthermore, pesticide residue analysis in agriculture products is important to ensure the safety of the products. Since Indonesia is exporting agriculture products, pesticide residue analysis is required, not only to comply with the maximum residue limits (MRLs) set by Indonesian government, but also with regulations set by the destination countries. Here, it is important to have laboratories that are able to perform the analysis continually, so pesticides concentrations in those products can be monitored and maintained below the MRLs as regulated. Failure in the monitoring of pesticide concentrations in the country may result in rejection of products after they are shipped to destination countries [2, 12] in addition to the health risk associated with consuming products with pesticide residue in Indonesia itself.

Meanwhile, there are challenges linked to pesticide residue analysis in agriculture products. Complexity of the matrices [13,14,15,16,17], a large number of target pesticides and samples [13, 18, 19], and low MRLs [20] are some of the main challenges. Here, sample preparation is the key for the analysis [14, 15, 17] since it removes most interferences related to the matrix complexity. It is must be rapid and built for multi-pesticides to counter the large number of target pesticides and samples.

Fortunately, sample preparation for pesticide analysis has been advancing from the complicated conventional methods that mostly utilized extraction in large volume of solvents [13] to simpler methods that utilized sensors molecularly imprinted polymers, and nanotechnology-based materials in small amount [21,22,23,24,25,26]. However, due to its relatively simple procedures, QuEChERS (quick, easy, cheap, effective, rugged, and safe) or its modifications are now commonly applied as sample preparation in multi-residue pesticide analysis [14, 15, 17, 27,28,29].

Similarly, instrumentation for pesticide quantification has also been advancing. Gas chromatography (GC) and liquid chromatography (LC), coupled with a mass spectrometer (MS) or double mass spectrometers (MS/MS), are replacing the conventional instruments for pesticide separation and detection. They are proven to be powerful in quantifying multi-pesticides [19, 30] as well as to get detection limits that are the same as or lower than the MRLs [2, 18, 19, 31].

Unfortunately, advances in pesticide residue analysis worldwide are not immediately applied in Indonesia. Indonesia is still facing limitations in terms of the number of laboratories and their ability to perform the analysis. Moreover, many laboratories are still engaging with conventional sample preparation methods and or instruments. GC or LC coupled with electron capture, nitrogen phosphorous, or ultraviolet detectors are still commonly used [1, 5]. These instruments, although are still relevant for pesticide detection and give good separation and sensitivity capabilities, are difficult to deliver detection limits at MRL levels.

Meanwhile, as an assessment of the quality of analytical results, laboratories are required to join proficiency testing (PT) if available and relevant [32,33,34,35,36]. Pesticide residues PTs are very important in quality testing due to the pesticide residues test increasing demand with different types of pesticides and various matrices. Reports in PTs of pesticide on foods, mostly in vegetables and fruits, are available [34, 35, 37,38,39,40,41,42,43], but none are organized in Indonesia. Moreover, while proficiency testings are also important tools to demonstrate laboratories’ capabilities in performing analysis, only a few laboratories in Indonesia have joined them.

This is the first PT for pesticide ever reported for Indonesia. This research summarizes Indonesia's five years of pesticide residues PT’s results and achievements conducted by the Testing Laboratory of the Directorate of Standardization and Quality Control, Ministry of Trade Republic of Indonesia. The participants are public and private sector pesticide residues testing laboratories under the Indonesian food testing laboratory network. In this report, the main findings of 5-year PT in Indonesian vegetables and fruits matrices will be discussed including the analytical methods used and so the performance of all participants.

Materials and methods

Overview of PT

These PTs were conducted by Directorate of Standardization and Quality Control, Ministry of Trade of Republic of Indonesia. Since 2019, this PT provider has been accredited to organize proficiency testing based on ISO/IEC 17043 principles for fruits and vegetables. The target pesticides and test materials are listed in Table 1. Due to the limitation on laboratory scope and equipment from most laboratory participants, the test parameters were based on the possible positive list. Therefore, the PT were not aimed to evaluate the laboratory performance in detecting for false positive and negative. All PTs were announced each year through Indonesian food testing reference laboratory network and laboratories may voluntarily register. Laboratories were allowed to choose the methods for the analysis and the number of target pesticide they wanted to analyze. After the registration periods, samples were distributed and laboratory participants performed the analysis according to their available methods at a certain period of time. The laboratory participants then reported the results for data analysis.

Table 1 PT information: test materials, target pesticides, and MRLs

Test materials and PT procedures

The test materials were prepared as puree (tomato, orange, lettuce, strawberry) or powder (brown rice). The raw materials for the test materials were obtained from local market and when available, organic materials were selected. The target pesticides were then spiked to the test materials and the test materials were subsequently homogenized. The concentration of target pesticides in the test materials was all confirmed below 0.01 mg/L.

From 2016 to 2018, each laboratory received two sample bottles, each contained 50 mg of the same test materials. They analyzed and reported those two as two sets of analytical result. Therefore, the total sets of analytical results are twice the number of laboratory participants. However, due to the increase in laboratory participants, in 2019 and 2020, each laboratory received only one bottle of sample, so the total set of analytical result were the same as the number of laboratory.

The homogeneity of the test materials was tested for each PT in which 10 bottles of test materials were randomly selected for the analysis before they were dispatched. All target pesticides were analyzed by methods that have been accredited, in which an LC–MS/MS or GC–MS/MS was used as the instrument for quantification. The homogeneity of the test materials was assessed for each target pesticide by comparing the values of between-samples standard deviation to standard deviation of the homogeneity test. The ratios of the consecutive values for all target pesticides listed in Table 1 were less than 0.3, thus all test materials were considered homogeneous.

The stability tests were performed by doing analysis of the same test materials at intervals of time to ensure that no significant change of test materials was observed during the PTs. The analysis was performed at the beginning (initial), prior to sample distribution, and at the last day of the analysis period (final) and the two results were subsequently compared. Several target pesticides listed in Table 1 were eliminated from the evaluation due to the statistically fluctuated concentrations shown by the stability tests. The elimination of a pesticide was applied only in a certain PT where its concentration was fluctuating, and not for all PTs.

PT evaluation

Three approaches: scaled median absolute deviation (MADe), normalized IQR (nIQR), and algorithm A [47], were firstly used to generate assigned values for each target pesticide every year. However, algorithm A was further chosen in preference for the whole PT analysis. The consensus robust average and robust standard deviation based on algorithm A were adopted as the assigned value and standard deviation for the PT assessment. For algorithm A approach, the robust average and standard deviation for each target pesticide every year were simulated from all set of analytical result reported by laboratory participants using Python 3.6 programming language until they converged.

The performance of the laboratory participants was subsequently evaluated using z-score [47, 48] as in Eq. (1):

$$z_{i} = \frac{{x_{i} - x}}{\sigma }$$
(1)

where zi and xi respectively represented the z-score and the analytical result of each laboratory participant, while x, and σ represented the assigned value and standard deviation for the proficiency assessment respectively. The performance of laboratories was classified as satisfactory if z  ≤ |2|, questionable if |2| ˂ ˂ |3|, and unsatisfactory if |3| ≤  z [48].

Results and discussions

Analytical methods and analytical results of laboratory participants

Over the years, as shown in Fig. 1, the number of laboratories that participated in the PTs tend to increase. The increase may suggest that more laboratories are in pesticide testing and so the awareness of joining PT. However, this trend is only observed in government laboratories and not in private laboratories that are relatively stable in number. A slight decrease in 2020 is likely related to COVID-19 lockdown that affects the working hours of laboratories in Indonesia. Even so, it is important to understand that participants in these PTs are relatively low. To date, there are around 1279 testing laboratories accredited by Indonesian Accreditation Body [49], and the highest number of participants in these PTs is less than 2 %, implying the low number of pesticide testing laboratories in Indonesia or a low number of laboratories that are willing to join PTs. Accreditation Body of Indonesia requires laboratories applying for accreditation to participate in at least one PT that was organized based on ISO/IEC 17043 principles [50]. However, this can be replaced by participating in interlaboratory comparison with at least 3 laboratories, or demonstrating internal performance-based data [50]. It is assumed that the latest is also responsible for this low number of participants observed in this study.

Fig. 1
figure 1

Left: Number of government laboratories (black) and private laboratories (grey) which registered for PTs and number of laboratories which reported at least one pesticide (-▫-). Right: Number of laboratories which reported at least one target pesticide (□), half of target pesticides (-◦-), and all of the target pesticides (-▪-) every year

Moreover, the number of laboratories that are able to report results for all target pesticides is decreasing. This implies either the low capabilities of laboratories to perform multi-residue analysis or the unavailability of pesticides standards in the laboratories. However, it must be noted that the number of target pesticides is increasing over the years and may affect the capabilities of laboratories. Since pesticide testing is more multi-residue analysis, with the expectation of hundreds of target pesticides in one-time analysis, it is important to upgrade the capabilities of laboratories to perform the multi-residue analysis. Therefore, there is a need for improvement in terms of the number of pesticides that are able to be analyzed in one-time analysis in Indonesian laboratories.

Statistics on methods of laboratories are given in Fig. 2. As for the sample preparation, the majority of laboratories apply QuEChERS or its modifications. It is widely used for pesticide analysis [14, 15, 17, 27, 28] and its prepackaged kits are available and easy to use. Meanwhile, for the instruments, all laboratories use gas chromatography (GC) or liquid chromatography (LC) for pesticide separation. Every year, however, at least 20 % still relies on detectors other than single (MS) or tandem mass spectrophotometer (MS/MS). MS detectors, both single and tandem, are relatively costly and usually not available in moderately equipped laboratories [2, 51]. However, they are powerful tools in detecting pesticides at low-level concentrations as in the case of most pesticide analyses [2, 18, 19, 31, 52].

Fig. 2
figure 2

Statistics of: a sample preparation; b instrument used; c quantification; and d number of target pesticide in one analysis over the PT years

For quantification, the use of matrix-matched calibration and calibration solution in quantifying pesticide are relatively comparable (Fig. 2). Yet, in case of pesticide analysis in complex matrices, the use of matrix-matched calibration is currently preferable over standard solution for more reliable results [53,54,55,56], possibly due to its better performance in suppressing the effect of the matrices. However, there is no statistical evidence of matrix-matched calibration superiority over standard solution in this study, so further study is required, and the use of matrix-matched calibration in this study is merely a recommendation.

Most laboratories have been performed the analysis in multi-residue, and in the last two PTs, 100 % of them have applied multi-residue (Fig. 2). However, as indicated early, the number of laboratories that are able to analyze all target pesticides is relatively low and so is the target pesticide in these PTs. Moreover, there are currently around a thousand known pesticides [57] with more than 100 target pesticides are commonly being analyzed in one-time analysis [43, 52, 58,59,60,61]. Therefore, considering the real case of pesticide analysis, the high percentage of laboratories that are able to perform multi-residue analysis, in this case, is still considered to be low.

Assigned value and laboratory performance assessment

Scaled median absolute deviation (MADe), normalized IQR (nIQR), and algorithm A (citation) were used to generate assigned values for each target pesticide every year. The assigned values were then used to calculate the z-scores and laboratory performance was further evaluated based on those scores (Fig. 3). Overall, for all approaches, the annual percentage of z-score in the satisfactory range is higher than 80 %, except for 2017 calculated based on MADe. Furthermore, using algorithm A, z-score in the satisfactory range is 90 % for all PTs. Therefore, for further use, assigned values generated from algorithm A is chosen in preference.

Fig. 3
figure 3

Laboratory performance based on z-score. The z-scores were calculated from assigned values that are generated based on MADe (left), nIQR (middle), and algorithm A (right) approaches

Over the years, the percentage of satisfactory results are relatively similar and the average values of satisfactory, questionable, and unsatisfactory are 90.8 %, 3.3 %, and 5.9 % respectively. These values are comparable similar to other multi-year PT participated by laboratories worldwide [43] and relatively higher than other PT participated by laboratories from one country [39]. Thus, the performance of Indonesian laboratories in this study is relatively as satisfactory as worldwide laboratories.

The assigned values, the z-score, and laboratory performance based on the z-score of each target pesticide are given in Figs. 4, 5, and 6 respectively. As shown in Fig. 6, the laboratory performance, as z-score, in satisfactory range are from 79 % for bifenthrin (2017) to 100 % for carbaryl (2016), methomyl (2018), bupimirate (2020), fenvalerate (2020), thiacloprid (2020), and methomyl (2020). Meanwhile the z-score in unsatisfactory range from 0 % to 21 % for bifenthrin (2017). Apart from bifenthrin, there are several target pesticides with more than 10 % of z-score in unsatisfactory range. They are chlorpyrifos-methyl (2017: 11 %), permethrin (2017: 11 %), carbaryl (2019: 13 %), chlorpyrifos-ethyl (2019: 14 %), iprodione (2019: 11 %). Chlorpyrifos-methyl and iprodione have been observed to have a low percentage of acceptable results in cereal as a result of adding water to the sample prior to extraction [35]. This effect may affect relatively polar pesticides [35], thus is likely responsible for the case of chemicals with a higher percentage of unsatisfactory as observed in 2019 (rice brown) of this study. As mentioned earlier, the test material of rice brown was distributed as dry material and some laboratories might have added water to the test materials before extraction. However, no information on water addition was submitted by laboratory participants thus this assumption was inconclusive. Moreover, high percentages of unsatisfactory results for bifenthrin and permethrin in 2017 have never been observed in previous PTs. Instead, they were considered to be “easy” target pesticides [34, 35] so the result was not expected. The possible explanation of this is related to the matrix, which is orange. Orange, like other citrus fruits, can be considered as “difficult” matrix [34, 62], possibly to the relatively higher essentials oil content [34] from the peel that was also used in this study to make the test material puree in 2017. The low satisfactory performance of pesticides is previously observed in citrus fruits [34], therefore, it is reasonable to assume that the low result in this study is due to the orange matrix.

Fig. 4
figure 4

Assigned value for each chemical. Solid lines represent assigned values and dashed lines represent standard deviations based on algorithm A

Fig. 5
figure 5

The z-score for all PTs

Fig. 6
figure 6

Percentage of laboratory performance for each chemical

Further analysis is performed for the unsatisfactory z-scores in relation to how samples are analyzed (Fig. 7). In terms of sample preparation (Fig. 7a), QuEChERS and its modifications result more unsatisfactory z-scores. However, this does not directly indicate that QuEChERS or its modifications methods are responsible for unsatisfactory z-scores. As shown in Fig. 8a, from the total reported z-scores, there is 89 % that apply QuEChERS or its modifications, and only 11 % apply conventional solvent extraction. Therefore, more unsatisfactory z-scores than the conventional solvent extraction are expected. On contrary, the percentage of unsatisfactory z-scores of the conventional solvent extraction (26 %) is higher than in total z-scores (11 %). This may indicate that the conventional solvent extraction is responsible for the unsatisfactory z-scores.

Fig. 7
figure 7

Relation of z-scores and methods

Fig. 8
figure 8

Relation of total z-scores, unsatisfactory z-scores and methods

Similarly, in terms of the used instrument (Figs. 7b and  8b), the percentage of unsatisfactory z-scores from detectors other than MS or MS/MS (38 %) is higher than in total z-scores (15 %). It is well known that MS or MS/MS detectors deliver more sensitivity measurement [63,64,65]. Thus, there is a chance that detectors other than MS or MS/MS are responsible for the unsatisfactory z-scores.

The percentage values of total z-scores and unsatisfactory z-scores for both quantification and number of pesticide (Figs. 7c–d and  8c–d) are relatively unchanged. Thus, it can be said that none of matrix-matched calibration, without matrix-matched calibration, multi-residues analysis, or one residue analysis is responsible for the unsatisfactory z-scores.

Lastly, more z-scores are given by the not accredited laboratories (Fig. 7d) and its percentage is significantly high compared to accredited laboratories (Fig. 8e). This indicates that unsatisfactory z-scores are likely linked to not accredited laboratories. This is reasonable since accredited laboratories are usually complying with fixed parameters set by accreditation bodies.

Conclusion

A five-year PT was reported in Indonesia. The number of participating laboratories in the PTs tends to increase but it is still relatively low compared to the total number of testing laboratories in Indonesia. Most of the laboratories have used QuEChERS or its modifications for sample preparation. While all laboratories have used GCs or LCs for pesticide separation, at least 20 % still relied on detectors other than a mass spectrophotometer. Matrix-matched calibration and standard solution are used for quantification. Overall, the laboratory participants are well performed, with an average of 90.8 % obtaining satisfactory results, and only 3.3 % and 5.9 % obtain questionable and unsatisfactory results, respectively. However, improvement is still needed, especially for the number of target pesticides for multi-residue pesticide analysis. Moreover, unsatisfactory z-scores are likely to be resulted from laboratories, which use conventional solvent extraction, use detectors other than MS, and that are not accredited.