1 Introduction

In recent years, agri-food supply chains (AFSCs) have witnessed a series of structural changes that have significantly changed how firms do business and deliver products to consumers (Marques Vieira et al. 2013). Aside from being complex and exposed to several risks and uncertainties, AFSCs encounter increasing volatility across a variety of business parameters ranging from cost, raw material availability to unstable exchange rates (Christopher and Holweg 2011; Vlajic et al. 2013). Moreover, several scholars argue that AFSCs have to take into consideration specific issues associated with quality management of perishable products, harvesting methods, logistics activities and risk management (Ahumada and Villalobos 2009; Soto-Silva et al. 2016). AFSCs are sensitive to these issues, which, thus, necessitate developing resilience and responding to consumer concerns for food quality and safety. Models for AFSCs production and distribution should be developed to make harvesting plans more efficient and to meet customer demands as well.

With the rapid pace of globalization, AFSCs have become more sophisticated, fragmented, and scattered, requiring the introduction of new technologies as alternatives to the traditional ones (Sun 2014). The vulnerabilities arising from the limited shelf-life of food products and the variability in quality bring further difficulties. Therefore, AFSC actors need to rethink their existing business practices and adopt new technologies in order to become more efficient and productive in the agri-food industry. Although the food sector is traditionally characterized by low research intensity (Garcia Martinez and Briz 2000; Grunert 1997), several factors drive the adoption of emerging technologies in AFSCs. For instance, AFSC businesses seek to enhance customer perception of fresh food products, reduce production costs, shorten lead times, achieve a competitive advantage and optimize AFSCs processes (Gašová et al. 2017). Through the application of new technologies, it would also be possible to expand the market boundary of agricultural production and accelerate the circulation speed of agricultural products (Liu 2019). A potent example of such technologies is big data.

According to Subudhi et al. (2019), big data is defined as a “conglomeration of the booming volume of heterogeneous data sets, which is so huge and intricate that processing it becomes difficult, using the existing database management tools.” The concept of big data is a relatively novel and promising field of research (Guo and Wang 2019; Kellengere Shankarnarayan and Ramakrishna 2020) that offers methods for increasing the utility of data to extract meaningful insights. As such, the essence of big data is to process and analyse sizeable parallel data sets obtained from multiple sources such as online user interactions, commercial interactions, monitoring systems, sensor devices and any other consumer tracking methods. The critical attribute of big data is the fine-grained nature of the data (George et al. 2014) which is generated by enormous computing power that monitors a variety of digital streams and is analysed using “smart” algorithms (Davenport 2014). Similarly, Devenport (2014) posits that big data represents a new era of the changing technology landscape that engendered a large volume of data generated continuously from multiple data sources and with multiple data formats. The extant literature has also indicated that big data has many features such as volume, velocity, value and variety (Aljunid 2019). These characteristics have garnered much attention and captured the interest of researchers across many disciplines. Beyond the technical scope of big data, business scholars concur that big data is key to the development of more efficient and effective AFSCs (Guo and Wang 2019; Lioutas and Charatsari 2020). Big data constitutes a means for generating new insights and supporting decision-making processes across several segments in the AFSC.

The usage of big data provides real-time analytic insights for proactive data-driven decision-making in AFSCs. It feeds researchers, practitioners and policy decision-makers with quick-witted intelligence (Lioutas et al. 2019) and guidelines necessary for the successful management of AFSCs (Sharma et al. 2018). Besides helping AFSC actors to make effective decisions, big data can assist AFSC partners to mitigate disturbances (weather hazards, market changes, etc.) by reducing the economic waste related to agricultural production (Lioutas et al. 2019), stimulating agricultural policy impacts (Coble et al. 2018) and improving the economic performance of AFSC actors (Lioutas and Charatsari 2020). The benefits of big data also involve the ability of food firms to understand consumer preferences and expectations better, develop products based on real-time market insights and enhance the overall working circumstance and efficiency levels of the business. The World Bank reports that the food and agriculture sector represents 10 per cent of global gross domestic product with the potential to increase in the future as a result of population growth and changes in consumer behaviour (Coyle 2016). The contribution of big data to this trend is inevitable due to its myriad applications, many of which are tied into the agri-food industry. However, the benefits of big data for AFSCs cannot be realised easily due to data ownership, privacy, security and ethical issues in AFSCs (Klerkx et al. 2019). Wolfert et al. (2017), nevertheless, argue that the implementation of big data for smart farming is challenged by the technical and governance concerns that can arise in the different stages of the AFSC. In summary, the authors posit that the technical issues linked to data formats, hardware and information standards may limit the availability of big data for further analysis, whereas the governance issues are those that restrain the performance of AFSC business processes, such as the lack of agreements on responsibilities and liabilities.

It is easy to understand that AFSCs can substantially benefit from the adoption of big data as an empowering tool within the food industry. Less clear are the challenges encountered during the transition towards smart and data-driven AFSCs, such as the lack of data standards, the digital divide between AFSC actors and the shortage of organizational capabilities and technical skills to handle data mining and perform analytical tasks. However, to the best of our knowledge, few studies have focused on the role of big data in enabling the development of sustainable AFSCs. The objective of this paper is to explore big data as a critical driver for the development of sustainable AFSCs through a systematic literature review (SLR) in order to analyse the prior studies, capture the central concepts and subjects discussed on big data applications for AFSCs and identify future research directions. Hence, as detailed in the next sections, a representative sample was selected, and an SLR was conducted. The findings obtained enabled us to examine the evolutionary pattern of big data- AFSC research, the distribution of publications according to countries and journals, and the potentials of big data for AFSC sustainability. Our guiding research question is the following:

1.1 What are the potentials of big data for the development of sustainable AFSCs?

The insights discussed in this paper offer a sharpened understanding of the value of big data for sustainable AFSCs and attempt to benefit agri-food firms interested in using big data for their business processes. The novelty of this study consists in trying to identify what has been studied and can be concluded about the sustainable AFSCs which are supported by big data implementations. Moreover, we believe that by investigating the potentials of big data for AFSC sustainability, we can contribute to the broader discussion on the prospective opportunities of data-driven AFSCs in the agri-food industry. We also argue that the increasing complexity of AFSCs, their fragmentation and the increasing complexity of their management call for serious consideration of big data as a potential support to unlock the value of continuously generated data and promote more evidence-based decision- making processes (Coble et al. 2018). The results of this review, added to its limitations, helped uncover future research trajectories and questions, as discussed in Section 5. The last section briefly concludes the paper and highlights the study contributions and limitations.

2 Research method

The present study employs an SLR method to investigate the role of big data as a key driver for the development of sustainable AFSCs, identifying the evolution of big data research in the food industry since its emergence. Our focus is on the most productive countries, the influential authors and articles, and the major contributing journals to big data research in the context of AFSCs. An SLR allows researchers to identify the boundaries of existing knowledge and communicate the results of other studies that are directly related to the one being undertaken (Tranfield et al. 2003). It constitutes a broad picture of the current research trends and provides a comprehensive approach to map out the theoretical perspectives and theoretical practices emerging in a particular field (Mardani et al. 2020). A structured research methodology, consisting of adequate search terms and accompanied by a literature search and analysis, is essential to perform a useful literature review (Rowley and Slack 2004). For the review, our study used the PRISMA guidelines (Liberati et al. 2009) pursuing a 3-step methodology for data collection and analysis. The steps were as follows: (1) defining search procedure and sample, (2) initial descriptive analysis on sample and (3) data analysis.

2.1 Defining search procedure and sample

Cronin et al. (2008) discuss the need to formulate a well-defined research question for a literature review. Accordingly, our research is mainly focused on reviewing the literature on big data for sustainable AFSC management. Keyword searches are the most common method for locating the relevant literature where a keyword combination facilitates the careful selection of the research sample (Khalil et al. 2015). Therefore, guided by the research question of this study and using the common Boolean operators AND and OR, the following keyword search query was used in Scopus:

  • TITLE-ABS-KEY (“big data” AND (sustainb* OR environ* OR eco* OR green* OR social OR societal OR ethic* OR CSR OR eco- OR efficiency OR “triple bottom line” OR TBL) AND (food* OR “agri*” OR perishable* OR fruit* OR vegetable* OR “cold chain” OR fresh*).

Using this keyword search query, the scope of our present study is appropriately narrowed down to our research focus, which is limited to AFSC management. The choice of Scopus as a main source of academic literature is explained by its comprehensive coverage of quality peer-reviewed scholarly journals. Scopus is a vast repository of academic articles that span over several scientific disciplines, making it widely used for the extraction of studies for literature reviews and bibliometric analyses. This study was conducted in June 2020. In particular, the search in Scopus was performed in title, abstract, and keywords fields. No temporal restriction was applied to the search. The subject areas used for the review were Agricultural and Biological Sciences; Social Sciences; Decision Sciences; Environmental Science; Business, Management and Accounting; Economics, Econometrics and Finance. To ensure the academic nature of the retrieved data (Ramos-Rodríguez and Ruíz‐Navarro 2004), we only selected English-speaking journal articles for the review. The used search query resulted in an initial sample of 311 papers to be initially screened for relevance using their titles and abstracts. Based on the inclusion criteria and the research question, the final number of selected articles was 128. The metadata data of those articles were exported in CSV and .txt formats. This helped to ensure that all necessary information about the articles (titles, authors’ names‚ authors’ affiliations, abstracts, keywords, and references) were extracted.

3 Findings

3.1 Descriptive statistics

3.1.1 Publications per year

Figure 1 depicts the trend in the number of articles published from 2014 onward. As can be observed, there is a consistent increase in the number of publications about big data applications for sustainable AFSCs. In less than a decade, the big data annual scientific production has grown remarkably. However, the real start of the scientific panorama was in 2016 when attention was focalized on the promising potentials of big data for AFSC sustainability. The next years (2017–2020) can be characterized by a noticeable upward trend and research interest in the subject. During this study period, big data research has witnessed significant growth, reflecting the rapid digitization and modernization of AFSCs.

Fig. 1
figure 1

Year-wise distribution of big data research in AFSCs

3.1.2 Publications by country

From the perspective of individual researchers, Fig. 2shows the countries with at least five publications in the reviewed literature. Scholars from the USA were the most productive, accounting for 26.56 per cent of the total contribution. In the USA, Climate corporation has employed big data to promote sustainable farming practices by developing a cloud-based farming information system that merges weather measurements, agronomic data modelling and high-resolution weather simulations (Orts and Spigonardo 2014; Rubens 2014). In such an initiative, big data allows farmers to adjust their working practices according to the weather forecast, especially when they need to know the climatic conditions to spray their land. To a lesser extent, Chinese scholars significantly contributed to the literature with 30 studies. Considering that China has become the largest CO2 emitter in the world, there is a tremendous potential for big data to reduce the environmental impacts, mitigate desertification and sequester carbon emissions from the environment in the country (Zhang and Huisingh 2018). Scholars from India contributed with 15 papers, while Australians and researchers from the UK published 13 articles each.

Fig. 2
figure 2

Country-wise distribution of big data research in AFSCs

3.1.3 Publications by journals

Figure 3 depicts the academic journals that publish three papers or more on big data applications for sustainable AFSCs. Computers in Electronics and Agriculture is the dominant journal publishing nine papers. To a lesser extent, six papers were located in Revista de la Facultad de Agronomia. NJAS- Wageningen Journal of Life Sciences and Journal of Cleaner of Production published four papers each. In total, the papers in our final sample were published in ninety academic journals. The target audiences of the publication outlets include scholars from agriculture science, computer sciences, production operations, management and economics. We found a small number of articles that investigated the triangle of big data, AFSCs and sustainability. Considering the tremendous growth of big data and its opportunities for achieving sustainable AFSCs, this indicates a significant knowledge gap.

Fig. 3
figure 3

Journal-wise distribution of big data research in AFSCs

4 Review discussion

In this section, we provide our answers to the research question of the study. The analysis of the literature is guided by the framework presented in Fig. 4. We decompose AFSCs into two main constructs: AFSC resources and AFSC management. These groups represent the application fields of big data in the AFSC. Following several studies, soil and water are considered to be the two most essential resources for agri-food production (Jara-Rojas et al. 2013; Sarangi et al. 2004; Sarkar et al. 1995). The second cluster comprises four paramount activities in AFSCs, namely, crop/ plant management, animal management, waste management and traceability management. These sub-components emerge from the typical definition of AFSC which covers a range of functions, including crop and livestock production (Munz et al. 2020). Moreover, waste is a central concept in AFSCs and its effective management can increase the profitability levels of AFSC members (Otles et al. 2015). Equally important is to consider traceability management as it concerns all the agri-food industry stakeholders, including final consumers (Astill et al. 2020).

Fig. 4
figure 4

A conceptual framework for the literature synthesis

4.1 Sustainable AFSC resources

4.1.1 Soil

Soil plays a critical role in sustaining biodiversity and providing the necessary elements for agricultural production, plant growth and survival, animal habitation, environmental quality, and animal sequestration, which are a stepping stone towards the achievement of the United Nations’ Sustainable Development Goals (SDGs) (Hou et al. 2020). Soil is considered to be the support and sustenance of crops and forests, and represents a vital component of the ecosystem that is affected by all agricultural production activities (Fernández-Getino et al. 2018). Given that the properties of soil have an impact on the ecosystem, environmental quality, climate change, AFSC sustainability and human health (Hou et al. 2020), the degradation of these properties can lead to the damage of soil structure and the quality of crops produced. Soil is a valuable and non-renewable natural resource, thus, there is a need to preserve the fertility of soil and improve its performance.

Several scholars argue that big data has the potential to promote more proper cultivation practices that are necessary for maintaining soil fertility. For example, a recent study by Hou et al. (2020) points out that big data and machine learning tools facilitate the collection, analysis and sharing of data related to soil. With big data implementation, it would be possible to uncover hidden patterns from soil datasets and obtain the necessary information for identifying soil conditions, such as nutrients, pH levels and soil moisture (Finger et al. 2019; Kolipaka 2020). The continuous monitoring of soil fertility measures could provide more detailed insights into the data characteristics of soil and support farmers in their crop yield predictions and decisions (Rajeswari and Suthendran 2019). Garg et al. (2019) employ big data along with machine learning methods to extract knowledge from data. The authors argue that big data could help find out fertilizer recommendation classes on behalf of existing soil nutrition composition. Moreover, the technology can automate all intelligent actions required to ensure decontaminated, fertile and healthy soil. As a result, the automation enabled by big data systems is useful for farmers to control AFSC processes through alerts and to make evidence-based decisions (Coble et al. 2018; Hou et al. 2020; Chapman et al. 2018). The availability of big data is crucial for soil analysis in order to achieve better knowledge regarding the nutrient contents of soil and the appropriate amount of fertilizers that can be used, resulting in a more balanced nutrient soil content and better agricultural productivity (Garg et al. 2019).

4.1.2 Water

Water is the vital natural resource that determines the survival and development of the global inhabitants (Cai et al. 2019). According to Zhang and Huisingh (2018), water is considered the most essential ecological and environmental factor that can control, alleviate, and manage decertified lands. For AFSCs, water is regarded as the primary input necessary for cultivation activities and animal farming. However, the exponential growth of the world population, coupled with the rising living standards, have led to an increase in the demand for AFSC products and the higher pressure over water (Badia-Melis et al. 2018). On the global scale, industries are encountering constraints from water accessibility, which are caused by the intensified urbanization and the expansion of agricultural activities (Capmourteres et al. 2018; Fleming et al. 2018; Khanna et al. 2018). Therefore, aspects that concern water scarcity or increased consumption (wastage) should be urgently reappraised. The modernization of AFSCs through the incorporation of big data is a promising opportunity for addressing water sustainability issues. Seminal research by Wolfert et al. (2017) emphasizes the importance of big data applications in equipping farmers with the predictive capabilities necessary for saving water while maximizing crop yields.

Big data can increase water use efficiency (Ciruela-Lorenzo et al. 2020) for different decision-making units in the agri-food industry. Meanwhile, various big data- based climate models can serve to assess annual agricultural conditions, set annual agricultural production plans, and ensure the efficient use of water and the prevention of land degradation (Zhang and Huisingh 2018). The value of big data lies in the potential to create prescriptive plans for AFSC actors and assist in water audits and policy formulations (Weersink et al. 2018). For instance, Weersink et al. (2018) note that the technological development associated with big data, including the Internet of Things (IoT) sensors for monitoring water pollution of farms, can be useful to create predictive algorithms that can cope with the stochasticity of the environment. For water-stressed regions, these algorithms can provide AFSCs players with specific information necessary for combatting water shortages and maintaining the sustainability of water systems. Furthermore, the analysis of big data allows monitoring the quality of water from the mass data generated in real-time from various sources (Kamilaris 2018). As a result, this helps AFSC stakeholders to achieve accurate predictions of soil water patterns, properly manage agricultural water resources, and increase crop yields (Cai et al. 2019). Most importantly, big data is useful for evaluating environmental risks and increasing AFSC awareness for water issues. To illustrate, Aqueduct is an interactive big data-based water-risk tool from the World Resources Institute that can monitor and measure water risk anywhere in the globe using various parameters like water quantity and quality (Aqueduct 2019). Therefore, the integration of big data in the AFSC can substantially optimize water consumption, minimize permanent loss of water (Reynolds et al. 2018), and overcome issues of water accessibility.

4.2 Sustainable AFSC activities

4.2.1 Crop/ plant management

Crop management constitutes the art and science of controlling or directing crop production in order to ensure the delivery of safe food products to society at reasonable costs to final consumers and with a sufficient margin of profit for the producer (Wiese 1982). For the sustainability of AFSCs, integrated crop management is a top priority for achieving sustainable agriculture (Camin et al. 2010). This implies that agri-food firms should set efficient crop planning, with greater consideration for quality standards, appropriate crop varieties, and consistent supply of food products. To sustain crop management practices, digitization has transformed AFSCs into smarter and data-driven processes. For instance, drones have been used to capture plant growth data, optimize the quantity of nitrogen for the fertilization of crops, whereas robots have significantly expanded farmers’ abilities to control vast expanses of crops (Ciruela-Lorenzo et al. 2020). Besides these technologies, the reliance on big data in AFSCs opens new prospects for farmers to effectively obtain critical information on crop cultivation and improve their productivity (Carbonell 2016; Li et al. 2011). The learnings obtained from big data are conducive to effective crop management decisions, higher operational efficiencies, cost reductions, and risk minimization (Gašová et al. 2017). Big data generated from the IoT not only helps to ease crop tracking, but also to ensure precise dosage of crop pesticide sprays, thereby increasing the marketability of the crop, farm returns, and environmental sustainability (Dupaľ et al. 2019; Marvin et al. 2017; Saiz-Rubio and Rovira-Más 2020).

The tremendous generation of data could potentially create a gold mine of knowledge about plant performance (e.g., stress tolerance, nutritional quality, overall crop) in diverse climatic conditions, soils, and management regimes (Halewood et al. 2018). This data variety and richness could embolden breeders and farmers to engage in crop improvement programs, thereby creating effective decision support tools (Finger et al. 2019; Chapman et al. 2018; Halewood et al. 2018; Serazetdinova et al. 2019). Equivalently, big data helps farmers to overcome the harmful issues related to improper crop protection practices because the technology could lead to valuable insights into the effects of pesticides and other chemicals on crops (Carbonell 2016). Zhang and Huisingh (2018) highlight that big data is a turning point in pest control, and research in this direction would contribute to AFSC sustainability. Li (2019) finds that the uptake of a crop cultivation system based on big data could provide farmers with timely information, which is necessary to improve the economic efficiency of AFSCs and increase the added value of science and technology in the agri-food industry. Coble et al. (2018) suggest that big data plays prospective roles in addressing nutrient runoff concerns, which are the primary cause of the degradation of water quality. The authors further note that the utilization of big data tools could significantly substantiate the value of the tools used to manage nutrient concerns through better evaluations, policies, and modelling of nutrient management strategies. Overall, big data can be utilized in cropping systems to renovate management practices and foster environmental sustainability through the minimization of negative impacts and the creation of resilient AFSCs (Delgado et al. 2019).

4.2.2 Animal management

The increasing intensification of animal farming has led to many sustainability challenges (Eisler et al. 2014; Steinfeld et al. 2006) such as excessive pollution and contamination of land, water, and air (Kamilaris 2018). Within the agri-food sector, animal farming has the largest environmental footprint (Steinfeld et al. 2006). According to Gerber et al. (2013), livestock farming is responsible for 14.5 % of GHG emissions produced by human activities, of which cattle farming for beef and dairy accounts for 1 per cent of emissions generated by the sector, either from pasture-based or confinement systems. To reduce the environmental impacts of AFSCs, animal farming has shifted from traditional practices towards smart farming or precision agriculture (Sharma et al. 2018; Kolipaka 2020). Smart farming is conceptualized as the development that integrates information and communication technology in the cyber-physical farm management cycle (Wolfert et al. 2017). With the use of advanced technologies, farmers would be able to maximize their efficiency levels, while causing minimum disturbances to the environment. On this subject, Finger et al. (2019) and Weersink et al. (2018) contend that big data constitutes a key enabler for smart animal farming, allowing farmers to manage the needs of individual animals in real-time. Tan and Yin (2017) note that big data systems can aid animal producers in producing sufficient animal feed, optimizing the use of additives, reducing wastage, and controlling pollution from animal production. Through the analysis of big data, AFSC actors can obtain a large amount of data, which were not quantifiable in the past, to monitor their livestock and increase animal health and welfare (Eastwood et al. 2019; Ramirez et al. 2019). Real-time data that are generated from smart sensors could be used to unlock higher value from animal farming operations and assist AFSC partners in making data-driven decisions (Astill et al. 2020). Additionally, the wealth of knowledge gained from the use of big data opens the opportunities for the development of individual diagnostic and herd-level management tools that are necessary for the detection of health events and monitoring of high-risk physiological periods, resulting in production and animal health benefits (Pralle and White 2020). For sustainability to be achieved, AFSC actors can rely on big data to derive actionable insights, ensure increased control over the ambient atmosphere of the farm, and maximize the economic return of their animal management operations. Therefore, big data could support precision livestock farming practices by providing a variety of farm generated data and the necessary analytical tools for improving the care and monitoring of livestock (Astill et al. 2020).

4.2.3 Waste management

Agri-food waste is a broad concept that refers to any residue generated by agricultural activities, such as food waste streams like potato or citrus peels (Forster-Carneiro et al. 2013). Mishra and Singh (2018) estimate that approximately one-third of the food produced is discarded or lost, amounting to 1.3 billion tons per year. The impact of waste on the environmental performance of AFSCs is seriously enormous as scholars argue that agri-food waste can produce large amounts of toxins and stench gas, which pollute water and air (He et al. 2012). Beyond environmental concerns, Irani et al. (2018) emphasize that food waste is an increasingly pressing societal issue that questions the fair distribution of food products, the methods of food production, and the reasons of food waste. Therefore, AFSC players, ranging from farmers, wholesalers, logistics services providers, to food retailers should revisit their AFSC practices to minimize agri-food waste and its ensuing carbon emissions and social implications. To provide solutions for food waste, agri-food businesses can use big data applications to limit food waste and decrease their carbon impact (Mishra and Singh 2018). The massive amount of AFSC data generated by the IoT can be analyzed using data analytics techniques to identify the sources of food waste in the supply chain (Kamble et al. 2020) and support AFSC managers to reduce the potential economic waste associated with AFSC activities by making effective decisions (Lioutas et al. 2019). Because AFSC businesses are required to develop proactive practices to support resource recovery from waste (Sgarbossa and Russo 2017; Xia et al. 2016), they can use big data to produce a reliable analysis of the extensive misuse of AFSC resources and thereby reduce food waste (Kamble et al. 2020). Furthermore, by using big data, AFSCs can mitigate the waste of production inputs, such as soil, water, land, and foster environmental sustainability.

The collaborative capabilities of big data applications are a foremost step toward the acceleration of effective waste management and disposal solutions (Sharma et al. 2020). In other words, increasing collaboration with big data solutions can optimize AFSCs, support effective food waste reduction strategies, and prevent food spoilage. Related to these points, Mishra and Singh (2018) submit that AFSC retailers can leverage big data analytics to eliminate waste by utilizing consumer complaints made in retail stores or rely on social media data. Likewise, agri-food businesses can capitalize on big data to mitigate potential resource waste, even at every pre-consumer phase. The role of big data to minimize AFSC waste is emphasized by Singh et al. (2018) who develop a big data cloud-computing framework to assist farmers in measuring their carbon footprint in a cost-effective manner. While the carbon footprint of food products generates substantial GHG emissions every year, big data can be a promising solution for AFSCs to plan and direct waste prevention initiatives.

4.2.4 Traceability management

Food traceability is a central notion in AFSCs, and it refers to the ability to trace the history of a product and to identify the origin of a food product, sources of all ingredients used and their location in the supply chain by means of records (Opara and Mazaud 2001). As per Bosona and Gebresenbet (Bosona and Gebresenbet 2013), traceability can be conceptualized as the part of logistics management that capture, store, and transmit appropriate information about foods throughout all the stage of the AFSC in order to assure quality and safety. From the AFSC sustainability perspective, the importance of traceability is evident since the primary goal of AFSC actors is to increase food safety and bring other benefits to production systems and supply chains (Giagnocavo et al. 2017). The opportunities of traceability involve the improvement of operational efficiencies and the increase of consumer trust and confidence in the ethical dimensions of values and processes in the AFSC. Recognizing that traceability has a critical role in the value chain of the AFSC products, Giagnocavo et al. (2017) advocate that the use of new technologies can contribute to the sustainable development of the agri-food industry. Based on such consideration, advances in big data technologies have the potential to redefine AFSC traceability, allowing agri-food stakeholders to verify the environmental stewardship traits across the AFSC from the farmer to the final consumer (Khanna et al. 2018).

By integrating big data in AFSC traceability systems, agri-food businesses would be able to improve their process control, optimize material use, and manage production more efficiently. For example, in the grain market, Jakku et al. (2019) suggest that big data applications could improve supply chain traceability and create value for consumers, retailers, processors, and growers. In the AFSC, the combination of big data, the IoT, and blockchain is potentially able to bring increased levels of visibility, traceability, transparency, authenticity, and quality of food products (Kamble et al. 2020). The automation of data capture merged with big data analytics can substantially benefit food traceability and turn AFSCs into consumer-driven chains (Lioutas et al. 2019). The adoption of sensor-based technologies and big data analytics permits agri-food businesses to monitor their AFSCs carefully and trace any contaminated food products to their source. In turn, this could also positively affect consumer willingness to pay for sustainable AFSC products (Khanna et al. 2018). Therefore, big data radically transforms AFSCs and turns them into data-driven supply chains, supporting sustainable goals, efficient information sharing, and effective decision-making processes. All of these opportunities are echoed in the study of Kamble et al. (2020) who suggest that big data could simplify traceability, shorten AFSCs, and empower deprived communities in the agri-food industry.

Overall, Fig. 5 presents a summary framework of the vast potentials of big data applications for the development of sustainable AFSCs. The findings of the review illustrate that the benefits of big data for AFSCs span the three foundational elements of sustainability. The practitioners can use the framework developed in the current study as it elucidates how big data contributes to the overall efficiency of AFSCs and alleviates several challenges faced by the agri-food industry, such as soil health, water misuse, and agri-food waste. The proposed framework will guide the practitioners in deciding the appropriate AFSC resources and AFSC management for improved sustainable performance.

Fig. 5
figure 5

Summary framework of the potentials of big data for AFSC sustainability

5 Discussions, challenges and implications

5.1 Discussions

The applications of big data are used in a variety of sectors— the agri-food industry is no exception. In general, big data denotes large data sets (generally built by integrating multiple sources of related data) that can be analysed through information systems in order to reveal patterns, trends, and associations of value for a variety of decision-making purposes. In this study, we aim to review the extant literature on the intersection of big data, AFSCs, and sustainability, and to provide insight into how big data can improve the sustainable performance of AFSCs. Typically, the unique contribution of big data is to aggregate the vast amount of data from farming operations to develop alternative management strategies for desired outcomes. Consequently, agri-food organizations can leverage the aggregated data to create sustainable AFSCs, where they can enhance the marketability of their products and add specified sustainable attributes in their supply chains (Kamble et al. 2020). According to Delgado et al. (2019), big data analysis will be one of the critical tools that will accelerate the development of sustainable systems. The authors also argue that the challenges of the 21st century for AFSCs are to conserve soil and water and to guarantee food security. Therefore, the sustainability of AFSC is highly required due to the intensive nature of agriculture and the rapidly changing climate. Sustainable AFSCs can contribute, to some extent, to the reduction of environmental impacts and the pace of climate change. It is well noticed that AFSCs are at the initial stages of utilizing big data tools to support farmers in achieving higher efficiencies and economic gains, enhancing process visibility, promoting trust and transparency, and reducing environmental footprint.

From an AFSC sustainable perspective, both AFSC resources and AFSC activities require proper handling and disposal especially related to end-of-life treatments, such as landfilling or incineration, which have substantial impacts on the environment. Likewise, disposal and the loss of value of wasted food products are examples of direct economic impacts. In addition, sustainable AFSCs can have significant impacts on food security and poverty. Therefore, appropriate strategies aiming at the realization of AFSC sustainability, emissions prevention, and resources valorisation can help avoid or mitigate the negative impacts of AFSCs. As argued by Östergren et al. (2017), maintaining resource efficiency is a high priority, and policymakers and different stakeholders should seek more alternatives for production and consumption activities within AFSCs so that the balance between the environmental, economic and social dimensions of sustainability can be maintained.

5.2 Further challenges

The implementation of big data in the agri-food industry helps to forge a pathway towards more sustainable AFSCs. In several ways, the use of big data can substantially benefit AFSC resources that are a determining factor for the abilities of agri-food businesses to deliver higher yields at greater efficiencies (Ryan 2020). Although it is ascertained that big data can assist farmers in better managing their crops, livestock, agri-food waste, and traceability processes, several challenges still lay ahead for the progress towards big data-enabled AFSCs. For instance, the poor quality of data and its limited availability could restrain the value of big data for AFSCs (Villa-Henriksen et al. 2020). Analysts estimate that the cost of poor data quality within a typical firm could reach between 8 per cent and 12 per cent of revenues (Sethuraman 2012). Generally, AFSCs consist of farmers that are adopting traditional farming practices, instead of analysing data to make evidence-based decisions. The lack of information infrastructure curtails agricultural production, operations of data analysis, and forward-looking guidance (Guo and Wang 2019). In the case of generating large-scale agricultural data, Li et al. (2019) argue that AFSC stakeholders would need to meet the analytical needs for rapidly mining knowledge and information and developing data warehousing capabilities.

As per Villa-Henriksen et al. (2020), the introduction of big data poses significant issues for data analysis because of the increasing complexity and heterogeneity of datasets. This would furthermore require the use of advanced data analytical techniques and solutions to generate actionable insights in the AFSC. The transition to big data-enabled AFSCs might expose some actors to the risk of lacking the basic mechanisms and abilities to symmetrically compete, afford the cost of big data agriculture, and effectively use and analyse big data (Lioutas and Charatsari 2020). The weak ICT infrastructure, the lack of resources, and the shortage of skilled and experienced professionals (Eastwood et al. 2019; Kamilaris 2018; Lioutas and Charatsari 2020) might hinder the ability to use big data for precision livestock farming (Ramirez et al. 2019). To sustain their operations, agri-food firms are compelled to hire highly talented data scientists (Wolfert et al. 2017) for the analysis and management of AFSC data (Kamble et al. 2020). As a result, agri-food businesses have to incur additional costs and make investments (Reynolds et al. 2018) in order to modernize their existing infrastructure and link data from a variety of on-farm systems, equipment, and records (Ramirez et al. 2019). The absence of governance modes can be a significant barrier to the implementation of big data in the AFSC (Astill et al. 2020). Specifically, a more data-driven AFSC must be developed based on data processing pipelines that could seamlessly use datasets with easy access, reusability, and interaction (Donohoe et al. 2018). This is essential as proper governance mechanisms (Rotz et al. 2019) are the key to AFSC sustainability (Astill et al. 2020; Eastwood et al. 2019; Costello and Ovando 2019). Further, the shift toward big data-enabled AFSCs is not an easy task given that processes and procedures for data governance, security, and legal compliance have to be developed so that data quality and integrity is ensured (Astill et al. 2020). Therefore, the development of organizational, ethical, and legal arrangements of data sharing is imperative to facilitate collaboration between AFSC entities and data scientists (van Evert et al. 2017).

5.3 Implications

5.3.1 Implications for practitioners

The pressing need for increasing efficiencies and sustainability has accelerated the digitization of AFSCs. The emergence of high-tech data innovations presents an opportunity for agri-food firms to renovate their practices and improve sustainability, efficiency, and competitiveness. In this study, we examined the role of big data to sustain AFSCs by a systematic analysis of the retrieved literature. Practically, the current research offers several implications. The practitioners can employ big data to strengthen the conservation of their AFSC resources. For improving soil performance, agri-food practitioners and scientists can rely on big data to gather enormous amounts of data related to soil chemistry, conditions, and properties (e.g., humidity, temperature). The collection of such data is beneficial for boosting AFSC digitization and informing decisions on soil monitoring (Fernández-Getino et al. 2018). In an attempt to optimize their crop productivity, AFSC partners can use big data to provide a favourable soil environment while protecting soil from further damage and compaction. The conservation of water resources is a must, and agri-food practitioners should consider water at the top of their sustainable development agenda. To do so, the practitioners can be equipped with big data to ensure water accessibility, devise sustainable irrigation practices, and enhance water quality.

Gaining insights into the use and value of water resources is crucial to inform investments in the water sector and optimize the allocation of water and its pricing. Moreover, the review suggests that big data is a key enabler for successful crop/plant management. Crop maximization is realized by the efficient management of AFSC inputs, notably, soil and water. Big data solutions can substantiate the utility of precision farming, thereby better responding to the field variability for crops and increasing food supply. The AFSC practitioners can take advantage of big data to ameliorate the accuracy of their crop forecasts, monitor the health and growth of crops, and ensure efficient handling of crops. Here, the increase of food production is significant as it can lead to the reduction of hunger, food security, and increase of the overall income of the population, as highlighted by the World Bank (World 2020). The analysis of data captured by agriculture, the IoT and smart sensors can generate actionable insights that improve farm productivity and profitability. It is furthermore evident that the practitioners will consider the investment in big data technologies as a vehicle to foster AFSC sustainability, while achieving higher operational efficiencies. Besides crop/ plant management, big data provides tremendous support and guidance to improve the management, monitoring, and control of animals in the AFSC.

The availability of big data can bolster precision livestock farming, making sure that farmed animals receive real-time care (Astill et al. 2020). Hence, the practitioners will find value in acquiring the data necessary for minimizing the effects of animal diseases, streamlining animal feeding operations, and increasing animal welfare. Without rigid monitoring of livestock health, the agri-food practitioners would incur substantial losses arising from the low productive capacity of livestock systems, death of animals, and possible public health outbreaks. When it comes to waste management, the application of big data can enable agri-food businesses to mitigate wasteful practices, which may be caused by the unrealistically low prices of AFSC inputs (e.g., water, energy), supply chain opacity, or duplication of business processes. Thoughtless AFSC practices are destructive, of course, and the practitioners will be required to harness big data capabilities to prompt actions that promote waste minimization and seize further opportunities, such as enhancing AFSC processes, optimizing profits, and achieving cost savings. The last thing to consider is AFSC traceability. One of the primary reasons why AFSC managers should recognize the potential of big data is its ability to achieve transparency (Wang et al. 2016), which is in the view of Skilton and Robinson (2009), a necessity to ensure effective traceability and increase food quality and safety.

5.3.2 Implications for researchers and future research trends

The present study has several implications for researchers. First of all, this study is different from past reviews (Wolfert et al. 2017; Saiz-Rubio and Rovira-Más 2020; Kamilaris et al. 2017) in that it aims to study the potentials of big data for the development of sustainable AFSCs. More precisely, the focus of the review has been narrowed down to the intersection between big data, sustainability, and AFSCs. Informed by the findings of the review, we summarized the potentialities of big data in a comprehensive framework that encapsulated the major resources and activities of the AFSC. Mapping the benefits of big data in the framework can support researchers to understand the application boundaries of big data for agri-food sustainability. It is important for researchers to be aware of the AFSC resources and activities that can be sustained by big data implementations. Throughout the review, we noticed the lack of studies clarifying the relationship between big data and AFSC sustainability, therefore, bridging this knowledge gap is an urgent need. It is also the primary contribution of this study to systematize big data research from the perspective of AFSC and to unfold the conceptual development of this field.

This study argues that big data can be a potential remedy to the degradation of AFSC resources and a key enabler for the long-term performance of agri-food businesses. Thus, no solution should be expected to become a long-term success if it is not compliant with the economic, social and environmental dimensions. Consequently, the summary framework that we suggested here yields to a sharpened understanding and pragmatic guidance for scholars to scrutinize the following research gaps and address them in their future research. The study contributes to the existing body of knowledge in the following ways:

  • The what: through recognizing a distinctive part of critical examination of the big data that has been previously overlooked or restrained in the literature such as:

    1. 1.

      Less focus has been placed on the challenges of big data-enabled AFSCs within agri-food businesses. Therefore, future research should propose feasible solutions to overcome the barriers hampering the effective integration of big data in AFSCs.

    2. 2.

      Previous studies suggest that big data could contribute to the conservation of soil and water for AFSC activities. However, the potential importance of big data in the sustainable development of clean and energy-efficient agriculture was overlooked. The use of big data to produce accurate energy estimations and predictions can increase the efficiencies of AFSCs. Investigating best practices for energy saving in the agri-food industry constitutes a promising research direction.

    3. 3.

      There is a lack of knowledge relating to the techniques necessary to improve big data quality and reduce the negative impact of data inaccuracies within AFSC datasets. A likely first endeavour in this regard is to purify data through fast model algorithms in order to generate better analysis in AFSCs.

  • The how: through emphasizing the integrative approach rather than the linear nature of the process such as:

    1. 4.

      Researchers should seek to pin down potential behavioural changes associated with big data implementations to increase the resilience of AFSCs and also provide agri-food industry policymakers with insights and means to assess new and existing AFSC policies.

    2. 5.

      From a systems lens, a multi-disciplinary approach that considers the role of big data in developing effective policies and stimulating decisions that affect AFSC sustainability will add substantial value to big data research in the agri-food industry.

    3. 6.

      Future research needs to empirically investigate the impact of big data on AFSC collaboration, performance, and provide guidelines on how to employ big data for optimizing the efficiency of AFSC systems. The framework developed in this study may be useful and inspiring for researchers to derive compatible constructs for the assessment of this impact and the formulation of guidance policy and practice.

    4. 7.

      The findings of the review reveal that big data can encourage the wise management of AFSC resources and facilitate the synergy between AFSC processes. Nevertheless, adequate and efficient analytical methods to leverage and integrate large amounts of data on soil quality, water, weather, land cultivation, and AFSC partners’ decisions should be examined in order to devise developmental agendas for sustainable agri-food industry, strategies for environmentally sustainable land exploitation, and efforts to understand the drivers of agri-food stakeholders’ behaviours.

    5. 8.

      Another interesting avenue for future research is to study the impact of big data on not only the performance of AFSCs but also the structure of agri-food businesses and the quality of farm life. As indicated in our study, both AFSC resources (i.e., soil and water) and activities (i.e., crop/ plant management, animal management, waste management, and traceability management) are varyingly affected by IoT technologies utilized for data collection and the application of data analytics.

    6. 9.

      Researchers can reuse an extended version of our framework analysing new conceptual and application fields of big data. This helps to advance big data research in AFSCs and to cope with the expansive evolution of technologies in the agri-food industry. Moreover, the elements of the proposed framework can be implemented to develop a big data-based recommendation system to improve AFSC management.

  • The why: through identifying a new logic underlying the process that accommodates for (a) the nature of data-driven innovations, and (b) the confusing boundaries between big data teams and their users such as:

    1. 10.

      Additional studies on the validation of the technological potentials of big data are necessary to help scholars understand how exactly big data can streamline AFSC processes.

    2. 11.

      There is a need to develop advanced algorithms and models for enhancing the prediction and forecast of weather conditions in order to deploy sustainable irrigation measures effectively.

    3. 12.

      Future research can examine the role of organizational values in big data-supported AFSC designs, agri-food businesses, and governance of big data.

    4. 13.

      The impact of institutional pressure on big data adoption in AFSC remains unaddressed. Whether market-driven and non-government incentives are underlying motives for the uptake of big data in the agri-food industry is a knowledge gap that should be closed.

6 Conclusions

This study used an SLR method in order to gain insight into the current state-of-the-art of big data research in the AFSC. The sustainability of AFSCs has been identified as an important and significant research field with a multi-disciplinary nature. The upward trend in the number of publications in this area confirms this tendency. The analysis of the literature pointed toward the development of a summary framework that encapsulates the potentials of big data for the development of sustainable AFSCs. We also recommended 13 actionable future research directions.

Although this is not the first time the environmental impacts of AFSCs have been studied, our proposed framework goes beyond current practices to improve sustainability, efficiency, and competitiveness for AFSCs. The framework is underpinned by a multi-disciplinary approach balancing agricultural and ecological issues and underscoring the power of big data to better understand the interdependency of AFSC resources and activities and to rationalize the economic, social, and environmental benefits of the technology. This framework highlights the importance of big data for sustainable AFSCs and its ability to ensure food security and satisfy environmental goals. While the environmental goals currently focus on emissions mitigation, the framework can be easily extended to consider other environmental factors. For example, careful consideration should be laid on the impact of big data on soil and water resources in order to devise more comprehensive strategies for sustainable development. Full awareness of the observed elements of the agro-food system, including climate, soil properties, and related agronomic parameters for each food variety and crops is required. Despite policymakers’ interests, most AFSC actors are currently more concerned with costs than carbon footprints, and incentive barriers and technological constraints impede the implementation of mitigation options in the food sector (Smith et al. 2007). However, through the use of big data, this study suggests that it would be possible to minimize the environmental impacts generated throughout the different stages of the AFSC. Indeed, another objective of this study is to promote more sustainable data-driven AFSCs.

Importantly, the summary framework derived from the findings of the review provides a greater opportunity for researchers to further explore AFSC resources and activities that can be sustained by big data implementations. Although there is a lack of studies clarifying the relationship between big data and AFSC sustainability, nevertheless, this study fills an important knowledge gap. Moreover, it helps to synthesize big data research in sustainable AFSCs and confirm the role of emerging technologies in making AFSC sustainability a reality and not an illusion. The potentials of big data integration in AFSCs span disciplines as various as agronomy, management, and environmental engineering, thereby providing insight into the development of sustainable AFSCs. From a practical perspective, the accountability of AFSC actors for improved food safety can be realized through big data applications and sound AFSC practices. Moreover, agri-food organizations need to develop big data analytics capabilities and capitalize on digitization to effectively build competitive and sustainable AFSCs, which are much needed for achieving triple-bottom-line sustainability in the agri-food industry.

Despite its scholarly contributions, this study is not without limitations. The use of Scopus as the sole database for this review may not help to capture all studies that are potentially relevant to the scope of the paper. Moreover, the review considered only journal articles and omitted other equally essential sources of knowledge such as books, chapters, and conference papers. Finally, the findings of the review were guided by the set of keywords used by the authors, and thus, they should be validated with empirical approaches such as expert interviews and surveys.