Big data in the food supply chain: a literature review

The emergence of big data (BD) offers new opportunities for food businesses to address emerging risks and operational challenges. BD denotes the integration and analysis of multiple data sets, which are inherently complex, voluminous and are often of inadequate quality and structure. While BD is a well-established method in supply chain management, academic research on its application in the food ecosystem is still lagging. To fill this knowledge gap and capture the latest developments in this field, a systematic literature review was performed. Forty-one papers were selected and thoroughly examined and analysed to identify the enablers of BD in the food supply chain. The review primarily attempted to obtain an answer to the following research question: “What are the possibilities of leveraging big data in the food supply chain?“ Six significant benefits of applying BD in the food industry were identified, namely, the extraction of valuable knowledge and insights, decision-making support, improvement of food chain efficiencies, reliable forecasting, waste minimization, and food safety. Finally, some challenges and future research directions were outlined.


Introduction
The food industry is an integral part of every economy and plays a critical role in supplying the necessities for human survival and provides consumer choice (Turi et al. 2014). According to estimates, US$14 trillion of foods is produced, packaged and sold worldwide every year and encompasses a multitude of transactions between suppliers, retailers and consumers . At the same time, the global food system is still encountering a series of serious challenges such as the increase of world population, rapid urbanization, ageing of countries' populations, sustainability, and the alarming global change of the environment (Cerqueira et al. 2019). Similarly, the fragmented nature of global food supply chains presents an additional challenge to respond to consumers' requirements in terms of food safety, quality, and authenticity. The food supply chain is a dynamic system encompassing food brands, primary producers, processors, regulators, third-party actors and other resources engaged in various processes and governance (Yu and Nagurney 2013). With the fast pace of technology developments, the conventional ways of managing and delivering food products to markets and consumers are evolving. Today, technology is viewed as a critical enabler, and Nambiar (Nambiar 2010) argued that food suppliers could use technology to enable continuous monitoring to preserve quality and provide cheaper food products to consumers. The use of technology results in increased operational efficiencies and savings throughout all the links of the food supply chain (Huscroft et al. 2013;Jayaraman et al. 2008;Jovanovic et al. 1994).
Digital technologies are constantly developed and deployed across the agro-food system, from the farmer to the consumer (Rotz et al. 2019). Over the past twenty years, advances in information and communication technologies (ICTs) have enabled new opportunities and innovations for improving the outcomes of agricultural activities (Xin and Zazueta 2016). For example, Radio Frequency IDentification (RFID) technology can be integrated into the food supply chain allowing organizations to gain enhanced granularity in supply chain traceability for compliance and business process improvement (Attaran and Attaran 2007). RFID also enables the real-time monitoring and visibility of re-usable assets such as pallets or totes carrying food products. It facilitates the acquisition of more accurate inventory data and tracking of food cargo at various levels of aggregation in the supply chain. The emergence of the Internet of Things (IoT) enhances the pervasive presence of 'things' or 'objects' with RFID tags, sensors and actuators interacting or participating on a network (Atzori et al. 2010). This can benefit the food industry and improve aspects such as the management of food loss (food loss occurs in pre-consumer phases) and food waste (Wen et al. 2018). The use of IoT in food chains has also intensified with billions of ubiquitous and interconnected devices ranging from mobile tools, equipment and machinery on farms to household appliances and temperature-sensing devices (Rao and Clarke 2019). When IoT is combined with other technologies, it helps to visualize food supply chain processes and geographic mapping of supply routes (Rejeb 2018a, b;Rejeb et al. 2019). Furthermore, sophisticated tools, devices and technology also include autonomous guided vehicles (AGV), precision farming using robotics and artificial intelligence (AI), distributed ledger technology (DLT), cloud computing and BD tools that combine to reshape agriculture at an unprecedented pace (Phillips et al. 2019). Technology is leveraged to process and handle large data streams from multiple sources and origins in the food chain.
BD is perceived to be a critical technology in food chains, agriculture, and other sectors of the economy (Sonka 2014). BD is defined as "a conglomeration of the booming volume of heterogeneous data sets, which is so huge and intricate that processing it becomes difficult, using the existing database management tools" (Subudhi et al. 2019, p.2). It can be understood as the processing and analysis of large data sets obtained from various sources such as online user interactions, consumer-generated content, commercial transactions, sensor devices, monitoring systems or any other consumer tracking tools . BD also refers to the massive amounts of digital information about human activities, which are generated by a wide range of high-throughput tools and technologies (Marchetti 2016). According to Cavanillas et al. (2016), BD is an emerging field where innovative technology offers new ways of extracting value from the volumes of data and information generated. In the context of food supply chains, BD is a fast-growing area that supports decision-making processes, differentiates and identifies final products based on market demands, and aids in food safety (Armbruster and MacDonell 2014). Research and developments on crop improvement and sustainable agriculture have significantly benefitted from the usage of BD in crop modelling for targeting genotypes to different environments (Löffler et al. 2005). For instance, analyses based on consumption and crop growth data could aid farmers in determining which crop varieties to plant and which to minimize, enhancing crop yield, increasing sales, and maximizing returns on investment (Tao et al. 2021). Similarly, the use of big geospatial data (e.g., from wireless networks, farm machinery telemetry, and periodic remote sensing) enables better management practices in soil erosion, water pollution, and disaster risk management in agriculture (Řezník et al. 2017). The ability to collect and analyze data on crop variety, quantity, quality, location, weather events, market prices, and management decisions can support predictive analytics tasks and enable farmers and farming cooperatives to improve crop forecasting (Jakku et al. 2019). The use of BD also encourages the development of precision agriculture, which contributes to water conservation (O'Connor et al. 2016), soil preservation, limited carbon emissions (Ochoa et al. 2014), and optimal productivity (Mayer et al. 2015).
Furthermore, the advent of BD has the potential to improve the design of food supply chains, the relationship development among stakeholders, enhance customer service systems, and manage daily value-added operations (Waller et al. 2013). The application of BD can help food businesses become more profitable by increasing their operational efficiencies, improving their potential economic gains, and optimizing their resource allocation. When BD is combined with artificial intelligence (AI) tools, the risks related to the occurrences of pathogens, contaminants or adulterants used in economically motivated adulterations (EMA) in the agriculture chain can be predicted (Marvin et al. 2017;Spink et al. 2019). Although these benefits are tangible, several challenges remain.
While BD has gained remarkable attention from both scholars and practitioners, research investigating the applications of BD in food chains remains scarce (Rotz et al. 2019). Moreover, few studies are using BD analytics with a focus on sustainable agriculture and food supply chains . Therefore, to fill this knowledge gap, the primary goal of this study is to explore the relevant literature and identify how BD potentially impacts food supply chains. By synthesizing the literature published in leading journals, authors strive to demonstrate how the adoption of BD in the food supply chain will improve operational efficiencies, enhance food quality and safety, and develop a sustainable food ecosystem. In dealing with this increasingly important topic, this study aims to provide a deeper understanding of the following research question (RQ): RQ: What are the possibilities of leveraging BD in the food supply chain?
The contributions of this research to the BD literature is significant. Based on the authors' current understanding and knowledge, this study presents the first reference to the potential of BD in food supply chains. Besides, the review is among the first to capture the dynamic nature of this topic, providing a systematic review of the recent investigations on BD in the context of food supply chains from literature appearing in leading journals. The review of previous scholarly research provides a timely summary of current evidence that can be used to increase the understanding of BD for scholars focused in the food, technology and supply chain industry. Food industry practitioners and decision-makers can derive new insights into how to design sustainable food supply chains with the emerging field of BD. Thus, this study is motivated by the limited discourse about the usefulness of BD in supply chain management (Engelseth et al. 2018). Hence, this gap in the literature is what authors explicitly intend to fill.
The remainder of the paper is structured as follows. Section 2 describes the methodology of the review. The subsequent section presents the statistical classification of publications. Section 4 provides a detailed discussion of the possibilities of BD in the food supply chain based on the findings of the reviewed literature. In Section 5, some challenges of BD are discussed. The last section concludes the papers, discusses the research contributions, limitations and future research directions.

Research protocol development
To answer the research question of the present study, the authors conducted a systematic review of published literature following the guidelines proposed by Denyer and Tranfield (2009). A systematic literature review (SLR) is a scientific activity that aims to evaluate and interpret all available research relevant to a particular research question or topic area or phenomenon of interest (Kitchenham and Charters 2007;Kitchenham 2004). An SLR is also a method that helps to consolidate and advance scientific research through locating, appraising and summarizing the existing literature. In order to survey the current state of scientific knowledge regarding the research question, an SLR is driven by prescribed steps to ensure the relevance of the retrieved literature, the minimization of research errors and bias, and the reliability of the quality assessment. The presentation and the process of the SLR in this study aim to establish a familiarity with what is already published about BD applications in the food supply chain. Along the process, care is taken in ensuring that the steps of the review are transparent, rigorous, reliable and repeatable. Furthermore, the authors developed and strictly followed a review protocol that is based on the iterative cycle of identifying adequate search keywords, selecting the relevant studies, and eventually carrying out the analysis. The review protocol is generated based on the central research question and the search string in order to extract the relevant studies. All the authors jointly specified and developed the necessary stages of the protocol. Table 1 describes in detail the selection of the search database, the collection of studies, and the eligibility criteria.

Data collection
Based on the surveyed Scopus research database, the initial result of the search queries was 131 publications. To further refine the results, the corresponding author undertook the removal of duplicates and the articles with missing bibliographic data points. The publications were also analyzed and filtered according to the eligibility criteria mentioned in Table 1. The authors screened the titles and abstracts to Research online database Searches were conducted in Scopus, a leading and global multidisciplinary research platform covering nearly 20,000 titles of peer-reviewed journals from over 5000 publishers (Blettler et al. 2018). Unlike other scientific databases, Scopus is known for its extensive coverage of academic literature and its international orientation. It is also a more comprehensive and representative bibliometric source than any other database. Publication types Limited to academic, peer-reviewed literature. Furthermore, the search was limited to highly rated journal publications.

Language
Limited to publications in the English language only. Date range Unlimited. No specific date range was used to conduct the research. Search fields Title, abstracts, and keywords Search keywords TITLE-ABS-KEY ( "big data" AND ( food OR "food supply chain*" OR "food chain*" ) ) Inclusion criteria Limited to publications that studied BD applications in food supply chains.

Exclusion criteria
Publications with a deep and pure technical focus were excluded. Studies on BD applications beyond the food and agricultural supply chains were filtered out.
identify the initial relevant studies, retrieving 62 publications for full-text review. After reading the full content and assessing the quality of articles, a total number of 41 articles were selected for complete review. The final selection of articles was guided by the research question of this study. In other words, out of the 62 publications, authors only considered publications that identified the possibilities of BD from the food chain perspective. As a result, all the 41 publications were relevant to the scope of the present study, and they provided discussions on BD from the perspective of food supply chains. Figure 1 shows the process of data collection.

Publications by year
The search was carried out in October 2019. Figure 2 presents the number of publications published by year and extracted from the execution of the research protocol.
Despite being a well-established technology, the interest in BD within the food industry has considerably increased over the recent years. Papers studying the application of BD to food supply chains were almost all published from 2013 onward. More specifically, there is an upward trend in the number of articles published on the subject from the year 2013 to 2019. The number of articles published from 2014 onward has exponentially increased, showing that the applications of BD have gained more recognition and increasing academic attention among food chain researchers. The reason is that many globalized food supply chains are currently migrating to an Industry 4.0 setting, embracing modern technological solutions that are commonly used in other industries (e.g., automotive industry). Industry 4.0 represents a milestone for the modernization and acceleration of food supply chains. As a critical technological component of this emerging paradigm, BD promises a revolutionary leap in the management of food chains among highly dispersed networks of several actors. BD contributes to the successful development of data-driven food supply chains responding to the core needs of businesses and other stakeholders. Out of the total reviewed studies, 36 papers were published in the last three years (2016-2019), reflecting that the integration of BD into food chain activities is still a nascent research area worth discussing and exploring in a much more in-depth manner.

Publications by country
In order to analyse the geographical distribution of publications concerning BD in the food supply chain, the authors' affiliations were identified at the time of publication. As shown in Fig. 2, a significant contribution to the BD literature in the context of food supply chains came primarily from the USA and the UK, with 15 and 7 papers, respectively. This finding is predictable for both countries. For example, Armbruster and MacDonell (2014) noted that several efforts are steadily underway in the US food system to harness BD to preserve the quality and safety of food products. BD applications in weather and climate have been applied in the USA in the establishment of climate predictions and disaster response in real-time network systems using satellite image data (Lee et al. 2015). According to the analytical agency Mind Commerce, the market size of BD in the US in 2013 reached $20 billion, whereas, in 2014, the value was $29, achieving a growth rate of 45% (Ramzaev 2015). The importance of BD is also rising in the UK, where the technology has been identified as a driver for economic growth and one of the eight key government priorities (Government 2013). The UK government invested £ 73 million to help public and academic projects to unlock the potential of BD in diverse sectors of the economy. Agrimetrics is one of the agricultural innovation centres recently launched in the UK to engage with the food industry stakeholders and enable detailed and collective understanding of the needs of farmers, food producers, retailers, and consumers through the use of BD and analytical tools (Agrimetrics 2015). To a lesser extent, scholars from Canada and China were equally responsible for the publication of 4 articles. In this regard, Barrados and Mitchell (2017) pointed out that there is a proliferation of automated data systems in Canada. This finding is consistent with the assertion of Clarke and Margetts (2014) who noted that the government of Canada was later than the UK and the US in introducing an open data initiative, which was set up in 2011 by Tony Clement, President of the Treasury Board. Five countries, including India, Japan, Malaysia, South Korea, and Spain, were responsible for ten articles (two each). Only one publication was identified in every remaining country within the sample of the relevant literature. When authors considered the analysis of publications on a continental basis, researchers from North America are the central contributors to the literature representing 37% of the total participation. To a lesser extent, relevant contributions for each of Europe and Asia represented respectively, 29% and 24% of the total studies. There was an increasing international focus on BD applications to food supply chains that are reflected in the contributions of developing countries in Africa with 8% of the total relevant studies. In comparison, Oceania represents 2% of the total studies. These findings suggest that the rise of BD is not limited to developed economies, but also the technology has extended to the food supply chains of the developing economies (Fig. 3).

Publications by journal
The reputation and credibility of the journal ranking have a significant impact on how people assess the value of the publication. The classification of journals was facilitated by the use of the BibExcel tool. The reviewed publications were from 37 journals. While ranking the journals based on the citation analysis, twenty-nine (29) articles were published in journals that had an impact factor in Journal Citation Report-JCR (2019). Table 2 presents the journal titles, the number of publications, and the impact factors exceeding 4. The category "Others" includes 29 journals, of which only 18 journals have an impact factor. It should be noted that all the publications spanned across a wide range of fields that cover food sciences, manufacturing, computer sciences, supply chain management and logistics, and business. The variety of the scope of the journals reflects the multi-dimensional perspectives of BD and its versatile applications to several areas in the food supply chain. Figure 4 presents the distribution of the selected 41 papers by the methodological approach used. Two main research approaches were identified for the classification of articles; conceptual and empirical. Conceptual papers review  and discuss the applications, theories, capabilities, and challenges of BD based either on the extant literature or without the collection of primary data. However, empirical papers tend to present data collected through case studies, interviews and focus on measurable and visible BD activities and processes in the food supply chain through other methodological approaches such as algorithmic analyses, prototypes, and system designs. As shown in Figs. 4, 17 papers provided a conceptual discussion or review on BD.

Big data publications based on the type of research
The remaining 24 papers dealt with the topic using empirical research approaches that included case studies and interviews (7), algorithmic and mathematic analyses (4), prototype and system design (4), survey and multi-methods (3). Table 3 presents in detail the classification of these studies according to their methodological approaches. Figure 5 shows the trend of how different research approaches have been used to study BD in the context of food supply chains during the period 2013-2019. The trend depicted in Fig. 5 reveals that there is a steep increase in the conceptual and review studies. The trend also shows that the concepts applied to BD research are being tested and validated through empirical techniques and methods such as case studies, interviews, algorithms, prototypes and surveys. While there is a sharp increase in theoretical studies, the increase in studies using empirical investigations is not significant. Therefore, empirical studies are necessary in order to assess the effectiveness and efficiency of BD in the food supply chain.

Increased knowledge and insights
In highly uncertain business environments, the dynamic and globalized nature of the food supply chains has created both fragmentation and complexity with a higher dependency on data and information analysis (Gereffi et al. 2012;. Large unstructured data sets are now generated on a real-time basis, which challenges the current approaches for decision-making and calls for a revamped focus on advanced analytical tools (Xin and Zazueta 2016). The proliferation of new technologies has given rise to a wave of data originating from different sources such as IoT and wireless sensor networks, the web, mobile applications, and social media. The ability to effectively process these data, manage information and extract knowledge is becoming key for achieving competitive advantage (Curry 2016). Advances in information technology offer new possibilities to extract new insights and knowledge from BD (Akhtar et al. 2018). The advantage of BD tools compared to conventional analytics and business intelligence is their ability to more effectively process the massive volume of data than others (Subudhi et al. 2019;Alfian et al. 2017).
In food supply networks, BD enables companies to discover consumers' needs, create new values, and improve the management of their organizational processes ). According to Engelset et al. (2019), BD is not a pure technology per-se; instead, it is a valuable method and tool set to manage, analyze, capture, search, share, store, transfer, visualize and query supply chain information. In the agriculture field, BD can help to efficiently extract value from the vast amounts of data such as environmental information, biological data, agricultural equipment information, monitoring data of production processes, sales and management data, food safety procedures, yield rates and soil health . The high capabilities to process and handle large datasets can optimize the operational decisions and coordination in the food chain. As such, the knowledge gained from the application of BD can be useful in designing adaptive processes for the optimization of the food supply chain. In this context, companies operating in the food industry would be able to optimize process steps from procurement to production to marketing by deriving new insights that were traditionally 'hidden' within data patterns . In this regard, Sonka (2014) argued that BD tools are more efficient in enabling analysts to explore massive quantities of texts and identify the relevant descriptors within the information. BD allows food retailers to adapt and become consumer-centric by providing useful analytical tools necessary for extracting relevant insights into consumer sentiments and behaviours .
In the era of BD, food supply chains are heavily dependent on the use of technology to create valuable knowledge. The mining of the data generated at each echelon of the supply chain provides an effective basis for agri-food decision-making, optimization of processes, and identification of interdependencies . For example, a BD platform is needed to handle a large amount of unstructured and continuously generated real-time sensor data (Alfian et al. 2017). The time and temperature information retrieved from the sensor network and analyzed with BD tools provide real-time insights into the product shelf-life information (Li and Wang 2017) and can help to reduce food waste. The intuitiveness of IoT networks and connected sensors across the food supply chain can be enhanced with BD to capture data related to time and temperature and to share it with exchange partners in order to dynamically manage the optimization of storage, packaging, delivery and selling according to the data drawn from the sensor networks (Li and Wang 2017). The increased data visualization capability can be applied in real-time to fresh food supply chains to improve customer value and reduce costs (Engelseth et al. 2019). Khanna et al. (2018) argued that the combination of BD, advanced information and computational technologies could improve knowledge of the processes and relationships in the agri-food sector. Tan et al. (2017) pointed out that the ability of BD to extract embedded knowledge from large amounts of data can help to solve several specific issues in the halal food industry, such as the contamination of halal food products. Therefore, food businesses, including small and medium enterprises can utilize BD to create actionable knowledge and insights, strengthen their oversight and management of data, and improve their competitiveness in the

Improved decision-making
According to Malakooti (2012), decision-making is a complex, multi-dimensional process that can take place spontaneously without any prior planning, or it may emerge after exhaustive and well-contrived analysis. The complexity of supply chain management has resulted in a lengthy decision-making process due to the time required to access information that is necessary to make business decisions. In the context of global food supply chains, strategic decisionmaking is essential as the holistic efforts could increase the profitability of an entire chain from an efficient framework (Zhong et al. 2017). Despite the advances in technology and decision support systems, achieving responsive and adequate decision making is a difficult task. However, leveraging BD in food supply chains can significantly improve decisionmaking. Moreover, BD counteracts the conventional ways of thinking and decision-making that are based on the intuition and experience of the owner or manager (O'Connor and Kelly 2017). BD enables a more informed, evidence-based decision-making (Akhtar et al. 2018) by providing managers with access to explicit information and equipping them with new tools and capabilities (Sonka 2014). BD provides sophisticated tools where farmers can assess different scenarios from different farming decisions (Xin and Zazueta 2016). In this regard, Kamilaris et al. (2018) developed the AgriBigCAT platform that can support farmers in their decision-making processes and administration planning to meet the challenges of increasing food production at a lower environmental impact. Moreover, BD increases the visualization of information across the food network and drives enhanced transparency, higher productivity, and informed decision making . Decision making would no longer be undertaken in food supply chains with insufficient or fragmented data and information. Consumers also benefit from the outputs of BD initiatives as it can provide contextual information about the food, its origin, method of processing and other information, which aids in a more informed purchasing decision. Lin and Mahalik (2019) argued that BD improves data storage and enhances the application of agrifood scientific research by providing intelligent decisionmaking. Tan et al. (2017) noted that halal industry players could make better and more efficient decisions using BD. Therefore, BD enables food supply chain exchange partners to be involved in interactive and consistent decision processes. BD leads to more intelligent and smarter decision making that can improve the operational performance of food chains, reduce costs, minimize the cycle time of decisions, and mitigate potential risks. Thus, we suggest the following research proposition: RP2: BD facilitates decision-making processes in the food supply chain.

Improved efficiencies
Managing efficiencies in food supply chains is an ongoing process that requires the better utilization of available resources, the optimization of processes, and the minimization of costs (Angkiriwang et al. 2014). Hence, food supply chains are pressured to enhance efficiencies at every stage, from procurement, logistics, manufacturing, marketing and sales to after-sale services. Similarly, the agri-food sector is dynamic, diverse, and requires more sophisticated tools to improve efficiencies ).
As technology is critical for improving supply chain efficiencies (Attaran 2017), the use of BD and its visualization capabilities allows firms to automate the process of exploring hidden patterns that can occur in the food supply chain efficiently and cost-effectively . BD allows food supply chain businesses to explore every opportunity to improve their operational efficiencies, simplify processes, and reduce transaction costs. The management, analysis and response to food-related data can be facilitated through BD and automated to predict situations in real-time (Tzounis et al. 2017). For example, Kshetri (2017) argued that a system based on BD could deliver information to farmers and water service providers on a real-time basis about the current and predicted water and soil moisture levels. Alfian et al. (2017) proposed a real-time monitoring system that utilizes smartphone-based sensors and BD to handle IoT-generated sensor data and helps food operators to implement critical strategies related to the perishable supply chain. Farmers can capitalize on BD to monitor the health status of animals in the food chain. To confirm this development, Sivamani et al. (2018) proposed a method based on BD to control the nutritional intake of the livestock, improve the health and diet of animals, and support the early detection of diseases.
While the applications of BD to agriculture dates back to the 1990 s (Carolan and Carolan 2017), the technology can play a substantial role in advancing modern precision agriculture. Precision agriculture is a technology-driven approach for the management of farming activities such as the monitoring, estimation and prediction of crop-related data. According to Bucci et al. (2018) precision agriculture is adopted by innovative farmers who rely on the capabilities of BD to enable the intelligent usage of precision farming data. Similarly, BD is a promising instrument for farmers wishing to develop smart agriculture, improve their productivity, and enhance their integration in the food supply chain. The constellation of technologies in the agri-food sector, such as remote sensing, satellite imagery and highspatial-resolution BD from farms, has already produced a sophisticated method of farming that increases the efficiency of agricultural production and enables site-specific crop and livestock management decisions (Khanna et al. 2018). In this respect, Li and Mahalik (2019) posit that BD can utilize data from GPS/GIS to track crop yields, determine the optimization of crops, and increase harvesting productivity. The combination of BD with IoT data can help farmers optimize their farm operation. In research by Kamilaris et al. (2018), BD is used in an online software platform to analyze geophysical information from various sources, estimate the impact of livestock on the environment, and increase resource efficiencies. Khanna et al. (2018) noted that in 2017, Great Lakes Watershed Management System brought environmental forecasting capability to precision agriculture by allowing farmers to input GIS coordinates for their fields, run tillage and fertilizer management scenarios, and to view predicted estimates of nutrient loading and soil erosion to nearby water bodies. Therefore, the enormous potential of BD applications to enhance precision agriculture is evident in the reviewed papers. Therefore, BD aids in the efficient usage of scarce resources (e.g. water) and the optimization of crop cultivation and harvesting. Furthermore, BD helps to develop more accurate models for agriculture management and monitoring of farming activities. Consequently, we introduce the following research proposition: RP3: BD has a positive impact on the operational efficiencies of food supply chains.

Reliable forecasting
Food supply chains are inherently complex to the extent that inputs cannot be completely controlled, managed, and safeguarded against uncertainties. Therefore, forecasting is a necessary activity that aims to evaluate the value of events in the future with uncertainty based on the observed patterns from the previous record (Ahmed 2004). Demand forecasting has long been a critical issue of the food industry that calls for reconsidering sophisticated technologies such as BD to aid more accurate and useful forecasting (Nita 2015). Hence, BD can act as a critical enabler in the food supply chain because of its power to aid forecasting accuracy and precision. The predictive capabilities of BD are beneficial to support the management of food chains, which are increasingly characterized by their short life cycles and speed of response. Moreover, the technology enables the systematization of demand forecasting, resulting in improved accuracy of consumers' demands, reduced distribution costs and disposal losses (Nita 2015). Farm management and operations will dramatically change because of the high resolution of BD information, realtime forecasting, and transparent prediction models. In crop management, Badr et al. (2016) noted that BD could provide the data required to run crop models under different climate and management scenarios, and this approach is useful for mitigating some food security issues. The authors argued further that technology and BD-centric forecasting could support decision-makers, crop growers, and researchers to gain a deeper understanding, better manage supply and demand of the food chain, anticipate food-related challenges, and develop practical solutions to overcome food insecurity and price uncertainties. Testing the credibility of forecasting results, Nita (2015) found that a BD-enabled system for a food manufacturer could produce a high forecasting accuracy within 70% of the target commodities. The benefits of proper and reliable forecasting include the optimization of food chain operations, lower product perishability, better planning and utilization of resources, and the improvement of the overall supply chain performance. BD also drives more collaborative forecasting and scheduling between the food business and its supply chain exchange partners, resulting in better inter-organizational collaborations. Thus, the following research proposition emerges: RP4: BD leads to more accurate and reliable forecasting in the food supply chain.

Waste minimization
In the context of agri-food supply chains, waste represents a catch-all term that encompasses non-value-adding activities, excess inventories, additional wait times, unnecessary processing steps, and other variabilities. According to Hicks et al. (2004), waste is a strategic issue in the supply chain that forces companies to seek ways to minimize all types of waste and thus achieve cost-savings. Research on food waste has established that one-third of the food produced is either wasted or is lost, accounting for 1.3 billion tons per year . (Note, food loss refers to pre-consumption stages such as pre and post-harvest loss whereas food waste occurs when the food is consumable but discarded).
Supply chain waste may stem from ineffective quality or process control, and large quantities of inventories can perish in agri-food supply chains. As the minimization of resource waste is a topic of paramount importance in the food supply chain, there is a high potential for BD tools to reduce waste in the food supply chain . The minimization of food waste through BD can result in increased resource utilization, better profitability and reduced risk of food insecurity. The visualization capabilities of BD can enhance the traceability of food supply chains and the visibility of key business processes. Belaud et al. (2019) pointed out that BD leads to more sustainable food supply chain designs that valorize agricultural waste. Li and Wang (2017) developed a BD-based system that aggregates time when the temperature exceeds a certain threshold at each stage of a supply chain and estimates the impact of improper quality control and perishability of food products (e.g., reduction in shelf-life, risk of spoilage). This increased control enables retailers and manufacturers to deliver satisfactory food quality and overcome the severe financial consequences of food loss and waste in the supply chain. Another benefit of BD tools is transparency, in the sense that whenever products pass through the supply chain, effective waste-related decisions can be dynamically made, such as pricing of food products based on their current shelf-life (Li and Wang 2017). The possibility of uncovering hidden and valuable insights with BD can also help food chain actors to reduce overall waste. For example, retailers today are utilizing BD for waste reduction by using consumer complaints made in retail stores . Data captured from social media (e.g., Twitter) can be analyzed using BD in order to develop effective waste minimization policies in the food chain. Therefore, BD contributes to more sustainable food chains as it can dramatically reduce the occurrence of perishability in the food chain and the immense food loss and food waste. Beyond overcoming the economic losses of waste, the technology also helps to incorporate other sustainability considerations that are relevant to food safety. For example, the aggregation of food data in a BD system empowers the trace-back and track-forward capabilities of the business. Hence, this capability enables the reduction of unnecessary food waste and the fast detection of products involved in foodborne illness outbreaks, their sources, and their current locations (if still in the supply chain). As a result, we outline the following research proposition: RP5: BD reduces waste in the food supply chain.

Food safety
Food safety represents a growing and critically important public health issue (Aung and Chang 2014). It is a joint responsibility of all actors involved in the food industry to ensure that food is safe to consume. With the increasing concerns and awareness of consumers toward food safety, food supply chain partners are obligated to secure and protect food products from any sort of contamination or adulteration, whether it be unintentional or intentional. The assurance of food safety means that food is safe from causing harm (Demartini et al. 2018). To maintain food safety, the use of technology and information systems can provide incentives and accountability measures that are critical for identifying best manufacturing practices for food operators at various stages in the food supply chain (Ahearn et al. 2016). In this regard, Marvin et al. (2017) confirmed the significant role of BD in predicting the presence of pathogens or contaminants by matching the information on environmental factors with pathogen growth or hazard occurrence. Zhang et al. (2013) developed algorithms that used BD and visualized images to model contamination conditions in an IoT-based food supply chain, helping to develop consumer confidence in the food ecosystem. To assist farmers in the selection of the most eco-friendly beef cattle supplier,  proposed a BD cloud-computing framework for carbon minimization. The captured information related to carbon footprint can be used by abattoir and processors in their supplier selection decisions while accommodating carbon footprint emissions in this process. Moreover, the deployment of BD in combination with ERP, IoT and other data sources connected to logistics providers can facilitate enhanced product tracking and risk management of food. By providing real-time information about the product, its condition (e.g., temperature), destination routes, including traffic and weather patterns, BD may prove valuable for trend detection of potential contamination during the delivery of food products . As stated earlier, the increased transparency gained from the BD application can provide thorough and real-time monitoring of the quality of perishable food products. In the highly complex global food chain, BD enables supply chain exchange partners to establish more effective and cooperative relationships in order to maintain food safety and enhance transparency. Li and Wang (2017) outlined that with BD applications, consumers would be able to obtain more information about the product shelf-life variation over time. Access to a granular level of information creates a conducive environment that not only assures food safety but it establishes more trust, confidence and commitment. Such digital transformation, is, according to Li and Wang (2017), a suitable framework for strategic innovation for marketing, quality management, and supply chain optimization. Therefore, BD can be viewed as a critical and value-adding element for food safety management that can respond to consumers' growing concerns about food quality and safety. Based on the previous discussion, we suggest the following research proposition: RP6: BD improves food safety management across the supply chain.

Further challenges of big data
The application of BD has tremendous potential in food supply chains. To achieve competitiveness, the food and restaurant industry could embrace BD to derive actionable business insights, make evidence-based decisions (Coble et al. 2018;Lokers et al. 2016), optimize operational efficiencies, produce reliable forecasts, minimize food waste, and ensure food quality and safety. In their study, Ma et al. (2018) argued that BD could enable restaurant owners to predict future visitors. For the service-oriented food industry, the implementation of BD has become a necessity given the ability of the technology to provide insights into customer spending habits and support restaurants to more accurately grasp the market trend (Tai et al. 2020). Although the benefits of BD for food supply chain players, including those operating in the foodservice industry, are tangible, several challenges are still hampering its wide-scale implementation.

Data complexity
According to Waldherr et al. (2017), the challenges of BD stem mainly from the growing amounts of data, the high speed of data generation, and the diversity of data formats and structures. The BD ecosystem is characterized by a great variety of data sources and the velocity of data flows for which advanced computational methods are imperative to analyze data (Zhou 2019). Similarly, the need for these methods and techniques is pressing as they allow to manage knowledge of chemical components of foods of importance to human health (Tao et al. 2018). Moreover, the increasing interconnectedness and complexity of BD result in overlaps, various links of data, and growing noise. To purify BD, food businesses are required to devise new strategies, tools and technologies that can improve data quality and analysis. In BD applications, poor data quality or so-called "dirty data"  could increase concerns over the reliability and validity of BD analyses and create additional costs for food firms. For example, analysts approximate that the cost of poor data quality within a typical business is between 8% and 12% of revenues (Sethuraman 2012). Therefore, subtracting noise from BD is a challenging task because data keeps on varying inconsistently concerning time, thereby affecting the mechanism of effective data management (Subudhi et al. 2019).

Security and privacy issues
The BD-driven food supply chains bring enormous challenges for food businesses, especially during data collection, storage, visualization, and information sharing. For instance, these include issues about data security and privacy (Sharma et al. 2018(Sharma et al. , 2020. As per Duncan et al. (2019), cybersecurity threats are problematic in the BD era because of inappropriate access to BD systems, data, or analytical technologies and the nefarious use of information for fraudulent food activities. Food supply chain partners need to secure the public and private information of individuals and businesses, including physical and digital footprints, searches, transaction histories, audio and video communications, service registrations, conversations, and messages . The BD ecosystem is fraught with data security risks, which necessitate being carefully evaluated before food businesses engage in the adoption of BD systems. Thus, to sharpen their competitive advantage, food businesses have to ensure a high level of data security to implement BD successfully. Furthermore, the aggregation of data from different and distant information sources has also raised several privacy concerns due to the so-called private information leakage (Guo and Wang 2019). As a result, BD systems might entail collecting consumers' private information without consideration of regulations, laws, and existing standards. Therefore, consumer-privacy issues could deter food businesses from shifting towards BD-enabled food supply chains.

Organizational challenges
At the organizational level, the lack of necessary capabilities and resources might hinder the applications of BD in the food industry. In this context, Kshetri (2017) points out that organizations might be in shortage of BD engineers and scientists who can understand, interpret, and perform analytics. This critique is also highlighted in the study of Tan et al. (2017) who argue that the halal industry still encounters the lack of talented professionals who could work with BD tools and techniques. Besides the need for analytical and technical know-how, organizations might commit sizeable initial investment to implement BD systems (Sonka 2014). For resource-constrained food businesses, BD might not be an economically feasible solution since the incorporation of IoT-based systems, and the expansion of human resources through BD corporate training programs could be a costly and risky investment. BD applications in food services can be unaffordable and almost exclusively developed for larger food firms. Therefore, when seeking to invest in BD applications, incapacitated food industry stakeholders, including farmers and foodservice organizations, could be skeptical of the benefits of BD for their business processes and reluctant to integrate BD systems into their organizational structure. This uncertainty could be further aggravated by the lack of interoperability (Jeppesen et al. 2018) among the technologies leveraged in the food supply chain.

Conclusions
This study aims to investigate the current state of research on the applications of BD to food supply chains by conducting an SLR on all relevant studies through an appropriate review methodology. Forty-one (41) articles were thoroughly examined and analyzed for this purpose. The findings of this SLR showed that the application of BD to food supply chains is getting increasingly popular with an increase in the number of publications recently. Initially, the SLR was focused on identifying the type of methodologies that were used in the reviewed publications. The use of conceptual approaches to contextualize and extend discussions on the possibilities of BD in the food chain was frequently noticed. Empirical methodologies were employed to demonstrate and validate the effectiveness of BD in sustaining food supply chains from different aspects. A significant number of studies (n= 13) used a case study methodology and interviews to gather data. Some studies developed and proposed prototypes, applied surveys or created system designs to validate the benefits of BD to food manufacturers, retailers, and consumers. The enablers of BD in the food industry identified from the SLR contribute to the literature, concepts, and theories on the capabilities of BD in bringing effective solutions to the management of food chains. In many instances, the ability to extract useful knowledge and insights from data demonstrates the enormous potential of BD and is frequently reported in the majority of studies. However, an observed lack of research studies investigating the capabilities of BD in optimizing food processes and supporting food procurement, processing and marketing is identified. It can be a potential area for further research.
The theoretical findings reveal that previous research on the application of BD to food supply chains have focused primarily on providing the basic concepts of BD and use cases demonstrating its benefits. A paucity of studies synthesizing the advantages of BD was found in the literature review. Hence, this study fills a knowledge gap and presents a contribution to the literature in the form of a detailed SLR. The findings of the SLR revealed six key enablers of BD in the food supply chain namely; • Improved knowledge and predictive insights • Decision-making support • Enhanced efficiencies • More accurate forecasting • Process-based waste minimization • Food safety management The findings of the review revealed that BD implementations could be impeded by the poor data quality, security and privacy concerns, lack of organizational capabilities and skills, high initial investment costs, and resistance to operate with BD systems. Thus, future research studies may investigate the solutions necessary to accelerate the uptake of BD in the food industry. Research in this direction will help to provide a more balanced understanding of what enables and hinders the development of BD-based food supply chains. Further, this study identifies that BD can be combined with other technological tools such as IoT, AI, cloud computing, and decision support systems (DSS) to substantiate the value of technology in the agri-food industry. Scholars may investigate to what extent food businesses can benefit from the integration of these technologies in the supply chain. The findings of the SLR are one of the initial attempts to contribute to the understanding of BD applications and its connection to the food research area. The utilization of BD could unlock several benefits and sustain the delivery of safer food products to consumers. Therefore, food industry practitioners and decisionmakers would gain a deeper understanding of the promising role of BD in contributing to the evolution of sustainable activities in their organizations. The enablers of BD identified in this study may be considered in the formulation of guidelines necessary for BD implementations in the food chain.
Although this study provides a timely review of an increasingly emerging technological capability, we recognize several limitations. The use of Scopus as a comprehensive database does not guarantee the full coverage of the extant literature. Some articles outside of Scopus might be relevant to the scope of the study but have not been considered. Hence, we encourage the replication of review studies in the future and the use of other accessible databases such as Web of Science and Google Scholar. The findings of this study are also limited to the selected number of publications, and therefore, the theoretical inferences drawn here should be validated with other empirical research methods such as expert interviews.
Funding Open access funding provided by Széchenyi István University (SZE).

Conflict of interests
The authors declare that there is no conflict of interest regarding the submission of this article.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.