1 Introduction

Huge competition and fluctuating demand patterns increase the data generation in the supply chain (SC) (Arunachalam et al., 2018). This pushes the companies to adopt the supply chain analytics (SCA) so as to gain competitive advantage (Davenport and O’dwyer, 2011; Shafiq et al., 2020). Adopting the SCA practices improves the accuracy and overall performance of the SC. As firms are already using statistical methods for decision making in one way or another, some practitioners are against the implementation of SCA because of the time and cost incurred in adoption of these solutions (Kusiak, 2006; Trkman et al., 2010). Implementing SCA in businesses is a tricky task because of the uncertainty associated with the data and the changing requirements of the customer (Handfield and Nicholas 2004; Liberatore & Luo, 2010; Huner et al., 2011; LaValle et al., 2011; Manyika et al., 2011).

Industries are facing high pressure to improve the overall performance of the SC to gain competitive advantage over rivals (Aloysius et al., 2018; Arunachalam et al., 2018; Hazen et al., 2018). Designing and developing data-driven innovations (DDI) is the way that the companies live, breathe, strive, and sustain their competitive advantage in a competitive data-driven environment (Sultana et al., 2021). Data plays a vital role mainly for retail and e-commerce industries to make a decision in the selection of strategies and to increase the performance (Mishra et al., 2018; Rai et al., 2006). Sultana et al. (2022) foregrounded the significance of DDI in strategic competitive performance of the companies. These days, the information or data generated by the industries are increasing exponentially because of the usage of technology solutions such as enterprise resource planning (ERP), radio frequency identification (RFID), Internet of Things (IoT), etc. (McAfee and Brynjolfsson, 2012, Wamba et al., 2018). So, as time increases, information (data) is also rapidly increasing. The exponential data growth leads to certain challenges such as an organization’s data literacy, privacy and security regarding the usage of data (Sultana et al., 2021). This is more of a central concern in SC. Because of the rich and abundant data in the SC, retail firms are facing issues like wasting time in analysing the irrelevant and inaccurate data, in addition to this, challenges such as how to store and access massive amounts of data, etc., which in turn decreases the overall performance and efficiency of the SC and also profits of the organizations. Tiwari et. al. (2018) also highlighted that there is a nascent focus on addressing the challenges of data driven companies in the retail sector. To overcome this, big data analytics (BDA) largely helps in retail and e-commerce industries to optimize and store the data in a purposeful way.

BDA, a part of SCA, is used for the analysis and storage of the data associated with the SC. Big data can be defined by following five characteristics: (1) Velocity (how fast the data is produced), (2) Volume (how much data is generated), (3) Value (the data should be reliable), (4) Variety (types of data collection) and (5) Veracity (quality and accuracy of the data). The BDA practices are implemented in the SC so that large data can be easily accessible and stored (Gupta et al., 2019; Katal, et al., 2013). Some of the practices like Business Intelligence, RFID, IoT, etc., can be implemented and integrated into the SC so that there will be a gain in competitive advantage where the retail firm can withstand market competition (Waller & Fawcett, 2013). The benefits of the big data practices are data quality, data integration, data visualization, data storage, etc. But organizations need supporting factors for implementing these BDA practices in the SC. Organizations will also face some of the challenges like capital investment, the skill of employers, scalability of the database, etc., in order to implement the BDA practices in the SC. By implementing specific practices for the corresponding SCs, the big data can be easily accessed and quality, reliability, consistency, scalability, and flexibility of data can be achieved (Katal, Wazid et al., 2013; Chae et al., 2014). It also addresses the critical challenges which influence the poor performance of the SC. These challenges can be eliminated by properly implementing the specific BDA practices in the SC. We can also achieve long term economic benefits of an organization by implementing BDA practices. It refines the big data into small, precise, and accurate data so that the data can be easily used and make decisions. Akter et al., (2020) highlighted that digital transformation helped the companies to improve performance. Consumer packaged goods industry has achieved improvements through digital transformation by becoming closer to their customers and forming longer-lasting relationships that equate to repeat business and higher satisfaction (Kumar et al., 2020; Zhu, et al., 2010). Wamba et al. (2018) has emphasized that one side has huge benefits through big data practices and the other side huge challenges to handle to achieve breakthrough performance. In this research, our motivation is to implement the practices in retail and e- commerce SC because of the abundance of big data and changing needs of the customer which is uncertain. Raman et al. (2018) highlighted that BDA in industries increases the operational excellence, productivity, real-time visibility, low-cost operations, etc., which gives value added and monetary gain for the supply chain. State-of-the-art literature in the field of operations management, marketing and information systems neither provides a consolidated list of criteria that can be used to select the best BDA practices in retail SC and prioritize the criteria involved in the selection process nor suggests a comprehensive methodology that allows the decision makers to combine these criteria to select the best BDA for retail SC in an unbiased manner. So, the research questions for this paper remain as follows:

  • RQ1 Which are the various big data practices that are more suitable for retail SC?

  • RQ2 On what criteria, we can prioritize and select the best BDA for retail SC?

This paper ameliorates the domain of big data analytics research by making the following four contributions. This study is the first to analyse the dominance of the big data practices in retail supply chain level and thus helps the newly emerging retail firms. Secondly, this study takes a first step to identify a unique set of seven retail supply chain performance measures (supplier integration, customer integration, cost, capacity utilization, flexibility, demand management, and time and value) that the selection of the best big data practices. This was done by reviewing the extant literature, and then by having a discussion with the industry respondents. Thirdly, this study uses the TODIM (Portuguese acronym for interactive and multi criteria decision-making) methodology to select the best big data technology based on the seven-retail supply chain performance measures. The proposed methodology has the following benefits over other MCDM methods. (1) TODIM is more enticing to the decision makers because of no need in optimizing the criterion within the constraints. (2) It is a simpler and faster method to solve any MCDM problem because it does not need an optimization tool. (3) It is even simple to configure the TODIM method by altering the weights and adding new criteria as per the requirement of the decision maker. Lastly, this study acknowledges the necessity of a diversified pool of experts with technical and domain expertise to make the right decision as well as to achieve consensus.

The rest of the paper is structured as follows: In the next section, we present a brief background of big data practices in the retail supply chain with a specific focus on retail supply chain performance measures. Then, we provide the TODIM approach, followed by its application to a case study in India. We close with a discussion of our results, implications for research and practice, followed by conclusion with the limitations of the study.

2 Literature review

The number of published articles using the keyword search “Big Data Analytics” from the Web of Science database revealed 5,203 relevant journal papers from more than 100 multi-disciplinary journals from 2010 to 2021. Out of these 5203 articles, the big data papers related to manufacturing supply chain are 480, service supply chain are 265 and retail supply chain are 226 respectively. This trend clearly shows the number of research articles in “Big Data Analytics” are found to be growing after 2010 as shown in Fig. 1. This pattern also indicates that different researchers especially in operations and supply chain domains all around the world are working in this area because of its ability to disrupt the supply chain positively. Despite the growing interest in this research arena, no explicit research has been done so far to select the best BDA technologies (data science, neural networks, enterprise resource planning, cloud computing, machine learning, data mining, RFID, Blockchain and IoT and Business intelligence) based on retail supply chain performance measures (supplier integration, customer integration, cost, capacity utilization, flexibility, demand management, and time and value). This motivates us to analyse the dominance of the big data practices at the retail supply chain by applying the MCDM based TODIM approach. The purpose of each big data practice at enterprise and supply chain level are presented in Table 1.

Fig. 1
figure 1

Number of papers published in three different supply chains and big data practices from 2010–2020

Table 1 Big data practices with reference to Enterprise and supply chain

2.1 Retail supply chain in Indian scenario

Since the last decade, the retail sector in India is active because of increasing power of buyers, product variety, economies of scale, and usage of modern supply and distribution management. Consumer literature reveals that most of the 71% of the retail sales in India are from FMCG goods (Srivastava, 2007). But there is a serious concern on FMCG retail because of improper data management, inventory inaccuracy, and vendor management which needs to be addressed promptly. So, to overcome this issue, BDAs are introduced in the retail SC in areas like strategic sourcing, supply chain network design, product design and development, demand planning, procurement, production, inventory, logistics and distribution and supply chain agility and sustainability (Tiwari et al., 2018). The introduction of BDA in retail SCs at the aforementioned areas has the capability to transform the conventional SC to data driven retail SCs or ‘Retail 4.0’. BDA in retail SC has the benefit of offering better-quality products and services to the consumer and to improve their shopping experience. The Indian retail sector is highly competitive enough to transform itself to ‘Retail 4.0’ environment. But to implement the best BDA in retail SC we need to focus on those criteria that enhance SC performance.

2.2 The Big data practices

2.2.1 Data science

In a retail supply chain, the raw data is increasing drastically due to more customers, advancement in the technology and collection techniques. By combining different fields like statistics and computation with data science, we can extract, improve, store, and monitor the meaningful data from the huge raw data for decision making. This is a complicated method which requires more time for industries to find meaningful data. This requires the usage of machine learning, data mining and artificial intelligence to get more accuracy from the data collected by RFID, sensors, etc. Though it was a costlier proposition (Ruehle, 2020), we can gain more benefits by implementing data science in the retail supply chain.

2.2.2 Neural networks

As there are a larger number of customers in the retail and e-commerce supply chain with human wants, there arises a problem in prediction of demand and supply. To overcome this problem, neural networks can be used. Neural networks consist of a series of algorithms which are being used to recognize the relations in a data set by forward bias and backward bias similarly to how the human brain operates. We can predict the sales by changing the input data and it is used for market forecasting (Loureiro et al., 2018). So, this practice can be used effectively in predicting the sales, demand, and behavior of the customers. This can be used in optimizing the information regarding the customer data and used widely in many industries for prediction purposes. But while using this technique we require more data and to develop the algorithm, we need more time and also it is computationally expensive than traditional algorithms. This also will not give the clue of why and how this happens which reduces the trust on this technique.

2.2.3 Enterprise resource planning

As the number of customers is increasing in the retail industry, the variety of products must increase according to customers' taste and preferences. This further tends to expand retailers’ size. But, at the same time, there will be difficulties for retailers to sell the right product at the right time to the right customers at the right price due to the problems related to demand forecasting. To handle this situation, every front end and back end worker of each functional department must be connected tightly. So, in this case, retailers will use ERP to connect the system from every department. For mixture and allocation of products, the retailers use the Distribution Requirements Planning (DRP) module which helps in outbound and inbound logistics and also inventories (Helms & Cengage, 2006; Martin, 1995; Ngare, 2007). ERP helps retailers in better understanding of information like inventories, order status, production rate, etc., of one department, to other departments but research in this content is limited. This may be due to the fact that installation of ERP is more costly and to get completed and fully functional, it may take years and success depends on skills of the workforce.

2.2.4 Cloud computing

Cloud computing is the trending resource system that is available today. This is mainly because it operates without the direct involvement of the user and it has the capacity to store a large amount of data. It has some important aspects like on-demand service, broad network access, resource pooling, rapid elasticity, and finally measured service. These aspects simplify the supply chain in terms of operating and performance which in turn reduces the market entry cost, so that small enterprises can easily enter the market (Ozu et al., 2020). Cloud computing offers more benefits to the retailers and e-commerce-based industries. But there are limitations like bandwidth issues, and uncontrollable data stored in the internet is not secure. Encompassing security in cloud computing requires additional investment. But, in the long run, cloud computing improves the financial performance of the organizations.

2.2.5 Machine learning

Machine Learning is mainly used as a problem-solving technique with the help of machines which act like the human brain by combining big data and algorithms without externally programming. So that, it can learn the patterns in the data by performing iterations and gives the result that what the customer needs. We can also predict the sales of the products according to the customer tastes and preferences by simplifying the complex data into easily accessible ones (Sharma et al., 2020). In retail and e-commerce industries, the prediction of customer needs and sales of the product are very difficult manually due to the increasing number of customers and increase in volume of the data. So, to understand the data patterns and future sales of the products can be predicted easily which in turns the cost of products can be optimized. But in this technique, there are some limitations in importing quality data.

2.2.6 Data mining

Data mining is a process where a large amount of data can be refined so that patterns are identified for problem-solving and forecasting. Firstly, it refines the big data into simple data and understands the trend of the data patterns. Then, it shares the information to all the departments in the industry which means it eliminates the unwanted data and uses the data which is more important for resources and operational performance of the industry (Senar et al., 2019). So, in retail and e-commerce industries the data which is collected from the customers can be refined by data mining techniques. But there are limitations like violation of user’s privacy which is not safe and secure and also provides data accuracy in its own limits.

2.2.7 RFID

RFID (Radio Frequency Identification) is the best technique used to collect data in the industry. It tracks the inventories, production rates, movements of products from one department to another department and also stores and shares that information and protects them from theft. Due to this, some benefits like warehouse management, transportation management, production scheduling, order management, inventory management, and asset management and also increment in labor productivity because the asset visibility reduces the loss in the stock and thus states the labor workload for stock-keeping units (Angeles, 2005; Tsai et al., 2010). At the store level, it reduces out of stock by improving inventory control and product availability. So, in retail and e-commerce industries these applications are used to improve the overall performance of the supply chain by scanning multiple items with greater security but there are limitations like not accurate or reliable as barcodes, and ten times more expensive than barcodes and installation requires more time and need to check whether it is fit to industry or not.

2.2.8 Blockchain and IoT

Blockchain in the supply chain is mainly used to track the functions of each department and monitor every action which is non-traceable with high authentication by eliminating the auditors (Manupati et al., 2020, 2022; Raj et al., 2022). In the supply chain, we need to know every aspect so that we can modify the process in the supply chain to improve the performance and optimize the cost. In the retail and e-commerce industry, the supply chain must be very effective to control the defects and to increase the performance. So, every partner can track the products, defects and monitor the quality of the products but this is done with the help of IoT to get more efficiency. But it will not change the critical activities in the network of the supply chain (Helo, & Hao, 2019; Kshetri, 2018). In retail and e-commerce industries to maintain the effective supply chains, we need to implement blockchain technology along with IoT but there are also limitations like storage, private keys, network security, over dependence on the internet and loss of jobs for the workers.

2.2.9 Business intelligence

In this competitive world, the industries are facing problems like data management, market place, and business operations to finally achieve a competitive advantage. To achieve this industry has to get accurate and timely information where markets are getting competitive in which product life cycles are getting shortening. To achieve this, industries need access to data sets and data management systems. So, Business Intelligence (BI) facilitates the industry by analyzing, consolidating and gathering the data by using analytical software and solutions. Due to an increase in the number of customers, it requires BI to understand the reports to monitor the trends, market potentials and customer behavior changes (Ranjan, 2009; Gangadharan & Swamy, 2004; Banerjee & Mishra, 2017). In this aspect, BI helps in decision making so that the industry changes to the reactive-to-data model. But there are limitations like expensive for SMEs, creating rigid techniques where it throttles the business, time consuming like it takes almost 18–20 months to completely implement the data warehouse. Finally and mainly piling of historical data like it creates models from historical data of the industry but nowadays industries are not concentrating on the historical data due to frequent alteration of the market so it became a constraint.

2.3 Criteria for supply chain performance

2.3.1 Supplier and customer integration

Integration enhances the degree of partnership with both customers and suppliers, thus forming the strategies at firm-level, practices and processes synchronized to achieve inter-organizational information sharing. By doing this integration through collaborative relationships we can know the necessary technological and managerial resources that has to be implemented and achieves the sharing of information by combining the core elements from heterogeneous sources of data into a single platform which is common to all which increases the visibility of the data (Li et al., 2018; Shafiq et al., 2020). Due to a single platform, there will be transparency in the information sharing through suppliers, customers and intra- organizations. Thus it decreases the variability in the information from supplier to end user, therefore, increases the supply chain performance by reduced lead time, improved inventory, reliable delivery, etc., but we need to know the extent to which supplier and customer should be involved in the process i.e., degree of integration. Hence, integration improves the supply chain performance by sharing real time, accurate and reliable information across the partners externally and within the organization.

2.3.2 Cost

Cost is one of the major criteria in supply chain and managing it among supply chain processes is a major difficulty that organizations are facing (Manupati et al., 2019, 2021; Tarei et al., 2022). Due to lack of information sharing among the organizations the wastage is increasing which in turns cost. The best supply chain is with optimized cost but not with minimum cost (Whicker et al., 2009). So, we need to choose the implementation of practice with optimized cost which gives more benefits like decreasing the defect products, decreasing the late deliveries, etc. If we choose the best practice with optimized cost we can reduce the wastage from supplier end to end user. Hence, optimization of cost in the supply chain improves the performance in terms of cost of the supply chain.

2.3.3 Capacity utilization

By sharing information through suppliers, customers and within organizations we can manage the inventory inaccuracy like damaged, out-of-date, seasonal and incorrect incoming and outgoing deliveries and finally misplaced items (Hollinger & Davis, 2001; Raman et al., 2001). Due to this supply chain performance will decrease, so, to overcome this information sharing has to happen transparently all over the organization. By sharing information each department will come to know the exact production rate, inventory rate, etc., so that the wastage of the products will decrease which in turn increases the supply chain performance.

2.3.4 Flexibility

Flexibility is used to measure the ability of the supply chain in terms of volume and schedule variations from suppliers to end users and potential behaviour of collaborative supply chain. This depends on the market environment and practice that the supply chain is using in which it operates (Angerhofer et al., 2006). Due to flexibility there will be increase or decrease aggregate production with response to customer demand which directly impacts the supply chain performance. We need to know the perfect practice which increases the flexibility so that supply chain performance increases.

2.3.5 Demand management

For supply chain performance demand management is one of the criteria. Demand management refers to demand forecasting and production planning which sense the demand signal, optimal prices and tracing of customer loyalty. This helps to find out new market trends and also root causes of failures and defects (Tiwari et al., 2018). So we can analyse the requirements from the customer’s point of view and proceed according to the needs of customers. To overcome this we need a best practice to analyse the demand so that performance of the supply chain increases.

2.3.6 Time and value

Time here represents every minute in the supply chain process to reduce lead time, cycle time, delivery time, etc., where it plays a major role in supply chain performance. Value the property of a product or service that customer is willing to pay. By reducing time and also optimizing the value of the product we can increase the performance of the supply chain (Whicker et al., 2009). To achieve this we need exact information which is transparent through the supply chain process. For this we need data driven practices like big data practices. The above mentioned are the criteria for improvement in the supply chain performance and also we need big data practices as alternatives to find out the performance of the supply chain. By implementing these practices, we can achieve data quality, data security, data visibility, etc., where the information is circulated among the suppliers to end users and also intra-organizations to achieve performance and competitive advantages. As described above, the authors selected the nine big data practices and seven supply chain performance evaluation criteria to identify the most suitable big data practices in the context of Indian retail and ecommerce supply chain.

3 Research methodology

3.1 The TODIM method

TODIM is a discrete multi criteria decision making method which was coined in the early nineties based on Prospect Theory. In Portuguese, it is named as Interactive and Multicriteria Decision Making (Kahmenan & Tversky, 1979). In 2002, this method was awarded a Nobel Prize for Economics under the subject psychological theory (Roux, 2002). Thus, the decision makers always have an assumption as the solution is maximum of some global measure value for all multicriteria methods—for example, the highest value among the global value is the solution of a multi-attribute utility function in case of MAUT (Keeney & Raiffa, 1993)—but the TODIM method uses the paradigm of prospect theory to get the global value. Thus, we can say how people are going to make decisions which are effective in the face of risk based on description and empirical evidence. The use of TODIM depends upon the value function of multi-attribute which is a combination of few mathematical descriptions which in turn reproduces the gain and loss function and finally aggregates them with over all criteria. Therefore, the value function shape in the TODIM method is the same as the gain/loss function of prospect theory but not all multicriteria problems deal with risk.

In calculations of the TODIM method, each specific form of loss and gain functions are tested based on the value of one single parameter. Once the forms are empirically validated, it will serve people to develop the additive difference function which is indeed a global multi-attribute function and gives the measurements of dominance of each alternative over the other (Tversky, 1969). In this way, the TODIM method is similar to outranking methods like PROMETHEE (Brans & Mareschal, 1990) because the final global value of each alternative is relative to its dominance over the other. Even though it appears that the validation process is complicated, which turns decision analysts to implement other gain or loss functions, in reality it is not so. Since, the two mathematical forms which were used in the past nineties have practical uses and are empirically validated in different applications (Gomes & Lina, 1992a, b; Trotta et al., 1999).

From the above mentioned TODIM additive difference function which is similar to multi-attribute value function will have its use and it has to be validated by doing the verification of condition of the mutual preferential independence which leads to the ordering of the global values of the alternatives (Clemen & Reilly, 2013; Keeney & Raiffa, 1993). In the TODIM method it can be seen that the multi-attribute value function or additive difference function is defined as the differences between the values of any two alternatives to the referential or reference criterion which was perceived in relation to each criterion.

Technically, the TODIM method utilizes the simple resources to eradicate the inconsistencies from the pairwise comparisons between the decision criteria. Using a criteria hierarchy process, fuzzy value judgments and also interdependence relationships among alternatives, this method allows value judgements to be carried out in a verbal scale. In this sense, the trade-offs will not occur due to non-compensatory methods (Bouyssou, 1986).

About the TODIM method, Roy and Bouyssou (1993) stated that it is “a method based on the French School and the American School. It combines aspects of the Multi-attribute Utility Theory, of the AHP method and the ELECTRE methods”. BarbaRomero & Pomerol (2000) also stated in the respect of TODIM method “it is based on a notion extremely similar to a net flow, in the PROMETHEE sense”, because the formulation of the expressions of gains and losses in multi-attribute function are similar to the expressions in the PROMETHEE methods, which make the use of net outranking flow.

Let us consider n number of alternatives to be ordered based on m number of qualitative and quantitative criteria and also let us assume that one criterion is considered as reference criterion. After this, experts are asked to estimate the values of each criteria c and also the value of each alternative i with respect to the objective of the criterion. Then the estimated values of each alternative in relation to criteria has to be numerical, so, it has to be normalized, in the same way the values of the criteria are also transformed from the verbal scale to the cardinal scale. The values of the criteria are obtained from the performance of the alternatives with respect to the criteria, such as, the level of noise will be measured in decibels, the power if the engine is measured in horsepower, etc. Therefore, the TODIM method is used for both qualitative and quantitative criteria and later on criteria of verbal scales are converted into cardinal scales and both are normalized. For each pair of alternatives, the relative measure of dominance of one alternative over another is calculated and computed as the sum over all criteria for both relative gain or loss function values for all alternatives. The sum will be either gains or losses or zeros which depends on the performance of each alternative with respect to each criterion.

The computation of the alternatives with the relation to all the criteria gives a numerical matrix. Again, the normalization is performed by the division of the matrix value of one alternative by the sum of all the alternatives for each criterion and obtains a matrix of values between zero and one. This matrix is known as the matrix of normalized alternatives’ scores against criteria and denoted as P = [Pnm], where n indicates the number of alternatives and m indicates the number of criteria, as shown in Table 2.

Table 2 Matrix of normalized alternatives’ scores against criteria

Now after the evaluation of normalized alternative scores, the partial matrices of dominance and final matrix of dominance must be determined. According to the relative importance assigned to each criteria the decision makers have to indicate the reference criteria r for further calculations. The reference criteria is chosen from the all criteria which has the highest value according to its importance. The decision makers give the values of each criteria on a numerical scale (e.g., from 1 to 5) and then it has to be normalized. Therefore, wrc is defined as the weight of the criterion c divided by the weight of the reference criteria r. This helps us to allow all pairs of differences between performance measurements to be in reference criterion dimension. The dominance measurement of each alternative Ai over each alternative Aj which is incorporated to Prospect Theory, is given by the expression

$$ \delta \left( {A_{i} ,A_{j} } \right) = \mathop \sum \limits_{{c = 1}}^{m} \Phi _{c} \left( {A_{i} ,A_{j} } \right)~~~~~~~\forall \left( {i,j} \right) $$
(1)

When

$$ \Phi _{c} \left( {A_{i} ,A_{j} } \right) = \left| {\begin{array}{*{20}l} {\sqrt {\frac{{W_{{rc}} \left( {P_{{ic}} - P_{{jc}} } \right)}}{{\mathop \sum \nolimits_{{c = 1}}^{m} w_{{rc}} }}} } \hfill & {if\left( {P_{{ic}} - P_{{jc}} } \right) > 0,\quad \quad \quad \quad (2)} \hfill \\ 0 \hfill & {if\left( {p_{{ic}} - P_{{jc}} } \right) = 0,\quad \quad \quad \quad (3)} \hfill \\ {\frac{{ - 1}}{\theta }\sqrt {\frac{{\mathop \sum \nolimits_{{c = 1}}^{m} w_{{rc}} \left( {P_{{ic}} - P_{{jc}} } \right)}}{{w_{{rc}} }}} } \hfill & {~if\left( {P_{{ic}} - P_{{jc}} } \right) < 0,\quad \quad \quad \quad (4)} \hfill \\ \end{array} } \right. $$

here \(\delta \left( {A_{i} ,A_{j} } \right)\) denotes the measurement of dominance of alternative Ai over alternative Aj; m is the number of criteria; c is any criterion, for c = 1, …, m; wrc is equal to wc divided by wr, where r is the reference criterion; Pic and Pjc are, respectively, the performances of the alternatives Ai and Aj in relation to c; \(\theta\) is the attenuation factor of the losses; different choices of \(\theta\) lead to different shapes of the prospect theoretical value function in the negative quadrant.

The expression \(\Phi_{c} \left( {A_{i} ,A_{j} } \right) \) denotes the contribution of criterion c to function \( \delta \left( {A_{i} ,A_{j} } \right)\), when comparing alternative i with alternative j. If the value of \(\left( {P_{ic} - P_{jc} } \right)\) is positive, it will represent a gain function of \(\delta \left( {A_{i} ,A_{j} } \right)\) and, therefore the Eq. (2) is used. If \(\left( {P_{ic} - P_{jc} } \right)\) is nil, the value zero will be assigned by applying Eq. (3). If \(\left( {P_{ic} - P_{jc} } \right)\) is negative, it will represent a loss function and Eq. (4) is used. Therefore, the expression \(\Phi_{c} \left( {A_{i} ,A_{j} } \right)\) permits an adjustment of the data of the problem to the value function of Prospect Theory and it explains the aversion and the propensity to risk. This function is in ‘‘S’’ shape and represented in Fig. 2. There are two curves one is concave which is above the horizontal axis and represents the gains for the analysis and also reflects the aversion to risk and below the horizontal axis there is a convex curve, which represents the losses for the analysis and also reflects propensity to risk.

Fig. 2
figure 2

Value function of the TODIM method (Gomes & Lina, 1992a)

After the calculations of diverse partial matrices of dominance, one for each criterion, the final dominance matrix of the general element \(\delta \left( {A_{i} ,A_{j} } \right)\) is determined, through the sum of the elements of the diverse matrices. Expression (5) is used to find out the overall value of alternative i through normalization of the corresponding dominance measurements. The respective values are ordered by giving ranks to every alternative.

$$ \xi_{i} = \frac{{\mathop \sum \nolimits_{j = 1}^{n} \delta \left( {A_{i} ,A_{j} } \right) - \min \mathop \sum \nolimits_{j = 1}^{n} \delta \left( {A_{i} ,A_{j} } \right)}}{{\max \mathop \sum \nolimits_{j = 1}^{n} \delta \left( {A_{i} ,A_{j} } \right) - \min \mathop \sum \nolimits_{j = 1}^{n} \delta \left( {A_{i} ,A_{j} } \right)}} $$
(5)

Therefore, the global values obtained from the computation by Eq. 5 which permits the complete rank ordering of all alternatives. To know the stability on the decision makers’ preferences a sensitivity analysis should be carried out on \(\theta \), the attenuation factor. We can also conduct sensitivity analysis on the criteria weights to know the performance evaluations.

3.2 Proposed methodology

The TODIM is used to address the issue that small scale retail industries are facing these days. To compete with the other companies in the market, small scale retail industries are worried about big data. To solve the volume, access, etc. There are many practices but to implement these big data practices we need huge investment. Small scale retail industries which are enlarging slowly can’t afford the huge investment so these companies are in a dilemma about which big data practices can be used to reduce the cost according to the company requirements. So, we identified a few big data practices which the industry wants to control over big data and there are nine big data practices which are mentioned as alternatives. And also identified seven criteria that a company requires to increase the overall performance of the supply chain. The main reasons to choose this TODIM method even though there are many multi criteria decision making methods:

  • To combine both qualitative and quantitative data in order to provide a new path towards selection for decision makers in the companies.

  • No other MCDM approaches deal with the risk but this method deals with the gain or loss function using prospect theory.

Figure 3 represents the block diagram of the proposed methodology, which shows the step by step process. Tables 3 and 4 present the representation of criteria and alternatives respectively. The following shows the notations of the big data practices and criteria which are required for the analysis.

Fig. 3
figure 3

Flowchart for the methodology

Table 3 Representation of Criteria
Table 4 Representation of Alternatives

3.3 Data collection

The data was conducted through a structured questionnaire. We have selected five experts who have knowledge of all the practices aforementioned in this study. As the adoption and implementation of the BDA practices are at a nascent stage in India. Hence, the majority of the practitioners do not have complete awareness about the practices. To avoid the nescience bias of the experts authors interviewed the experts and selected five experts for the final responses. The respondent profile is mentioned in Table 5.

Table 5 Respondent Profile

Here the level of technology implementation represents the integration of big data practices with criteria to improve the supply chain performance. In Table 5, The level of technology implementation column refers that the amount of capability that the big data practices are implemented in the supply chain: Rank 1–Rank 5 shows the integration rank between big data practices and supply chain criteria which gives the ranks through pairwise comparison based on the scale of 1 to 5, where ‘1’ is ‘very unimportant’ and ‘5’ is ‘very important’ and also ranks for alternatives based on the same scale, where ‘1’ is ‘very poor’ and ‘5’ is ‘very good’.

Table 6 Weights of the criteria

Assumption

From Fig. 2 we can say that \(\theta > 0\) and \(0 < \theta < 1\) indicates the losses are amplified while \( \theta > 1\) indicates the losses are attenuated means the effect of loss will be reduced. So the value of \(\theta\) is taken as 1 and according to our company requirement we can change the values of \(\theta\) to get different values of loss functions and can be used for sensitivity analysis. In order to find out the weights of each criteria and also find out the reference criteria the ranks are collected from the experts and normalized. Table 6 presents the weights of criteria.

Here the criteria with highest weight is taken as reference criteria and based on reference criteria the values of Wrc of each criteria is calculated. After calculating these weights (see Table 7) we need to find out the normalized matrix (see Table 8) of alternatives with respect to criteria from the ranks given by the experts.

Table 7 Matrix of alternatives’ scores against criteria by experts
Table 8 Matrix of normalized alternatives’ scores against criteria

After calculating the normalized matrix, we need to find out the partial dominance of every alternative using Eqs. 2, 3 and 4.

Similarly, the partial dominance values for other alternatives are also calculated as presented in Table 9. From the partial dominance values, we need to calculate the final dominance values for all the alternatives using Eq. 1, so that we are eliminating the criteria.

Table 9 Partial dominance values of A1 over all other alternatives

After finding out the final dominance values (see Table 10) of all the alternatives, we need to calculate the global values (see Table 11). We then find out normalized global values with ranks according to their values.

Table 10 Final dominance values of all the alternatives
Table 11 Final values and ranking of all the alternatives

According to the ranks of each alternative, the alternative A7 i.e., RFID has got the first rank and it is the best choice that satisfies the optimal conditions and criteria of the companies to improve overall performance of the supply chain.

3.4 Sensitivity analysis

After the results by the TODIM method, a sensitivity analysis was carried out by varying the value of \(\theta \) the attenuation factor of losses. Previously, this value was taken as 1 but now we varied this to 2.5 and 5 as shown in Tables 12 and 13. The global values of all the alternatives are changed and are as follows:

Table 12 Final values and ranking of all the alternatives when \(\theta =2.5\)
Table 13 Final values and ranking of all the alternatives when \(\theta =5\)

Now the alternative A9 i.e., business intelligence has the rank 1 and no losses in the alternatives.

Now the alternative A4 i.e., cloud computing has the rank 1 and losses in the alternatives A2, A5, A6 and A8 i.e., neural networks, machine learning data mining and block chain and IoT.

Here, we have done sensitivity analysis by varying the value of \(\theta \) to know the consistency of the data. The values of \(\theta \) are 1, 2.5 and 5 (see Table 14), the global values of all the alternatives are plotted in graph (see Fig. 2) and R2 values are also calculated.

Table 14 Final values for all values of \(\theta \)

From the above Table 15, we can say that the data is consistent in nature because there are less losses even though the values of \(\theta \) are changing. Only the alternatives A2, A5, A6 and A8 are the loss functions that too when \(\theta =5\).

Table 15 Gains and losses for different values of \(\theta \)

4 Discussion

The Indian retail industry has seen phenomenal growth in the last one decade. Despite this, adoption of big data practices is in the nascent stage (Gawankar et al., 2020). Hence, in this study, we integrated key big data practices with FMCG supply chain and obtained the dominance of each big data practice over the other. The calculated final global values of each practice are normalized and then ranked according to the order. The results show that RFID is a high dominant practice over the other practices. In brief, RFID plays a key role in the FMCG supply chain which highlights that supply chain managers need to make more emphasis on RFID to realise the benefits of big data practices. This result is in line with the Reyes et al. (2016) study on RFID practices, where they highlighted the benefits of RFID practice in supply chain such as reducing operational costs, high level of integrity, increased efficiency and cut-down reworks by assessing the tracking of shipping and receiving products. This evidenced that it controls the uncertainty in the data for the retail supply chain (Gawankar et al., 2020). Further, RFID practice helps to realise the benefits in cost, supplier and customer integration, demand management, time and value, capacity utilization and flexibility to improve the overall performance of the supply chain. Cloud computing is the dominance practice after RFID because cloud computing scalability, efficiency and integration increases by decreasing redundancy and also making supply chain networks a cost effective one. Later, machine learning ranks in the order and gives the benefits like informed decisions, greater contextual intelligence and finally asset and inventory management decisions to improve the performance of the retail supply chain. Blockchain and IoT are ranked in the top five. These practices help to improve the retail supply chain performance by eliminating the middleman, increasing the speed of transactions, lowering the costs and improving security to the data. According to Shah (2016), pilferage loss in Indian retail supply chain approximately 1.5 percent of the total sales value. This shows Blockchain and IoT are the most priority adoptive big data practices for Indian retail supply chain to overcome the challenges such as pilferage or breakage losses. Following this, neural networks are in the next order which has the ability to adapt and easily combine with other technologies which can learn from each other and make up for their own deficiencies. Finally, data mining, data science, business intelligence and ERP come in a row to improve the performance of the Indian retail supply chain. The results obtained from this study are discussed with the respondents. The majority of the respondents agreed with the results obtained from the study. Three of the respondents expressed that cloud computing and RFID plays a vital role with cost criteria performance in retail and ecommerce supply chain. The use of BDA in industries will increase the operational excellence, productivity, real-time visibility, low-cost operations, etc., which gives value added and monetary gain for the industries.

This paper also gives an idea and model to small and medium industries on how to choose a best big data practice and how much it has to be integrated to overcome the issue of data uncertainty to increase the performance of the supply chain. To know how the dominance value changes with the attenuation factor of losses sensitivity analysis is performed (refer Table 14). Initially, the attenuation factor (uncertainty) was taken as ‘1’ assuming that there is no uncertainty we obtained different dominant values for the practices. In this scenario, RFID has placed the highest priority for the ecommerce and retail supply chain. Further, when we increased the attenuation factor to 2.5, now the dominance values changes and business intelligence ranks first among the BDA alternatives. Further, we increased the attenuation factor to 5, analysis shows that cloud computing plays a highest priority and it plays a major role in retail SC performance (Fig. 4). It concludes that cloud computing practice is vital for retail and ecommerce supply chains to address the high data uncertainty.

Fig. 4
figure 4

Value function graph

4.1 Theoretical implications

This study brings the following research contributions. Though the literature confesses the necessity for development for a comprehensive methodology for understanding the impact of the big data analytics tools in retail supply chain performance, no conscious effort has been made till date in this direction (Gawankar, 2020). This study is the first to analyse the dominance of the big data practices (data science, neural networks, enterprise resource planning, cloud computing, machine learning, data mining, RFID, Blockchain and IoT and Business intelligence) in retail supply chain level and thus helps the newly emerging retail firms. Secondly, this study takes a first step to identify a unique set of seven retail supply chain performance measures (supplier integration, customer integration, cost, capacity utilization, flexibility, demand management, and time and value) that the selection of the best big data practices. This was done by reviewing the extant literature, and then by having a discussion with the industry respondents. Thirdly, this study uses the TODIM (Portuguese acronym for interactive and multi criteria decision-making) methodology to select the best big data technology based on the seven-retail supply chain performance measures. Cloud computing, RFID and data science are the top three practices with the cost criteria which are in-lined to similar studies conducted in retail and ecommerce supply chain.

4.2 Managerial implications

From the results, we found that RFID placed a top priority among all other selected big data practices. It is evident that one of the major challenges for supply chain managers in India is pilferage. According to the global retail theft barometer, 2.38 percent of sales are considered as a pilferage (SDM, 2015). To overcome this RFID will help supply chain managers across the supply chain. RFID also bids real time inventory visibility in the retail shop. This characteristic of RFID aids the inventory managers so as to monitor and control the inventory supply at all times. Moreover, in retail business, identification of the exact location of the item is challenging because of the inventory of hundreds of thousands of stock keeping units. This particular problem related to identification of the exact location of the item can be solved by implementing RFID in the retail sector. Cloud computing is placed as a second priority, which helps supply chain managers to understand customers in a better way and to bring more innovative products using the enormous data obtained from the customers. Integrating cloud computing services in the retail sector not only significantly decreases the IT costs but also streamlines the workflow, advances efficiency and end-user experience. Prediction of demand and maintenance of inventory across the stages of the supply chain is a key challenging issue in the retail industry. With the AI and automated machine learning tools, the retail outlet can plan for optimizing purchases, inventory and sales. AI and automated machine learning tools helps the retail outlets in assortment planning according to customer demand. The AI and ML tools further checks whether the amounts are set appropriately, and which outlets require the supply of certain goods in certain quantities by considering both the local, regional and other features. AI and automated machine learning tools can also help the retail outlet in envisaging the number of goods required on a given day in retort to customers’ demand, thereby saving money and time. In the same line, blockchain technology will help supply chain managers to manage inventories across the chain and will overcome the shortages of the products. However, during our data collection stage we observed that blockchain technology is still at a very nascent stage in the Indian retail industry. Neural networks will help the supply chain managers to manage the data and segment the customer database as required.

4.3 Limitations and future research directions

This study elevates numerous vital matters that require further research investigation. For instance, the same research issue can be solved by using fuzzy or grey TODIM techniques to overcome the nebulousness of decisions by the experts. Furthermore, the result of the multi criteria decision making models is purely reliant on the inputs provided by the respondents of the case organization. Effort should be taken to select the right expert pool. If we select a single expert, then there will be problem with expert bias. Therefore, it is worthwhile to select experts from different functional groups with relevant knowledge and expertise. Otherwise, it will become a garbage in, garbage out problem. Also, effort should be taken to see the consensus of feedback from the effort. Group decision making techniques like geometric mean will solve this purpose (Ramkumar & Jenamani, 2012; Ramkumar et al., 2016). Further improvement of the model can also be done by performing additional field surveys. Further refinement of the model can also be made based on some widely used theoretical perspectives instead of developing the model purely based on literature. It is also worthwhile to study the impact of each criterion on the final selection.

5 Conclusions

The MCDM methods support the decisions given in the business, which contains many criteria that have to be satisfied to make any decision. The uniqueness of this study for the best big data practice to improve the overall performance of the supply chain is that, the qualitative and quantitative factors with different scales are combined in the same technique, also we need to know the risk concept in the analysis and last, TODIM method is applied to this problem. Although there are many decision-making methods that are based on complex calculations, the proposed framework can be applicable easily by the companies in various industries in order to make decisions based on many criteria. To get more accurate data and efficient decisions in the decision-making process, we can implement fuzzy logic to the TODIM method and Pythagorean Fuzzy TODIM so that we can reduce uncertainty and ambiguity in the decisions taken by the companies. The criteria chosen in this study are relevant to the Indian retail supply chain. These criteria can vary across the industries. Hence, the dominance values change with criteria which are different for different industries like for some industry cost may be the main criteria but for others demand management is the main criteria, so it varies from industry to industry. Hence, obtained priorities may not be generalized.