1 Introduction

Rural to urban migration is increasing worldwide [21]. Hence, governments must provide infrastructure and basic amenities at the same phase. However, financial constraints make infrastructural development slow. Therefore, we must use the efficiently, which led to the development of the Smart City (SC) concept. The notion of Smart Cities (SCs) can be defined as the application of information and communication technologies to create a better and sustainable living experience, environmentally, socially, and economically [47, 48].

Th SC consists of six components, these are [15]: Smart Economy (SE), Smart People, Smart Governance, Smart Mobility, Smart Environment, and Smart Living. Lately, two more components have been added: Smart Industry and Production, and Smart Healthcare [8]. SE is an economy designed to leverage technology to ensure sustainability and efficiency. Smart people refer to the people in the SC, who use the other SC’s components and provide them with data which is then used to make intelligent decisions [23, 43]. Smart governance refers to an intelligent and data-driven decision-making to provide the public’s needs [43]. Smart Mobility consists of connected vehicles that provide safe, clean and affordable mean of transportation [38]. Smart Environment involves the management of the environment through fine grain sensing and data analysis. Smart living also known as smart homes and smart buildings is the use of technology to deliver quality services to citizens[43]. Smart Industry and Production refer to technology enabled manufacturing, while Smart Healthcare is the use of information technology to improve healthcare delivery [36].

An ideal SE must include the economic competitiveness of the labor market [15]. It also must integrate the national and global markets. With advances in the Internet of Things (IoT) technology, as well as a reduction in its cost and maintenance, researchers are looking for ways to harness this technology to make cities smarter. Several works have been carried out in SCs in the literature [3, 36, 44]: In commerce, researchers proposed SC solutions that improve; import, export, transportation, and wholesale of goods. Kiran et al. [28] developed a water marketplace where houses can buy and sell water in their neighborhood. The system uses machine learning to estimate the daily and monthly cost of water for each household. Thus, making transactions easier.

Lv et al. [31] investigated a new system for SC Vertical Markets (VM). A VM is a group of enterprises or industries that use similar production methods and market similar products or services. Hence, the vertical market aims at satisfying individual professional needs. The authors investigated the security of the system. They also investigated the profits gained by both retailers and suppliers under joint-operation and self-operation modes. They found that the system is stable.

However, little work is available in the customer shopping experience. It is partly because the customer component of the market is not an organized entity. Thus, there is nobody to bear the cost of design and development. Secondly, retailers do not want to pay for it too. They see customers as the opposing side of the commerce game. But, improving customer experience has rewards for the retails. It ensures customers revisit the retailer. Thus, increasing his profit.

In this paper, we proposed a fog computing-based shopping recommendation system. The system takes the user’s shopping list or preferences as input. Then, it recommends products to him accordingly. The system is multi-objective. It uses discounts, congestion, and the relative distance of the customer from the shops in the mall. We believe the shoppers (especially tourists) will enjoy the extra information they obtain from the system. Thus, increasing their shopping session and the retailers’ profit.

The remaining part of this paper is as follows: Section 2 studied the works available in Smart Shopping (SMSH) systems available in scientific literature. Section 3 presents a detailed description of the proposed system. Section 4 shows the performance of the proposed system. Section 5 describe some of the challenges and research gap in path planning of SMSH systems, while Section 6 concludes the paper and discusses our future work.

2 Review

This paper used the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) technique to review the literature in SMSH [40]. Figure 1 shows the process: Our search for path planning techniques in smart shopping yielded no papers, let alone fog computing-assisted path planning. Since our system carries out recommendations in addition to path planning, we expand the search term by adding recommendation systems. The identification section shows that the Google Scholar database search yielded ten papers, while the search term used in the Scopus database returned 15 papers. Figure 2 shows the annual number of publications in smart shopping recommendation systems. The number of papers is low because neither retailers nor the government is willing to invest in the area. However, the number of papers published between 2019 and 2022 shows that researchers are gaining interest in the area. There is only one duplicate from the two databases. We studied the remaining 24 papers. Five of them are irrelevant to SMSH in computer science. Thus, 17 papers were reported in this section.

Fig. 1
figure 1

PRISMA for smart shopping systems

Fig. 2
figure 2

Annual publications in smart shopping systems according to Scopus

Customer satisfaction leads to a loyal one [13]. The primary aim of SMSH is to leverage information technology to improve the efficiency of the smart economy’s retail sector. This paper focuses on the improvement of the customers’ shopping experience. Several solutions have been proposed in SMSH to better the customers’ shopping experience. The SMSH solutions found in the literature can be grouped into; smart shopping carts (SSCs), smart billing systems, smart inventory systems, smart customer support, and recommender systems (shops, clothing, goods, and recipes). Table 1 summarizes the technologies in this review. It references the papers and the type of technology used. It also shows the functionalities of the systems and the targeted customer demography.

Table 1 Summary of smart shopping systems

The SSCs are designed to help customers with inconveniences like locating items and checking out. The simpler version of SSCs, lets call it SSC-V1, are usually fitted with WiFi for communication and Radio-frequency identification (RFID) readers for scanning RFID sticker labeled products [26, 27, 37]. In SSC-V1, when a customer places a product in their shopping cart, it is recorded in their virtual cart on the shop’s server. Some SSC-V1s help the customers track their spending to ensure they are within their budget limit [6]. They have green and red lights. The green light turns on when the user enters his budget in the cart via a keyboard. It switches to the red light when the items in the cart are over the customer’s budget. SSC-V1 cannot check out autonomously — the cashier validates the virtual cart while the customer makes payment in the traditional point of sale (POS) manner. Thus, SSC-V1 reduces the service time at the POS. However, there is the risk of shoplifting because SSC-V1 has no theft-detection system. To solve this problem, some authors proposed using a second RFID reader at the gate [51], while another [17] used weight sensors to validate the data read by the cart’s RFID reader.

Some SSC-V1 are equipped with ease of accessibility systems. Shete et al. [45] added braille-translated buttons to the cart to improve user experience and system reliability. A similar technology with an autonomous moving cart is presented in [14]. However, these systems do not recommend items to the customer. They are also limited to a single store. Therefore, the customer must choose from the local variety.

Second versions of SSCs, SSC-V2, allow online check out, but they have no recommendation system. Maulana et al. [33], have written a survey of RFID based self-checkout systems. SSC-V2 avoids the traditional POS altogether, which is better during festivities (e.g., Eid, Christmas, new year, etc.) when number of incoming customers increase exponentially. Some variations use; WiFi [9], and robotic arm [4].

The third version, SSC-V3, offers recommendation systems in addition to the online billing system. The recommendation system ranges from a simple recommendation of items based on their price or expiry dates [51] to systems that recommend items based on the user’s purchase history. In addition, they can help shoppers locate items within the shop [39]. Guan et al. [16] experimented with convolutional neural networks, support vector machines, and later kernel fusion (LKF). They found that machine learning techniques can recommend clothing to customers. In [19], the authors used a collaborative clustering recommendation system. It helps the customers by recommending products based on the items they have already taken and ratings provided by others. In [11], they used social vector for recommendation to the customers. The social vector technique is a mathematical technique that defines and quantifies social relationships between any entities. In [41], SSC-V3 combines the techniques in [19] (machine learning recommendation system) and [17] (online billing system).

The more data a recommendation system gathers, the more accurate it can be. Thus, most SSC-V3 have a login system. It helps record each customer’s shopping history. The SSC-V3 in [50] captures the customer’s face and then uses face identification to access their account on the shop’s server, while [34] uses fingerprint. The cart uses the customer’s consumption interest data and the shop’s available products to generate a list of recommended products for him. It displays the information on its display unit. Furthermore, the cart continues gathering information to fine-tune the user’s interest as they shop. The problem with such systems are privacy issues since the shop’s server must store the customer’s biometric data.

Some researchers argue that the customer does not need an SSC. Smartphones can improve their experience. The shopper uses their phone to scan their groceries’ barcodes and make payments as proposed in [18]. These systems reduce the service time at the cashier’s table. Thus, improving the customers’ shopping experience. However, cybersecurity and privacy are some of the systems’ limitations. In this case, the customers can use their mobile phone application to scan the items they want and place the item back on the counter [46]. On check-out, the customer can collect their grocery or have them delivered home. The system also uses Natural Language Processing algorithms to recommend a recipe for the customer based on the items they bought. Another mobile application is developed to keep and track the user’s shopping list [10, 20]. In some cases [20], they used apriori algorithm for recommendation and possibly to remind the customer of items they may have forgotten.

Hussien et al. [22] proposed using RFID to help visually impaired customers navigate the shop. The shop has RFID stickers on both ends of each aisle. The customer inputs their desired aisle verbally through their phone’s microphone. The phone then gives the customer directions, while a Bluetooth and RFID reader-enabled cane for localization. Furthermore, there are RFID labels on the shelves to identify each item, which gives the customer access to information such as; item name, manufacturer, expiry date, and cost. Although the system helps the visually impaired get access to product information and location, too many components are needed for the system to work: (1) The customer must carry a phone, (2) they must also carry a specialized cane, and (3) they must operate both gadgets, while simultaneously pushing a cart. Also, the number of components may reduce the system’s reliability or hamper the ease of accessibility of the system.

A shopping list management system is developed in [7]. The system contains a list of items a customer wants to buy. It uses anchor beacons in the mall to locate the item at the top of the list. It then displays the route to the item location on the customer’s map. The item is removed from the list once the customer obtains it. The route to the next item is then displayed. This process continues until the list is exhausted. This system helps users complete their shopping in a short period. Thus, reducing congestion, which is useful during this COVID pandemic. Similar technology is proposed in [29]. However, in this case, the system informs the retailer of the users’ activity to let them know which goods are preferred. But, the system preserves the customers’ privacy by using the k-anonymity privacy-preserving technique. The system uses beacons rather than fog computing. Thus, all information is processed by the server, which degrades users’ shopping experience.

We propose an SMSH system that recommends items to customers. The system improves customers’ shopping experience. However, our system is different from the system in [7]: Our system decides which item to recommend by calculating the cost of recommending each product. Thus, giving the customer and the mall management some flexibility. The system calculates this cost using a weighted sum of sub-costs of items’ discount, shop location relative to the customer, and congestion in the shops. To the best of our knowledge, there is no such system in the literature. We hope it will show retailers the importance of such systems, especially in this COVID pandemic, where shoppers are afraid of enclosed areas and crowded places.

3 Proposed system

In this research, we proposed a smart shopping system to enhance shoppers’ experience. Figure 3 shows the system’s top level use case diagrams: Fig. 3a is the top level use case diagram of our proposed system, while Fig. 3b is the top level use case diagram of the simulation setup.

Fig. 3
figure 3

System’s top level diagrams

The shopper using the proposed system is called the Smart Shopper (SMS). The SMS accesses the system through his handheld device (i.e., smartphone, table, etc.). As Fig. 3a shows, the handheld device connects to the recommendation system through the fog layer as soon as the SMS enters the mall. His shopping list is connected to the recommendation system through the list manager. The list manager allows the system to interact with the customer through the list. The system knows which items are bought and what remains through the list manager. The recommendation system uses the state of the shopping list, the customer’s preferences (if set), the administrator settings, and items’ information in the database to determine which items to recommend for the shopper to buy next or which shop to visit. This choice is optimized using the recommendation system’s optimization algorithm discussed in Section 3.2.3.

For a fair simulation, we added other types of shoppers that may be found in the mall: Nonchalant shopper (NCS), Naive shopper (NVS), and Smart shopper (SMS). They are discussed in details in Sections 3.2.13.2.2, and 3.2.3, respectively. Figure 3b shows the top-level diagram of how these shoppers interact in the simulation. The NCS goes around the shops without any shopping list. He only decides to buy an item in the spur of the moment. The NVS is the reproduction of [7]; he has a shopping list and buys items according to it. Finally, the SMS is a shopper that uses our proposed system described earlier. He follows the system’s recommendations on what, when, and where to buy an item.

In this research we used Anylogic simulator [5]. It is a Java-based discrete-event simulator. Figure 4 shows the simulation setup for the proposed system. It shows the floor plan of a typical mall found in the city of Madina. Furthermore, we obtain the items and their prices from the respective shops’ online websites. It improves the accuracy of the simulation when investigating the savings of the different shoppers. Table 2 shows the shops and their simulation setup: The “ID” column shows the ID of each shop. It describes the type of shop as categorized in Table 3. The “Shops” column is the market name of the shops. The “Name” column is the name used in the simulation code [1]. The “Description” column explains the shop sales and its serial number in that category. The “Location” shows the shops’ physical location in the mall. It is represented in a two-dimensional Cartesian coordinate with the top left corner of Fig. 4 as the origin.

Fig. 4
figure 4

Simulation setup for the proposed system

Table 2 Shops used in the simulation
Table 3 Types of shops

In the simulation, we deploy Fog nodes in the mall. The circles at the center of the alley in Fig. 4 represent the range of each fog node. The fog nodes communicate with the shoppers’ IoT devices (smartphones or tablets) over the mall’s Wi-Fi. The fog nodes receive the customer’s shopping list or preferences. Then, the fog layer recommends discounts and other promotions that may interest them. It also helps localize the shopper so that the recommendation system can make an optimum decision on shops and items to recommend.

3.1 Market model

This section explains how the market in the simulation works. It describes how the shoppers go about buying items in the different shops. Furthermore, it also explains the queuing model used in each shop after a shopper comes to the cashier for checkout. Note that our proposed system has no online billing system. It only recommends items to the customer and plans their shopping session according to the shoppers’ and administrators’ redefined preferences. The market model is as follows;

  1. 1.

    The arrival of shoppers to the market is modeled as a Poisson process with an inter-arrival time of shoppers following an exponential distribution with an intensity λ.

  2. 2.

    The arriving shoppers are either nonchalant–, naive–, and SMSs with the probability Pnc, Pnv, Psm, respectively, where Pnc + Pnv + Psm = 1.

  3. 3.

    In this model, it is assumed that there is an infinite quantity of goods and the price does not change based on demand or supply.

  4. 4.

    When shoppers arrive at a shop, their shopping process is modeled by a queueing process shown in Fig. 5. The time it takes to service each user at the point of sale is fixed at tpos.

  5. 5.

    Table 2 shows the shops in the market developed. Thus, 1 ≤ n ≤ 11 for the (1).

  6. 6.

    We developed a database containing items in each shop and the price of the commodities in Saudi Riyals. The items were obtained from a database publicly available on the Internet [2, 12, 32, 49], while others were found on the websites of shops in Madina [25, 30].

Fig. 5
figure 5

Flowchart modeling shopping

3.2 Shoppers model

This section explains the behavior of the three types of shoppers in the system. Figure 6 shows the state diagram of the simulation: It shows how each shopper behaves in detail. The simulation starts at the topmost part of the diagram when the customer enters the mall. After variable initialization, the simulation checks to see the type of shopper. The first decision box checks to see if the shopper is NCS. Then, the shoppers’ behavior takes the right part of the state diagram. Section 3.2.1 explains what happens in the section. If the shopper is not an NCS, then the second decision box checks whether they are NVS or SMS. If they are NVS, they take the left part of the state diagram. Otherwise, they take the center part of the simulation. Their respective behaviors are is explained in details in Sections 3.2.2 and 3.2.3. After the shoppers have finished shopping, they exit the mall, and the simulator destroys the shoppers’ instance.

Fig. 6
figure 6

State chart of the shoppers in the proposed system

3.2.1 Nonchalant shopper model

As explained earlier, the market contains NCSs. Their behavior is described by the right branch of the first decision box in Fig. 6. The NCSs come for window shopping and occasionally buy items in a shop with the probability Pi, where i is the shop associated with the given probability. The probabilities Pi must satisfy (1). The presence of the NCS serves as noise to the other type of shoppers since they increase the queue size at the Point of Sale (PoS). As such, we use them to see how well the proposed system and the system in [7] perform in their presence. Once an NCS finishes with a shop, they continue shopping in another shop with the probability Pc. Therefore, the probability that an NCS continues shopping P after he has visited n shops follows a geometric distribution as shown in (2).

$$ \begin{array}{@{}rcl@{}} \sum\limits^{n}_{i=1}P_{i} &=& 1 \\ P &=& P_{c}^{(n-1)}(1-P_{c}) \end{array} $$
(1) (2)

3.2.2 Naive shopper model

The left branch of the second decision box of the state diagram in Fig. 6 shows the behavior of the NVS. The NVS is akin to the system proposed in [7], where the shopper has a list and moves from one shop to another to buy items from their list. For fairness, we have made some modifications to the system:

  1. 1.

    An NVS has little to no knowledge of the shop.

  2. 2.

    Furthermore, this shopper has no access to any of the mall’s maps. Hence, he goes from one shop to another looking for the desired item.

  3. 3.

    As the shopper moves about, they learn about the location of the shops in the mall with a probability or recall (Prs). Thus, they can easily go back to those shops.

  4. 4.

    Once the shopper finds the targeted shop. He can find all the items he wants with the probability Pri. This parameter also allows him to recall an item he wants to buy.

  5. 5.

    The shopper leaves the current shop after reaching the bottom of their shopping list. Then he starts from 2.

  6. 6.

    If he buys all items from his list, the shopper leaves the mall.

3.2.3 Smart shopper model

The forward branch of the second decision box of the state diagram in Fig. 6 shows the behavior of the SMS. The SMS represents our proposed system. His behavior is as follows:

  1. 1.

    We assumed that the SMS has a list of the goods they want to buy. Hence, they input the list content into the system through their handheld device.

  2. 2.

    On entering the mall, the shopper’s handled device connects with the fog layer through the mall’s Wi-Fi. Thus, the fog layer has access to the shopper’s list and shopping preferences.

  3. 3.

    To improve shopper experience, the fog layer calculates the cost for each item with respect to each shop using (3 – 7). The cost is a value between 0 and 100. A shop with a cost of 0 is the most desirable since it has the least cost.

  4. 4.

    The number of shoppers component of the cost is ρ. It allows the system to control crowds in the mall. The component δ measures the cost of distance relative to the SMS’s location. The component ψ measures the optimization cost in terms of the price of goods.

  5. 5.

    The system sends SMS to the shop with the least cost.

  6. 6.

    Th SMS moves to the next shop if he has finished buying the item and the next item is in a different shop.

  7. 7.

    When the list is exhausted, the SMS exits the shop.

$$ \begin{array}{@{}rcl@{}} i &=& \text{item shopper wants to buy}\\ S_{i} &=& \text{Shop where}~ i~ \text{could be purchased}\\ l &=& \text{Shoppers instantaneous location} \\ \psi_{i} &=& \text{Discount form original price of item, i}\\ \delta_{i} &=& \frac{\text{Distance between $S_{i}$ and $l_{i}$}}{\text{Distance end-to-end of mall}} \times 100 \\ \rho_{i} &=& \frac{\text{No. of Shoppers in $S_{i}$}}{\text{No. of Shoppers in each Shop}} \times 100 \end{array} $$
(3) (4) (5)
$$ \begin{array}{@{}rcl@{}} \text{Cost}_{i} &= \alpha_{1} \delta_{i} + \alpha_{2} (100-\psi_{i}) + \alpha_{3} \rho_{i} \\ Where, & \alpha_{1} + \alpha_{2} + \alpha_{3} = 1 \end{array} $$
(6) (7)

4 Results

In this paper, we propose a fog computing-based SMSH system. It recommends goods to customers as they shop in a mall. To evaluate the system’s performance, we simulated the system using AnyLogic [5]. Figure 7 shows the factors and response variables involved in the simulations. Factors are the variables that affect the system’s performance, while the response variables show how the system responds to changes in factors. Table 4 shows the parameter settings for our experiments.

Fig. 7
figure 7

Top-level block diagram for simulating the proposed system

Table 4 Constant simulation parameters

As mentioned in Section 3, we modeled three shoppers: (1) Smart shopper (SMS) – it is the proposed system that optimally proposes items and shops to the shopper. (2) Naive shopper (NVS) – modeled after system proposed by [7]. The shopper has a list, which he uses to go to the shop to buy products. (3) Nonchalant shopper (NCS) – is a shopper without a plan. He goes to shops at random and buys items at random. The NCS allows us to add noise to the system when testing the smart and NVSs.

In this section, two experiments were carried out: First, we investigated the performance of each system independently. We run each simulation for 24 hours (i.e., 86,400 sec) in simulation time. We run nine simulations. Table 5 shows the settings for all the nine experiments. The first column represents the notation used to refer to the simulation setting. The next column explains the simulation notations. The next three columns, C, P, and D, represent the Crowd, Price, and Distance, respectively. They are the optimization parameters for the recommendation system . The value 1 means the parameter is enabled in the simulation, while the value 0 means it is not. The Naive notation means that only the NVS was simulated, while the Nonchalant means only the NCS was simulated. Thus, in both cases, the last three columns are not needed.

Table 5 First experimental settings

Figure 8 shows the performance we obtained from the aforementioned simulations. Figure 8a shows the percentage of shoppers who completed their shopping in each simulation. This experiment is important because it shows us the efficiency of the systems. The Nonchalant experiment shows the highest efficiency, 98.68% completed shopping because they have a smaller shopping list. As explained in Section 3, after shopping in one shop, the NCS continues shopping in another shop with the probability Pc. Since Pc = 0.1, then the chance to continue shopping is little. Figure 9 shows how quickly the probability that the NCS will continue shopping increases in the number of shops already visited. Hence, 90.0% of the NCS only went to one shop, about 10.0% only visited two shops, and only 0.9% visited more than two shops. The NVS and the SMS have 20 items on their shopping lists in the remaining shopping experiments. Thus, we can fairly compare the other eight experiments. Among them, the Smart_All experiment shows the best performance because it considers all factors (i.e., C, P, and D) when recommending a shop or item to the SMS. The Naive experiment shows the worst performance because the NVS goes to shops according to their shopping list regardless of other factors.

Fig. 8
figure 8

Performance evaluation of the individual experimental setups

Fig. 9
figure 9

The probability to continue shopping

Figure 8b shows the average time spent by a shopper in minutes. The NCS has approximately \(\frac {1}{15}\times \) shopping duration of the other experiments because majority (90.0%) of the NCS visited only one shop, while the remaining shoppers have 20 items to buy, although some items are found in one shop. Excluding the Nonchalant, the best performance is shown by Smart_DC which prioritize distances of the shop from the shopper and the crowd size. The Naive has the worst performance, because the shopper does not regard any factors that will help him finish shopping earlier. Rather, he just follows his list. The simulations related to price; Smart_P and Smart_CP also show poor performance because the market forces push more customers to the shops with cheaper products. Thus, those shops will be crowded and the shopper will take more time in the check-out queue. Therefore, adding online billing system to the solution will help improve their performance. The SMS experiment in Fig. 8a and b show similar performance, because the earlier a customer can finish their shopping, the sooner more customers could be served.

$$ Savings = \sum\limits_{j=1}^{m}\sum\limits_{i=1}^{n}p_{i,j} \times \frac{\psi_{i}}{100} $$
(8)

Figure 8c shows the shoppers’ savings. Equation (3) shows how the results were obtained. It is the sum of money saved on all items n bought by all the shoppers who have completed their shopping m. The savings made by a customer j when he buys an item i, can be calculated by multiplying the percentage discount ψ and the actual price of the item p. The results show that Smart_CP has the highest savings because the algorithm recommends items and shops based on price and congestion. The P parameter helps increase savings, while the congestion part helps more customers complete their shopping. The next best experiment is Smart_P, which only focuses on the price. This algorithm tends to crowd shops with more items on sale. Thus, increasing the customers’ shopping sessions. The Smart_DP saves 4.44 × 103 SAR (Saudi Riyals) because it considers distance and price. It is worse than Smart_CP because distance and price crowd the shops closest to the door with the most discount. It is followed by Smart_All, which (intuitively) should have been the best because it has a little bit of all objectives. However, it performs poorly because of the P parameter. Smart_C, Smart_D, and Smart_DC lost about 1.43 SAR compared to the best performance because of the absence of the P parameter in the optimization algorithms. The Naive experiment also performs like the above trio because it does not account for price discounts. Of course, we know that the Nonchalant has the shortest shopping list, which is why it has the list price savings.

$$ J(S_{1},S_{2},\dots,S_{n}) = \frac{({\sum}_{i=1}^{n}S_{i})^{2}}{n{\sum}_{i=1}^{n}{S_{i}^{2}}} = \frac{\Bar{S}^{2}}{\Bar{S^{2}}} \times 100 $$
(9)

Figure 8d shows the simulations’ Jain Fairness Index (JFI). It calculates the fairness of a system [24]. A system is fair if it equally distributes load (i.e., shoppers) to its subsystems. The JFI rates the given system with a value between zero and one; zero indicates that the system is unfair, and one indicates that it is fair. In the case of our research, the JFI is used to measure the uniformity in the distribution of shoppers. Equation (9) calculates the JFI of the simulations — it is the squared average of the number of shoppers S divided by the average of the squares of the customer in the shops S2. It is multiplied by 100 to get a percentage. Figure 8 shows that the system with the best fairness value is Smart_C. This result validates the simulation since Smart_C greedily accepts the least congested shops. Naive is the second because they follow their shopping list. They become uniformly distributed among the shops because the list is generated using a uniform distribution. Smart_DC is the third because of the C parameter; without it (as in Smart_D), the fairness falls by 1.15%. The fourth performance is from Smart_CP. It performs poorer than Smart_DC even though both have the C parameter. It is because the P parameter drives shoppers towards shops with discounts, where there is more crowd. It is also why Smart_D performed better than Smart_P. The Smart_DP configuration has the worst performance because of the absence of the C parameter and the presence of price. Thus, it disregards crowd and looks for the closest shops with discounts, thereby creating congestion. Also, we see improvement in performance when the C parameter is added to Smart_DP to form Smart_All, because it is crowd-conscious.

We use analysis of variance (ANOVA) at a 95% confidence interval to investigate the SMS closer. ANOVA is usually represented in a six column table [35]. The first column of the tables are the factors under investigation; C, P, and D representing Crowd, Price, and Distance objectives respectively. Their combination represent the interaction between the factors. The second column is the sum of squares (Sum Sq.) of the residual errors. It helps express the total variation of the various factors. The third column is the degree of freedom (d.f.), while the forth column is the mean sum of squares (Mean Sq.). It is calculated by dividing the sum of squares by the degree of freedom. The fifth column is the F-ratio (F) of the factors. It is the ratio of the variation between sample means to variation within the samples. The sixth column is the p-value of the model. It is the probability of obtaining an F-ratio as large or larger than the one observed. Since the confidence interval is 95%, then the significance level of the investigation is 0.05. Thus, we can infer that the mean of a factor or interaction is statistically significant its p-value is < 0.05, otherwise it is not.

Table 6 shows the ANOVA table for the experiments in Table 5. It investigates the percentage of shoppers that completed their shopping. The results show that the percentage of customers who completed their shopping is mainly affected by C, D, and their interactions (C*D). Intuitively, this is true because the number of customers in a shop extends the time when the customer will get the item they want and the checkout time. Also, the distance to the shop determines when the shopper would reach the shop, which adds to the item’s shopping time. However, none of the results is significant because the proposed system recommends shops or items to customers based on the current number of shoppers in the shop. It avoids counting shoppers that have not yet entered the shop because they might balk and go to another shop. Therefore, to improve the system’s performance, it must be able to predict whether a shopper will accept the recommendation or not. Then recommend other shoppers based on that predictions.

Table 6 ANOVA table for completed shopping

Table 7 shows the ANOVA for the duration performance. It investigates the average time spent by a customer to complete their shopping. The results show that the D and the P parameters affect the average shopping completion time. This make sense since the distance adds to the time to reach a shop. The price tend to attract customers to a shop, thus, crowding the shop and making checkout queues longer. However, none of the results are statistically significant. Like the results in Table 6, the results are also affected by the inability to predict whether recommendation will be accepted by users.

Table 7 ANOVA table for duration (min)

Table 8 shows the ANOVA for the total savings for the SMS simulations. It investigates how the parameters affect SMS savings. We found that only the P parameter has the most significance. This result also helps validate the simulation by showing that the P parameter will significantly affect the savings of the SMSs. This result indicates that the D and C parameters and the interactions are not statistically significant. Thus, the only definitive way to maximize customers’ savings is by using the P parameter. However, earlier results (and Table 9) show that P tends to crowd the shops with many items on sale. Therefore, the P parameter should be combined with C to avoid this problem. But, they should not have equal priority since Table 8 shows that the result will not be reliable when they have equal priority.

Table 8 ANOVA table for savings (× 103 SAR)
Table 9 ANOVA table for load balancing shops

Table 9 presents the ANOVA results for the fairness of the proposed system. The system is fair if it uniformly distributes the customers in the mall. Thus, the table shows how the parameters affect the distribution of the shoppers in the mall. The results show that the P parameter accounts for most of the variation, followed by the D parameter. We learn from this experiment that P and D are the most significant parameters for crowd control, while all the others are not. Hence, we shall use this two in the second experiment. Furthermore, we know from the previous experiments that both P and D negatively affects the system’s fairness. The parameter P tends to crowd customers in shops with discounts, while D crowds them in shops closer to the entrance because the algorithm starts working immediately after the shoppers enter the mall.

The second experiment aims to determine the system’s performance in a realistic setting. In a real mall, there are the NCSs, especially in a tourist city like Al-Madinah. The NCSs stroll through the shops without a prior plan. They are there for window shopping. Thus, they seldom buy any product. They serve as noise to both the Smart and the Naive systems because they increase the service time at the POS when they join the queue. In this experiment, we introduced 20,40,60, and 80 percent of all shoppers as NCSs, while the remaining are SMSs or NVS. The ANOVA tables have shown that P and D are statistically significant parameters in the proposed system. Thus, we choose Smart_P, Smart_D, and Smart_DP to compare with the Naive. The Naive strictly follows his shopping list, it is developed based on system in [7]. Figure 10 shows the results of the experiments.

Fig. 10
figure 10

Performance evaluation of the smart and naive system with varying nonchalant

Figure 10a shows the savings in SAR for Smart_P, Smart_D, Smart_DP, and Naive against the number of NCSs, at a 95% confidence interval. The results show that the Naive experiment has the poorest savings. It saves less at NCS= 80%. The result shows that the performance of Smart_D is statistically the same as Naive’s. As expected, the Smart_P has the highest savings since its primary objective is to find discounts for the SMS. Hence, it is safe to say that the Smart_DP performs poorer than Smart_P due to the D parameter. The D parameter lures customers to shops closer to them regardless of the price. Thus, adversely affecting the savings of the SMS.

Figure 10b is the average time it takes to complete buying the items in their shopping list. All experiments show resistance to an increase in the number of NCSs. Therefore, both systems work well when many shoppers are not using the system. Another possibility is that they show resistance because the NVS selects the destination uniformly. The results show that the Smart_D has the shortest shopping duration because the D parameter takes the customers to the shoppers closest. The Smart_P performs the poorest among the smart systems. The P parameter attracts SMS to shops with discounts, which leads to a long queue. Thus, making the shoppers stay longer in POS queues. However, when P is combined with D, in Smart_DP, the shoppers complete their errand 4.7% earlier than Smart_P. Therefore, we can conclude that the D parameter help reduces the shopping duration. The Naive simulation takes longer than all smart systems because the NVS strictly follow its shopping list.

$$ \begin{array}{@{}rcl@{}} completion~rate &=& \frac{S_{completed}}{S_{All}} \times 100 \\ S_{All} &=& S_{completed} + S_{Shopping} + S_{arriving} \end{array} $$
(10) (11)

Figure 10c shows the percentage of customers who completed shopping. As shown in (10), it is the ratio of the number of shoppers of a particular category that complete their shopping (Scompleted) to the total number of shoppers in the same class (SAll). The figure shows that as the number of NCS increases, the number of NVS that completed their shopping from the Naive category decreases. The more NCS, the longer the POS queues in the shops. Which in turn increases service time at the POS. The results for Smart_P, Smart_D, and Smart_DP do not give any meaningful trend. This result agrees with the completed shopping ANOVA table (see Table 6) that all the factors are not statistically significant. The completion rate is affected by the arrival rate of the customers as shown in (10) and (11). Since we used exponential distribution to model the arrival rate of the shoppers, then the number of new customers arriving will greatly vary from time to time, and the proposed system could not cope.

Figure 10d shows the fairness of the experiments. Fairness shows the degree of equality in the number of customers across the shops. The fairness using JFI is a percentage. It is 100% when all shops have the same number of customers. The figure shows that the Naive experiment is the fairest system. Its performance seems fair because a uniform random number generator creates the shopping list. Thus, the generated shopping list uniformly distributes the shoppers across the mall. However, its performance drops in a severe case of noise when NCS= 80%. The Smart_D configuration is the next follows. It performs better than Smart_P and Smart_DP because the shoppers go to the shop closest to them. The algorithm starts to work at the entrance to the mall. The simulator uses Gaussian distribution to generate the shopping list. Thus, the Smart_D shoppers would appear to pack themselves in the shops closer to the mall’s entrance at the simulation’s beginning. Later on, they gradually populate the other end of the mall.

The Smart_P also performs satisfactorily because it only chooses the cheaper shops at the beginning of the simulation. Then, the shoppers will start reaching other shops with no discount because they have no option. The Smart_DP performs the poorest because the SMS has two objectives: Distances and Price. Hence, they select the shops closest to the user and offer cheaper items as their first choice. They choose shops that offer either one of the conditions as the second choice. After exhausting the first and second choices, they start visiting shops that are neither close to the entrance nor offer discounts. Another factor is that all the proposed system’s experiments graphs have a V-shape. It assumes this shape because fairness combines the SMS and NCS or the NVS and NCSs. The pairs counter the performance of each other, which causes the performance to degrade. But once one outnumbers the other, the performance begins to improve.

5 Challenges

This section presents the challenges we found during this research. They are from our proposed system and the papers in the literature. Table 10 shows the challenges in SMSH systems. The first column is the challenge observed in the literature, while the second column discusses how the challenge affects SMSH systems and possible research gaps.

Table 10 Challenges in smart shopping system

In our research, we found that the proposed system can perform better if the items in the shopping list are clustered according to their shops. Also, it only records a shopper after they reach the shop because they may balk and go to another shop. Thus, the system should predict whether a shopper will go to the shop. Then use the information to provide recommendations to the other shoppers. Another issue is that the shops do not have the same capacity in a real mall. Therefore, the algorithm should not aim for the same number of shoppers in all shops. It should be a threshold for the number of customers in each shop to avoid overcrowding.

6 Conclusion and future works

In this paper, we propose a fog computing-based SMSH system. It recommends shops and items to customers according to their preferences. We believe that an improved shopper experience will guarantee repeat visitors. Also, the mall administration could use the congestion parameter to reduce overcrowding. The system uses three parameters to optimize the shopper experience. These parameters are (1) the distance between the shopper and the shop, (2) the number of people in the desired shop, and (3) whether the shop offers a discount. We also found that the proposed system outperforms the system in the literature. It provides up to 30% in terms of savings and 6.12% faster shopping. However, the system is not as fair as the one in the literature.

Therefore in the future, we shall investigate the system performance when the shops have an uneven number of items on sale. Also, an online billing system will be introduced to see if it will improve the SMS completion rate. Finally, we will investigate how to manage the exponential nature of the arrival rate of the SMSs.