Introduction

The concept of the Circular Economy is gaining relevance in political and business thinking in developing strategies that support a transition to a more sustainable future [15]. A Circular Economy describes an economic system based on business models that replace the “end-of-life” concept [16] by reducing, reusing, recycling, and recovering materials in production/distribution and consumption processes. The Circular Economy can operate at the micro-level (products, companies, consumers), meso-level (eco-industrial parks), and macro-level (city, region, nation, and beyond). It aims to accomplish sustainable development, which implies creating environmental quality, economic prosperity, and social equity for current and future generations.

The Circular Economy system can improve the product cycle and changes the system and mentality for both production and the consumer [10, 28].

In line with these concepts, the principles and actions of circularity have been categorized and classified. With regard to the different categories, the strategies, from lowest to highest level of circularity, have been taken into account [29, 30, 44], where R9 corresponds to the strategy with the lowest level of circularity and R1 pertains to the highest level of circularity. In turn, these strategies are framed in three groups (Table 1).

Table 1 Classification of R principles.

Research on the Circular Economy has evolved, starting from a technical perspective, where resource efficiency and environmental impact were valued, adding other aspects, such as value chains and business models [24, 31].

In order to know the state of the art, bibliometric techniques have been used, searching for the terms “Circular Economy” and retail * in the Web of Science Core Collection. The search yielded 111 documents.

If the selection is further refined by type of document only to include articles published to the end of the first semester of 2021, we obtain 84 articles in 47 journals.

The main research area in which the articles are framed is Environmental Science Ecology (41), followed by Engineering (39), Science Technology other topics (28), while Business Economics is ranked fourth (23).

There are several common elements among the ten most cited articles (Table 2): All are published in a five-year range. In four of these, the central object studied is food and the use of its surplus in commercial establishments. For example, Borello et al. [5] experimented with evaluating the willingness of consumers to actively participate in closed circuits to reduce food waste through the active participation of all actors in the supply chain. Along the same lines, Mondejar et al. [23] addressed how marketing and sale strategies negatively influence the waste behavior of individuals, emphasizing the critical role retailers play in preventing the generation of food waste.

Table 2 The Ten most cited articles that include the terms “Circular Economy” and “retail”

However, if we look at the 369 keywords that appear in these articles, although, as independent terms, sustainability and model are the most frequent, other keywords that are included in conjunction with others stand out. Examples of these are management, design, supply chain and waste, or a combination of them.

There are very few scientific publications related to such important elements of the Circular Economy in the retail sector, such as Communication, Transportation, Retail Reverse Logistics Operations, or Legislation and Regulation. Thus, there is a lack of a framework explaining how companies willing to become circular adapt their existing business model or create a new one [38].

Methodology

In developing this work, we use a Deep Learning technique to predict the resistance of the businesses located in the commercial premises under study based on their activity, their circularity rating, and the history of opening and closings that the premises have had over the selected periods.

We define the resistance of commercial premises and their associated businesses as the activity’s survival in a given year. This survival is conditioned by many factors [35, 47]. However, our purpose is to study whether the activity and circularity information is sufficient to determine its resistance.

To this end, we have divided the prediction of the resistance of the premises into two periods: one related to the Great Depression of 2008, and the other to the post-depression period, from 2014 to 2018.

Obtaining data on the activities of commercial premises and economic activity and changes in them have been obtained from direct exploration through Google Street View®, following a methodology based on the collection of primary data [39]. By using this tool, it has been possible to obtain information from 2008 onwards and verify the actual evolution, in images, until 2018 (Table 3). Commercial areas in cities in the UK and Spain have been analyzed, resulting in a total of 658 data points: London (193), Barcelona (198), Valencia (174), and Almería (93).

Table 3 Detail of the data table, the data corresponding, by columns, to the information on Activity (Axxxx), Circularity Index (Cxxxx), and Occupation (OCxxxx): Open or closed

This information is structured in a table that presents, for the study carried out in this paper, the format \(\left( {a_{yi} c_{yi} o_{yi} ...a_{yk} c_{yk} o_{yk} } \right)\), where a is the activity of the premises, c is the circularity index, and o the occupation of said premises in the years \(y_{i} ...y_{k}\).

We evaluated the resistance of the commercial premises for the periods 2008 to 2011, in 2012, in 2014, and for the period 2014 to 2017 to predict the survival of activity in 2018.

Deep Learning and its application to prediction

The development of neural networks during the 1980s and 90s significantly boosted the development of artificial intelligence and its applications in scientific and technical fields. The basic neural computation models consolidated their capacities in classification and prediction in problems considered too difficult due to the number, typology, sample characteristics, and quality of the data they worked with [40, 42].

The use of pre-trained neural networks and the appearance of new models has led to a considerable improvement in the classification, recognition, and prediction processes of complex phenomena.

Deep Learning arises from the need to improve recognition processes using pre-trained neural network models for more straightforward tasks [32]

This technique has been widely accepted after displacing other identification, classification, and data analysis tasks in image, video, and text problems. Deep Learning is widely used to analyze symbolic information (non-numerical) and in the processing of text messages [17, 19, 46].

LSTM (Long Short-Term Memory)

Many of the problems that we encounter in the real world are understandable if you consider the evolution of their states over time. Time series are a classic case of this type of problem. A similar problem is encountered when developing systems understand a text or the evolution of a specific market.

One of the main handicaps of neural networks, in their basic architecture, is their inability to deal efficiently with problems in which the state of a phenomenon depends on a succession of states in a previous time [25].

This deficiency has been resolved through recurrent neural networks. A recurrent neural network can be expressed as a function whose output depends on a set of inputs and on the output of that same network at a previous time.

An outline of a recurring network is displayed in Fig. 1.

Fig. 1
figure 1

Outline of a recurrent neural network. The input, x (t), and the output at an earlier time, y (t-1), feed the activation and output function of the neuron to produce the output y (t)

Internally, the calculation of the output is given by the expression:

$$y\left( t \right) = {\text{tanh}}\left( {W \cdot \overline{{\left[ {x\left( t \right),y\left[ {t - 1} \right]} \right]}} + \overline{b}} \right),$$

where W is a matrix of weights that confers a weight to each input and output element in the previous instant, and that is subject to modification through training,\(\overline{{\left[ {x\left( t \right),y\left[ {t - 1} \right]} \right]}}\). It is the concatenation vector of the inputs and outputs at the previous instant, and \(\overline{b}\) is a vector of fit values.

One such type of network is LSTM. LSTM provides an answer to the problem of maintaining “long-term dependencies” that other models cannot. Indeed, LSTM solves this problem thanks to its structure. These networks were introduced by Ref. [14], although their redefinition and their applications have been developed in several studies [13, 32, 33].

An LSTM network has a structure like the one shown in Fig. 2:

Fig. 2
figure 2

Detail of one of the levels of an LSTM

LSTM has a much more complex internal structure for solving long-term dependencies than the conventional recurring network model. The network is made up of the following:

  • CS (t-1), CS (t): The state of the cell at each instant of time. It stores contextual information and long-term dependencies, which are disseminated through the LSTM network with minimal modifications at each instant of time.

  • X (t): Data entry at a particular moment.

  • Y (t-1), Y (t): Network outputs at given time instants, t-1 and t.

  • Concat:\(\overline{{\left[ {x\left( t \right),y\left[ {t - 1} \right]} \right]}}\) Operation of the concatenation of the inputs with the output at a previous time, forming an input vector to the network at the current time.

  • δ: Thresholding function of the input data, following a logistic function, to which weights are applied that can be modified by training.

  • Tanh: Thresholding function between -1 and 1, following a hyperbolic tangent function, to which weights are applied that can be modified by training.

  •  + : Operation of the sum of the components of the input vectors to the function.

  • Χ: Product operation, element by element (bitwise mult.) of the input vectors to the function.

As can be observed, the state of the network is transmitted from one processing unit to another, with local interaction that modifies it slightly. This signal allows us to maintain the long-term dependencies previously mentioned and, if related, connect the status of the premises in previous years with the current status.

LSTM is trained by modifying the weights that appear in the different operations, within each stage, following an iterative process of reducing the error committed at the output, with the validation set taken as a reference.

Results

We carried out an experiment using LSTM to determine if there is a relationship between the resistance of the commercial premises under study, their status in previous periods, the type of activity, and their rating in the Circular Economy plane. This study will describe the commercial premises using a series of labels that, as a phrase, will be interpreted by the classifier as a vector with non-numerical characteristics.

This method of working is novel and, we believe, convenient, for several reasons:

  • The numerical encoding of the circularity and activity information is not natural to the problem. It induces false artificial relations of order between the different activities and the circularity classifications. Re-encodings performed on the LSTM network do not produce this effect.

  • Traditional statistical techniques do not adequately solve the long-term dependency problems that occur in this problem. We want to determine whether or not that dependence is decisive for the survival of the business.

  • The data may be biased from a statistical point of view, making it challenging to use traditional statistical techniques. This issue does not affect neural networks: they can learn from any data set, regardless of its sampling distribution.

Information on each of the premises includes their activity, R index, according to the Circular Economy, and status (open or closed) in each of the years being analyzed.

We use a pre-trained network architecture with an LSTM neural network as the main element, implemented using Matlab, to build a system that “predicts” the status of a commercial premises based on the information available about it in previous years.

As there are two periods, we constructed two predictors, one for each period.

The construction of the predictor requires the partition of each data set from the previous ones into two parts: a training set, in which the LSTM network algorithm will be used to build the predictor, and a data validation set, which will be used to test its accuracy.

The validation set follows the statistical rule of cross-validation, thus eliminating the possible adverse effects of a poor choice of the validation set that invalidates the results obtained.

We have used 70% of the total data set for the predictor construction process and 30% according to the cross-validation criteria to build the validation set, allowing us to test the result obtained.

The idea behind this experiment is that if a predictor can be built with sufficient precision, using the data referenced above, we can deduce that we have a strong dependence between these and the resistance level of a commercial premises over time. This dependence has been studied in the works of Ref. [12, 18, 48].

First experiment. Development of the LSTM predictor during the Great Depression of 2008

This experiment utilized data from 2008 to 2011, following the indicated formatted sequence \(\left( {a_{yi} c_{yi} o_{yi} ...a_{yk} c_{yk} o_{yk} } \right)\), where a is the activity of the premises, c is the circularity index, and o is the occupation of said premises from \({ }y_{i} ...y_{k}\) from 2008 to 2011. The status of the premises in 2012 and, in a similar procedure, 2014 (open or closed) is used as prediction data. The decision to use a two-year period is important for several reasons:

  • It frees up the learning process from the specificity of the year we want to predict (2014 and 2012 are very different years, economically speaking).

  • As expected, the precision obtained for 2012 should be higher than for 2014, indicating the strength of the relationship that we want to test.

    Using Matlab and the Deep Learning Toolbox, a multi-level neural network architecture with an LSTM level as the main element was trained on an i7 computer with 16 GB of RAM, in which the data had been previously prepared to be used as a training set. The experiment involved the following phases:

  • Selection of results to be searched by the predictor (Status of the commercial premises in 2012 (or 2014 in a later experiment)

  • Partition of the sample sets (686) into two subsets: the training set (70%) and the validation set (30%). The samples from the latter were not used for the training of the LSTM network. They were taken randomly and followed statistical criteria to cross-validate the result obtained. This last set was used to determine the precision of the predictor.

  • Re-coding of the data's text values by numerical values is suitable for the LSTM training system.

  • Creation of a Deep Learning level structure with the following structure:

  • An initial level of a training data stream

  • A level that converted text to vectors for use in training.

  • An LSTM level, which is the network that provided the prediction.

  • A level of adaptation of the outputs and calculation of the accuracy of the classification.

The operation required relabeling of activities and circularity indices to improve performance in learning processes.

The training of the networks was then carried out. Figures 3 and 4 indicate the following:

  • Training accuracy. Accurate classification in real time. The system offers precision data for every five samples entered for learning. This classification accuracy is carried out, during the training process, with the data used for training.

  • Smoothed training accuracy. Precision in smoothed classification. Using the moving average of the previous precision values. In our case, it was more beneficial to observe the trend of the training process.

  • Validation accuracy. Precision is obtained by the predictor, using the data from the training set exclusively at each moment of the training process.

  • Training loss. Mean square error committed during the training process, according to the cross-validation criteria for each group of five introduced learning samples.

  • Smoothed training loss. Moving average of the error made using the previous values.

  • Loss in the validation set. This is an error made using the set of validation samples for its calculation. It is the most critical error.

Fig. 3
figure 3

Training results for the 2008 financial crisis predictor

Fig. 4
figure 4

Training results for 2014

The precision obtained for this predictor is 93.17%. This value means that the system predicts, on the set of tests, with a 93.17% accuracy, the survival of a commercial premises based on the activity and circularity information before 2012.

When repeating the prediction for 2014, the precision obtained dropped to 84.88%, evidence of the strong relationship between the status of the premises in the years of the crisis and its subsequent evolution. This is consistent with the fact that, as time progresses, the survival of the premises is affected by other conditioning factors beyond what the predictor has learned.

Second experiment. Prediction of the status of the commercial premises in the post-financial crisis period

The second experiment uses data regarding the status of the premises and their Circular Economy index in the years after the Great Depression of 2008, in particular, from 2012 to 2017, and links this information to their status in 2018.

The same type of predictor is used and trained under the same conditions as the previous experiments, with data from the post-depression period.

The results of the training also show very significant precision values (Fig. 5), of the order of 94.15%.

Fig. 5
figure 5

Training results for 2014

These results lead us to think that the relationship between the business's status, activity, and circularity index in a past time interval decisively conditions its survival in subsequent years.

Discussion

“Circularity” has been applied to many subject areas far removed from that of productivity [34]. It has been linked to such disparate areas as urban design [2, 44], digital technologies [6, 37], sports [27, 41], healthcare [36, 45], and retailing. This last sector, retailing, is expected more excellent projection in the coming decades.

An Internet search [9] showed that of the top 25 European retailers, ten publicly addressed the Circular Economy on their websites, indicating their commitment to promoting a transition to a more Circular Economy. “Retailers have a key role to play in sharing the benefits of the Circular Economy as millions of European consumers buy their products in our stores every single day” [11]. Thus, leading retailers must play a leading role in shaping the Circular Economy. However, for the models to be more sustainable, the change in their current business strategies will also need to be accompanied by substantial changes in consumer consumption behavior.

Charter [7] argued that in the “Age of Acceleration,” the Circular Economy will be increasingly essential and experiences rapid changes. In the consumer society environment, the retail sector contributes significantly to waste production. For this reason, the Circular Economy has become a significant alternative to the classical economic model in recent years, and the retail sector has started to advance along these lines [20].

Although there is a growing body of literature concerning the Circular Economy, and despite the importance of this issue for the retail sector, academic studies that focus on the Circular Economy and retailing remain scarce. The first publication on this topic did not appear until 2014, with the research of Mirabella et al. [21], which focuses on the use of food waste derived from food manufacturing and the goal of a zero-waste economy in retailing.

In addition to the already mentioned research by Mirabella [21], other outstanding works in this field are as follows: Mondejar-Jimenez et al. [23], which emphasized the critical role of retailers in preventing the generation of food waste through their marketing and sale strategies, Borrello et al. [5], whose results show the potential participation of consumers in closed circuits inspired by the principles of the Circular Economy, Weissbrod and Bocken [43], which showed how a firm pursues innovation activities for economic, social, and environmental value creation in the context of time sensitivity, Zhong and Pearce [49], which concluded that the tightening of the loop of the Circular Economy benefits the environment and sustainability, as well as the economic stability of consumers/prosumers, Corrado and Sala [8], which made a review of existing studies on the generation of food waste on a global and European scale, whose main objective is to describe and compare the different approaches adopted.

The research shows the strong relationship between occupation and activity and the survival of the premises, as well as the relationship between the status, activity, and the circularity index of the business in a past time interval decisively conditions its survival in future.

The current linear production model seems to be the cause of the principal environmental problems, climate change, or plastic accumulation. The Circular Economy represents an alternative to the current linear production, consumption, and waste generation model that is highly unsustainable both at an environmental and economic level.

Conclusions

Predicting commercial and business activity survival is an interesting and complex challenge for public entities (urban planning, purchasing, installing urban furniture, or tax planning). The same is also true for private companies seeking to maximize their investment, choosing locations, and activities resistant to the passage of time.

Over time, and as the commercial fabric of an area consolidates, forecasts become more reliable since the time factor helps understand the market better. However, for commercial activities in the development stage, predictions are more complicated since, not being able to rely on historical data, forecasters are forced to obtain data from more external agents than those of the business itself, with predictions based on data from web pages, social networks, or internet job postings.

The analysis of the short-term data of our research demonstrates how the survival of a business’ activity is linked to its status and circularity strategy in previous years, observing how those places whose activities are related to the Circular Economy survive longer. This conclusion, however, is not categorical because said survival is also linked to specific events in the economic or personal sphere that can permanently distort said activity.

Our analysis also verifies that the status of a business’ activity over time can be predicted with a high level of accuracy based on its circularity index, which is very useful for urban development agents and local agents when taking action or making urban planning decisions that strengthen and support local businesses. In addition, the technique can use new characteristics adapted to the urban environment in which the activity takes place to determine if these are also decisive in the survival of the businesses.

Machine learning models that predict business success and the analysis carried out by neural networks conclude that predictions with a high degree of precision can be achieved with the data obtained. However, these prediction levels must be taken with the necessary caution and interpreting that the information obtained is focused and circumscribed to the environment under consideration. We do not know how this technique could work on a global level. While this may seem like a problem, it is not, as the training process can be almost wholly automated once the data for an area has been obtained.

The technique used in this work should be interpreted as an application study. It is not intended to be a commercial advisory service (for now) for business survival as there is not enough data available to train a more generic predictor. Nor is it understood how said predictor would operate in heterogeneous urban environments (neighborhoods of different urban categories, rural areas linked to urban areas, or cities with manifest social conflict).

As a limitation to the study and future lines of research, it would be much more enriching to increase the percentage of success, combine data, and non-formal and qualitative information, and add different data sources to provide the commercial activity with the most information possible on its temporal evolution. In addition, this research has been developed in medium and large-sized cities because GSV is mainly available in urban areas. For future research, it would be interesting to apply these studies in rural areas where available.