Keywords

1 Instructions

1.1 Question Background

The retail industry will adopt various ways to attract more consumers to become members, and try to improve the loyalty of members. At present, the development of e-commerce leads to the continuous loss of shopping mall members, which brings great losses to retail operators. At this time, operators need to implement targeted marketing strategies to strengthen good relations with members. For example, businesses take a series of sales promotion for their members to maintain their loyalty.

Some people think that the cost of maintaining old members is too high. In fact, the investment of developing new members is much higher than taking certain measures to maintain existing members. Improve members’ image, strengthen the detailed management of existing members, regularly push products and services to them, and establish a stable relationship with members is an effective way for the better development of the real retail industry.

1.2 Question Related Information

We obtain the data of the member related information from a large department store: the member information data, the sales flow table in recent years, the member consumption detailed list, the commodity information table and the data dictionary. Generally speaking, the higher the commodity price, the higher the profit. We will focus on analysing the consumption characteristics of the members of the shopping mall, compare the differences between members and non-members, and explain the value that members bring to the shopping mall. Establish a mathematical model to describe each member's purchasing power according to their consumption situation, so as to identify the value of each member. As an important resource in the retail industry, members have a life cycle. During the process from joining members to quitting, members’ status, such as active or inactive will change constantly.

Therefore, it’s necessary to try to establish a mathematical model of member life cycle and state division in a certain time window, so that the store managers can manage the members more effectively.

2 Model Hypothesis and Symbolic Description

2.1 Model Hypothesis

Through the data, cash register number and transaction time, an order ticket can be determined only, the small ticket may contain several different commodities of different brands. In other words, it is assumed that there are no two customers who settle accounts at the same time or at the same cash register and record the same document number in the system. It is assumed that there are only two forms of sales promotion in the market, one is direct price reduction or discount, which represents the difference between the amount paid by customers and the original price of goods, and the other is market reward points, which represents the increase of member points.

2.2 Problem Analysis

The first step, we compare the differences between members and non-members in terms of purchases quantity, purchase amount, return quantity and return amount. For some of the members from other branches, we also analyzed the differences between our members and the members from other branches in terms of purchase and return behavior.

At the same time, we analyze the different groups’ consumption habits distribution according to consumption data, which can more intuitively see the differences between member groups and other customer groups in customer consumption habits and their customer value. Based on the quarterly consumption amount of members, we will establish a mathematical model to reflect how consumption amount and time affect members’ purchasing power. According to purchasing power and RMF model, we can observe the change of customer value.

For the purchasing situation of members and non-members, we choose the average unit price, the total number of purchases and the total amount of purchases as three indicators. We note that the dataset provides the return records of members. We believe that returns will have a extremely important impact on the sales and personnel scheduling of the mall. The customers of the group with less quantity and amount of returns are relatively mature, resulting in relatively small profits loss and personnel loss to the shopping mall. For the returns of members and non-members, we choose the average unit price of returned goods, the total number of returned goods and the total amount of returned goods as three indicators. Most of the members are members of our store and some are members of other branches. The members from other branches also enjoy the rights of ordinary members, such as members’ discounts and credits, but they are not the object of membership management in our store. Therefore, we conducted the same analysis on the purchasing and returning situation of our members and other branch members.

2.3 The Construction Model of Purchasing Power

According to the members’ consumption of the characterization of every member of the purchasing power, to recognize the value of membership. According to the theory of RMF model, the RMF measure of customer value, that is, R, represents retention rate, M represents the amount of consumption, and F represents consumption times. We believe that the consumption amount of M in the RMF model, indicating the purchasing power of members. The more the amount of consumption, the higher the purchasing power. Furthermore, the shorter the last consumption time and the current time interval, the higher the value of customers. In RMF model, M represents the sum of customer's historical consumption amount, which increases over time. We believe that members’ purchasing power will change over time. Considering the recent consumption amount and historical consumption amount of members, the changing trend of purchasing power can be explained.

We set the purchasing power of Member \(i\) at \(t\) Quarter as \({P}_{i,t}\):

$${P}_{i,t}= {M}_{i,t}\times \frac{2}{5}+{P}_{i,t-1}\times \frac{3}{5},t=\mathrm{1,2},3,\dots$$

\({M}_{i,t}\) is the Consumption at Current Quarter, \({P}_{i,t-1}\) is the purchasing power of the previous quarter, so \({P}_{i,0}=0\).

In summary, the criteria given by the model for judging membership status are as follows:

Members are considered active members, Members have consumption records within three months, there is no consumption record in three months, but there is consumption record in five months, that is to say, it is considered to be an inactive member. Members who have no consumption records in five months are invalid members.

2.4 Trend Analysis of Purchasing Power

From the analyze, it can be seen that the purchasing power of the top 10% customers with the highest purchasing power index has been rising in the nearly 2 years, and the gap with the purchasing power of the other 90% customers has also been widening. The purchasing power of the remaining 90% of the customer base has been declining over the past two years. From this, we can see that the shopping mall's customer group presents a long tail phenomenon, 90% of the customers’ consumption capacity is constantly declining, purchase intention and gradually declining. The 10% customer group with the strongest purchasing power has a more and more significant share in the development and profit of the shopping mall, and their purchasing power and willingness to buy are also increasing. This part of the customers have higher customer value.

2.5 Division of Membership Status

Members’ life cycle can be defined as: membership (development) - > active period - > inactive period - > invalidation (withdrawal) period. In our opinion, how to judge that members enter the inactive period after they do not buy commodities for a period of time. And how to determine whether a member does not buy goods for a longer period of time, that is to enter the expiration period, which is very critical.

Set the status of Member \(i\) at \(t\) time as \({\mathrm{S}}_{\mathrm{i},\mathrm{t}}\)

Let \({\mathrm{S}}_{\mathrm{i},\mathrm{t}}\) be the state of member i at t time.

The state \({S}_{i,t}=-1\) means that customer \(i\) is invalid at time t.

The state \({S}_{i,t}=1\) means that customer \(i\) is inactive at t time.

The state \({S}_{i,t}=2\) means that customer \(i\) is active at time t.

Let M be the symbol of the amount, Q the symbol of the quantity, and C the symbol of the number of purchases to the shopping mall. For the development state, we think that generally speaking, it can be classified as inactive state, that is, the activity of new members is not enough to enter active state. Generally speaking, we can assume that in the recent \(\Delta {t}_{1}\) period, member \(i\) went to the mall more than \({c}_{1} times\); A total payment exceeding \({m}_{1}\) or a purchase exceeding \({q}_{1}\) is considered to be active.

However, in the recent \(\Delta {t}_{2}\) period, membership \(i\) goes to the mall more than \({c}_{2}\) times, or pays more than \({m}_{2}\) yuan altogether, or purchases more than \({q}_{2}\) goods, which is considered inactive; in other cases, membership is invalid and withdraws.

So as:

$${S}_{i,t}=\left\{\begin{array}{ccc}2\, ,& {M}_{i,t,\Delta {t}_{1}}\ge {m}_{1}\vee {Q}_{i,t,\Delta {t}_{1}}\ge {q}_{1}\vee {C}_{i,t,\Delta {t}_{1}}\ge {c}_{1}\\ 1\, ,& \begin{aligned}({M}_{i,t,\Delta {t}_{1}}<{m}_{1} & \wedge {Q}_{i,t,\Delta {t}_{1}}<{q}_{1} \\ & \wedge {C}_{i,t,\Delta {t}_{1}}<{c}_{1})\\ \end{aligned} \wedge \begin{aligned}({M}_{i,t,\Delta {t}_{2}}\ge {m}_{2} & \vee {Q}_{i,t,\Delta {t}_{2}}\ge {q}_{2} \\ &\vee {C}_{i,t,\Delta {t}_{2}}\ge {c}_{2}) \\ \end{aligned} \\ 0\, ,& other\end{array}\right.$$

Currently, members' consumption data totals three years, of which the first year is incomplete. For members' life cycle, the time of data is not long enough to support the simultaneous calculation of so many thresholds, so we simplify the model appropriately. We believe that in the recent \(\Delta {t}_{1}\) period, Members i purchased at least one commodity, which is considered active; If the member has not purchased goods in the latest \(\Delta {t}_{1}\) period, but has purchased at least one item in the latest \(\Delta {t}_{2}\) period, the member is considered inactive. In the recent \(\Delta {t}_{2}\) period, members have not purchased goods, they think that the membership has lost.

So the simplified model is

$${S}_{i,t}=\left\{\begin{array}{ccc}2& ,& {C}_{i,t,\Delta {t}_{1}}\ge 1\\ 1& ,& {C}_{i,t,\Delta {t}_{1}}=0\wedge {C}_{i,t,\Delta {t}_{2}}\ge 1\\ 0& ,& other\end{array}\right.$$

So now we have to determine the size of \(\Delta {t}_{1}\) and\(\Delta {t}_{2}\). The activation rate \({P}_{\mathrm{0,2}}(t,\Delta {t}_{2},i)\) of inactive members is defined as: at time t, member i has not purchased any products during the \(\Delta {t}_{2}\) period before time t. But in the time from t to t + 1, the probability of purchasing at least one product.

The activation rate \({P}_{\mathrm{1,2}}(t,\Delta {t}_{1},\Delta {t}_{2},i)\) of inactive members is defined as: at time t, member i has not purchased any products during the \(\Delta {t}_{1}\) period before time t.At least one product has been purchased from \(\Delta {t}_{2}\) to\(\Delta {t}_{1}\), but the probability of purchasing at least one product from time t to time t + 1.

We assume that \({P}_{\mathrm{0,2}}\) and \({P}_{\mathrm{1,2}}\) are independent with the members and the current time, that is, \({P}_{\mathrm{0,2}}\left(t,\Delta {t}_{2},i\right)={P}_{\mathrm{0,2}}(\Delta {t}_{2})\), \({P}_{\mathrm{1,2}}\left(t,\Delta {t}_{1},\Delta {t}_{2},i\right)={P}_{\mathrm{1,2}}(\Delta {t}_{1},\Delta {t}_{2})\). And the probability is expressed by statistical frequency, so the following conclusions are drawn:

Conclusion 1: When the activation rate \({P}_{\mathrm{0,2}}(\Delta {t}_{2}^{*})\) is the minimum of \({P}_{\mathrm{0,2}}(\Delta {t}_{2})\), the \(\Delta {t}_{2}^{*}\) is the inactive period of members. That is, the longest time for members to remain inactive;

The reason is that after the \(\Delta {t}_{2}^{*}\) period, if the member does not buy, the possibility of the member resuming shopping is the lowest in next month, that is to say, the member most likely to become an invalid member. Therefore, any member who has not purchased goods in the recent \(\Delta {t}_{2}^{*}\) period is considered to be transformed from inactive state to invalid state.

Conclusion 2: When the activation rate \({P}_{\mathrm{1,2}}(\Delta {t}_{1}^{*},\Delta {t}_{2}^{*})\) is the minimum of \({P}_{\mathrm{1,2}}(\Delta {t}_{1},\Delta {t}_{2}^{*})\), the \(\Delta {t}_{1}^{*}\) is the active period of members.Similar to conclusion 1, in such a long period of time as \(\Delta {t}_{1}^{*}\) members did not shop (even if they did during the period from \(\Delta {t}_{2}^{*}\) to \(\Delta {t}_{1}^{*}\)), they were least likely to resume shopping and most likely to shift from active to inactive. First, we calculate \(\Delta {t}_{2}^{*}\). For \(\Delta {t}_{2}=j\),\(j\) in \(\mathrm{2,3},4,\dots \dots ,\mathrm{11,12}\) For any number in 11,12, for a month in the sample a.Calculate the number of members \({x}_{1}\) who did not buy in the first j months of this month, then calculate the number of customers \({x}_{2}\) in the next month of \({x}_{1}\), and record the activation rate of invalid members in the month a under the condition \(\Delta {t}_{2}=j\) that \({P}_{\mathrm{0,2}}\left(j,a\right)=\frac{{x}_{2}}{{x}_{1}}\).

2.6 Sensitivity Analysis

For active period \(\Delta {\mathrm{t}}_{1}^{*}\) and inactive period \(\Delta {\mathrm{t}}_{2}^{*}\), we choose 18 consecutive months as test samples to calculate \(\Delta {\mathrm{t}}_{1}^{*}\) and \(\Delta {\mathrm{t}}_{2}^{*}\), in the 24-month sample length from these nearly 2 years. To evaluate the robustness of active and inactive periods. From the 24-month sample period, seven 18-month test samples can be generated. To evaluate the robustness of active and inactive periods. From the 24-month sample period, seven 18-month test samples can be generated.

3 Customer Life Cycle Model

In fact, customer activity is not constant. According to the activation rate of customers, we can get the model of customer's transition between inactive, active and loss states, that is, customer life cycle model. For each user, the probability of losing, inactive and active users in the t month is \({P}_{t,0}\), \({P}_{t,1}\),\({P}_{t,2}\), and \({P}_{t,0}+{P}_{t,1}+{P}_{t,2}=1\). For new users, \({P}_{t,0}=0\),\({P}_{t,1}=0\),\({P}_{t,2}=1\).

In t + 1 month, the probability that the user belongs to three types of users is respectively.

$${P}_{t+\mathrm{1,0}}={k}_{\mathrm{0,0}}{P}_{t,0}+{{k}_{\mathrm{1,0}}P}_{t,1}$$
$${P}_{t+\mathrm{1,1}}={k}_{\mathrm{1,1}}{P}_{t,1}+{k}_{\mathrm{2,1}}{P}_{t,2}$$
$${P}_{t+\mathrm{1,2}}={k}_{\mathrm{0,2}}{P}_{t,0}+{k}_{\mathrm{1,2}}{P}_{t,1}+{k}_{\mathrm{2,2}}{P}_{t,2},$$
$${k}_{\mathrm{0,0}}=0.0401, { k}_{\mathrm{1,0}}=0.2807,$$
$${k}_{\mathrm{1,1}}=0.6346, { k}_{\mathrm{2,1}}=0.2279,$$
$${k}_{\mathrm{0,2}}=0.0509,{ k}_{\mathrm{1,2}}=0.0847,$$
$${k}_{\mathrm{2,2}}=0.7721$$

Based on the conclusion of RMF model, we find that the purchasing power of the first 10% of customers increases gradually, and their purchasing willingness becomes stronger and stronger. We believe that this part of customers have the highest customer value, so establish membership status partition model and membership life cycle model. Based on the purchasing situation of members, members can be divided into active members, inactive members and lost members. Members can switch between these three states, and the probability of conversion is activation rate.

By calculating the activation rate under different states, we find that the boundaries between the three states are that the members with consumption are active members in three months, those without consumption in three months but with consumption in five months are inactive members, and those without consumption records in five months are invalid members. Finally, we calculate the probability of transition among the three states based on historical data.

4 Model Evaluation, Improvement and Extension

Combines with the descriptive statistics of consumption habits distribution, we can roughly estimate the consumption habits of the overall customers. By establishing a purchasing power model and combining with the RMF model, the changes in customer value can be observed. RMF model can make up for the deficiency of single purchasing power index and reflect customer value more comprehensively. The relationship between member activation rate and marketing activities is studied by establishing membership status partition model and membership life cycle model. Based on the data analysis of membership status, the differences of purchase time among active members, inactive members and lost members were clarfied.

The method of determining this boundary is proved by mathematical method, which is justified by mathematical method besides traditional marketing theory. An analytical model for association rules of commodity portfolio is establish., which not only reveals the relationship between commodities, but also shows the strength of the relationship between commodities through the confidence index, which has a strong explanability. At the same time, automatic mining is more efficient and more applicable than manual mining. FP-growth algorithm is used to analyze the association rules of the problem. Compared with the traditional Apriori algorithm for computing Association rules, FP-growth algorithm has obvious advantages in the efficiency and accuracy of large-scale data processing. However, it is worth noting that FP-growth algorithm can only be used to calculate historical data, but can not operate on incremental data alone. Therefore, in the actual application process, the specific needs of market analysis may not be met, and the storage space occupied is also very large.

The purchasing power and RMF model can be further deepened, and the purchasing power can be internalized as an index in the RMF model. Clustering according to the members’ retention rate and consumption frequency, dividing different customer groups. and comparing the customer value of each group, we can get more detailed customer division and clearer customer value. By using member life cycle models of the problem, we can not only monitor the member's activity, but also promote it further. Predicting the state transition of members’ activity is great reference value to enterprise customer management and marketing decision-making. we assume that there are only two ways of discount: price reduction and membership points. At the same time, we are not clear about the use of membership points. If there is more detailed discount information, we can refine the relationship between the activation rate and discount activities. then the specific discount strategy will also have a clearer direction.

This paper is mainly based on the data of the member information, the sale water meter, the member consumption detailed list, the merchandise information table, through the data processing and analysis, rejects the abnormal data, prepares for the following processing. By analyzing the characteristics of member consumption and the difference between member and non-member consumption, we can provide marketing suggestions for the store manager FP-growth Algorithm is designed to evaluate the purchasing power of members based on their gender, length of membership, age and consumption frequency, and each parameter of the model is explained, so as to improve the management level of the shopping mall.