1 Introduction

Remanufacturing, an advanced form of recycling, has become an increasing concern for companies as sustainability gains importance. The remanufacturing process to restore a collection of coresFootnote 1 to excellent condition consists of procedures that may involve advanced technology. Such procedures include disassembly, cleaning, testing, parts replacement/repairs, and reassembly operations. Examples of remanufactured products are engines, photocopiers, toner cartridges, and the like.

The remanufacturing industry is large, comprising of many market sectors and providing significant economic, environmental, and societal benefits (Akçali and Çetinkaya 2011). For some manufacturers, such as Eaton Corporation, backed by Roadranger support (http://www.roadranger.com/rr/Aftermarket/CoreBuyback/index.htm) , products sold to and used by consumers are actively sought back for remanufacturing. Such returned products are called buyback cores. Financial incentives are often used to encourage returns of these products for remanufacturing or traditional recycling. On the other hand, consumers also often return products that are more significantly worn out or are even damaged. We call those normal cores. A normal core is distinguished from a buyback core in that the normal core has a lower yield than the buyback core does. After undergoing the remanufacturing process, remanufactured products, which then are in good as new condition, can be sold to consumers. A remanufactured product and a manufactured product are treated as indistinguishable.

In this paper, we consider an inventory system that remanufactures returned products, and in which products returned as buyback cores are modelled to depend on past demands and past sales. We propose periodic review finite horizon backlog models for the system. We consider two types of cores in our models: buyback cores, which the remanufacturer purchases at a cost, and normal cores, which are likely to be damaged and returned by consumers. The remanufacturing cost for a buyback core is lower than that for a normal core because a buyback core is in better condition than a normal core is. Products are not manufactured from raw materials in our models, so all serviceable products come from remanufacturing. We consider a situation that is commonly encountered in practice, in which buyback cores are collected for products sold in the immediately previous period and earlier, and products sold too long ago, say, before a certain time, are not entitled for returns. That is, products can only be returned as buyback cores within a certain period of time after they are sold. For example, the remanufacturing facilities at Caterpillar Singapore (http://www.caterpillar.com) carry out a practice whereby there is an entitlement period during which sold products can be returned, and products beyond the entitlement period are not eligible for return. To be more specific, when an end-customer buys a remanufactured product from a Caterpillar dealer, he pays a price (composed of the actual selling price of the product and a deposit) that is the same as the price he would pay for a new product. The customer is also given an entitlement period of eight months during which he can return a used product to the dealer, and can get back part of his deposit, at an amount depending on the condition of the returned product. The exact percentage of the deposit that he can get back depends on the quality of the returned product, ranging from “full” to “partial” to “none”.

A major assumption of many papers on managing dynamic remanufacturing inventory systems is that product returns and demands/sales across different periods are independent. This assumption can be justified when the product is widely spread out in the market or when a common component/material is recovered from different products (e.g., remanufacturing of consumer electronics); see Tao and Zhou (2014). Nevertheless, one can imagine that a correlation between demands/sales and returns is likely to exist in many remanufacturing systems. If a characteristic can be identified and used to forecast returns as part of managing a dynamic remanufacturing inventory system, it can potentially reduce system costs through better deployment of returned products. We provide empirical evidence to show the dependence of product returns on past sales in a remanufacturing system.

In Sect. 3, we introduce a way to forecast returns of buyback cores that depends on past demands and sales. By introducing a way to model returns that are forecast from past demands and sales, we study inventory policies on the resulting models. We first derive a simple, explicit remanufacturing and disposal policyFootnote 2 for our backlog model in which returns are forecast from past demands. We show how that policy is affected by changes in the forecasting of returns when those changes are caused by changes in past demands. Then, we consider a model in which returns are forecast from past sales, and we study a feasible inventory policy for that model that is based on the optimal policy for the earlier model. We analyze how different this feasible policy is from the optimal policy in terms of system costs, and we also provide numerical evidence that suggests that the difference tends to be small.

1.1 Data analysis

We describe and analyze a data set from a remanufacturing-based company with an international presence, in order to illustrate the dependence of returns on past sales and the returns policy offered to customers. This builds the basis for us to consider incorporating core returns that are forecast from past demands/sales into an inventory model. The data set covers information on the sales and returns of seven different core types from two of the company’s distribution centers, for the period January 2010 to January 2014. The company offers a returns policy that allows customers to return their cores within eight months. In our dataset, a total of 3084 sales transactions occurred, out of which 2447 cores were returned to the company. Of the remaining cores that were not returned, 232 had been purchased within eight months of the data being retrieved and were considered to be active cores. The other 405 observations were cores that were not returned and were considered to be attrition cores.

To examine the relationship between the number of returns in the current month and the sales figures from previous months, also known as lagged sales, we define Lag \(X\) sales as the relationship between returns and the sales quantity \(X\) months ago.

Fig. 1
figure 1

Pearson’s R against Lag \(X\)

In Fig. 1, we have picked one of the seven core types and we show the correlation between the monthly buyback cores and their respective monthly lagged sales, including the upper and lower limit of the 95% level of confidence. In the figure, the y-axis refers to the Pearson’s R (also known as the Pearson correlation coefficient), and the x-axis records the lagged sales, Lag \(X\), against which the return data were measured. The figure shows that, with a 95% confidence level, returns are positively correlated to the sales X months ago for \(X=0, 1, 2, \ldots , 8\), but the existence of such a correlation is not clear for \(X=9\) or \(10\). This observation is interesting because the company offers a returns policy of eight months. The figure shows that the returns policy set by a company can indeed affect the return time of cores. The other six core types also display similar patterns. This observation shows the potential of using returns that are forecast from past demands/sales in managing a remanufacturing-based system.

2 Literature review

The literature on closed-loop supply chains is vast. Akçali and Çetinkaya (2011) presented a review of the subject that includes a comprehensive list of references. Recently, Souza (2013) provided a review of the literature and a tutorial on closed-loop supply chains, in which he discussed a wide range of topics that include results on a base model with underlying assumptions, comments on extensions, and potential research areas. Among Souza’s various topics, he discussed end-of-use returns with remanufacturing.

The literature on the study of remanufacturing-based inventory system includes papers by de Brito and van der Laan (2009), DeCroix (2006), DeCroix and Zipkin (2005), Guo et al. (2014), Simpson (1978), van der Laan and Salomon (1997), and van der Laan and Teunter (2006). A major assumption of these papers is that product returns and demands from different periods are independent. On the other hand, a case studied in Bayiz and Tang (2004) described correlated demand and return processes of a company that sells thermoluminescent badges and then in subsequent periods collects them back for refurbishment. The number of badges returned in a particular period is forecast using a linear combination of historical demands for the badge. By using actual data, Bayiz and Tang (2004) found that the forecast was rather accurate, with an average error of 24%. Works on stochastic and correlated demands and returns are rather limited due to the subject’s complexity. In Zhou et al. (2011) (also see Li et al. 2009), the authors studied product returns for a periodic review finite horizon inventory model with backlogged demand. Those authors considered \(K\) types of core, with different conditions of returned cores, ranging from slightly used to significantly damaged, that can be remanufactured. The system also has a manufacturing capability. Zhou et al. (2011) offered an optimal policy for deciding the optimal quantity of serviceable products to be made available to consumers, and the optimal quantity of each type of core to remanufacture and to dispose of in each period, whereas in Li et al. (2009), the authors did not provide an optimal policy. The methodology used was stochastic dynamic programming. In the main model in Zhou et al. (2011), the authors assumed that product returns and previous demands are independent. Zhou et al. (2011) then briefly considered the dependence of returns on past sales in an extension to their main model. That dependence was in terms of a Markov process, and to model the case in which just old enough products can be returned, the authors considered only returns of products sold at least \(\tau \) periods previously. The authors postulated the optimal policy for the extension in Theorem \(5\) of their paper. The dependence of returns that are forecast from past demands/sales in our paper complements that of Zhou et al. (2011), in that we consider the case whereby the current returns are dependent only on immediate past demands/sales, and products that were sold too long ago are not eligible for returns. In the previous subsection, we provided an analysis of a data set from a remanufacturing system to motivate our assumption.

Tao and Zhou (2014) recently considered a single product, periodic-review inventory system with remanufacturable returned products, while assuming that demands and returns follow general stochastic processes and may be correlated. Those authors provided an efficient approximation algorithm, based on cost-balancing techniques, to compute manufacturing and remanufacturing quantities in each period, and they showed that the expected costs under that remanufacturing balancing policy was at most twice the optimal cost. In our paper, considering the fact that it is usually harder to obtain demand data than sales data, in addition to the model in which returns depend on past demands as considered in Tao and Zhou (2014), we develop a model in which returns depends on past sales. We will formulate the two models we consider in our paper in Sect. 3.

Kiesmüller and van der Laan (2001) considered a discrete-time system in which product returns in a period depend explicitly on the demand that existed some periods ago. Those authors assumed that returned products are directly added to the serviceable inventory, and that manufacturing follows a base-stock policy. We consider a different model setting from theirs, motivated by our empirical study. Among various results, Kiesmüller and van der Laan (2001) showed numerically that the dependence on past demands has a positive effect on optimal cost, compared with a situation in which product returns are independent of previous demands.

Models with returns that are dependent on past sales are considered in Kelle and Silver (1989b), Ketzenberg et al. (2006), Khawan et al. (2007), Toktay et al. (2000), and Hsueh (2011). Kelle and Silver (1989b) modelled the dependence of returns on sales by specifying deterministic probabilities for a sold product to be returned in the next period, the period after that and so on [also see Goh and Varaprasad (1986), and Kelle and Silver 1989a]. The dependence on past sales in our paper is different from theirs however, and coincides when the maximum returns period for our model is \(1\) or under certain assumptions about parameters of our model (see Remark 1). Kelle and Silver (1989b) reduced their stochastic inventory model to a deterministic, dynamic lot-sizing problem for which there are known solution methods. In our paper, we use stochastic dynamic programming in our analysis of inventory models. Ketzenberg et al. (2006) focused on the value of information in a closed-loop supply chain. In their paper, dependence of returns on past sales followed that of Kelle and Silver (1989b), and was simplified in such a way that a sold product could only be returned in the next period with a certain probability, or not at all. That approach is similar to the way we forecast returns when returns in the current period are dependent only on the immediately previous sales. Khawan et al. (2007) considered an inventory system with warranty returns. They did not explicitly specify in their paper how returns are dependent on past sales. Toktay et al. (2000) considered a closed queueing network in their paper, wherein returns were modelled to depend on sales through an unknown return probability and delay distribution. Their dependence of returns on past sales was similar to that in Kelle and Silver (1989b). Instead of a deterministic probability for a sold product to be returned in a future period, as in Kelle and Silver (1989b), however, Toktay et al. (2000) considered the product of the probability that the product will be returned and a discrete delay density. Hsueh (2011) considered an inventory system with manufacturing and remanufacturing, taking into account different demand and return rates in different phases of the product life cycle. Those demand and return rates were normally distributed, with a different mean for each different phase of the product life cycle. In addition, the mean of the demand rate and that of the return rate were related. Hsueh provided formulae for the optimal production lot size, reorder point, and safety stock of the product for each phase of the product life cycle. Unlike Hsueh’s (2011) model, ours does not assume a particular distribution for demands and returns. Relevant literature on inventory models with remanufacturing, in which optimal policies are studied, includes Zhou and Yu (2011), Gong and Chao (2013), and Tao et al. (2012). In those papers, product returns and previous demands are independent.

Jia et al. (2016) explored a remanufacturing periodic review finite horizon inventory system with lost sales. They considered a switching mechanism whereby in the first half of the planning horizon, a push mode for remanufacturing is employed to satisfy demands, while in the second half of the planning horizon, a pull mode for remanufacturing is employed to satisfy demands. Their paper provided an optimal policy for the switching strategy, which possesses a simple, multi-dimensional base-stock structure. However, the sequence of events in Jia et al. (2016) is different from that in this paper. In our paper, we make remanufacturing decisions before products are returned in the current period [just as is the case in the model of Zhou et al. (2011)], whereas in Jia et al. (2016), remanufacturing decisions are made after products are returned in the current period. Both situations can arise in practice.

Another stream of research on correlated demand and returns focuses on how to forecast returns by using appropriate statistical methods (e.g., Clottey et al. 2012; Toktay et al. 2004). The impacts of information, inventory decisions, pricing, and the use of a warranty on product-returns management have also been studied (e.g., Jing and Huang 2013; Koppius et al. 2004; Pourakbar et al. 2014; van der Laan and de Brito 2009; Xie and Ye 2016; Ye et al. 2013). More recently, Ovchinnikov et al. (2014) provided a data-driven assessment of the economic and environmental aspects of remanufacturing for product and service firms, and they presented an analytical model and a behavioral study that together incorporate demand cannibalization from multiple customer segments across a firm’s product line. Ovchinnikov, et al. showed that remanufacturing frequently aligns firms’ economic and environmental goals by increasing profits and decreasing total environmental impact.

Our paper considers data-driven models, and it provides analytical results for those models that potentially can be used to analyze the impact of information on product inventory management with returns. In the next section, we shall describe our backlog models.

3 Remanufacturing models: returns forecast from past demands and past sales

In this section, we describe our periodic reviewFootnote 3 finite horizon inventory models, with one model forecasting returns from past demands (Model A), and the other model forecasting returns from past sales (Model B). The second model is more realistic as sales data is usually easier to obtain than demand data, whereas with the first model, we are able to obtain a nice structure for its optimal inventory policy. Using results derived from the first model, we then analyze the second model.

Two types of cores are considered in these models: buyback cores and normal cores. Buyback cores have better quality and usability than normal cores do. A characteristic of a buyback core is that its yield (i.e., its percentage of reusable parts) is higher than that of a normal core. On the other hand, a normal core has greater variety in its quality and usability. Unsatisfied demand is backlogged in our models, and we forecast returns of buyback cores from past demands in one model and from past sales in the second model. Returns and past demands/sales are not related in the case of normal cores.

We show in this section that the optimal policies for our models can be found by solving dynamic programs. We observe that our forecasts of returns for buyback cores affect the optimal policy only through past demands/sales, even though returns of those cores are modelled to depend on other (random) factors as well.

We now proceed to describing our backlog models by first defining the cost parameters used in those models. We have

\(h = \) unit holding cost for serviceable products per period.

\(p = \) unit penalty cost for serviceable products per period.

By serviceable products, we mean products that are ready to be sold to consumers.

\(b = \) unit purchasing price of buyback cores.

A buyback core is purchased back from a consumer at cost \(b\). Such a core is usually usable, but has suffered wear and tear due to usage. It is in better condition than a normal core is.

\(c = \) unit purchasing price of normal cores.

A normal core can be purchased from a consumer at cost \(c\). The value of \(c\) is much smaller than the value of \(b\), because a normal core is usually in worse condition than a buyback core is. For the sake of simplicity, we set \(c = 0\).

\(r_0 = \) unit remanufacturing cost of buyback cores.

\(r_1 = \) unit remanufacturing cost of normal cores.

Let \(r_0 < r_1\). This relationship between \(r_0\) and \(r_1\) reflects that a buyback core is in a better condition than a normal core.

\(s_0 = \) unit stocking cost of buyback cores.

\(s_1 = \) unit stocking cost of normal cores.

Let \(s_1 \le s_0 \le h\).

\(u = \) unit disposal cost of normal cores.

We assume in this paper that only normal cores can be disposed of, and that buyback cores are either stocked or remanufactured. This assumption is reasonable because buyback cores are usually in better condition than normal cores are.

Note that we consider a finite horizon in this paper, where \(N\) is the number of periods in the planning horizon. In our models, only products that are purchased at the most \(K\) periods before the current period, and up to the immediately previous period, are considered for returns as buyback cores. Hence, \(K\) is the maximum period for returns.

The variables in these models are:

\(x_{0,n} = \) inventory level of serviceable products at the beginning of the \(n\)th period.

\(x_{1,n} = \) aggregate inventory level of serviceable products and buyback cores at the beginning of the \(n\)th period.

\(x_{2,n} = \) aggregate inventory level of serviceable products, buyback cores and normal cores at the beginning of the \(n\)th period.

\({{\varvec{x}}_{\varvec{n}}} = (x_{0,n},x_{1,n},x_{2,n})\), \(x_{0,n} \le x_{1,n} \le x_{2,n}\).

\(y_{0,n} = \) inventory level of serviceable products in the \(n\)th period after remanufacturing, but before demand and returns occur.

\(y_{1,n} = \) aggregate inventory level of serviceable products and buyback cores in the \(n\)th period after remanufacturing, but before demand and returns occur.

\(y_{2,n} = \) aggregate inventory level of serviceable products, buyback cores and normal cores in the \(n\)th period after remanufacturing and disposal, but before demand and returns occur.

\({{\varvec{y}}_{\varvec{n}}} = (y_{0,n},y_{1,n},y_{2,n})\), \(y_{0,n} \le y_{1,n} \le y_{2,n}\).

The variables given above are aggregated. We can easily obtain actual inventories from these variables. As an example, \(x_{1,n} - x_{0,n}\) is the number of units of buyback cores on-hand at the beginning of the \(n\)th period.

\(w_{1,n} = \) quantity of buyback cores remanufactured in the \(n\)th period.

\(w_{2,n} = \) quantity of normal cores remanufactured in the \(n\)th period.

\({{\varvec{w}}_{\varvec{n}}} = (w_{1,n}, w_{2,n})\).

Randomness in the models comes from the following:

\(D_n = \) consumer demand for serviceable products in the \(n\)th period, \(n = 1, \ldots , N\).

\(D_n\) is a continuous nonnegative random variable with probability density function \(f_{D_n}(\xi ), \xi \ge 0\), and realization \(d_n\), \(n = 1, \ldots , N\). Also, we denote \(\mu _{D_n}\) to be the finite mean of \(D_n\).

\(R^j_{n} = \sum _{i=1}^{k(n)} \sigma _{n,i} z^j_{n-i} + \epsilon _n = \) quantity of products returned as buyback cores in the \(n\)th period, \(n = 2, \ldots , N\), \(j = A, B\). Let \(R_1^j = 0\), \(j = A, B\).

Here \(k(n) = \left\{ \begin{array}{ll} n-1 &{} {\text{ if }}\ n \le K \\ K &{} {\text{ if }}\ n \ge K + 1 \end{array} \right. \).

We have

$$\begin{aligned} z^A_{n-i}:= & {} d_{n-i}, \\ z^B_{n-i}:= & {} \max \{\min \{d_{n-i},y_{0,n-i}\},0\} \end{aligned}$$

are the respective realized demand and realized sales \(i\) previous period away from the current period, that is, the \((n-i)\)th period. Note that \(\sigma _{n,i}, i = 1, \ldots , k(n)\), are random variables taking values between \(0\) and \(1\). The returns distribution is therefore not determined by previous demand/sales in a deterministic manner, but in a random way, due to \(\sigma _{n,i}\)Footnote 4 which is random and a random noise term \(\epsilon _n\).Footnote 5\(R_n^j\) represents the return’s forecasting of buyback cores and is modelled to depend explicitly on past demands/sales. It is clear that this return’s forecasting in the \(n\)th period is dependent on the immediate previous demand/sales, up to demand/sales \(k(n)\) previous periods away. When \(j = A\), returns are forecast to depend on past demands, which make analysis possible. We also consider the more realistic situation when returns are forecast to depend on past sales when \(j = B\).

In the literature (for example, Kelle and Silver 1989b; Toktay et al. 2000), return’s forecasting is modelled in a “forward” manner whereby given a product sold, the probability it is returned in the next period, the period after next, etc., are identified. In our case, we model return’s forecasting in a “backward” manner whereby returns are modelled in the current period in terms of demands/sales in previous periods.

\(B_n = \) quantity of products returned as normal cores in the \(n\)th period, \(n = 1, \ldots , N\).

\(B_n\) is a continuous nonnegative random variable with realization \(b_n\), \(n = 1, \ldots , N\).

\(D_n, B_n, \epsilon _n, \sigma _{n,i}, 1 \le i \le k(n),\) may be correlated in the \(n\)th period, but they are independent across different periods. This assumption is needed to formulate the inventory problems as dynamic programs as discussed later in the section.

Remark 1

If we view \(\sigma _{n,i} z_{n-i}^j\) as the number of units of products returned as buyback cores in the \(n\)th period from demand/sales of these products \(i\) period earlier (which is \(z_{n-i}^j\)), then \(\sigma _{n,i}, 1 \le i \le k(n)\), are unlikely to be independent across periods since we must have

$$\begin{aligned} \sigma _{n-i+1,1} + \cdots + \sigma _{n,i} + \cdots + \sigma _{n-i+K,K} \le 1. \end{aligned}$$

However, we still have independence across periods if \(\sigma _{n,i}, 1 \le i \le k(n)\), \(2 \le n \le N\), are fixed numbers. Also, when \(K = 1\), the above independence assumption across different periods can be enforced with this interpretation of \(\sigma _{n,i} z_{n-i}^j\). Furthermore, when \(K = 1\) and if \(\sigma _{n,1}z_{n-1}^B\) is binomially distributed with probability of success \(= p_0\) and number of trials \( = z_{n-1}^B\), and \(\epsilon _n \equiv 0\), then our return’s forecasting model is the same as that of Ketzenberg et al. (2006) whereby a sold product can only be returned in the next period with probability \(p_0\) or not at all.

The sequence of events for our models follows that of Zhou et al. (2011). At the beginning of each period, the remanufacturer decides how many units of buyback and normal cores to remanufacture. Then, the remanufacturer decides how many units of normal cores to dispose. Next, consumer demands and product returns are realized, and unsatisfied demands are fully backlogged. Finally, all costs are calculated. All lead times are assumed to be zero.

From now onwards, it is understood that the demand \(D_n\) in the \(i\)th period can also be written as \(Z_i^A\) with realized demand denoted by \(d_i\) or \(z_i^A\). On the other hand, \(Z_i^B\) stands for the sales in the \(i\)th period, that is,

$$\begin{aligned} Z_i^B = \max \{ \min \{D_i, y_{0,i}\}, 0 \}, \end{aligned}$$
(1)

with realized sales in the \(i\)th period denoted by \(z_i^B\).

We have the following straightforward observation on \(Z_i^j\):

Proposition 1

We have \(0 \le Z_i^B \le Z_i^A\) for all \(1 \le i \le N\).

We now write down the expected cost, due to holding/stocking, remanufacturing, disposal, purchasing and penalty, in the \(n\)th period, given \(z^j_{n-i}, 1 \le i \le k(n)\), \(j = A, B\), as

$$\begin{aligned}&U_n({{\varvec{x}}_{\varvec{n}}},{{\varvec{y}}_{\varvec{n}}},{{\varvec{w}}_{\varvec{n}}},z_{n-k(n)}^j, \ldots , z_{n-1}^j) \\&\quad = s_0 (y_{1,n} - y_{0,n} + E(R_n^j)) + s_1 (y_{2,n} - y_{1,n} + E(B_n)) + r_0 w_{1,n} + r_1 w_{2,n} \\&\qquad + \,u(x_{2,n} - x_{1,n} - y_{2,n} + y_{1,n} - w_{2,n}) \\&\qquad +\, b E(R_n^j) + h E(y_{0,n} - D_n)^+ + p E(D_n - y_{0,n})^+. \end{aligned}$$

We use the same notation for the expected cost in the \(n\)th period for when returns are forecast from past demands and when returns are forecast from past sales.

Note that in the above expected cost expression,

  • \(s_0 (y_{1,n} - y_{0,n} + E(R_n^j)) + s_1 (y_{2,n} - y_{1,n} + E(B_n)) = \) total stocking cost of cores in the \(n\)th period.

  • \(r_0 w_{1,n} + r_1 w_{2,n} = \) total remanufacturing cost of cores in the \(n\)th period.

  • \(x_{2,n} - x_{1,n} - y_{2,n} + y_{1,n} - w_{2,n} = \) number of units of normal cores disposed of in the \(n\)th period, and hence, \(u (x_{2,n} - x_{1,n}- y_{2,n} + y_{1,n} - w_{2,n}) = \) total disposal cost of normal cores in the \(n\)th period.

  • \(b E(R_n^j) = \) expected total cost to purchase buyback cores in the \(n\)th period.

  • \(h E(y_{0,n} - D_n)^+ = \) expected holding cost of serviceable products in the \(n\)th period.

  • \(p E(D_n - y_{0,n})^+ = \) expected penalty cost of serviceable products in the \(n\)th period.

Let us eliminate some variables to obtain a cost expression with fewer variables. We have for \(1 \le n \le N\), \(({{\varvec{y}}_{\varvec{n}}},{{\varvec{w}}_{\varvec{n}}})\) is constrained to satisfy

$$\begin{aligned} \begin{array}{l} y_{0,n} \le y_{1,n} \le y_{2,n}, \\ 0 \le w_{1,n} = x_{1,n} - x_{0,n} - y_{1,n} + y_{0,n}, \\ 0 \le w_{2,n} \le x_{2,n} - x_{1,n} - y_{2,n} + y_{1,n}, \\ w_{1,n} + w_{2,n} = y_{0,n} - x_{0,n}, \end{array} \end{aligned}$$

Solving for \(w_{1,n}\) and \(w_{2,n}\) above in terms of \(y_{0,n}, y_{1,n}\) and \(y_{2,n}\), we have

$$\begin{aligned} \begin{array}{l} w_{1,n} = x_{1,n} - x_{0,n} - y_{1,n} + y_{0,n} \\ w_{2,n} = y_{1,n} - x_{1,n}. \end{array} \end{aligned}$$
(2)

Therefore, by eliminating \({{\varvec{w}}_{\varvec{n}}}\), the expected cost in the \(n\)th period given \(z_{n-i}^j, 1 \le i \le k(n)\), can be rewritten as

$$\begin{aligned}&{\overline{U}}_n({{\varvec{x}}_{\varvec{n}}},{{\varvec{y}}_{\varvec{n}}},z^j_{n-k(n)},\ldots , z^j_{n-1}) \nonumber \\&\quad = -r_0 x_{0,n} - (r_1 - r_0)x_{1,n} + ux_{2,n} + (r_0 - s_0) y_{0,n} + (r_1 - r_0 + s_0 - s_1)y_{1,n} \nonumber \\&\qquad +\, (s_1 - u) y_{2,n} + (s_0 + b) E(R_n^j) + s_1 E(B_n) + hE(y_{0,n}-D_n)^+ + pE(D_n - y_{0,n})^+, \end{aligned}$$
(3)

where \(R_n^j = \sum _{i=1}^{k(n)} \sigma _{n,i}z_{n-i}^j + \epsilon _n\), \(j = A, B\).

Before we continue, we let \(K = 1\) from now onwards, that is, we consider returns only from products purchased in the immediate previous period. Hence, we assume that the maximum returns period for products returned as buyback cores is \(1\). As discussed in Remark 1, having \(K = 1\) will enable our interpretation of \(\sigma _{n,1}z_{n-1}^B\) as returns of buyback cores from sales in the previous period to hold without violating the independence assumption on \(\sigma _{n,1}\) across periods. Results derived in this paper for \(K = 1\) are applicable for \(K \ge 2\), with the understanding that this independence assumption holds, such as when \(\sigma _{n,i}\) is a fixed number for all \(1 \le i \le k(n)\), \(2 \le n \le N\).

Now, a policy \({{\varvec{\pi }}^{\varvec{j}}} = ({\pi }_1^j, \ldots , {\pi }_N^j)\) for our model, with returns forecasted from past demands when \(j = A\) and returns forecasted from past sales when \(j = B\), is such that \({\pi }_1^j({{\varvec{x}}_{\varvec{1}}}) = {{\varvec{y}}_{\varvec{1}}}\), \({\pi }_2^j({{\varvec{x}}_{\varvec{2}}},z_1^j,b_{1}) = {{\varvec{y}}_{\varvec{2}}}\) and for \(3 \le n \le N\), \({\pi }_n^j({{\varvec{x}}_{\varvec{n}}},{{\varvec{z}}_{\varvec{n-1}}^{\varvec{j}}},{{\varvec{b}}_{{\varvec{n-1}}}},\sigma _{2,1},\ldots ,\sigma _{n-1,1},\epsilon _{2}, \ldots , \epsilon _{n-1}) = {{\varvec{y}}_{\varvec{n}}}\), where \({{\varvec{y}}_{\varvec{n}}}\) is constrained to satisfy

$$\begin{aligned} \begin{array}{l} y_{0,n} \le y_{1,n} \le y_{2,n}, \\ y_{1,n} - x_{1,n} \le y_{0,n} - x_{0,n}, \\ y_{2,n} \le x_{2,n}, \\ y_{1,n} \ge x_{1,n}, \end{array} \end{aligned}$$

for \(1 \le n \le N\). Note that here \({{\varvec{z}}_{{\varvec{n-1}}}^{\varvec{j}}}\) stands for \((z_1^j, \ldots , z_{n-1}^j)\) and \({{\varvec{b}}_{{\varvec{n-1}}}}\) stands for \((b_1, \ldots , b_{n-1})\).

For a given policy \({\varvec{\pi }}^{\varvec{j}} = ({\pi }_1^j, \ldots , {\pi }_N^j)\) and \(1 \le n \le N\), the expected total cost from the \(n\)th period to the \(N\)th period given \(({{\varvec{x}}_{\varvec{n}}},{{\varvec{z}}_{{\varvec{n-1}}}^{\varvec{j}}},{{\varvec{b}}_{{\varvec{n-1}}}},\sigma _{2,1},\ldots ,\sigma _{n-1,1},\epsilon _2, \ldots , \epsilon _{n-1})\) is

$$\begin{aligned}&V_{{{\varvec{\pi }}^{\varvec{j}}},n}({{\varvec{x}}_{{\varvec{n}}}},{{\varvec{z}}_{{\varvec{n-1}}}^{\varvec{j}}},{{\varvec{b}}_{{\varvec{n-1}}}}, \sigma _{2,1},\ldots ,\sigma _{n-1,1},\epsilon _2,\ldots ,\epsilon _{n-1}) \nonumber \\&\quad = {\overline{U}}_n({\varvec{x}}_{\varvec{n}},{\varvec{y}}_{\varvec{n}},z_{n-1}^j) + \alpha E_{D_{n},B_{n},\sigma _{n,1},\epsilon _{n}}{\overline{U}}_{n+1}({\varvec{x}}_{{\varvec{n+1}}}, {\varvec{y}}_{{\varvec{n+1}}},Z_{n}^j) \nonumber \\&\qquad + \sum _{i=n+2}^{N} \alpha ^{i-n} E_{D_{i-2},D_{i-1},B_{i-1},\sigma _{i-1,1},\epsilon _{i-1}}{\overline{U}}_{i}({\varvec{x}}_{{\varvec{i}}}, {\varvec{y}}_{{\varvec{i}}},Z_{i-1}^j), \end{aligned}$$
(4)

where

$$\begin{aligned} x_{0,n+1}= & {} y_{0,n} - D_{n}, \\ x_{1,n+1}= & {} y_{1,n} - D_{n} + R_{n}^j, \\ x_{2,n+1}= & {} y_{2,n} - D_{n} + R_{n}^j + B_{n}, \end{aligned}$$

with \(R_{n}^j=\sigma _{n,1} z_{n-1}^j + \epsilon _{n}\), and for \(n+2 \le i \le N\),

$$\begin{aligned} x_{0,i}= & {} y_{0,i-1} - D_{i-1}, \\ x_{1,i}= & {} y_{1,i-1} - D_{i-1} + R_{i-1}^j, \\ x_{2,i}= & {} y_{2,i-1} - D_{i-1} + R_{i-1}^j + B_{i-1}, \end{aligned}$$

with \(R_{i-1}^j = \sigma _{i-1,1} Z_{i-2}^j + \epsilon _{i-1}\), where \(Z_{i-2}^j\) stands for the demand in the \((i-2)\)th period when \(j = A\), and sales in the \((i-2)\)th period, defined by (1), when \(j = B\). In (4), \({{\varvec{y}}_{{\varvec{i}}}} = {\pi }_i^j({{\varvec{x}}_{{\varvec{i}}}},{{\varvec{z}}_{{\varvec{i-1}}}^{\varvec{j}}}, {{\varvec{b}}_{{\varvec{i-1}}}}, \sigma _{2,1},\ldots ,\)\(\sigma _{i-1,1},\)\(\epsilon _2, \ldots , \epsilon _{i-1})\) for \(n \le i \le N\), \(j = A, B\). We omit the superscript \(j\) from \({{\varvec{x}}_{\varvec{i}}} = (x_{0,i},x_{1,i},x_{2,i})\), \(n+1 \le i \le N\), and \({{\varvec{y}}_{\varvec{i}}} = (y_{0,i},y_{1,i},y_{2,i})\), \(n \le i \le N\) above.

Following Bertsekas (2005), an optimal policy \({\varvec{{\pi }}}^{{\varvec{j}},{\varvec{*}}}\) is a policy that minimizes the above expected cost from the \(1\)st period to the \(N\)th period over all feasible policies \({{\varvec{\pi }}^{\varvec{j}}}\), that is,

$$\begin{aligned} V_{{\varvec{{\pi }}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{1}}}) = \min _{{\varvec{\pi }}^{\varvec{j}}} V_{{\varvec{\pi }}^{\varvec{j}},1}({{\varvec{x}}_{\varvec{1}}}), \end{aligned}$$

while the optimal cost \(V^*_j({{\varvec{x}}_{\varvec{1}}})\) is such that

$$\begin{aligned} V^*_j({{\varvec{x}}_{\varvec{1}}}) = \min _{{\varvec{\pi }}^{\varvec{j}}} V_{{\varvec{\pi }}^{\varvec{j}},1}({{\varvec{x}}_{\varvec{1}}}), \end{aligned}$$
(5)

\(j = A, B\). Here, \({{\varvec{\pi }}^{{\varvec{A}},{\varvec{*}}}}\) is the optimal policy for Model A, while \({{\varvec{\pi }}^{{\varvec{B}},{\varvec{*}}}}\) is the optimal policy for Model B.

The optimal policy \({\varvec{{\pi }}}^{{\varvec{j}},{\varvec{*}}}\) can be found using dynamic programming technique, by solving a dynamic program as follows:

Define \(V_1^j({{\varvec{x}}_{\varvec{1}}})\) to be the following minimization problem

$$\begin{aligned}&\min _{{{\varvec{y}}_{\varvec{1}}}} \left\{ {\overline{U}}_1({{\varvec{x}}_{\varvec{1}}},{{\varvec{y}}_{\varvec{1}}}) + \alpha E_{D_1,B_1} (V_2^j({{\varvec{x}}_{\varvec{2}}},Z_1^j)) \right\} \end{aligned}$$
(6)
$$\begin{aligned}&{\mathrm{subject}\,\mathrm{to}} \nonumber \\&\quad \begin{array}{l} y_{0,1} \le y_{1,1} \le y_{2,1}, \\ y_{1,1} - x_{1,1} \le y_{0,1} - x_{0,1}, \\ y_{2,1} \le x_{2,1}, \\ y_{1,1} \ge x_{1,1}, \end{array} \end{aligned}$$
(7)

where \({{{\varvec{x}}}_{{\varvec{2}}}} = (x_{0,2},x_{1,2},x_{2,2})\) in the \(2\)nd period is given by

$$\begin{aligned} x_{0,2}= & {} y_{0,1} - D_1, \\ x_{1,2}= & {} y_{1,1} - D_1, \\ x_{2,2}= & {} y_{2,1} - D_1 + B_1. \end{aligned}$$

Let \({{\varvec{y}}_{\varvec{1}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{1}}})\) be an optimal solution to (6) subject to constraints (7).

For \(2 \le n \le N\), given \(Z^j_{n-1} = z^j_{n-1}\), we define \(V_n^j({{\varvec{x}}_{\varvec{n}}},z_{n-1}^j)\) to be

$$\begin{aligned}&\min _{{\varvec{{y}}}_{\varvec{n}}} \left\{ {\overline{U}}_n({{\varvec{x}}_{\varvec{n}}},{{\varvec{y}}_{\varvec{n}}},z_{n-1}^j) + \alpha E_{D_{n},B_n,\sigma _{n,1},\epsilon _n} (V_{n+1}^j({{\varvec{x}}_{{\varvec{n+1}}}},Z_n^j)) \right\} \end{aligned}$$
(8)
$$\begin{aligned}&{\mathrm{subject}\,\mathrm{to}} \nonumber \\&\quad \begin{array}{l} y_{0,n} \le y_{1,n} \le y_{2,n}, \\ y_{1,n} - x_{1,n} \le y_{0,n} - x_{0,n}, \\ y_{2,n} \le x_{2,n}, \\ y_{1,n} \ge x_{1,n}, \end{array} \end{aligned}$$
(9)

where \({\varvec{x}}_{{\varvec{n+1}}} = (x_{0,n+1},x_{1,n+1},x_{2,n+1})\) in the \((n+1)\)th period is given by

$$\begin{aligned} x_{0,n+1}= & {} y_{0,n} - D_{n}, \\ x_{1,n+1}= & {} y_{1,n} - D_n + R_n^j, \\ x_{2,n+1}= & {} y_{2,n} - D_n + R_n^j + B_n, \end{aligned}$$

with \(R_{n}^j = \sigma _{n,1} z_{n-1}^j + \epsilon _n\).

Let \({{\varvec{y}}_{{\varvec{n}}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^j)\) be an optimal solution to (8) subject to constraints (9).

Define \(V_{N+1}^j({{\varvec{x}}_{{\varvec{N+1}}}},z_N^j)\) to be identically equal to zero.

\(V_1^j({{\varvec{x}}_{\varvec{1}}})\), \(V_n^j({{\varvec{x}}_{\varvec{n}}},z_{n-1}^j), 2 \le n \le N\), defined above, constitute a dynamic program, with boundary condition \(V_{N+1}^j({{\varvec{x}}_{{\varvec{N+1}}}},z_N^j) \equiv 0\), for \(j = A, B\).

Note that in general \(V_1^A({{\varvec{x}}_{\varvec{1}}})\), \(V_1^B({{\varvec{x}}_{\varvec{1}}})\) and \(V_n^A({{\varvec{x}}_{\varvec{n}}},z_{n-1}^A)\), \(V_n^B({{\varvec{x}}_{\varvec{n}}},z_{n-1}^B)\), \(2 \le n \le N - 1\), are different due to the different way in which \(Z_i^j\) is defined for \(j = A\) and \(j = B\), \(1 \le i \le N\), although it is easy to observe from (8) subject to constraints (9) and \(V_{N+1}^j({{\varvec{x}}_{{\varvec{N+1}}}},z_N^j) \equiv 0\) that \(V^A_{N}({{\varvec{x}}_{{\varvec{N}}}},z_{N-1}) = V^B_{N}({{\varvec{x}}_{{\varvec{N}}}},z_{N-1})\) for all \(z_{N-1} \ge 0\).

Using our dynamic programming formulations, we have the following proposition:

Proposition 2

For every initial state \({{\varvec{x}}_{\varvec{1}}}\) and \(j = A, B\), we have \(V^*_j({{\varvec{x}}_{\varvec{1}}}) = V_1^j({{\varvec{x}}_{\varvec{1}}})\). Also,

$$\begin{aligned} {\pi }_1^{j,*}({{\varvec{x}}_{\varvec{1}}}) = {{\varvec{y}}_{\varvec{1}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{1}}}),\quad {\pi }_2^{j,*}({\varvec{x}}_{\varvec{2}},z_1^j,b_1) = {{\varvec{y}}_{\varvec{2}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{2}}},z_1^j), \end{aligned}$$

and for \(3 \le n \le N\),

$$\begin{aligned} {\pi }_n^{j,*}({{\varvec{x}}_{\varvec{n}}},{{\varvec{z}}_{{\varvec{n-1}}}^{\varvec{j}}},{{\varvec{b}}_{{\varvec{n-1}}}}, \sigma _{2,1},\ldots ,\sigma _{n-1,1},{\epsilon _{2}}, \ldots , \epsilon _{n-1}) = {{\varvec{y}}_{{\varvec{n}}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^j), \end{aligned}$$

where \({{\varvec{y}}_{\varvec{1}}^{{\varvec{j}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{1}}}), {{\varvec{y}}_{{\varvec{n}}}^{{\varvec{j,*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^j)\), \(2 \le n \le N\), are obtained by solving the above dynamic program for each \(j = A, B\).

By the above proposition, to find the optimal policy \({\varvec{{\pi }}}^{{\varvec{j}},{\varvec{*}}}\), we only need to find \({{\varvec{y}}_{\varvec{1}}^{{\varvec{j,*}}}}({{\varvec{x}}_{\varvec{1}}})\) and \({{\varvec{y}}_{{\varvec{n}}}^{{\varvec{j,*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^j)\), \(2 \le n \le N\).

We know that return’s forecasting of buyback cores is defined by past demands/sales and some random factors. We see from the above proposition that the effect the return’s forecasting has on the optimal policy for the two models is only through past demands for Model A and past sales for Model B.

In Sect. 4, we provide a nice structure for the optimal inventory policy for Model A, the model where returns are forecast from past demands. Based on our results in the section, in Sect. 5, we propose a feasible policy for Model B, the model where returns are forecast from past sales, and analyze the extent to which this feasible policy is close to optimality. In Sect. 5.1, we provide numerical results.

4 An optimal inventory policy for Model A

We proceed in this section to state the explicit form of the optimal policy \({{\varvec{\pi }}^{{\varvec{A}},{\varvec{*}}}}\) for our backlog model, Model A, which we formulate in Sect. 3, when returns are forecast from past demands. In each period, this policy can be described neatly in terms of optimal control parameters that are not dependent on inventories at the beginning of the period.

Theorem 1

For \(2 \le n \le N\), given \({{\varvec{x}}_{\varvec{n}}}\) and demand realization \(Z^A_{n-1} = z^A_{n-1}\), there exist optimal control parameters \(\xi _{0,n}, \xi _{1,n}(z_{n-1}^A), \eta _{2,n}(z_{n-1}^A)\), with \(\xi _{1,n}(z_{n-1}^A) \le \xi _{0,n}\) and \(\xi _{1,n}(z_{n-1}^A) \le \eta _{2,n}(z_{n-1}^A)\), such that

Remanufacturing:

  • If \(\xi _{0,n} \le x_{0,n}\), we do not remanufacture in the \(n\)th period, and stock all buyback and normal cores for the next period.

  • If \(x_{0,n} < \xi _{0,n} \le x_{1,n}\), we remanufacture up to \(\xi _{0,n}\) using only buyback cores without using any normal cores in the \(n\)th period, and stock the remaining \(x_{1,n} - \xi _{0,n}\) buyback cores for the next period.

  • If \(\xi _{1,n}(z_{n-1}^A) \le x_{1,n} < \xi _{0,n}\), we remanufacture all available buyback cores without using any normal cores in the \(n\)th period.

  • If \(x_{1,n} < \xi _{1,n}(z_{n-1}^A) \le x_{2,n}\), we remanufacture up to \(\xi _{1,n}(z_{n-1}^A)\) using all available buyback cores and additional normal cores in the \(n\)th period.

  • If \(x_{0,n} \le x_{1,n} \le x_{2,n} < \xi _{1,n}(z_{n-1}^A) \le \xi _{0,n}\), we remanufacture all available buyback cores and normal cores in the \(n\)th period.

and

Disposal:

  • If \(\eta _{2,n}(z_{n-1}^A) \le x_{1,n} \le x_{2,n}\), we dispose all available normal cores in the \(n\)th period.

  • If \(x_{1,n} < \eta _{2,n}(z_{n-1}^A) \le x_{2,n}\), we dispose \(x_{2,n} - \eta _{2,n}(z_{n-1}^A)\) normal cores in the \(n\)th period.

  • If \(x_{1,n} \le x_{2,n} < \eta _{2,n}(z_{n-1}^A)\), we do not dispose any normal cores in the \(n\)th period.

Similar rules apply when \(n = 1\), using optimal control parameters \(\xi _{0,1}, \xi _{1,1}\) and \(\eta _{2,1}\), given \({{\varvec{x}}_{\varvec{1}}}\).

The above rules for \(n = 1\) and \(2 \le n \le N\) constitute the optimal policy \({{\varvec{\pi }}^{{\varvec{A,*}}}}\) for our backlog model when returns are forecast from past demands.

In Theorem 1, we describe a simply stated optimal policy for our model. Observe from the theorem that the policy is essentially a “remanufacture-up-to” and “dispose-down-to” policy with remanufacturing and disposal levels characterized by optimal control parameters that depend on past demand and do not depend on initial inventories in each period.

In the next subsection, we describe how we obtain the policy by solving a minimization problem (11) subject to constraints (12) (given in the subsection).

In the following, we describe properties of the optimal cost and the optimal policy that we derived. Structural properties of optimal policies are often investigated in the literature, and can be found for example in Puranam and Katehakis (2014). First, it is interesting to investigate how returns forecasted from past demands affect the optimal policy for Model A. For \(n = 1\), it is clear that optimal control parameters for our optimal policy are independent of past demands. For \(2 \le n \le N\), the following theorem describes how optimal control parameters \(\xi _{1,n}(z_{n-1}^A)\), \(\eta _{2,n}(z_{n-1}^A)\) vary with \(z_{n-1}^A\).

Theorem 2

For \(2 \le n \le N - 1\), \(\xi _{1,n}(z_{n-1}^A), \eta _{2,n}(z_{n-1}^A)\) are nonincreasing in \(z_{n-1}^A\). When \(n = N\), \(\xi _{1,n}(z_{n-1}^A), \eta _{2,n}(z_{n-1}^A)\) are not dependent on \(z_{n-1}^A\).

In our model, we consider returns forecast of buyback cores, and return’s forecasting is based on past demand for serviceable products (\(z_{n-1}^A = d_{n-1})\). The larger/smaller the value of \(d_{n-1}\), the forecast is for larger/smaller number of buyback cores to be returned. As \(d_{n-1}\) increases, \(\xi _{0,n}\) is unchanged while \(\xi _{1,n}(d_{n-1})\) and \(\eta _{2,n}(d_{n-1})\) are nonincreasing (by Theorem 2). We see from Theorem 1 that as a result, in the current period, if the forecast is an increase in buyback core returns (as there is an increase in past realized demands), we are more unlikely to remanufacture normal cores and instead dispose of them, while the remanufacturing decision on buyback cores is not changed.

Next, we observe the following property of \(\xi _{0,n}\) and the optimal cost \(V_1^A({{\varvec{x}}_{\varvec{1}}})\):

Theorem 3

\(V_1^A({{\varvec{x}}_{\varvec{1}}})\) is decreasing in \(x_{0,1}\) for \(x_{0,1} < \xi _{0,1}\). Furthermore, for \(1 \le n \le N\), if \(D_n\) is identically distributed, we have \(\xi _{0,N} \le \xi _{0,n}\).

It is clear from the above theorem that to keep system cost down over the planning horizon, the initial inventory of serviceable products \(x_{0,1}\) cannot be too small, in particular, it should not be smaller than \(\xi _{0,1}\). Furthermore, by the above theorem, we know that the optimal control parameter \(\xi _{0,n}\) is greater than or equal to \(\xi _{0,N}\) for \(1 \le n \le N\). A natural question to ask is whether we have monotonicity of \(\xi _{0,n}\) in \(n\). The following example illustrates that this is not possible in general:

Example 1

Let \(\alpha = 1, D \equiv d={\text{ positive } \text{ constant }},\ N = 3, \sigma _{n,1} \equiv \frac{1}{2}, \epsilon _n \equiv 0, B_n \equiv 5\), with cost parameters satisfying \(h = p\), \(2s_1 < u\), \(r_0 - s_0 > p\), \(r_0 < 2s_0\) and \(r_1 - 2s_1 > 2h\). We have \(\xi _{0,1} = d, \xi _{0,2} = 2d\) and \(\xi _{0,3} = - \infty \).

The above example shows the non-monotonicity of \(\xi _{0,n}\) in \(n\).

4.1 Verification of Theorem 1

In this subsection, we proceed in an abstract manner, analyzing a minimization problem that is an abstraction of the optimality equation in our dynamic programming formulation in Sect. 3. We obtain results by analyzing this minimization problem, and these results enable us to arrive at the optimal policy for our backlog model, Model A, in Theorem 1.

First, we abstract the expected one period cost function \({\overline{U}}_n({{\varvec{x}}_{\varvec{n}}},{{\varvec{y}}_{\varvec{n}}},z_{n-1}^A)\) by the function \(C({{\varvec{y}}},z)\), which is defined to be

$$\begin{aligned} C({{\varvec{y}}},z)= & {} (r_0 - s_0)y_0 + (r_1 - r_0 + s_0 - s_1)y_1 + (s_1-u)y_2 + \beta z \nonumber \\&+ \,hE(y_0 - D)^+ + pE(D-y_0)^+, \end{aligned}$$
(10)

where \({{\varvec{y}}} = (y_0,y_1,y_2)\), \(\beta \) is a given constant and \(D\) is a continuous nonnegative random variable.

It is easy to see that \(C({{\varvec{y}}},z)\) is a continuously differentiable convex function of \(({{\varvec{y}}},z)\) and is additively separable in \(({{\varvec{y}}},z)\).

We consider the following minimization problem, which is an abstraction of our dynamic program in Sect. 3:

$$\begin{aligned} {\overline{K}}({{\varvec{x}}},z) = \min _{{{\varvec{y}}}} \{ C({{\varvec{y}}},z) + \alpha K({{\varvec{y}}},z)\} \end{aligned}$$
(11)

subject to

$$\begin{aligned} \begin{array}{l} y_0 \le y_1 \le y_2, \\ y_1 - x_1 \le y_0 - x_0, \\ y_2 \le x_2, \\ y_1 \ge x_1. \end{array} \end{aligned}$$
(12)

Here \({{\varvec{x}}} = (x_0,x_1,x_2)\), \(x_0 \le x_1 \le x_2\), \(z \in \mathfrak {R}_+\).

\(K({{\varvec{y}}},z)\) represents the term \(E_{D_n,B_n,\sigma _{n,1},\epsilon _n}({V}^A_{n+1}({{\varvec{x}}_{{\varvec{n+1}}}},{Z}_n^A))\) in the dynamic program ((8) subject to constraints (9)) that we use to find the policy for our model, Model A. We list below essential properties that \(K({{\varvec{y}}},z)\) is assumed to satisfy. These properties reflect the term \(E_{D_n,B_n,\sigma _{n,1},\epsilon _n}({V}^A_{n+1}({{\varvec{x}}_{{\varvec{n+1}}}},{Z}_n^A))\) it represents, and is satisfied by \(E_{D_n,B_n,\sigma _{n,1},\epsilon _n}({V}^A_{n+1}({{\varvec{x}}_{{\varvec{n+1}}}},{Z}_n^A))\) as shown in the proof of Theorem 1.

The properties that \(K({{\varvec{y}}},z)\) satisfies are as follows:

  1. 1.

    \(K({{\varvec{y}}},z)\) is a continuously differentiable convex function of \(({{\varvec{y}}},z)\).

  2. 2.

    \(K({{\varvec{y}}},z)\) is additively separable in \({{\varvec{y}}} = (y_0,y_1,y_2)\), that is, \(K({{\varvec{y}}},z) = K_0(y_0,z) + K_1(y_1,z) + K_2(y_2,z)\), for some function \(K_i(y_i,z)\), \(i = 0, 1, 2\).

  3. 3.

    \(K({{\varvec{y}}},z)\) is additively separable in \(y_{0}\) and \(z\), that is, \(K({{\varvec{y}}},z)\) can be written as the sum of two functions \({\hat{K}}_0(y_0,y_1,y_2)\) and \({\hat{K}}_1(z,y_1,y_2)\).

  4. 4.

    \(K({{\varvec{y}}},z)\) is such that

    $$\begin{aligned}&\frac{\partial K}{\partial y_1} ({{\varvec{y}}},z) \ge -(r_1 - r_0)\ \forall \ ({{\varvec{y}}},z). \end{aligned}$$

Properties 2 and 3 imply that \(K({{\varvec{y}}},z)\) can be written as \(\hat{{\hat{K}}}_0(y_0) + \hat{{\hat{K}}}_1(y_1,z) + \hat{{\hat{K}}}_2(y_2,z)\).

With the above, we then obtain in Theorem 4 (given below) the optimal solution to the minimization problem (11) subject to constraints (12). Theorem 4 allows us to obtain the explicit form of the optimal policy \({\varvec{{\pi }}}^{{\varvec{A}},{\varvec{*}}}\) for our model in Theorem 1.

Let us denote the objective function \(C({{\varvec{y}}},z) + \alpha K({{\varvec{y}}},z)\) in the minimization problem (11) subject to constraints (12) by \(\Phi ({{\varvec{y}}},z)\) for convenience.

Remark 2

Besides convexity and continuous differentiability, \(\Phi ({{\varvec{y}}},z)\) is additively separable in \({{\varvec{y}}}\), and is also additively separable in \(y_0\), \(z\), as these properties hold for \(C({{\varvec{y}}},z)\) and \(K({{\varvec{y}}},z)\). Hence, \(\Phi ({{\varvec{y}}},z) = \Phi _0(y_0) + \Phi _1(y_1,z) + \Phi _2(y_2,z)\), where \(\Phi _i(\cdot ), i = 0, 1, 2\), are continuously differentiable convex functions of their respective variables.

By Property 4, \(r_0 < r_1\) and \(s_1 \le s_0\), we have \(\frac{\partial \Phi }{\partial y_1}({{\varvec{y}}},z) > 0\), therefore \(\Phi ({{\varvec{y}}},z)\) is increasing in \(y_1\).

Following Zhou et al. (2011), let

$$\begin{aligned}&\xi _0(z) \in \text{ argmin }_{y_0} \Phi (y_0,y_1,y_2,z), \nonumber \\&\xi _1(z) \in \text{ argmin }_{y_0} \Phi (y_0,y_0,y_2,z), \nonumber \\&\eta _2(z) \in \text{ argmin }_{y_2} \Phi (y_0,y_1,y_2,z). \end{aligned}$$
(13)

The above parameters will be used to solve the minimization problem (11) subject to constraints (12). They are then used to define the optimal control parameters for our optimal policy \({\varvec{{\pi }}}^{{\varvec{A,*}}}\). By Remark 2, we see that \(\xi _0(z)\) is not dependent on \(z\). Hence, we write \(\xi _0\) for \(\xi _0(z)\) from now onwards. Note that the way we prove that \(\xi _0, \xi _1(z)\) and \(\eta _2(z)\) are optimal control parameters, which is the result of Theorem 4 that leads to Theorem 1, is not identical to that in Zhou et al. (2011). We rely on the Karush-Kuhn-Tucker (KKT) conditions to prove this.

Parameters \(\xi _0, \xi _1(z)\) and \(\eta _2(z)\) do not depend on \(y_0, y_1\) or \(y_2\) due to the additive separability of \(\Phi ({{\varvec{y}}},d)\) in \({{\varvec{y}}}\). They may be equal to \(+\infty \) or \(-\infty \) though.

Observe that \(\xi _1(z) \le \xi _0\), since by definition of \(\xi _0, \xi _1(z)\), we have \(\Phi (\xi _1(z),\xi _1(z),y_2,z) \le \Phi (\xi _0,\xi _0,y_2,z) \le \Phi (\xi _1(z),\xi _0,y_2,z)\). The result then follows by the increasing property of \(\Phi ({{\varvec{y}}},z)\) in \(y_1\), by Remark 2.

Note that there is no clear relationship between \(\xi _1(z)\) and \(\eta _2(z)\). If \(\eta _2(z) < \xi _1(z)\), then redefine \(\eta _2(z)\) and \(\xi _1(z)\) to be equal and belong to \(\text{ argmin }_{y_0} \Phi (y_0,y_0,y_0,z)\). The following proposition shows that in this case, we still have \(\xi _1(z) \le \xi _0\).

Proposition 3

Suppose \(\xi _1^p(z)\) and \(\eta _2^p(z)\) defined by

$$\begin{aligned} \xi _1^p(z) \in \text{ argmin }_{y_0} \Phi (y_0,y_0,y_2,z), \nonumber \\ \eta _2^p(z) \in \text{ argmin }_{y_2} \Phi (y_0,y_1,y_2,z), \end{aligned}$$

is such that \(\eta _2^p(z) < \xi _1^p(z)\). Then \(\xi _1(z) \in \text{ argmin }_{y_0} \Phi (y_0,y_0,y_0,z)\) has the property that \(\xi _1(z) \le \xi _0\), where \(\xi _0\) is given by the first inclusion in (13).

Proof

We prove by contradiction by assuming that \(\xi _0 < \xi _1(z)\).

First note that \(\xi _1^p(z) \le \xi _0\). Then, we have \(\eta _2^p(z)< \xi _1^p(z) \le \xi _0 < \xi _1(z)\). Hence, by definition of \(\eta _2^p(z)\) and the convexity of \(\Phi _2(\cdot ,z)\), we obtain \(\Phi _2(\xi _0,z) \le \Phi _2(\xi _1(z),z)\).

Now, by definition of \(\xi _1(z)\),

$$\begin{aligned} \Phi (\xi _1(z),\xi _1(z),\xi _1(z),z) \le \Phi (\xi _0,\xi _0,\xi _0,z). \end{aligned}$$
(14)

Observe that \(\Phi (\xi _0,\xi _0,\xi _0,z) \le \Phi (\xi _0,\xi _0,\xi _1(z),z)\) holds, since \(\Phi _2(\xi _0,z) \le \Phi _2(\xi _1(z),z)\). Therefore, from (14), we have

$$\begin{aligned} \Phi (\xi _1(z),\xi _1(z),\xi _1(z),z) \le \Phi (\xi _0,\xi _0,\xi _1(z),z). \end{aligned}$$
(15)

If \(\xi _1(z) \le \xi _1^p(z)\), then \(\xi _1(z) \le \xi _0\), as \(\xi _1^p(z) \le \xi _0\). This is a contradiction to our assumption.

If \(\xi _1^p(z) < \xi _1(z)\), then, by \(\xi _1^p(z) \le \xi _0\), the convexity of \(\Phi \) in the first two variables, the definition of \(\xi _1^p(z)\) and (15), we have \(\xi _1(z) \le \xi _0\), which is again a contradiction to our assumption.

Hence, we have the required result. \(\square \)

Remark 3

In the case \(\eta _2^p(z) < \xi _1^p(z)\), and \(\eta _2(z)\) and \(\xi _1(z)\) are defined by \(\xi _1(z) = \eta _2(z) \in \ {\mathrm{argmin}}_{y_0}\ \Phi (y_0,y_0,y_0,z)\), then it is easy to check that \(\eta _2^p(z) \le \xi _1(z) = \eta _2(z) \le \xi _1^p(z)\).

In any case, we have

$$\begin{aligned}&\xi _1(z) \le \xi _0, \end{aligned}$$
(16)
$$\begin{aligned}&\xi _1(z) \le \eta _2(z). \end{aligned}$$
(17)

Note that the way we define the above parameters that eventually give rise to the optimal policy \({\varvec{{\pi }}}^{{\varvec{A,*}}}\) for our backlog model when returns are forecast from past demands is similar to that in Zhou et al. (2011). These parameters are used in Theorem 4 to define optimal solution to the minimization problem (11) subject to constraints (12). Theorem 4 is proved by using the KKT conditions.

\(\xi _0\) and \(\xi _1(z)\) may be thought of as “remanufacture-up-to” parameters, while \(\eta _2(z)\) is the “dispose-down-to” parameter. Depending on the values of \(x_0, x_1, x_2\), the system may remanufacture up to \(\xi _1(z)\) or \(\xi _0\). Similarly, depending on the value of \(x_0, x_1, x_2\), some of the normal cores may be disposed such that the aggregate inventory level of serviceable products, buyback and normal cores is down to the level \(\eta _2(z)\).

Let us define \(\xi _{-1}(z) = \infty \). This definition is needed in the statement of Theorem 4 below, where we provide an optimal solution to the minimization problem (11) subject to constraints (12).

Theorem 4

Given \({{\varvec{x}}} = (x_0,x_1,x_2), x_0 \le x_1 \le x_2\), it either satisfies \(\xi _m(z) \le x_m \le \xi _{m-1}(z)\) or \(x_m \le \xi _m(z) \le x_{m+1}\) for some \(m = 0\) or \(1\). If not, then \(x_0 \le x_1 \le x_2 \le \xi _1(z) \le \xi _0(z)\). Here, we denote \(\xi _0\) by \(\xi _0(z)\).

Let \((y_0^*({{\varvec{x}}},z),y_1^*({{\varvec{x}}},z),y_2^*({{\varvec{x}}},z))\) be as defined below:

  1. 1.
    1. i.

      If \(m = 0\), let \(y_0^*({{\varvec{x}}},z) = \max \{x_0, \xi _0\}\), \(y_1^*({{\varvec{x}}},z) = x_1\).

    2. ii.

      If \(m = 1\), let \(y_0^*({{\varvec{x}}},z) = y_1^*({{\varvec{x}}},z) = \max \{x_1, \xi _1(z)\}\).

    3. iii.

      Otherwise, let \(y_0^*({{\varvec{x}}},z) = y_1^*({{\varvec{x}}},z) = x_2\).

  2. 2.

    Let \(y_2^*({{\varvec{x}}},z) = \max \{ x_{1}, \min \{ x_2, \eta _2(z) \} \}\).

Then \((y_0^*({{\varvec{x}}},z),y_1^*({{\varvec{x}}},z),y_2^*({{\varvec{x}}},z))\) defined above is an optimal solution to (11) subject to constraints (12).

The above theorem is proved by verifying that the defined \((y_0^*({{\varvec{x}}},z),y_1^*({{\varvec{x}}},z),y_2^*({{\varvec{x}}},z))\) satisfies the KKT conditions for (11) subject to constraints (12). This is done by exhausting all the different scenarios in which \(x_0,x_1,x_2,\xi _0,\xi _1(z),\eta _2(z)\) can be arranged. Satisfying the KKT conditions is necessary and sufficient for optimality, since the minimization problem is a convex program and the Slater’s condition holds true trivially.

We can alternatively express the policy in Theorem 1 in terms of \({{\varvec{{y}}}_{\varvec{1}}^{{\varvec{A}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{1}}})\) and \({{\varvec{{y}}}_{{\varvec{n}}}^{{\varvec{A,*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^A)\), \(2 \le n \le N\), which have similar expressions as \((y_0^*({{\varvec{x}}},z),y_1^*({{\varvec{x}}},z),y_2^*({{\varvec{x}}},z))\) in the above theorem.

We end this subsection with the following two propositions on \({\overline{K}}({{\varvec{x}}},z)\), which are needed when Theorem 4 is applied in the proof by induction to show Theorem 1.

Proposition 4

\({\overline{K}}({{\varvec{x}}},z)\) is a convex function of \(({{\varvec{x}}},z)\), where \({{\varvec{x}}} = (x_0,x_1,x_2)\), \(x_0 \le x_1 \le x_2\) and \(z \in \mathfrak {R}_+\). As a consequence, \({\overline{K}}({{\varvec{x}}},z)\) is continuously differentiable a.e. on \(\{ ({{\varvec{x}}}, z)\ ; \ x_0 \le x_1 \le x_2, z \in \mathfrak {R}_+ \}\).

Proof

Consider the following set

$$\begin{aligned} {{\mathcal {C}}}:= & {} \{ ({{\varvec{x}}},z,v) = (x_0,x_1,x_2,z,v) ; x_0 \le x_1 \le x_2, z \in \mathfrak {R}_+, \exists {{\varvec{y}}} = (y_0,y_1,y_2)\ {\text{ such } \text{ that }} \\&y_0 \le y_1 \le y_2, y_1 - x_1 \le y_0 - x_0, y_2 \le x_2, y_1 \ge x_1,\\&v \ge C({{\varvec{y}}},z) + \alpha K({{\varvec{y}}},z) \}. \end{aligned}$$

It is easy to show that \({{\mathcal {C}}}\) is a convex set in \(\mathfrak {R}^5\), since \(C({{\varvec{y}}},z)\) and \(K({{\varvec{y}}},z)\) are convex functions of \(({{\varvec{y}}},z)\). Therefore, by Theorem \(5.3\) of Rockafellar (1970), \(f({{\varvec{x}}},z) = \inf \{ v\ ; ({{\varvec{x}}},z,v) \in {{\mathcal {C}}} \}\) is a convex function of \(({{\varvec{x}}},z)\). Since \(f({{\varvec{x}}},z) = {\overline{K}}({{\varvec{x}}},z)\), we then have \({\overline{K}}({{\varvec{x}}},z)\) is a convex function of \(({{\varvec{x}}},z)\). The consequence in the proposition follows from Theorem 25.5 of Rockafellar (1970). \(\square \)

Proposition 5

\({\overline{K}}({{\varvec{x}}},z)\) is additively separable in \({{\varvec{x}}}\) and is also additively separable in \(x_0\), \(z\). Hence, \({\overline{K}}({{\varvec{x}}},z) = {\overline{K}}_0(x_0) + {\overline{K}}_1(x_1,z) + {\overline{K}}_2(x_2,z)\), for some function \({\overline{K}}_0(x_0)\), \({\overline{K}}_i(x_i,z), i = 1, 2\). Also, \(\frac{\partial {\overline{K}}}{\partial x_1}({{\varvec{x}}},z) \ge 0\), where it is defined.

The idea behind the proof of the above proposition is to use Theorem 4 to express \({\overline{K}}({\varvec{x}},z)\) explicitly in terms of expressions that are defined to be \({\overline{K}}_0(x_0)\), \({\overline{K}}_i(x_i,z), i = 1, 2\).

5 A feasible inventory policy for Model B

As \(Z_n^B\) in (6) subject to constraints (7), and (8) subject to constraints (9), where \(j = B\), are given by (1), it is unlikely that \({{\varvec{y}}_{\varvec{1}}^{{\varvec{B,*}}}}({{\varvec{x}}_{\varvec{1}}})\) and \({{\varvec{y}}_{{\varvec{n}}}^{{\varvec{B}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^B)\), \(2 \le n \le N\), that express the optimal policy \({{\varvec{\pi }}^{{\varvec{B}},{\varvec{*}}}}\) have easily tractable structures. Given that we have obtained a nice structure for the optimal policy for Model A in Sect. 4, we can use this policy as a feasible policy for Model B by defining the feasible policy \({{\varvec{{\overline{\pi }}}}} = ({\overline{\pi }}_1, \ldots , {\overline{\pi }}_N)\) in the following way:

Let \({{\overline{\pi }}}_1({{\varvec{x}}_{\varvec{1}}}) := {{\varvec{{y}}}_{\varvec{1}}^{{\varvec{A}},{\varvec{*}}}}({{\varvec{x}}_{\varvec{1}}})\), \({{\overline{\pi }}}_2({\varvec{x}}_{\varvec{2}}, z_1^B, b_1) := {{\varvec{{y}}}_{\varvec{2}}^{{\varvec{A,*}}}}({{\varvec{x}}_{\varvec{2}}},z_{1}^B)\), and for \(3 \le n \le N\), \({{\overline{\pi }}}_n({{\varvec{x}}_{\varvec{n}}},{{\varvec{z}}_{{\varvec{n-1}}}^{\varvec{B}}},\)\({{\varvec{b}}_{{\varvec{n-1}}}},\)\(\sigma _{2,1},\ldots ,\sigma _{n-1,1},\epsilon _{2}, \ldots , \epsilon _{n-1}) := {{\varvec{{y}}}_{{\varvec{n}}}^{{\varvec{A,*}}}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}^B)\).

It is clear that \({{\varvec{{\overline{\pi }}}}}\) defined in the above way is a feasible policy for Model B, the model where returns are forecast from past sales. Hence,

$$\begin{aligned}&\displaystyle V_1^B({\varvec{x}}_{\varvec{1}}) \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}), \\&\displaystyle V_n^B({{\varvec{x}}_{\varvec{n}}},z_{n-1}^B) \le V_{{{\varvec{{\overline{\pi }}}}},n}({{\varvec{x}}_{\varvec{n}}},{{\varvec{z}}_{{\varvec{n-1}}}^{\varvec{B}}}, {{\varvec{b}}_{{\varvec{n-1}}}},\sigma _{2,1},\ldots ,\sigma _{n-1,1},\epsilon _2,\ldots ,\epsilon _{n-1}), \ 2 \le n \le N. \end{aligned}$$

A natural question to ask is how close the feasible policy is to optimality. An attempt to answer this question is to compare the system cost under this feasible policy with the optimal system cost. This is what we proceed to achieve. We do this by using what we know so far - the structure of the optimal policy \({\varvec{\pi }}^{{\varvec{A}},{\varvec{*}}}\) for Model A. We use it to analyze \(V_n^A({{\varvec{x}}_{\varvec{n}}},z_{n-1}), 2 \le n \le N\).

In what follows, we write \(z_{n-1}\) without a superscript to indicate that we are not attaching any meaning to this variable as past demand or sales, but merely treating it as a generic nonnegative variable.

Let us impose the following conditions on our cost parameters:

Corollary 1

  1. (a)

    \(r_1 \le u + p\).

  2. (b)

    \(r_0 \le s_0 + p\).

  3. (c)

    \(r_1 \le s_1 + p\).

  4. (d)

    \(u \le h + r_1\).

These conditions are reasonable conditions for the model. The first three conditions encourage remanufacturing, the only way to have enough serviceable products to satisfy demand, to avoid backlog, while the last condition discourages remanufacturing of normal cores in favor of disposal when there is no demand for serviceable products to avoid stocking excess serviceable products obtained from remanufacturing. These conditions are needed to prove the following proposition:

Proposition 6

We have, for \(2 \le n \le N\),

  • \(-r_0 \le \frac{\partial {V}^A_n}{\partial x_{0,n}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}) \le (1 + \cdots + \alpha ^{N-n})(-s_0 + h)\).

  • \(r_0 - r_1 \le \frac{\partial {V}^A_n}{\partial x_{1,n}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}) \le s_0 - \min \{s_1,u\} + (\alpha + \cdots + \alpha ^{N-n})(s_0 + u - \min \{s_1,u\})\).

  • \(r_1 - (1 + \cdots + \alpha ^{N-n})p \le \frac{\partial {V}^A_n}{\partial x_{2,n}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}) \le u\).

wherever the partial derivatives are defined.

Note that without Condition 1, we can also obtain a similar result as Proposition 6, but the analysis to obtain the result will be more complicated.

The above proposition is proved by induction using the structure of the optimal policy in Theorem 1 (see also Theorem 4) and the definitions of \(\xi _0, \xi _{1,n}(z_{n-1}), \eta _{2,n}(z_{n-1})\) as found in (13) and in Proposition 3. Also, the following holds:

Proposition 7

We have, for \(2 \le n \le N\),

$$\begin{aligned}&\frac{\partial V_n^A}{\partial z_{n-1}}({{\varvec{x}}_{\varvec{n}}},z_{n-1}) \\&\quad = E(\sigma _{n,1}) \left[ s_0 + b + \alpha E_{D_n,B_n,\sigma _{n,1},\epsilon _n}\left( \left( \frac{\partial V_{n+1}^A}{\partial x_{1,n+1}} + \frac{\partial V_{n+1}^A}{\partial x_{2,n+1}}\right) ({{\varvec{x}}_{{\varvec{n+1}}}^{{\varvec{A}},{\varvec{*}}}},D_n) \right) \right] , \end{aligned}$$

wherever the partial derivatives are defined.

Using Propositions 6 and 7, we are ready to find an upper bound for the difference between the optimal cost \(V_1^B({{\varvec{x}}_{\varvec{1}}})\) and the cost under our feasible policy \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}})\). Before we do this, we need the following two propositions which follows from the above two propositions:

Proposition 8

We have, for \(2 \le n \le N\),

$$\begin{aligned}&V_{{{\varvec{{\overline{\pi }}}}},n}({{\varvec{x}}_{\varvec{n}}},{{\varvec{z}}_{{\varvec{n-1}}}},{{\varvec{b}}_{{\varvec{n-1}}}},\sigma _{2,1},\ldots ,\sigma _{n-1,1},\epsilon _2,\ldots ,\epsilon _{n-1}) - V_n^A({{\varvec{x}}_{\varvec{n}}},z_{n-1}) \\&\quad \le \alpha (1 + \cdots + \alpha ^{N-n}) \max \{\alpha ((1 + \cdots + \alpha ^{N-n-2})p - r_0) - s_0 - b,0\}\mu _n, \end{aligned}$$

where \(\mu _n = \max \{ \mu _{D_n}, \ldots , \mu _{D_{N-1}} \}\), \(2 \le n \le N-1\), and \(\mu _N = 0\). Consequently,

$$\begin{aligned}&V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^A({{\varvec{x}}_{\varvec{1}}}) \\&\quad \le \alpha (1 + \cdots + \alpha ^{N-1}) \max \{ \alpha ((1 + \cdots + \alpha ^{N-3})p - r_0) - s_0 - b,0\}\mu _1, \end{aligned}$$

where \(\mu _1 = \max \{ \mu _{D_1}, \ldots , \mu _{D_{N-1}} \}\).

Proposition 9

We have, for \(2 \le n \le N\),

$$\begin{aligned}&V_n^A({{\varvec{x}}_{\varvec{n}}},z_{n-1}) - V_n^B({{\varvec{x}}_{\varvec{n}}},z_{n-1}) \\&\quad \le \alpha (1 + \cdots + \alpha ^{N-n}) (s_0 + b + \alpha (1 + \cdots + \alpha ^{N-n-2})(s_0 + u - \min \{s_1,u\}))\mu _n, \end{aligned}$$

where \(\mu _n = \max \{ \mu _{D_n}, \ldots , \mu _{D_{N-1}} \}\), \(2 \le n \le N-1\), and \(\mu _N = 0\). Consequently,

$$\begin{aligned}&V_1^A({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \\&\quad \le \alpha (1 + \cdots + \alpha ^{N-1}) (s_0 + b + \alpha (1 + \cdots + \alpha ^{N-3})(s_0 + u - \min \{s_1,u\}))\mu _1, \end{aligned}$$

where \(\mu _1 = \max \{ \mu _{D_1}, \ldots , \mu _{D_{N-1}} \}\).

We have the following lemma which follows from the above two propositions:

Lemma 1

We have, for \(0< \alpha < 1\),

  • If \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) \le V_1^A({{\varvec{x}}_{\varvec{1}}})\), then

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le \frac{\alpha \mu _1}{1 - \alpha }\left( s_0 + b + \frac{\alpha }{1 - \alpha }(s_0 + u - \min \{s_1, u\}) \right) . \end{aligned}$$
  • If \(V_1^A({{\varvec{x}}_{\varvec{1}}}) \le V_1^B({{\varvec{x}}_{\varvec{1}}})\), then

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le \frac{\alpha \mu _1}{1 - \alpha } \max \left\{ \alpha \left( \frac{p}{1 - \alpha } - r_0 \right) - s_0 - b, 0 \right\} . \end{aligned}$$
  • Otherwise,

    $$\begin{aligned} 0\le & {} V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le \frac{\alpha \mu _1}{1 - \alpha } \left( s_0 + b + \frac{\alpha }{1 - \alpha }(s_0 + u - \min \{s_1, u\}) \right. \\&+ \left. \max \left\{ \alpha \left( \frac{p}{1 - \alpha } - r_0 \right) - s_0 - b, 0 \right\} \right) . \end{aligned}$$

here \(\mu _1 = \max \{ \mu _{D_1}, \ldots , \mu _{D_{N-1}} \}\).

Proof

Note that \(V_1^{B} ({{\varvec{x}}_{\varvec{1}}}) \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}})\). Hence, depending on how \(V_1^A({{\varvec{x}}_{\varvec{1}}})\) compares with \(V_1^{B} ({{\varvec{x}}_{\varvec{1}}})\) and \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}})\), we have the following situations:

  • If \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) \le V_1^A({{\varvec{x}}_{\varvec{1}}})\), then

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le V_1^A({{\varvec{x}}_{\varvec{1}}} )- V_1^B({{\varvec{x}}_{\varvec{1}}}). \end{aligned}$$
  • If \(V_1^A({{\varvec{x}}_{\varvec{1}}}) \le V_1^B({{\varvec{x}}_{\varvec{1}}})\), then

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^A({{\varvec{x}}_{\varvec{1}}}). \end{aligned}$$
  • Otherwise,

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) = (V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^A({{\varvec{x}}_{\varvec{1}}})) + (V_1^A({{\varvec{x}}_{\varvec{1}}} )- V_1^B({{\varvec{x}}_{\varvec{1}}})). \end{aligned}$$

The required results then follow by applying Propositions 8 and 9, and noting that

$$\begin{aligned} 1 + \cdots + \alpha ^k \le \frac{1}{1 - \alpha }, \end{aligned}$$

for all \(k \ge 0\). \(\square \)

We note the following:

Proposition 10

For \(0< \alpha < 1\), if \(p \le (1-\alpha )(r_0 + (s_0 + b)/\alpha )\), then \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) \le V_1^A({{\varvec{x}}_{\varvec{1}}})\).

We now have the main result of this section:

Theorem 5

For \(0< \alpha < 1\),

  • If \(p \le (1-\alpha )(r_0 + (s_0 + b)/\alpha )\), then

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le \frac{\alpha \mu _1}{1 - \alpha }\left( s_0 + b + \frac{\alpha }{1 - \alpha }(s_0 + u - \min \{s_1, u\}) \right) . \end{aligned}$$
  • Otherwise,

    $$\begin{aligned} 0 \le V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}}) \le \frac{\alpha ^2 \mu _1}{1 - \alpha } \left( \frac{1}{1 - \alpha }(s_0 + u + p - \min \{s_1, u\}) - r_0 \right) . \end{aligned}$$

here \(\mu _1 = \max \{ \mu _{D_1}, \ldots , \mu _{D_{N-1}} \}\).

Proof

Observe that when \(p \le (1-\alpha )(r_0 + (s_0 + b)/\alpha )\), then by Proposition 10, \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) \le V_1^A({{\varvec{x}}_{\varvec{1}}})\), the result then follows from Lemma 1. On the other hand, when \(p > (1-\alpha )(r_0 + (s_0 + b)/\alpha )\), then

$$\begin{aligned} \alpha \left( \frac{p}{1 - \alpha } - r_0 \right) - s_0 - b > 0, \end{aligned}$$

and the result also follows from Lemma 1. \(\square \)

Observe from the above theorem that when the discount factor \(\alpha \) is small, we can approximate the optimal policy \({\varvec{\pi }}^{{\varvec{B}},{\varvec{*}}}\) by the feasible policy \({\varvec{{\overline{\pi }}}}\) well, as the upper bounds tend to zero as \(\alpha \) approaches zero. We also observe from the theorem that the difference between \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}})\) and \(V_1^B({{\varvec{x}}_{\varvec{1}}})\) is bounded above by constants that are independent of the planning horizon and initial inventories, but only dependent on cost parameters \(r_0, s_0, s_1, b, p, u\), discount factor \(\alpha \) and mean demands \(\mu _{D_i}\), \(1 \le i \le N-1\).

It is not surprising that the constants depend on \(s_0\), \(b\) and \(r_0\), since we are comparing with the feasible policy \({\varvec{{\overline{\pi }}}}\), which is related to Model A, to obtain the above bounds and the difference between Model A and Model B is the way in which returns of buyback cores are forecast, and \(s_0\), \(b\) and \(r_0\) are cost parameters related to buyback cores. On the other hand, the only effect normal cores has on the above bounds is through \(s_1\) and \(u\). The unit penalty cost, \(p\), only appears in a bound when it is large enough, in particular, larger than \((1-\alpha )(r_0 + (s_0 + b)/\alpha )\). It can be imagined that when the initial inventories of serviceable products, buyback cores and normal cores are low, the difference between \(Z_n^A\) and \(Z_n^B\) is likely to get larger and larger as \(n\) increases resulting in more penalty cost being incurred for Model B compared to Model A. This is so because the number of units of buyback cores returned for the former gets smaller compared to that for the latter. Hence, since the upper bound for the difference \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}})\) is obtained by comparing Model A with Model B, when \(p\) is large, it appears in the upper bound. This reflects the difference in penalty costs between the two models when inventories are low. The unit holding cost, \(h\), does not play a role in these constants since when there are more serviceable products than demand, then \(Z_n^A\) is equal to \(Z_n^B\), and there is no difference between Model A and Model B in the \(n\)th period.

5.1 Numerical study

In our numerical experiments, we investigate the difference in costs, \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}})\), by varying initial inventories (at the start of the planning horizon) and parameters of our models. Note that the upper bounds for \(V_{{{\varvec{{\overline{\pi }}}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}})\) given in Theorem 5 are worst case and we expect that the actual differences to be smaller than these upper bounds. This is substantiated by the numerical results we obtained which are given below.

To be realistic, in our numerical experiments, we set the length of the planning horizon, \(N\), to be \(6\), with a set of numerical experiments having \(N\) from \(3\) to \(15\) in increment of \(3\). Because of the curse of dimensionality when solving dynamic programs, our experiments are such that the dynamic program for Model A and Model B (both presented in Sect. 3) are solved only for 3 periods, instead of the whole planning horizon with length \(N\), which can be greater than 3.

To implement our models with dynamic programs being solved only for 3 periods even though the length of the planning horizon can be greater than 3, in the \(1\)st period, we solve the dynamic program with length \(3\) for Model A and Model B to obtain the number of units of serviceable products to remanufacture from buyback or normal cores and the number of units of normal cores to dispose of. Also, for the same realized demand, we obtain the initial inventories of serviceable products, buyback cores and normal cores at the beginning of the \(2\)nd period under the feasible policy \({{\varvec{{\overline{\pi }}}}}\) (which is related to Model A), and the optimal policy \({{\varvec{\pi }}^{{\varvec{B}},{\varvec{*}}}}\) (which is related to Model B). The expected costs in the \(1\)st period under the two policies are also computed for the same realized demand in the \(1\)st period. We then solve the dynamic program for Model A and Model B again, but now for \(n = 2\), to obtain the number of units of serviceable products to remanufacture from buyback or normal cores and number of units of normal cores to dispose of. The initial inventories for the \(3\)rd period are then updated for the same realized demand, but different realized returns of buyback cores, which depends on the serviceable products available in the \(1\)st period under the feasible policy \({{\varvec{{\overline{\pi }}}}}\) (which is related to Model A) and the optimal policy \({{\varvec{\pi }}^{{\varvec{B,*}}}}\) (which is related to Model B). The expected cost in the \(2\)nd period under the feasible policy \({{\varvec{{\overline{\pi }}}}}\) and under the optimal policy \({{\varvec{\pi }}^{{\varvec{B}},{\varvec{*}}}}\) are also computed for the same realized \(2\)nd period demand. \(N\) is always a multiple of \(3\) in our numerical experiments. If \(N\) is larger than \(3\), then at the \(4\)th period, we solve the dynamic program for Model A and Model B again, for a horizon of length \(3\), treating the different inventories we obtained at the end of the \(3\)rd period, after taking into account realized demand and returns of cores, as initial inventories for each dynamic program. We continue in this way if \(N\) is larger than \(6\). For each policy, the total expected cost for the whole planning horizon \(N\) is the sum of appropriately discounted expected single period cost, given by (3).

We consider integral inventories and demands in our numerical study. For each policy, the programs are run \(R\) times, and we obtain the average system cost under each policy (\({\hat{V}}_{{\varvec{{\overline{\pi }}}},1}\) and \({\hat{V}}_1^B\)) by summing the total system cost obtained in each run, as described in the last paragraph, and then divide by \(R\), where \(R\) is taken to be \(100\). The rounded uniform distribution \(U_R(0,15)\), a discrete probability distribution, is considered for \(D_n\). We let the maximum returns period \(K\) be 1, and in our dynamic programs, \(\sigma _{n,1}\) is such that \(\sigma _{n,1}z^j_{n-1}\) is binomially distributed with probability of success \(p_0\) and number of trials \(z^j_{n-1}\), \(j = A, B\). We let \(\epsilon _n \equiv 0\). Hence, we consider the situation when a product is either returned with probability \(p_0\) in the next period or not at all.

In all our numerical experiments, we set \(b = 1.0\), \(r_0 = 1.0\), \(r_1 = 1.0\), \(s_0 = 1.0\), \(s_1 = 1.0\), \(u = 1.0\), \(p_0 = 0.8\) and \(B_n \equiv 5\). Our numerical results show that the percentage difference \(\frac{{\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B}{{\hat{V}}_1^B} \times 100\) is not greater than \(3.50\%\), with only two instances beyond \(3.00\%\).

Table 1 Effect of different initial inventories on average system costs under the feasible and optimal policies (\(h= 1.0, p = 2.0, \alpha = 0.5, N = 6\))

Table 1 shows how \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B\) varies with changes in the initial inventories of serviceable products, buyback cores and normal cores. We see from the table that there is no set pattern to how the difference varies with changes in \(x_{0,1}, x_{1,1}, x_{2,1}\). The maximum value for \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B\) in the table is \(0.74\), which is much lesser than the predicted upper bound

$$\begin{aligned} \frac{\alpha \mu _1}{1 - \alpha }\left( s_0 + b + \frac{\alpha }{1 - \alpha }(s_0 + u - \min \{s_1, u\}) \right) \approx 22.5, \end{aligned}$$

which holds for all values of \((x_{0,1}, x_{1,1}, x_{2,1})\) in the table.

Table 2 Effect of different length of planning horizon on average system costs under the feasible and optimal policies (\(x_{0,1} = 5, x_{1,1} = 10, x_{2,1} = 15, h = 1.0, p = 2.0, \alpha = 0.5\))

As shown in Table 2, the difference \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B\) does not vary very much as the length of the planning horizon \(N\) increases. In Theorem 5, the upper bounds provided are also independent of \(N\).

Table 3 Effect of different discount factor on average system costs under the feasible and optimal policies (\(x_{0,1} = 5, x_{1,1} = 10, x_{2,1} = 15, h = 1.0, p = 2.0, N = 6\))

In line with Theorem 5, we see from Table 3 that the difference in \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1}\) and \({\hat{V}}_1^B\) increases with increase in the value of the discount factor, \(\alpha \), although, the actual difference is smaller than the upper bounds provided in the theorem. For example, when \(\alpha = 0.2\), we have the theoretical upper bound, as given by Theorem 5, of

$$\begin{aligned} \frac{\alpha \mu _1}{1 - \alpha }\left( s_0 + b + \frac{\alpha }{1 - \alpha }(s_0 + u - \min \{s_1, u\}) \right) \approx 4.22 \end{aligned}$$

and we have

$$\begin{aligned} \frac{\text{ theoretical } \text{ upper } \text{ bound }}{{\hat{V}}_1^B} \times 100 \approx 15.39\%, \end{aligned}$$

while

$$\begin{aligned} \frac{{\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B}{{\hat{V}}_1^B} \times 100 = 0.11\%. \end{aligned}$$

When \(\alpha = 0.8\), we have the theoretical upper bound,Footnote 6 as given by Theorem 5, of

$$\begin{aligned} \frac{\alpha \mu _1}{1 - \alpha }\left( s_0 + b + \frac{\alpha }{1 - \alpha }(s_0 + u - \min \{s_1, u\}) \right) \approx 180.00 \end{aligned}$$

and we have

$$\begin{aligned} \frac{\text{ theoretical } \text{ upper } \text{ bound }}{{\hat{V}}_1^B} \times 100 \approx 180.34\%, \end{aligned}$$

while

$$\begin{aligned} \frac{{\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B}{{\hat{V}}_1^B} \times 100 = 3.15\%. \end{aligned}$$
Table 4 Effect of different unit penalty cost on average system costs under the feasible and optimal policies (\(x_{0,1} = 5, x_{1,1} = 10, x_{2,1} = 15, h = 1.0, \alpha = 0.5, N = 6\))

In Table 4, we see that as the unit penalty cost \(p\) increases from \(1.0\) to \(4.0\), the difference \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B\) decreases steadily from \(0.65\) to \(0.13\), except for an increase when \(p\) increases from \(1.5\) to \(2.0\). However, from Theorem 5, we expect the difference to increase with \(p\), since the upper bound in the theorem increases with \(p\). A reason for the increase in upper bound with \(p\) in the theorem is because the upper bound is obtained by considering Models A and B, and returns of buyback cores for Model A are dependent on past demands, while returns of buyback cores for Model B are dependent on past sales, which can be low when there are insufficient serviceable products. Remanufacturing is therefore unaffected for Model A, while there may be fewer buyback cores to remanufacture to serviceable products for Model B, due to low past sales, to satisfy current demand. In the worst case, this difference in penalty costs, due to unsatisfied demand, becomes apparent as \(p\) becomes large, leading to an upper bound for \(V_{{\varvec{{\overline{\pi }}}},1}({{\varvec{x}}_{\varvec{1}}}) - V_1^B({{\varvec{x}}_{\varvec{1}}})\) that depends on and increases with \(p\). This effect is not present when we are computing \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1}\) and \({\hat{V}}_1^B\), because returns of buyback cores under policies \({\varvec{{\overline{\pi }}}}\) and \({\varvec{\pi }}^{{\varvec{B,*}}}\) now both depend on past sales, and also as \(p\) increases, both policies become more similar to each other and act to remanufacture any available cores to serviceable products to satisfy demand.

Table 5 Effect of different unit holding cost on average system costs under the feasible and optimal policies (\(x_{0,1} = 5, x_{1,1} = 10, x_{2,1} = 15, p = 2.0, \alpha = 0.5, N = 6\))

In Table 5, we observe that the dependence of \({\hat{V}}_{{\varvec{{\overline{\pi }}}},1} - {\hat{V}}_1^B\) on unit holding cost \(h\) is not apparent, which is in line with the independence of the upper bounds in Theorem 5 on \(h\).

We end this subsection by investigating whether the “myopic” policy is good enough as an approximation to the feasible policy \({\varvec{{\overline{\pi }}}}\) in deciding the number of units of buyback and normal cores to remanufacture and the number of units of normal cores to dispose in each period. It has been shown in the literature, such as Ignall and Veinott (1969), that under certain situations, “myopic” policy can be optimal. An advantage of the “myopic” policy over the feasible policy \({\varvec{{\overline{\pi }}}}\) for Model B is that the former only requires optimizing the single period cost function at each period which can be implemented easily, while the latter requires solving (13) to find optimal control parameters to make remanufacturing and disposal decisions, and this can be challenging.

Denote the average system cost under the “myopic” policy by \({\hat{V}}_1\). This is obtained by summing the total system cost obtained in each run under this policy, and then divide by \(R\), where \(R\) is the total number of runs and is taken to be \(100\). As shown in Tables 6 and 7, the “myopic” policy approximates the feasible policy \({{\varvec{{\overline{\pi }}}}}\) reasonably well, with the highest percentage difference in average system costs being \(17.46\%\), and the lowest percentage difference being \(2.00\%\). Out of the 17 scenarios tested, 11 scenarios have percentage difference in average system costs less than \(10\%\).

Table 6 Effect of different initial inventories on average system costs under the “myopic” and feasible policies (\(h= 1.0, p = 2.0, \alpha = 0.5, N = 6\))
Table 7 Effect of different length of planning horizon on average system costs under the “myopic” and feasible policies (\(x_{0,1} = 5, x_{1,1} = 10, x_{2,1} = 15, h = 1.0, p = 2.0, \alpha = 0.5\))

6 Concluding remarks

In this paper, we describe two models for our remanufacturing inventory system, incorporating a forecasting method for returns, that depends on past demands and sales of buyback cores. This paper considers two types of cores: buyback cores and normal cores. In Theorem 1, through analyzing a dynamic program, we obtain optimal control parameters to describe the optimal policy for the backlog model when returns are forecast from past demands. Properties of the optimal cost and the optimal policy we obtained are also provided. Then, in Sect. 5, we study a feasible inventory policy for the model in which returns are forecast from past sales, and we show how close this feasible inventory policy is to the optimal inventory policy by studying the difference in the expected costs under each of these policies. A question that arises at this point is whether the theoretical upper bounds given in Theorem 5 are tight, and we leave this as a future work. Numerical results are given in Sect. 5.1.