Interactive procedure for a multiobjective stochastic discrete dynamic problem

Multiple objectives and dynamics characterize many sequential decision problems. In the paper we consider returns in partially ordered criteria space as a way of generalization of single criterion dynamic programming models to multiobjective case. In our problem evaluations of alternatives with respect to criteria are represented by distribution functions. Thus, the overall comparison of two alternatives is equivalent to the comparison of two vectors of probability distributions. We assume that the decision maker tries to find a solution preferred to all other solutions (the most preferred solution). In the paper a new interactive procedure for stochastic, dynamic multiple criteria decision making problem is proposed. The procedure consists of two steps. First, the Bellman principle is used to identify the set of efficient solutions. Next interactive approach is employed to find the most preferred solution. A numerical example and a real-world application are presented to illustrate the applicability of the proposed technique.


Introduction
Multiple objectives and dynamics characterize many sequential decision problems.They led to the emergence of a research field known as multiobjective dynamic programming.The vector version of Bellman's principle of optimality [2] is a cornerstone upon which MODP is built.Decomposition methods based on the principle of optimality embed the original multiobjective dynamic optimization problem into the family of single-stage multiobjective problems both in deterministic and stochastic cases [31,32].Unfortunately, due to the well known "curse of dimensionality", the possibility of applying this approach in practice doesn't occur very often.Some attempts to overcome the curse of dimensionality are worth noting; they are based on artificial intelligence-for instance ant colonies [25] or genetic algorithms approaches [6].
One of advantages of considering multiobjective problems in partially ordered criteria space is the possibility of dealing jointly with different types of outcomes.Linking them is the topic of the paper of Zaraś [38], where three kinds of evaluations were considered: deterministic, stochastic and fuzzy criteria.Such a possibility for dynamic programming models were considered by Trzaskalik and Sitarz [34].
The most recent contributions to dynamic programming from the point of view of the approach proposed in the paper are as follows.Bazgan et al. [1] propose an approach to ordered structures in dynamic programming, which is applied in the present paper.Their contribution is to use dominance relations in dynamic programming.Another nonstandard approach to stochastic dynamic programming is presented by Cardoso et al. [5].Instead of using the expected value of the objective function for building the optimization criterion, objective function value quantiles are applied, and Monte Carlo simulation of the unknown process input outcomes is conducted.The basis of Sitarz's work [24] is the same as in the present paper.Anyway, while the main idea of Sitarz is to apply narrowing relations to diminish the number of considered Pareto optimal solutions, we will apply the interactive approach to obtain the final solution.
In the present paper a multi-period problem with returns in a partially ordered criteria space is considered.The dynamic programming method can be applied to find the set of Pareto realizations [33].We expect that such a set will be usually large, so a formal procedure for decreasing it is desirable to obtain the final solution.
The problem considered here is a multiobjective decision making problem with a finite number of solutions and evaluations represented by random variables with known distributions.Keeney and Raiffa [12] suggested the multiattribute utility function for solving it.In practice, estimation of such a function is not easy.However, this approach, although indirect, is the starting point for various multiobjective methods.
Among them there are methods based on pairwise comparisons.This group includes, for example, techniques exploiting stochastic dominance rules [19,38].As such procedures sometimes do not provide clear ranking, alternative concepts are proposed, such as stochastic dominance degree [7,15].Stochastic Multicriteria Acceptability Analysis (SMAA) is another family of methods used for solving stochastic multiobjective problems [13,14,30].They provide indices describing the multiobjective problem.In practice, Monte Carlo simulation is used to compute approximations for values of these indices.
The approach that is used very often for solving real-world multiobjective problems is based on the interactive paradigm.It assumes that all preference information necessary for solving the problem is collected in consecutive steps.In each interactive technique the decision maker is asked to define which criteria influence her/his preferences and to provide preference information with respect to a given solution or a given set of solutions.Initially, interactive approach was used for solving decision making problems under certainty [3,8,39].In stochastic context interactive methods are mainly used for solving multiobjective linear programming problems [18,29,37].Various classical interactive concepts are adapted for problems, where at least some parameters are fuzzy or stochastic, including reference-point approach [17].Kato and Sakawa [11] proposed an interactive fuzzy satisfying method to derive a satisfying solution of a multiobjective linear programming problem.
On the other hand, interactive procedures are also proposed for discrete problems, where the number of feasible solutions is moderate.Nowak [20] proposed a modification of STEM procedure [3] for such a situation.In this technique a single candidate solution is proposed in each iteration.The decision maker is asked to specify the criterion that satisfies her/him and defines the limit of concessions which can be made to improve other criteria.The candidate solution is identified using stochastic dominance rules.INSDECM [21] is another technique that can be used in such cases.It assumes that the decision maker's experience with decision techniques is slightly greater, since during the dialog phase he/she can formulate his/her requirements not only in relation to means, but also with respect to other probability distribution parameters.Both these techniques use a so-called direct paradigm for collecting the preference information.The decision maker defines his/her requirements by specifying constraints on values of distribution parameters.In [22] a technique using trade-off information for identifying a candidate solution is proposed.The information that must be provided by the decision maker is limited.He/she has to specify only the criterion that should be improved, and to order the criteria that can be diminished.
The main purpose of this paper is to propose an interactive procedure for a discrete multiobjective dynamic stochastic decision making problem, which integrates our previous works.Our technique uses a dynamic programming approach [32,33,35] for identifying non-dominated solutions and the interactive procedure INSDECM [21] for selecting the final solution of the problem.An additional purpose of this work is to present a real-world application of our technique.
The paper is structured as follows.In Sect. 2 we start with defining the problem of dynamic programming with a partially ordered criteria space.Then we describe a stochastic partially ordered discrete stochastic space (Sect.3).In the fourth section we briefly present INSDECM procedure, and in Sect. 5 we describe how this technique can be used for dynamic problems.Next an illustrative example (Sect.6) and an application (Sect.7) are presented.The last section groups conclusions.

Partially ordered criteria space
Let W be a set, v, w ∈ W. We assume, that a preference relation ≤ ⊂ W × W is given.If v ≤ w we say, that w is not worse than v. Let • be a binary operator.
A partially ordered criteria space (W, ≤, •) is defined, iff: We define the relation < ⊂ W × W as follows: If v < w, we say that w is better than v. Let V ⊂ W. v is defined as a maximal element of V, iff: We denote by max V the set of all maximal elements of V, defined as follows:

Formulation of the problem
The presented description of discrete dynamic decision process comes from Trzaskalik [31,32].We consider such a process, which consists of T periods.We assume, that for t ∈ 1, T : Y t is the set of all feasible states at the beginning of the period t, Y T + 1 is the set of all feasible states at the end of the process, X t (y t ) is the set of all feasible decisions for the period t and the state y t , D t (y t ) is the set of all period realizations in the period t, defined as follows: D is a set of all process realizations, defined as follows: d(y t ) is the partial realization for a given realization d, which begins at y t .We have:

D(y t
) is the set of all partial realizations, which begin at y t .We have: is the set of all partial realizations, which begin at any y t ∈ Y t .We have: P denotes a discrete dynamic process.We have: We assume, that sets Y 1 , . .., Y T +1 , X 1 (y 1 ), . .., X T (y T ) and functions Ω 1 , . . ., Ω T are identified.
We assume that there are given a partially ordered criteria space (W, ≤, •) and period criteria functions f t : D t → W. Functions F t : D (Y t ) → W are defined in the following way: F is the multi-period criteria function.We assume that F = F 1 .
(P, F) denotes a discrete dynamic decision process with returns in the partially ordered criteria space, consisting of the process P and the multi-period criteria function F.
A realization d ∈ D is said to be efficient, iff: The set of all efficient realizations is denoted by D.
The dynamic programming problem with a partially ordered criteria space is formulated as follows [33,35]: find D in the decision space and max F(D) in the criteria space.

Bellman's principle of optimality
Applying Theorems 1 and 2, given below we are able to find the set of all maximal elements F(D) in the criteria space and the set of all efficient realizations D in the decision space.

Partially ordered discrete stochastic space
As an example of a partially ordered criteria space for a multiobjective dynamic problem we consider the set of discrete random variables, which can be understood as a set of probability sequences in the following way [33]: where p i = P (X = i) , n = max {k ∈ N, p k > 0}.Thus, for simplicity of the procedure description, we assume here that random variables take only nonnegative integer values.However, our procedure will also work for discrete random variables that take values of real numbers.

123
To define the relation ≤ we use FSD (First Stochastic Dominance) and SSD (Second Stochastic Dominance) relations: p ≤ q ⇔ q FSD p ∨ q SSD p, where: Let us now assume, that multiple criteria are taken into account.Let N be the number of criteria, and W N be the product of N structures W. The operator • N and relation ≤ N are defined as follows: In our case the relation ≤ N holds, if FSD or SSD relations hold for each criterion.

INSDECM (INTeractive Stochastic DECision Making
) is designed for problems with a finite number of feasible solutions (alternatives) and evaluations represented by random variables with known distributions.As it is an interactive technique, the decision maker's preferences are identified step by step.The main idea of the procedure comes from the interactive multiobjective goal programming (IMGP) proposed by Spronk [27].Each iteration of INSDECM includes the following phases [21]: -presentation of the data, -asking the decision maker to provide preference information by specifying constraints which the solution should satisfy to be acceptable, -generating the set of alternatives satisfying the conditions defined by the decision maker.
The set of feasible solutions is progressively reduced.The purpose is to identify a small subset of solutions satisfying the decision-maker's requirements, and finally present them to him/her to make the final choice.
As in a stochastic problem outcomes are represented by random variables, the dialog procedure must deal with some parameters of probability distributions.In INSDECM the decision maker is asked to specify the data to be presented.For each criterion he/she may choose the expected outcome measures, as well as variability characteristics.In each iteration a potency matrix is constructed.Each column corresponds to a probability distribution parameter that the decision-maker considers worthy to analyze.In the first row the worst value of this parameter achievable within the whole set of alternatives is placed, while in the second row it's best value is presented.Thus the decision-maker is able to find out the interval which includes values of the parameter for all alternatives analyzed at the current stage of the procedure.The decision maker is asked to express his/her preferences by specifying the minimum or maximum acceptable values of distribution parameters.Such constraints are, in general, not consistent with stochastic dominance rules [23].INSDECM verifies whether a constraint defined by the decision maker is consistent with stochastic dominance rules and, if the inconsistency is identified, suggests how the constraint can be redefined.
The procedure iterates until the decision maker accepts a particular solution as the final solution.Although the procedure does not limit the number of scalar measures to be presented, the decision maker is usually not able to analyze too many of them.If the number of criteria is large then it is practical to limit the number of the measures for each criterion to one.Usually, central tendency measures provide useful information.Measures based on the probability of getting outcomes above or below the specified target value are also useful, as they are intuitively comprehensible for the decision maker.

Dynamic INSDECM procedure
In a dynamic decision making problem, the decision maker is often interested not only in values of the multi-period criteria function, but also in values of period criteria functions.Thus, the problem becomes more complicated, as multiple criteria for multiple periods are considered, and additionally, multiple distribution parameters can be taken into account for each criterion.Here we propose a modification of INSDECM for dynamic problems.In order to organize the dialog procedure we propose to ask the decision maker to define a hierarchy of criteria and to consider only one criterion in each iteration.
Let us assume that criteria are ordered according to their importance-the most important is criterion no. 1, while the least important is criterion no.N .
The procedure consists of two main steps: 1. Identification of the set of non-dominated realizations D (obtained by means of the procedure described in Sect.2.3). 2 Selection of the final solution by modified INSDECM interactive procedure.
Let D (l) be the set of process realizations considered in iteration l.The interaction with the decision maker operates as follows: 1. Assume l = 1, D (l) = D. 2. Ask the decision maker to specify distribution parameters to be analyzed for l-th criterion for period criteria functions and multi-period criteria function.3. Construct a potency matrix and present it to the decision maker.4. Ask the decision maker whether he/she is satisfied with pessimistic values of distribution parameters.If the answer is YES, go to step 8. 5. Ask the decision maker to define his/her requirements by defining minimum or maximum values of parameters under consideration.6. Verify the consistency of constraints defined by the decision maker with stochastic dominance rules.If constraints are consistent with stochastic dominance rules go to 8. 7. Propose the decision maker the ways in which inconsistent constraint can be redefined and ask him/her to choose one of the suggestions.If he/she does not accept any proposal, go to 5; else replace the constraint by the proposal accepted by the decision maker.8. Generate D (l+1) -the set of process realizations d ∈ D (l) satisfying constraints defined by the decision maker.If D (l+1) = φ, notify the decision maker and go to 5; If l < N , assume l = l + 1 and go to 2. 9. Present the set D (N +1) to the decision maker.If he/she is able to make a final choice-end the procedure, otherwise assume l = 1, D (1) = D (N +1) and go back to step 2. Fig. 1 The structure of the decision making process

Numerical example
We consider a two-period three-criteria decision making process with profits defined by probability distributions.Each random variable takes values 0, 1 or 2 for each period and each state.The structure of the process is presented in Fig. 1, while distributions of profits are shown in Table 1.
Using a dynamic programming method we identify relations between feasible decisions for each state in period 2 (Table 2).

Iteration 1
1.The first criterion is considered.The decision maker specifies the data that he would like to analyze: 2 (d 2 ) ≥ 1 -probability that the first criterion will take value not less than 1 in the second period, -P F (1) (d) ≥ 2 -probability that the first criterion will take value not less than 2 for the whole process.
2. The potency matrix is presented to the decision maker (Table 3).
5. Since the number of process realizations satisfying constraints is greater than 1, the procedure continues: l = 2.

Iteration 2
1.The second criterion is considered.The decision maker specifies the data that he would like to analyze: 4. The constraints defined by the decision maker are consistent with stochastic dominance relations.The set of process realizations satisfying decision maker's constraints is identified; the following process realization does not satisfy the constraints and is removed: (3, 6, 7).
5. Since all criteria have been analyzed the set of process realizations is presented to the decision maker (Table 6).6.The decision maker decides to choose the process realization (1,4,8) as the final solution.

Example of application
The case study presented here describes the problem identified in a company that designs and manufactures a full range of rail control signalling equipment.We analyzed an actual problem that the company faced when considering whether to respond to the invitation to a tender announced in the country in which the company had not completed any projects before.
When starting a project the company has to make a series of decisions.First of all, managers must decide whether the project is worthy of in-depth analysis.The experience of the company's staff and their intuition is usually used at this stage.If the answer is positive, then the company's role in project implementation must be specified.Usually it operates as a general contractor, cooperates with a local partner or just supplies solutions-provides the required equipment.Thus, it should be decided whether to look for a local partner.The analysis of a local market is usually time-consuming and expensive.Both company's staff and consulting firms can be involved in such a study.
In addition to organizational issues, the preparation of the tender involves making a series of technical decisions.In most cases the company is able to propose the various solutions to the customer.The alternative to be proposed in a particular situation is crucial for the success of the tender, as well as for the cost-effectiveness of the project.In the case analyzed here it was found that two solutions, A and B, can be proposed.The first option was a proven technology that had been applied in many projects on various markets.An additional advantage of this solution was the possibility to use equipment supplied by other manufacturers, including local partners.The option B included the use of a new technology, which had just been developed by the company's engineers.In this case the whole equipment had to be provided by the company.A local partner could be employed at most for completing a part of installation work.There was also a risk that this modern technology would not work properly with other equipment installed on-site.Additional research was required to determine whether such a situation might occur.
The problem was to define the strategy which the company should apply while preparing the tender offer.After the discussions with the company's representatives we were able to specify the scenario of the decision-making process.Figure 2 presents the structure of the Fig. 2 The structure of a project planning problem problem.In order to make the description more clear, each final node represents a single strategy.
The states of the decision process and the decisions made in each state are as follows: The discussion with the company's managers resulted in specifying the following criteria: -criterion 1-profit margin (maximized), -criterion 2-utilization of a limited resource: workforce-project managers (minimized), -criterion 3-time required to prepare the offer (minimized).
In order to evaluate period criteria functions (probability distributions) the following data were used: -expert estimations-the company's staff was asked to evaluate the probability of finding a consulting firm and a local partner, the distributions of the time required to complete adaptation works if a local partner is employed as a supplier of some equipment, etc. -simulation model to analyze the profitability of the contract assuming that the company's offer is accepted by the customer.
To identify the final solution of the problem we used the procedure described above.First we used Bellman's principle to identify the set of non-dominated strategies.It consisted of 8 strategies represented by the following final nodes: 11, 13, 14, 16, 18, 19, 22, and 23.
During the dialog phase of the procedure the company's manager assigned to cooperate with us focused initially on the profit.He decided to analyze strategies with high probability of profit specified by the company's management.Next he concentrated on the third criterion, as he stated that there was a high probability of new tenders.We found that he preferred to define his preferences by specifying the minimal acceptable probability that a particular criterion value will be reached.Finally the strategy represented by the final node no.14 was chosen.According to it, the company should employ a consulting firm and search for a local partner responsible for completing a part of installation work.However, finally the company chose another solution.They decided not to search for a local partner.The reason was the lack of time for preparing the offer.It was difficult to find a consulting company that would be able to help find a local partner.As a result the time required for preparing the offer was decreased by 30 %, but the utilization of a limited resource (project managers) was increased by about 20 %.The company expected also, that in the case of winning the contract, the profit margin would be reduced by 5 % comparing to the one that would be achieved if the solution identified by our procedure would be implemented.

Final remarks
Most real-world decision problems involve multiple objectives and risk.Decisions are often made sequentially.Multiobjective stochastic dynamic programming is often an efficient tool for solving such problems and the procedure proposed in this paper can be used for solving them.It combines dynamic programming and interactive approach.
In this paper we assume that period criteria functions are stochastic, while the transition function is deterministic.This is not the only way in which stochastic dynamic decision problems can be formulated.In future research we are going to modify the problem formulation.Thus, we are going to analyze problems in which: -period criteria functions are deterministic, the transition function is stochastic, -both period criteria functions and transition functions are stochastic.
We also plan to employ our procedure to solve other real-world problems: project portfolio management, innovation management and supply chain management.

State 1 :
the decision whether to start the preparation of the offer: decision 1A-yes, decision 1B-no.State 2: the decision whether the company should look for a local partner: 2A-yes, 2B-no.State 3: the decision whether a consulting firm should be employed to look for a local partner: 3A -yes, 3B-no.State 4: the decision whether a consulting company should be employed to support project implementation: 4A-yes, 4B-no.State 5 and 6: the decision about the role that the local partner should play in the project: 5A/6A-the local partner employed as the supplier of some equipment; 5B/6B-the local partner employed for completing a part of installation work only.State 7 and 9: the selection of the technology option assuming the local partner is employed as the supplier of some equipment: 7A/9A-technology option A1: use the technology A, most of the equipment supplied by the company, only a small part of equipment supplied by the local partner; 7B/9B-technology option A2: use the technology A, the core equipment supplied by the company, larger part of equipment supplied by the local partner; 7C/9C-technology option A3: use the technology A, only the most advanced equipment supplied by the company, most of equipment supplied by the local partner.State 8 and 10: the selection of the technology option assuming the local partner is employed for completing a part of installation work only: 8A/10A-use technology A; 8B/10B-use technology B.

Table 1
Probability distributions for outcomes

Table 2
Stochastic dominance relations for decisions in feasible states in period 2