Termination Analysis of Probabilistic Programs Through Positivstellensatz’s
 23 Citations
 1.3k Downloads
Abstract
We consider nondeterministic probabilistic programs with the most basic liveness property of termination. We present efficient methods for termination analysis of nondeterministic probabilistic programs with polynomial guards and assignments. Our approach is through synthesis of polynomial ranking supermartingales, that on one hand significantly generalizes linear ranking supermartingales and on the other hand is a counterpart of polynomial rankingfunctions for proving termination of nonprobabilistic programs. The approach synthesizes polynomial rankingsupermartingales through Positivstellensatz’s, yielding an efficient method which is not only sound, but also semicomplete over a large subclass of programs. We show experimental results to demonstrate that our approach can handle several classical programs with complex polynomial guards and assignments, and can synthesize efficient quadratic rankingsupermartingales when a linear one does not exist even for simple affine programs.
Keywords
Ranking Function Assignment Statement Nest Loop Semidefinite Programming Probabilistic Program1 Introduction
Probabilistic Programs. Classic imperative programs extended with randomvalue generators give rise to probabilistic programs. Probabilistic programs provide the appropriate framework to model applications ranging from randomized algorithms [17, 38], to stochastic network protocols [5, 34], to robot planning [30, 33], etc. Nondeterminism plays a crucial role in modeling, such as, to model behaviors over which there is no control, or for abstraction. Thus nondeterministic probabilistic programs are crucial in a huge range of problems, and hence their formal analysis has been studied across disciplines, such as probability theory and statistics [18, 28, 32, 39, 42], formal methods [5, 34], artificial intelligence [30, 31], and programming languages [10, 19, 21, 43].
Basic Termination Questions. Besides safety properties, the most basic property for analysis of programs is the liveness property. The most basic and widely used notion of liveness for programs is termination. In absence of probability (i.e., for nonprobabilistic programs), the synthesis of ranking functions and proof of termination are equivalent [22], and numerous approaches exist for synthesis of ranking functions for nonprobabilistic programs [8, 13, 40, 48]. The most basic extension of the termination question for probabilistic programs is the almostsure termination question which asks whether a program terminates with probability 1. Another fundamental question is about finite termination (aka positive almostsure termination [7, 21]) which asks whether the expected termination time is finite. The next interesting question is the concentration bound computation problem that asks to compute a bound M such that the probability that the termination time is below M is concentrated, or in other words, the probability that the termination time exceeds the bound M decreases exponentially.
Previous Results. We discuss the relevant previous results for termination analysis of probabilistic programs.

Probabilistic Programs. First, quantitative invariants was introduced to establish termination of discrete probabilistic programs with demonic nondeterminism [35, 36], This was extended in [10] to ranking supermartingales resulting in a sound (but not complete) approach to prove almostsure termination of probabilistic programs without nondeterminism but with integer and realvalued random variables from distributions like uniform, Gaussian, and Poison, etc. For probabilistic programs with countable statespace and without nondeterminism, the Lyapunov ranking functions provide a sound and complete method for proving finite termination [7, 23]. Another sound method is to explore boundedtermination with exponential decrease of probabilities [37] through abstract interpretation [15]. For probabilistic programs with nondeterminism, a sound and complete characterization for finite termination through rankingsupermartingale is obtained in [21]. Ranking supermartingales thus provide a very powerful approach for termination analysis of probabilistic programs.

Ranking Functions/Supermartingales Synthesis. Synthesis of linear rankingfunctions/rankingsupermartingales has been studied extensively in [10, 12, 13, 40]. In context of probabilistic programs, the algorithmic study of synthesis of linear ranking supermartingales for probabilistic programs (cf. [10]) and probabilistic programs with nondeterminism (cf. our previous result [12]) has been studied. The major technique adopted in these results is Farkas’ Lemma [20] which serves as a complete reasoning method for linear inequalities. Beyond linear ranking functions, polynomial ranking functions have also been considered. Heuristic synthesis method of polynomial rankingfunctions is studied in [4, 9]: Babic et al. [4] checked termination of deterministic polynomial programs by detecting divergence on program variables and Bradley et al. [9] extended to nondeterministic programs through an analysis on finite differences over transitions. More general methods for deterministic polynomial programs are given by [14, 47] where Cousot [14] uses Lagrangian Relaxation, and Shen et al. [47] use Putinar’s Positivstellensatz [41]. Complete methods of synthesizing polynomial rankingfunctions for nondeterministic programs are studied by Yang et al. [50], where a complete method through root classification/real root isolation of semialgbebraic systems and quantifier elimination is proposed.
To summarize, while many different approaches has been studied, the algorithmic study of synthesis of ranking supermartingales for probabilistic programs has only been limited to linear ranking supermartingales (cf. [10, 12]). Hence there is no algorithmic approach to handle nonlinear ranking supermartingales even for probabilistic programs without nondeterminism.
 1.
Polynomial Ranking Supermartingales. First, we extend the notion of linear ranking supermartingales (LRSM) to polynomial ranking supermartingales (pRSM). We show (by a straightforward extension of LRSM) that pRSM implies both almostsure as well as finite termination.
 2.
Positivstellensatz’s. Second, we conduct a detailed investigation on the application of Positivstellensatz’s (German for “positivelocustheorem” which is related to polynomials over semialgebraic sets) (cf. Sect. 5.1) to synthesis of pRSMs over nondeterministic probabilistic programs. To the best of our knowledge, this is the first result which demonstrates the synthesis of a polynomial subclass of ranking supermartingales through Positivstellensatz’s.
 3.
New Approach for Nonprobabilistic Programs. Our results also extend existing results for nonprobabilistic programs. We present the first result that uses Schmüdgen’s Positivstellensatz [45] and Handelman’s Theorem [25] to synthesize polynomial rankingfunctions for nonprobabilistic programs.
 4.
Efficient Approach. The previous complete method [50] suffers from high computational complexity due to the use of quantifier elimination. In contrast, our approach (sound but not complete) is efficient since the synthesis can be accomplished through linear or semidefinite programming, which can mostly be solved in polynomial time in the problem size [24]. In particular, our approach does not require quantifier elimination, and works for nondeterministic probabilistic programs.
 5.
Experimental Results. We demonstrate the effectiveness of our approach on several classical examples. We show that on classical examples, such as Gambler’s Ruin, and Random Walk, our approach can synthesize a pRSM efficiently. For these examples, LRSMs do not exist, and many of them cannot be analysed efficiently by previous approaches.
In summary, while Farkas’ Lemma and Motzkin’s Transposition Theorem are standard techniques to linear ranking functions or linear ranking supermartingales, they are not sufficient for synthesizing polynomial rankingsupermartingales. To address this problem, we study the use of Positivstellensatz’s for the first time to synthesize polynomial rankingsupermartingales for probabilistic programs, for some of them even the first time for nonprobabilistic programs, and show that how they can be used for efficient termination analysis over programs. Due to space restrictions, some technical details are available only in the full version [11].
2 Probabilistic Programs
2.1 Basic Notations and Concepts
For a set A, we denote by A the cardinality of A. We denote by \(\mathbb {N}\), \(\mathbb {N}_0\), \(\mathbb {Z}\), and \(\mathbb {R}\) the sets of all positive integers, nonnegative integers, integers, and real numbers, respectively. We use boldface notation for vectors, e.g. \({{\varvec{x}}}\), \({{\varvec{y}}}\), etc., and we denote an ith component of a vector \({{\varvec{x}}}\) by \({{\varvec{x}}}[i]\).
Polynomial Predicates. Let X be a finite set of variables endowed with a fixed linear order under which we have \(X=\{x_1,\dots ,x_{X}\}\). We denote the set of realcoefficient polynomials by \({\mathfrak {R}}{\left[ x_1,\dots , x_{X}\right] }\) or \({\mathfrak {R}}{\left[ X\right] }\). A polynomial constraint over X is a logical formula of the form \({g_1}{\bowtie }{g_2}\), where \(g_1,g_2\) are polynomials over X and \(\bowtie \in \{<,\le ,>,\ge \}\). A propositional polynomial predicate over X is a propositional formula whose all atomic propositional literals are either true, false or polynomial constraints over X. The validity of the satisfaction assertion \({{\varvec{x}}}\models \phi \) between a vector \({{\varvec{x}}}\in \mathbb {R}^{X}\) (interpreted in the way that the value for \(x_j\) \((1\le j\le X)\) is \({{\varvec{x}}}[j]\)) and a propositional polynomial predicate \(\phi \) is defined in the standard way w.r.t polynomial evaluation and normal semantics for logical connectives. The satisfaction set of a propositional polynomial predicate \(\phi \) is defined as \({\!\!}\llbracket {\phi }\rrbracket {\!\!}:=\{{{\varvec{x}}}\in \mathbb {R}^{X}\mid {{\varvec{x}}}\models \phi \}\). For more on polynomials (e.g., polynomial evaluation and arithmetic over polynomials), we refer to the textbook [29, Chapter 3].
Probability Space. A probability space is a triple \((\varOmega ,\mathcal {F},\mathbb {P})\), where \(\varOmega \) is a nonempty set (socalled sample space), \(\mathcal {F}\) is a \(\sigma \) algebra over \(\varOmega \) (i.e., a collection of subsets of \(\varOmega \) that contains the empty set \(\emptyset \) and is closed under complementation and countable union), and \(\mathbb {P}\) is a probability measure on \(\mathcal {F}\), i.e., a function \(\mathbb {P}:\mathcal {F}\rightarrow [0,1]\) such that (i) \(\mathbb {P}(\varOmega )=1\) and (ii) for all setsequences \(A_1,A_2,\dots \in \mathcal {F}\) that are pairwisedisjoint (i.e., \(A_i \cap A_j = \emptyset \) whenever \(i\ne j\)) it holds that \(\sum _{i=1}^{\infty }\mathbb {P}(A_i)=\mathbb {P}\left( \bigcup _{i=1}^{\infty } A_i\right) \).
Random Variables and Filtrations. A random variable X in a probability space \((\varOmega ,\mathcal {F},\mathbb {P})\) is an \(\mathcal {F}\)measurable function \(X:\varOmega \rightarrow \mathbb {R}\cup \{\infty ,+\infty \}\), i.e., a function satisfying the condition that for all \(d\in \mathbb {R}\cup \{+\infty , \infty \}\), the set \(\{\omega \in \varOmega \mid X(\omega )\le d\}\) belongs to \(\mathcal {F}\). The expected value of a random variable X, denote by \(\mathbb {E}(X)\), is defined as the Lebesgue integral of X with respect to \(\mathbb {P}\), i.e., \(\mathbb {E}(X):=\int X\,\mathrm {d}\mathbb {P}\) ; the precise definition of Lebesgue integral is somewhat technical and is omitted here (cf. [6, Chapter 5] for a formal definition). A filtration of a probability space \((\varOmega ,\mathcal {F},\mathbb {P})\) is an infinite sequence \(\{\mathcal {F}_n \}_{n\in \mathbb {N}_0}\) of \(\sigma \)algebras over \(\varOmega \) such that \(\mathcal {F}_n \subseteq \mathcal {F}_{n+1} \subseteq \mathcal {F}\) for all \(n\in \mathbb {N}_0\).
2.2 Probabilistic Programs
The Syntax. The class of probabilistic programs we consider encompasses basic programming mechanisms such as assignment statement (indicated by ‘:=’), whileloop, ifbranch, basic probabilistic mechanisms such as probabilistic branch (indicated by ‘prob’) and random sampling, and demonic nondeterminism indicated by ‘\(\star \)’. Variables (or identifiers) of a probabilistic program are of real type, i.e., values of the variables are real numbers; moreover, variables are classified into program and sampling variables, where program variables receive their values through assignment statements and sampling variables do through random samplings. We consider that each sampling variable r is bounded, i.e., associated with a onedimensional cumulative distribution function \(\Upsilon _r\) and a nonempty bounded interval \(\mathrm {supp}_{r}\) such that any random variable z which respects \(\Upsilon _r\) satisfies that z lies in the bounded interval with probability 1. Due to space restriction, details (e.g., grammar) are relegated to the full version [11]. An example probabilistic program is illustrated in Example 1.
Example 1
Consider the running example depicted in Fig. 1, where r is a sampling variable with the twopoint distribution \(\{1\mapsto 0.5,1\mapsto 0.5\}\) where the probability to take values 1 and \(1\) are both 0.5. The probabilistic program models a scenario of Gambler’s Ruin where the gambler has initial money x and repeats gambling until he wins more than 10 or loses all his money. The result of a gamble is nondeterministic: either win 1 with probability 0.5 (nondeterministic branch); or lose with probability 0.51 (the probabilistic branch). The numbers 1–7 on the left are the program counters for the program, where 1 is the initial program counter and 7 the terminal program counter.
The Semantics. We use control flow graphs to capture the semantics of probabilistic programs, which we define below.
Definition 1

\( L \) is a finite set of labels partitioned into four pairwisedisjoint subsets \( L _\mathrm {d}\), \( L _\mathrm {p}, L _\mathrm {c}\) and \( L _\mathrm {a}\) of demonic, probabilistic, conditionalbranching (branching for short) and assignment labels, resp.; and \(\bot \) is a special label not in L called the terminal label;

\(X\) and \(R\) are disjoint finite sets of realvalued program and sampling variables respectively;

\(\mapsto \) is a transition relation in which every member (called transition) is a tuple of the form \((\ell ,\alpha ,\ell ')\) for which \(\ell \) (resp. \(\ell '\)) is the source label (resp. target label) in \( L \) and \(\alpha \) is either a real number in (0, 1) if \(\ell \in L _\mathrm {p}\), or \(\star \) if \(\ell \in L _\mathrm {d}\), or a propositional polynomial predicate if \(\ell \in L _\mathrm {c}\), or an update function \(f:\mathbb {R}^{X}\times \mathbb {R}^{R}\rightarrow \mathbb {R}^{X}\) if \(\ell \in L _\mathrm {a}\).
W.l.o.g, we assume that \( L \subseteq \mathbb {N}_0\). Intuitively, labels in \( L _\mathrm {d}\) correspond to demonic statements indicated by ‘\(\star \)’; labels in \( L _\mathrm {p}\) correspond to probabilisticbranching statements indicated by ‘prob’; labels in \( L _\mathrm {c}\) correspond to conditionalbranching statements indicated by some propositional polynomial predicate; labels in \( L _\mathrm {a}\) correspond to assignments indicated by ‘\(:=\)’; and the terminal label \(\bot \) denotes the termination of a program. The transition relation \(\mapsto \) specifies the transitions between labels together with the additional information specific to different types of labels. The update functions are interpreted as follows: we first fix two linear orders on \(X\) and \(R\) so that \(X= \{x_1,\dots ,x_{X}\}\) and \(R= \{r_1,\dots ,r_{R}\}\), interpreting each vector \({{\varvec{x}}}\in \mathbb {R}^{X}\) (resp. \({{\varvec{r}}}\in \mathbb {R}^{R}\)) as a valuation of program (resp. sampling) variables in the sense that the value of \(x_j\) (resp. \(r_j\)) is \({{\varvec{x}}}[j]\) (resp. \({{\varvec{r}}}[j]\)); then each update function f is interpreted as a function which transforms a valuation \({{\varvec{x}}}\in \mathbb {R}^{X}\) before the execution of an assignment statement into \(f({{\varvec{x}}},{{\varvec{r}}})\) after the execution of the assignment statement, where \({{\varvec{r}}}\) is the valuation on \(R\) obtained from a sampling before the execution of the assignment statement.
It is intuitively clear that any probabilistic program can be naturally transformed into a CFG. Informally, each label represents a program location in an execution of a probabilistic program for which the statement of the program location is the next to be executed (see Fig. 2).
In the rest of the section, we fix a probabilistic program P with the set \(X= \{x_1,\dots ,x_{X}\}\) of program variables and the set \(R= \{r_1,\dots ,r_{R}\}\) of sampling variables, and let \(\mathcal {G}=( L ,\bot ,(X,R),\mapsto )\) be its associated CFG. We also fix \(\ell _0\) and resp. \({{\varvec{x}}}_0\) to be the label corresponding to the first statement to be executed in P and resp. the initial valuation of program variables.
The Semantics. A configuration (for P) is a tuple \((\ell ,{{\varvec{x}}})\) where \(\ell \in L \cup \{\bot \}\) and \({{\varvec{x}}}\in \mathbb {R}^{X}\). A finite path (of P) is a finite sequence of configurations \((\ell _0,{{\varvec{x}}}_0),\cdots ,(\ell _k,{{\varvec{x}}}_k)\) such that for all \(0 \le i < k\), either (i) \(\ell _{i+1}=\ell _i=\bot \) and \({{\varvec{x}}}_i={{\varvec{x}}}_{i+1}\) (i.e., the program terminates); or (ii) there exist \((\ell _i,\alpha ,\ell _{i+1})\in \mapsto \) and \({{\varvec{r}}}\in \{{{\varvec{r}}}'\mid \forall r\in R.\ {{\varvec{r}}}'(r)\in \mathrm {supp}_{r}\}\) such that one of the following conditions hold: (a) \(\ell _i\in L _\mathrm {p}\cup L _\mathrm {d}\) and \({{\varvec{x}}}_i={{\varvec{x}}}_{i+1}\) (probabilistic or demonic transitions), (b) \(\ell _i\in L _\mathrm {c}\), \({{\varvec{x}}}_i={{\varvec{x}}}_{i+1}\) and \({{\varvec{x}}}_i\models \alpha \) (conditionalbranch transitions), (c) \(\ell _i\in L _\mathrm {a}\) and \({{\varvec{x}}}_{i+1}=\alpha ({{\varvec{x}}}_i,{{\varvec{r}}})\) (assignment transitions). A run (of P) is an infinite sequence of configurations whose all finite prefixes are finite paths over P. A configuration \((\ell ,{{\varvec{x}}})\) is reachable from the initial configuration \((\ell _0,{{\varvec{x}}}_0)\) if there exists a finite path \((\ell _0,{{\varvec{x}}}_0),\cdots ,(\ell _k,{{\varvec{x}}}_k)\) such that \((\ell ,{{\varvec{x}}})=(\ell _k,{{\varvec{x}}}_k)\).
The probabilistic feature of P can be captured by constructing a suitable probability measure over the set of all its runs. However, before this can be done, nondeterminism in P needs to be resolved by some scheduler.
Definition 2
(Scheduler). A scheduler (for P) is a function which assigns to every finite path \((\ell _0,{{\varvec{x}}}_0),\dots ,(\ell _k,{{\varvec{x}}}_k)\) with \(\ell _k\in L _\mathrm {d}\) a transition in \(\mapsto \) with source label \(\ell _k\).
The behaviour of P under a scheduler \(\sigma \) is standard: at each step, P first samples a real number for each sampling variable and then evolves to the next step according to its CFG or the scheduler choice. In this way, the scheduler and random choices/samplings produce a run over P. Moreover, each scheduler \(\sigma \) induces a unique probability measure \(\mathbb {P}^{\sigma }\) over the runs of P. In the sequel, we will use \(\mathbb {E}^{\sigma }(\cdot )\) to denote the expected values of random variables under \(\mathbb {P}^{\sigma }\).
Random Variables and Filtrations over Runs. We define the following (vectors of) random variables on the set of runs of P: \(\{\theta ^P_n\}_{n\in \mathbb {N}_0},~\{\overline{{{\varvec{x}}}}^P_{n}\}_{n\in \mathbb {N}_0}\) and \(\{\overline{{{\varvec{r}}}}^P_{n}\}_{n\in \mathbb {N}_0}\): each \(\theta ^P_n\) is the random variable representing the (integervalued) label at the nth step; each \(\overline{{{\varvec{x}}}}^P_{n}\) is the vector of random variables such that each \(\overline{{{\varvec{x}}}}^P_{n}[i]\) is the random variable representing the value of the program variable \(x_i\) at the nth step; and each \(\overline{{{\varvec{r}}}}^P_{n}[i]\) is the random variable representing the sampled value of the sampling variable \(r_i\) at the nth step. The filtration \(\{\mathcal {H}^P_n\}_{n\in \mathbb {N}_0}\) is defined such that each \(\sigma \)algebra \(\mathcal {H}^P_n\) is the smallest \(\sigma \)algebra that makes all random variables in \(\{\theta ^P_k\}_{0\le k\le n}\) and \(\{\overline{{{\varvec{x}}}}^P_{k}\}_{0\le k\le n}\) measurable. We will omit the superscript P in all the notations above if it is clear from the context.
Remark 1
Under the condition that each sampling variable is bounded, using an inductive argument it follows that each \(\overline{{{\varvec{x}}}}_{n}\) is a vector of bounded random variables. Thus \(\mathbb {E}^\sigma ({}{\overline{{{\varvec{x}}}}_n[i]}{})\) exists for each random variable \(\overline{{{\varvec{x}}}}_n[i]\).
Below we define the notion of polynomial invariants which logically captures all reachable configurations. A polynomial invariant may be obtained through abstract interpretation [15].
Definition 3
(Polynomial Invariant). A polynomial invariant (for P) is a function \(I\) assigning a propositional polynomial predicate over \(X\) to every label in \(\mathcal {G}\) such that for all configurations \((\ell ,{{\varvec{x}}})\) reachable from \((\ell _0,{{\varvec{x}}}_0)\) in \(\mathcal {G}\), it holds that \({{\varvec{x}}}\models I(\ell )\).
3 Termination over Probabilistic Programs
In this section, we first define the notions of almostsure/finite termination and concentration bounds over probabilistic programs, and then describe the computational problems studied in this paper. Below we fix a probabilistic program P with its associated CFG \(\mathcal {G}=( L ,\bot ,(X,R),\mapsto )\) and an initial configuration \((\ell _0,{{\varvec{x}}}_0)\) for P.
Definition 4
(Termination [7, 12, 21]). A run \(\omega =\{(\ell _n,{{\varvec{x}}}_n)\}_{n\in \mathbb {N}_0}\) over P is terminating if \(\ell _n=\bot \) for some \(n\in \mathbb {N}_0\). The termination time of P is a random variable \(T_P\) such that for each run \(\omega =\{(\ell _n,{{\varvec{x}}}_n)\}_{n\in \mathbb {N}_0}\), \(T_P(\omega )\) is the least number n such that \(\ell _n=\bot \) if such n exists, and \(\infty \) otherwise. The program P is said to be almostsure terminating (resp. finitely terminating) if \(\mathbb {P}^\sigma (T_P<\infty )=1\) (resp. \(\mathbb {E}^\sigma (T_P)<\infty \)) for all schedulers \(\sigma \) (for P).
Note that \(\mathbb {E}^\sigma (T_P)<\infty \) implies that \(\mathbb {P}^\sigma (T_P<\infty )=1\), but the converse does not necessarily hold (see [10, Example 5] for an example). To measure the expected values of the termination time under all (demonic) schedulers, we further define the quantity \(\mathsf {ET}(P):=\sup _{\sigma }\mathbb {E}^{\sigma }(T_P)\).
Definition 5
(Concentration on Termination Time [12, 37]). A concentration bound for P is a nonnegative integer M such that there exist real constants \(c_1\ge 0\) and \(c_2>0\), and for all \(N \ge M\) we have \(\mathbb {P}(T_P>N)\le c_1\cdot e^{c_2\cdot N}\).
Informally, a concentration bound characterizes exponential decrease of probability values of nontermination beyond the bound. On one hand, it can be used to give an upper bound on probability of nontermination beyond a large step; and on the other hand, it leads to an algorithm that approximates \(\mathsf {ET}(P)\) (cf. [12, Theorem 5]).

Input: a probabilistic program P, a polynomial invariant \(I\) for P and an initial configuration \((\ell _0,{{\varvec{x}}}_0)\) for P;

Output (AlmostSure/Finite Termination): “\(\text{ yes }\)” if the algorithm finds that P is almostsure/finite terminating and “\(\text{ fail }\)” otherwise;

Output (Concentration on Termination): a concentration bound if the algorithm finds one and “\(\text{ fail }\)” otherwise.
4 Polynomial RankingSupermartingale
In this section, we develop the notion of polynomial rankingsupermartingale which is an extension of linear rankingsupermartingale [10, 12]. We fix a probabilistic program P, a polynomial invariant I for P and an initial configuration \((\ell _0,{{\varvec{x}}}_0)\) for P. Let \(\mathcal {G}=( L ,\bot ,(X,R),\mapsto )\) be the associated CFG of P, with \(X= \{x_1,\dots ,x_{X}\}\) and \(R= \{r_1,\dots ,r_{R}\}\). We first present the general notion of ranking supermartingale, and then define polynomial ranking supermartingale.
Definition 6
(Ranking Supermartingale [12, 21]). A discretetime stochastic process \(\{X_n\}_{n\in \mathbb {N}_0}\) w.r.t a filtration \(\{\mathcal {F}_n\}_{n\in \mathbb {N}_0}\) is a ranking supermartingale (RSM) if there exist \(K<0\) and \(\epsilon >0\) such that for all \(n\in \mathbb {N}_0\), we have \(\mathbb {E}(X_n)<\infty \) and it holds almost surely (with probability 1) that \(X_n\ge K\) and \(\mathbb {E}(X_{n+1}\mid \mathcal {F}_n)\le X_n\epsilon \cdot \mathbf {1}_{X_n\ge 0}\), where \(\mathbb {E}(X_{n+1}\mid \mathcal {F}_n)\) is the conditional expectation of \(X_{n+1}\) given \(\mathcal {F}_n\) (cf. [49, Chapter 9]).
Informally, a polynomial rankingsupermartingale over P is a polynomial instantiation of an RSM through certain function \(\eta :( L \cup \{\bot \})\times \mathbb {R}^{X}\rightarrow \mathbb {R}\) which satisfies that each \(\eta (\ell ,\cdot )\) (for all \(\ell \in L \cup \{\bot \}\)) is essentially a polynomial function over \(X\). Given such a function \(\eta \), the intuition is to have conditions that make the stochastic process \(X_n=\eta (\theta _n,\overline{{{\varvec{x}}}}_n)\) an RSM. To ensure this, we consider the conditional expectation \(\mathbb {E}^\sigma \left( X_{n+1}\mid \mathcal {H}_n\right) \); this is captured by an extension of preexpectation [10, 12] from the linear to the polynomial case. Below we define \( L _{\bot }:= L \cup \{\bot \}\). For a function \(g:\mathbb {R}^{X}\times \mathbb {R}^{R}\rightarrow \mathbb {R}\), we let \(\mathbb {E}_R(g,\cdot ):\mathbb {R}^{X}\rightarrow \mathbb {R}\) be the function such that each \(\mathbb {E}_R(g,{{\varvec{x}}})\) is the expected value \(\mathbb {E}(g({{\varvec{x}}},\hat{{{\varvec{r}}}}))\), where \(\hat{{{\varvec{r}}}}\) is any vector of independent random variables such that each \(\hat{{{\varvec{r}}}}[i]\) is a random variable that respects the cumulative distribution function \(\Upsilon _{r_i}\).
Definition 7

\(\mathrm {pre}_\eta (\ell ,{{\varvec{x}}}):=\sum _{(\ell ,z,\ell ')\in \mapsto } z\cdot \eta \left( \ell ',{{\varvec{x}}}\right) \) if \(\ell \in L _\mathrm {p}\) (probabilistic transitions);

\(\mathrm {pre}_\eta (\ell ,{{\varvec{x}}}):=\max _{(\ell ,\star ,\ell ')\in \mapsto }\eta (\ell ',{{\varvec{x}}})\) if \(\ell \in L _\mathrm {d}\) (nondeterministic transitions);

\(\mathrm {pre}_\eta (\ell ,{{\varvec{x}}}):=\eta (\ell ',{{\varvec{x}}})\) if \(\ell \in L _\mathrm {c}\) and \((\ell ,\phi ,\ell ')\) is the only transition in \(\mapsto \) such that \({{\varvec{x}}}\models \phi \) (conditional transitions);

\(\mathrm {pre}_\eta (\ell ,{{\varvec{x}}}):=\mathbb {E}_{R}\left( g,{{\varvec{x}}}\right) \) if \(\ell \in L _{\mathrm {a}}\), where g is the function such that \(g({{\varvec{x}}},{{\varvec{r}}})=\eta \left( \ell ',f({{\varvec{x}}},{{\varvec{r}}})\right) \) and \((\ell ,f,\ell ')\) is the only transition in \(\mapsto \) (assignment transitions); and

\(\mathrm {pre}_\eta (\ell ,{{\varvec{x}}}):=\eta (\ell ,{{\varvec{x}}})\) if \(\ell =\bot \) (terminal location).
The following lemma establishes the relationship between preexpectation and conditional expectation.
Lemma 1
Let \(\eta : L _\bot \times \mathbb {R}^{X}\rightarrow \mathbb {R}\) be a function such that each \(\eta (\ell ,\cdot )\) (for all \(\ell \in L _\bot \)) is a polynomial function over \(X\), and \(\sigma \) be any scheduler. Let the stochastic process \(\{X_n\}_{n\in \mathbb {N}_0}\) be defined by: \(X_{n}:=\eta (\theta _{n},\overline{{{\varvec{x}}}}_{n})\). Then for all \(n\in \mathbb {N}_0\), we have \(\mathbb {E}^{\sigma }(X_{n+1}\mid \mathcal {H}_n)\le \mathrm {pre}_\eta (\theta _{n},\overline{{{\varvec{x}}}}_{n})\).
Example 2
Consider the running example in Example 1 with CFG in Fig. 2. Let \(\eta \) be the function specified in the second and fifth column of Table 1, where \(g(x):=(x1)(10x)\). Then \(\mathrm {pre}_\eta \) is given in the third and sixth column of Table 1. Note that the case for \(i=2\) is obtained from \(\mathrm {pre}_\eta (2, x)=\max \{g(x)+9.6,g(x)+9.6\}\), and the case for \(i=3\) is from \(\mathrm {pre}_\eta (3, x)=\mathbb {E}_R(h, x)\), where h is the function \(h(y,r)= g(y)(2y11)rr^2+10\).
i  \(\eta (i,x)\)  \(\mathrm {pre}_\eta (i,x)\)  i  \(\eta (i,x)\)  \(\mathrm {pre}_\eta (i,x)\) 

1  \(g(x)+10\)  \(\mathbf {1}_{1\le x\le 10}\cdot (g(x)+9.8)\) \({}+\mathbf {1}_{x < 1\vee x> 10}\cdot (0.2)\)  5  \(g(x)+2x1.8\)  \(g(x)+2x2\) 
2  \(g(x)+9.8\)  \(g(x)+9.6\)  6  \(g(x)2x+20.2\)  \(g(x)2x+20\) 
3  \(g(x)+9.6\)  \(g(x)+9\)  7  \(0.2\)  \(0.2\) 
4  \(g(x)+9.6\)  \(g(x)+0.04x+8.98\) 
We now define the notion of polynomial rankingsupermartingale. The intuition is that we encode the RSMdifference condition as a logical formula, treat zero as the threshold between terminal and nonterminal labels, and use the invariant I to overapproximate the set of reachable configurations at each label. Below for each \(\ell \in L _\mathrm {c}\), we define \(\mathsf {PP}(\ell )\) to be the propositional polynomial predicate \(\bigvee _{(\ell ,\phi ,\ell ')\in \mapsto , \ell '\ne \bot }\phi \); and for \(\ell \in L \backslash L _\mathrm {c}\), we let \(\mathsf {PP}(\ell ):= \text{ true. }\)
Definition 8

C1: the function \(\eta (\ell ,\cdot ):\mathbb {R}^{X}\rightarrow \mathbb {R}\) is a polynomial over \(X\) of order at most d;

C2: if \(\ell \ne \bot \) and \({{\varvec{x}}}\models I(\ell )\), then \(\eta (\ell ,{{\varvec{x}}})\ge 0\);

C3: if \(\ell =\bot \), then \(\eta (\ell , {{\varvec{x}}})=K\);

C4: if \(\ell \ne \bot \) and \({{\varvec{x}}}\models I(\ell )\wedge \mathsf {PP}(\ell )\), then \(\mathrm {pre}_\eta (\ell ,{{\varvec{x}}})\le \eta (\ell ,{{\varvec{x}}})\epsilon \).
Note that C2 and C3 together separate nontermination and termination by the threshold 0, and C4 is the RSM difference condition which is intuitively related to the \(\epsilon \) difference in the RSM definition (cf. Definition 6). By generalizing our previous proofs in [12] (from LRSM to pRSM), we establish the soundness of pRSMs w.r.t both almostsure and finite termination.
Theorem 1
If there exists a dpRSM \(\eta \) w.r.t (P, I) with constants \(\epsilon ,K\) (cf. Definition 8), then P is a.s. terminating and \(\mathsf {ET}(P)\le \mathsf {UB}(P):=\frac{\eta (\ell _0,{{\varvec{x}}}_0)K}{\epsilon }\).
Example 3
Consider the running example (cf. Example 1) and the function \(\eta \) given in Example 2. Assuming that the initial valuation satisfies \(1\le x\wedge x\le 10\), we assign the trivial invariant I such that \(I(1)=0\le x\wedge x\le 11\), \(I(j)=1\le x\wedge x\le 10\) for \(2\le j\le 6\) and \(I(7)=x<1\vee x>10\). It is straightforward to verify that \(\eta \) is a 2pRSM with \(\epsilon =0.2\) and \(K=0.2\) (cf. Definition 8 for \(\epsilon , K\)). Hence by Theorem 1, the program in Example 1 terminates almostsurely under any scheduler and its expected termination time is at most \(5\cdot (x_01)\cdot (10x_0)+51\), given the initial value \(x_0\).
Remark 2
The running example (cf. Example 1) does not admit a linear (i.e. 1) pRSM since \(\mathbb {E}_R(r)=0\) at label 3. This indicates that linear pRSMs may not exist even over simple affine programs like Example 1. Thus, this motivates the study of pRSMs even for simple affine programs.
Remark 3
The nonstrict inequality symbol ‘\(\ge \)’ in C2 can be replaced by its strict counterpart ‘>’ since \(\eta +c\) (\(c>0\)) remains to be a pRSM if \(\eta \) is a pRSM and K (in C3) is sufficiently small. (By definition, \(\mathrm {pre}_{\eta +c}=\mathrm {pre}_\eta +c\).) Moreover, the nonstrict inequality symbol ‘\(\le \)’ in C4 can be replaced by ‘<’ since a pRSM \(\eta \) and a constant K (for C3) can be scaled by a constant factor (e.g. 1.1) so that strict inequalities are ensured. Moreover, one can also assume that \(K=1\) and \(\epsilon =1\) in Definition 8. This is because one can first scale a pRSM with constants \(\epsilon , K\) by a positive scalar to ensure that \(\epsilon =1\), and then safely set \(K=1\) due to C2.
Theorem 1 answers the questions of almostsure and finite termination in a unified fashion. Generalizing our approach in [12], we show that by restricting a pRSM to have bounded difference, we also obtain concentration results.
Definition 9

for all \(\ell \in L _\mathrm {d}\cup L _\mathrm {p}\) and \((\ell ,\alpha ,\ell ')\in \mapsto \), and for all \({{\varvec{x}}}\in {\!\!}\llbracket {I(\ell )}\rrbracket {\!\!}\), it holds that \(a\le \eta (\ell ',{{\varvec{x}}})\eta (\ell ,{{\varvec{x}}})\le b\);

for all \(\ell \in L _\mathrm {c}\) and \((\ell ,\phi ,\ell ')\in \mapsto \), and for all \({{\varvec{x}}}\in {\!\!}\llbracket {I(\ell )\wedge \phi }\rrbracket {\!\!}\), it holds that \(a\le \eta (\ell ',{{\varvec{x}}})\eta (\ell ,{{\varvec{x}}})\le b\);

for all \(\ell \in L _\mathrm {a}\) and \((\ell ,f,\ell ')\in \mapsto \), for all \({{\varvec{x}}}\in {\!\!}\llbracket {I(\ell )}\rrbracket {\!\!}\) and for all \({{\varvec{r}}}\in \{{{\varvec{r}}}'\mid \forall r\in R.\ {{\varvec{r}}}'[r]\in \mathrm {Supp}_r\}\), it holds that \(a\le \eta (\ell ',f({{\varvec{x}}},{{\varvec{r}}}))\eta (\ell ,{{\varvec{x}}})\le b\).
Note that if a dpRSM \(\eta \) with constants \(\epsilon ,K\) (cf. Definition 8) is differencebounded w.r.t [a, b], then from definition \(a\le \epsilon \); one can further assume that \(\epsilon \le b\) since otherwise one can reset \(\epsilon :=b\). By definition, the stochastic process \(X_n:=\eta (\theta _n, \overline{{{\varvec{x}}}}_n)\) defined through a differencebounded pRSM w.r.t [a, b] satisfies that \(a\le X_{n+1}X_n\le b\); then using Hoeffding’s Inequality [12, 26], we establish a concentration bound.
Theorem 2
Let \(\eta \) be a differencebounded dpRSM w.r.t [a, b] with constants \(\epsilon \) and K. For all \(n\in \mathbb {N}\), if \(\epsilon (n1)>\eta (\ell _0,{{\varvec{x}}}_0)\), then \(\mathbb {P}(T_P > n)\le e^{\frac{2(\epsilon (n1)\eta (\ell _0,{{\varvec{x}}}_0))^2}{(n1)(ba)^2}}\).
From Theorem 2, a differencebounded dpRSM \(\eta \) implies a concentration bound \(\frac{\eta (\ell _0,{{\varvec{x}}}_0)}{\epsilon }+2\).
Example 4

for all \(x\in [1,10]\), \(\eta (2,x)\eta (1,x)=0.2\);

for all \(x\in [0,1)\cup (10,11]\), \(10.2\le \eta (7,x)\eta (1,x)\le 0.2\);

for all \(x\in [1,10]\) and \(i\in \{3,4\}\), \(\eta (i,x)\eta (2,x)=0.2\);

for all \(x\in [1,10]\) and \(i\in \{5,6\}\), \(9.4\le \eta (i,x)\eta (4,x)\le 8.6\);

for all \(x\in [1,10]\), \(\eta (1,x1)\eta (5,x)=0.2\);

for all \(x\in [1,10]\), \(\eta (1,x+1)\eta (6,x)=0.2\);

for all \(x\in [1,10]\) and \(r\in \{1,1\}\), \(9.6\le \eta (1,x+r)\eta (3,x)\le 8.4\).
Then by Theorem 2, assuming that the program have initial value \(x_0=5\), one can deduce that \(\mathbb {P}\left( T_P>50000\right) \le e^{\frac{2\cdot (0.2\cdot 4999930)^2}{49999\cdot 18.8^2}}\approx 1.3016\cdot 10^{5}\).
We end this section with a result stating that whether a (differencebounded) dpRSM exists can be decided (using quantifier elimination).
Theorem 3
For any fixed natural number \(d\in \mathbb {N}\), the problem whether a (differencebounded) dpRSM w.r.t an input pair (P, I) exists is decidable.
5 The Synthesis Algorithm
In this section, we present an efficient algorithmic approach for solving almostsure/finite termination and concentration questions through synthesis of pRSMs. Instead of computationallyexpensive quantifier elimination (cf. Theorem 3) we use Positivstellensatz, which is sound but not complete. Note that by Theorem 1, the existence of a pRSM implies both almostsure and finite termination of a probabilistic program.
Example 5
Consider again the program in Example 1 with its CFG. Consider the invariant specified in Example 3. The instances of the pattern for termination of this program are listed as follows, where each instance is represented by a pair \((\varGamma ,g)\) where \(\varGamma \) and g corresponds to \(\{g_1,\dots ,g_m\}\) and resp. g described in (\(\dag \)).

(C4, label 1) \((\{x1,10x,x,11x\}, \eta (1,x)\eta (2,x)\epsilon )\);

(C4, label 2) \((\{x1,10x\}, \eta (2,x)\eta (3,x)\epsilon )\) and \((\{x1,10x\}, \eta (2,x)\eta (4,x)\epsilon )\);

(C4, label 3) \((\{x1,10x\}, \eta (3,x)\mathbb {E}_R((y,r)\mapsto \eta (1,y+r), x)\epsilon )\);

(C4, label 4) \((\{x1,10x\}, \eta (4,x)0.51\eta (5,x)0.49\eta (6,x)\epsilon )\);

(C4, label 5) \((\{x1,10x\}, \eta (5,x)\eta (1, x1)\epsilon )\);

(C4, label 6) \((\{x1,10x\}, \eta (6,x)\eta (1, x+1)\epsilon )\);

(C2) \((\{x,11x\}, \eta (1,x))\) and \((\{x1,10x\}, \eta (j,x))\) for \(2\le j\le 6\).
In the next part, we show that such pattern can be solved by Positivstellensatz’s.
5.1 Positivstellensatz’s
We fix a linearlyordered finite set X of variables and a finite set \(\varGamma =\{g_1,\dots ,g_m\}\subseteq {\mathfrak {R}}{\left[ X\right] }\) of polynomials. Let \({\!\!}\llbracket {\varGamma }\rrbracket {\!\!}\) be the set of all vectors \({{\varvec{x}}}\in \mathbb {R}^{X}\) satisfying the propositional polynomial predicate \(\bigwedge _{i=1}^m g_i\ge 0\). We first define preorderings and sums of squares as follows.
Definition 10
Definition 11
Remark 4
It is wellknown that a realcoefficient polynomial g of degree 2d is a sum of squares iff there exists a kdimensional positive semidefinite real square matrix Q such that \(g={{\varvec{y}}}^\mathrm {T} Q{{\varvec{y}}}\), where k is the number of monomials of degree no greater than d and \({{\varvec{y}}}\) is the column vector of all such monomials (cf. [27, Corollary 7.2.9]). This implies that the problem whether a given polynomial (with real coefficients) is a sum of squares can be solved by semidefinite programming [24].
Now we present the first Positivstellensatz, called Schmüdgen’s Positivstellensatz.
Theorem 4
(Schmüdgen’s Positivstellensatz [45]). Let \(g\in {\mathfrak {R}}{\left[ X\right] }\). If the set \({\!\!}\llbracket {\varGamma }\rrbracket {\!\!}\) is compact and \(g({{\varvec{x}}})>0\) for all \({{\varvec{x}}}\in {\!\!}\llbracket {\varGamma }\rrbracket {\!\!}\), then \(g\in \text{ PO }(\varGamma )\).
Example 6
Theorem 4 can be further refined by a weaker version of Putinar’s Positivstellensatz.
Theorem 5
Similar to Eqs. (\(\ddag \)) and (\(\S \)) results in a system of linear equalities that involves variables for synthesis of a pRSM and matrices of variables under semidefinite constraints, provided that an upper bound on the degrees of sums of squares is enforced.
Example 7
In the following, we introduce a Positivstellensatz entitled Handelman’s Theorem when \(\varGamma \) consists of only linear (degree one) polynomials. For Handelman’s Theorem, we assume that \(\varGamma \) consists of only linear (degree 1) polynomials and \({\!\!}\llbracket {\varGamma }\rrbracket {\!\!}\) is nonempty. (Note that whether a system of linear inequalities has a solution is decidable in PTIME [46].)
Definition 12
Theorem 6
To apply Handelman’s theorem, we consider a natural number which serves as a bound on the number of multiplicands allowed to form an element in \(\text{ Monoid }(\varGamma )\); then Eq. (\(\#\)) results in a system of linear equalities involving \(a_1,\dots ,a_d\). Unlike previous Positivstellensatz’s, the form of Handelman’s theorem allows us to construct a system of linear equalities free from semidefinite constraints.
Example 8
Consider that \(X=\{x\}\) and \(\varGamma =\{1x,1+x\}\). Fix the maximal number of multiplicands in an element of \(\text{ Monoid }(\varGamma )\) to be 2. Then the form of Eq. (\(\#\)) can be rewritten as \(g=\sum _{i=1}^6 a_i\cdot u_i\) where \(u_1=1\), \(u_2=1x\), \(u_3=1+x\), \(u_4=1x^2\), \(u_5=12x+x^2\), \(u_6=1+2x+x^2\) and each \(a_i\) (\(1\le i\le 6\)) is subject to be a nonnegative real number.
5.2 The Algorithm for pRSM Synthesis
Based on the Positivstellensatz’s introduced in the previous part, we present our algorithm for synthesis of pRSMs. Below, we fix an input probabilistic program P, an input polynomial invariant \(I\) and an input initial configuration \((\ell _0,{{\varvec{x}}}_0)\) for P. Let \(\mathcal {G}=( L ,\bot ,(X,R),\mapsto )\) be the associated CFG of P.
Description of the Algorithm PRSMSynth. We present a succinct description of the key ideas. The description of the key steps of the algorithm is as follows.
 1.
Template \(\eta \) for a pRSM. The algorithm fixes a natural number d as the maximal degree for a pRSM, constructs \(\mathcal {M}_d\) as the set of all monomials over X of degree no greater than d, and set up a template dpRSM \(\eta \) such that each \(\eta (\ell ,\cdot )\) is the polynomial \(\sum _{h\in \mathcal {M}_d} a_{h,\ell }\cdot h\) where each \(a_{h,\ell }\) is a (distinct) scalar variable (cf. C1).
 2.
Bound for Sums of Squares and Monoid Multiplicands. The algorithm fixes a natural number k as the maximal degree for a sum of squares (cf. Schmüdgen’s and Putinar’s Positivstellensatz) or as the maximal number of multiplicands in a monoid element (cf. Handelman’s Theorem).
 3.
RSMDifference and TerminatingNegativity. From Remark 3, the algorithm fixes \(\epsilon \) to be 1 (cf. condition C3) and K to be \(1\) (cf. condition C4).
 4.
Computation of preexpectation \(\mathrm {pre}_\eta \). With \(\epsilon ,K\) fixed to be resp. \(1,1\) in the previous step, the algorithm computes \(\mathrm {pre}_\eta \) by Definition 7, whose all involved coefficients are linear combinations from \(a_{h,\ell }\)’s.
 5.
Pattern Extraction. The algorithm extracts instances conforming to pattern (\(\dag \)) from C2, C4 and formulae presented in Definition 9, and translates them into systems of linear equalities over variables among \(a_{h,\ell }\)’s, \(\epsilon \), K, and extra matrices of variables assumed to be positive semidefinite (cf. Schmüdgen’s and Putinar’s Positivstellensatz) or scalar variables assumed to be nonnegative (cf. Handelman’s Theorem) through Eqs. (\(\ddag \)), (\(\S \)) and (\(\#\)).
 6.
Solution via Semidefinite or Linear Programming. The algorithm calls semidefinite programming (for Schmüdgen’s and Putinar’s Positivstellensatz) or linear programming (for Handelman’s Theorem) in order to check the feasibility or to optimize \(\mathsf {UB}(P)\) (cf. Theorem 1 for upper bound of \(\mathsf {ET}(P)\)) over all variables among \(a_{h,\ell }\)’s and extra matrix/scalar variables from Eqs. (\(\ddag \)), (\(\S \)) and (\(\#\)). Note that the feasibility implies the existence of a (differencebounded) dpRSM; the existence of a dpRSM in turn implies finite termination, and the existence of a differencebounded dpRSM in turn implies a concentration bound through Theorem 2.
The soundness of our algorithm is as follows.
Theorem 7
(Soundness). Any function \(\eta \) synthesized through the algorithm PRSMSynth is a valid pRSM.
Remark 5
(Efficiency). It is wellknown that for semidefinite programs with a positive real number R to bound the Frobenius norm of any feasible solution, an approximate solution upto precision \(\epsilon \) can be computed in polynomial time in the size of the semidefinite program (with rational numbers encoded in binary), \(\log R\) and \(\log \epsilon ^{1}\) [24]. Thus, our sound approach presents an efficient method for analysis of many probabilistic programs. Moreover, when each propositional polynomial predicate in the probabilistic program involves only linear polynomials, then the sound form of Handelman’s theorem can be applied, resulting in feasibility checking of systems of linear inequalities rather than semidefinite constraints. By polynomialtime algorithms for solving systems of linear inequalities [46], our approach is polynomial time (and thus efficient) over such programs.
Remark 6
(SemiCompleteness). Consider probabilistic programs of the following form: \(\mathbf{while}~\phi ~\mathbf{do}~\mathbf{if}~\star ~\mathbf{then}~P_1~\mathbf{else}~P_2~\mathbf{od}\), where \(P_1,P_2\) are single assignments, \({\!\!}\llbracket {\phi }\rrbracket {\!\!}\) is compact, and invariants which assign to each label a propositional polynomial predicate is in DNF form that involves no strict inequality (i.e. no ‘<’ or ‘>’). Upon such inputs, our approach is semicomplete in the sense that by raising the upper bounds for the degree of a sum of squares and the number of multiplicands in a monoid element, the algorithm PRSMSynth will eventually find a pRSM if it exists. This is because Theorems 4 to 6 are “semicomplete” when \({\!\!}\llbracket {\varGamma }\rrbracket {\!\!}\) is compact, as the terminal label can be separately handled by \(\mathsf {PP}(\cdot )\) so that only compact \(\varGamma \)’s for Positivstellensatz’s may be formed, and the difference between strict and nonstrict inequalities does not matter (cf. Remark 3).
6 Experimental Results
In this section, we present experimental results for our algorithm through the semidefinite programming tool SOSTOOLS [3] (that uses SeDuMi [1]) and the linear programming tool CPLEX [2]. Due to space constraints, the detailed description of the input probabilistic programs are in [11].
Experimental Setup. We consider six classical examples of probabilistic programs that exhibit distinct types nonlinear behaviours. Our examples are, namely, Logistic Map adopted in [14] which was previously handled by Lagrangian relaxation and semidefinite programming whereas our approach uses linear programming, Decay that models a sequence of points converging stochastically to the origin, Random Walk that models a random walk within a bounded region defined through nonlinear curves, Gambler’s Ruin which is our running example (Example 1), Gambler’s Ruin Variant which is a variant of Example 1, and Nested Loop which is a nested loop with stochastic increments. Except for Gambler’s Ruin Variant and Nested Loop, our approach is semicomplete for all other examples (cf. Remark 6). In all the examples the invariants are straightforward and was manually integrated with the input. Since SOSTOOLS only produces numerical results, we modify “\(\eta (\ell ,{{\varvec{x}}})\ge 0\)” in C2 to “\(\eta (\ell ,{{\varvec{x}}})\ge 1\)” for Putinar’s or Schmüdgen’s Positivstellensatz and check whether the maximal numerical error of all equalities added to SOSTOOLS is sufficiently small over a bounded region. In our examples, the bounded region is \(\{(x,y)\mid x^2+y^2\le 2\}\) and the maximal numerical error should not exceed 1. Note that 1 is also our fixed \(\epsilon \) in C4, and by Remark 3, the modification on C2 is not restrictive. Instead, one may also pursue Sylvester’s Criterion (cf. [27, Theorem 7.2.5]) to check membership of sums of squares through checking whether a square matrix is positive semidefinite or not.
Experimental results
Example  Method  SOSTOOLS  error  \(\eta (\ell _0,\cdot )\) 

Decay  Putinar  0.1248s  \(\le 10^{9}\)  \( 5282.3435x^2 + 5282.3435y^2 + 1\) 
Random Walk  Schmüdgen  0.7176s  \(\le 10^{7}\)  \(300x^2  300y^2 + 601\) 
Example  Method  CPLEX    \(\eta (\ell _0,\cdot )\) 
Gambler’s Ruin  Handelman  \(\le 10^{2}\)s    \(33x3x^2\) 
Gambler’s Ruin V  Handelman  \(\le 10^{2}\)s    \(21+100x70y100x^2+100xy\) 
Logistic Map  Handelman  \(\le 10^{2}\)s    1000500.7496x 
Nested Loop  Handelman  \(\le 2\cdot 10^{2}\)s    \(48 + 160n + (mx)(800n+240) \) 
For all the examples we consider except Logistic Map, their almostsure termination cannot be answered by previous approaches. For the LogisticMap example, our reduction is to linear programming whereas existing approaches [14, 47] reduce to semidefinite programming.
7 Conclusion and Future Work
In this paper, we extended linear ranking supermartingale (LRSM) for probabilistic programs proposed in [10, 12] to polynomial ranking supermartingales (pRSM) for nondeterministic probabilistic programs. We developed the notion of (difference bounded) pRSM and proved that it is sound for almostsure and finite termination, as well as for concentration bound (Theorems 1 and 2). Then we developed an efficient (sound but not complete) algorithm for synthesizing pRSMs through Positivstellensatz’s (cf. Sect. 5.1), proved its soundness (Theorem 7) and argued its semicompleteness (Remark 6) over an important class of programs. Finally, our experiments demonstrate the effectiveness of our synthesis approach over various classical probabilistic programs, where LRSMs do not exist (cf. Example 1 and Remark 2). Directions of future work are to explore (a) more elegant methods for numerical problems related to semidefinite programming, and (b) other forms of RSMs for more general class of probabilistic programs.
Notes
Acknowledgements
We thank anonymous referees for valuable comments. We also thank Hui Kong for his help on SOSTOOLS. The research was partly supported by Austrian Science Fund (FWF) NFN Grant No. S11407N23 (RiSE/SHiNE), ERC Start grant (279307: Graph Games), ERC Advanced Grant (267989: QUAREM), and Natural Science Foundation of China (NSFC) under Grant No. 61532019.
References
 1.SeDuMi 1.3 (2008). http://sedumi.ie.lehigh.edu/
 2.IBM ILOG CPLEX Optimizer Interactive Optimizer Community Edition 12.6.3.0 (2010). http://www01.ibm.com/software/integration/optimization/cplexoptimizer/
 3.SOSTOOLS v3.00 (2013). http://www.cds.caltech.edu/sostools/
 4.Babic, D., Cook, B., Hu, A.J., Rakamaric, Z.: Proving termination of nonlinear command sequences. Form. Asp. Comput. 25(3), 389–403 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 5.Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press, Cambridge (2008)zbMATHGoogle Scholar
 6.Billingsley, P.: Probability and Measure, 3rd edn. Wiley, New York (1995)zbMATHGoogle Scholar
 7.Bournez, O., Garnier, F.: Proving positive almostsure termination. In: Giesl, J. (ed.) RTA 2005. LNCS, vol. 3467, pp. 323–337. Springer, Heidelberg (2005)CrossRefGoogle Scholar
 8.Bradley, A.R., Manna, Z., Sipma, H.B.: Linear ranking with reachability. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 491–504. Springer, Heidelberg (2005)CrossRefGoogle Scholar
 9.Bradley, A.R., Manna, Z., Sipma, H.B.: Termination of polynomial programs. In: Cousot [16], pp. 113–129Google Scholar
 10.Chakarov, A., Sankaranarayanan, S.: Probabilistic program analysis with Martingales. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 511–526. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 11.Chatterjee, K., Fu, H., Goharshady, A.K.: Termination analysis of probabilistic programs through positivstellensatz’s (2016). arXiv CoRR: http://arxiv.org/abs/1604.07169
 12.Chatterjee, K., Fu, H., Novotný, P., Hasheminezhad, R.: Algorithmic analysis of qualitative and quantitative termination problems for affine probabilistic programs. In: POPL, pp. 327–342. ACM (2016)Google Scholar
 13.Colón, M.A., Sipma, H.B.: Synthesis of linear ranking functions. In: Margaria, T., Yi, W. (eds.) TACAS 2001. LNCS, vol. 2031, pp. 67–81. Springer, Heidelberg (2001)CrossRefGoogle Scholar
 14.Cousot, P.: Proving program invariance and termination by parametric abstraction, Lagrangian relaxation and semidefinite programming. In: Cousot [16], pp. 1–24Google Scholar
 15.Cousot, P., Cousot, R.: Abstract interpretation: a unified Lattice model for static analysis of programs by construction or approximation of fixpoints. In: POPL, pp. 238–252. ACM (1977)Google Scholar
 16.Cousot, R. (ed.): VMCAI 2005. LNCS, vol. 3385. Springer, Heidelberg (2005)zbMATHGoogle Scholar
 17.Dubhashi, D., Panconesi, A.: Concentration of Measure for the Analysis of Randomized Algorithms, 1st edn. Cambridge University Press, New York (2009)CrossRefzbMATHGoogle Scholar
 18.Durrett, R.: Probability: Theory and Examples, 2nd edn. Duxbury Press, Belmont (1996)zbMATHGoogle Scholar
 19.Esparza, J., Gaiser, A., Kiefer, S.: Proving termination of probabilistic programs using patterns. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 123–138. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 20.Farkas, J.: A fourierféle mechanikai elv alkalmazásai (Hungarian). Mathematikaiés Természettudományi Értesitö 12, 457–472 (1894)Google Scholar
 21.Fioriti, L.M.F., Hermanns, H.: Probabilistic termination: soundness, completeness, and compositionality. In: POPL, pp. 489–501. ACM (2015)Google Scholar
 22.Floyd, R.W.: Assigning meanings to programs. Math. Asp. Comput. Sci. 19, 19–33 (1967)MathSciNetCrossRefzbMATHGoogle Scholar
 23.Foster, F.G.: On the stochastic matrices associated with certain queuing processes. Ann. Math. Stat. 24(3), 355–360 (1953)MathSciNetCrossRefzbMATHGoogle Scholar
 24.Grötschel, M., Lovasz, L., Schrijver, A.: Geometric Algorithms and Combinatorial Optimization. Springer, Heidelberg (1993)CrossRefzbMATHGoogle Scholar
 25.Handelman, D.: Representing polynomials by positive linear functions on compact convex polyhedra. Pacific J. Math. 132, 35–62 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
 26.Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)MathSciNetCrossRefzbMATHGoogle Scholar
 27.Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, Cambridge (2013)zbMATHGoogle Scholar
 28.Howard, H.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)zbMATHGoogle Scholar
 29.Hungerford, T.W.: Algebra. Springer, Heidelberg (1974)zbMATHGoogle Scholar
 30.Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. intell. 101(1), 99–134 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
 31.Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)Google Scholar
 32.Kemeny, J., Snell, J., Knapp, A.: Denumerable Markov Chains. D. Van Nostrand Company, Princeton (1966)zbMATHGoogle Scholar
 33.KressGazit, H., Fainekos, G.E., Pappas, G.J.: Temporallogicbased reactive mission and motion planning. IEEE Trans. Robot. 25(6), 1370–1381 (2009)CrossRefGoogle Scholar
 34.Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic realtime systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 585–591. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 35.McIver, A.K., Morgan, C.: Developing and reasoning about probabilistic programs in pGCL. In: Cavalcanti, A., Sampaio, A., Woodcock, J. (eds.) PSSE 2004. LNCS, vol. 3167, pp. 123–155. Springer, Heidelberg (2006)CrossRefGoogle Scholar
 36.McIver, A., Morgan, C.: Abstraction, Refinement and Proof for Probabilistic Systems. Monographs in Computer Science. Springer, New York (2005)zbMATHGoogle Scholar
 37.Monniaux, D.: An abstract analysis of the probabilistic termination of programs. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 111–126. Springer, Heidelberg (2001)CrossRefGoogle Scholar
 38.Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press, New York (1995)CrossRefzbMATHGoogle Scholar
 39.Paz, A.: Introduction to Probabilistic Automata. Computer Science and Applied Mathematics. Academic Press, New York (1971)zbMATHGoogle Scholar
 40.Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear ranking functions. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 239–251. Springer, Heidelberg (2004)CrossRefGoogle Scholar
 41.Putinar, M.: Positive polynomials on compact semialgebraic sets. Indiana Univ. Math. J. 42, 969–984 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
 42.Rabin, M.: Probabilistic automata. Inf. Control 6, 230–245 (1963)CrossRefzbMATHGoogle Scholar
 43.Sankaranarayanan, S., Chakarov, A., Gulwani, S.: Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In: PLDI, pp. 447–458 (2013)Google Scholar
 44.Scheiderer, C.: Positivity and sums of squares: a guide to recent results. In: Putinar, M., Sullivant, S. (eds.) Emerging Applications of Algebraic Geometry. IMAVMA, vol. 149, pp. 271–324. Springer, New York (1996)CrossRefGoogle Scholar
 45.Schmüdgen, K.: The \({K}\)moment problem for compact semialgebraic sets. Math. Ann. 289, 203–206 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
 46.Schrijver, A.: Theory of Linear and Integer Programming. WileyInterscience Series in Discrete Mathematics and Optimization. Wiley, New York (1999)zbMATHGoogle Scholar
 47.Shen, L., Wu, M., Yang, Z., Zeng, Z.: Generating exact nonlinear ranking functions by symbolicnumeric hybrid method. J. Syst. Sci. Comput. 26(2), 291–301 (2013)MathSciNetzbMATHGoogle Scholar
 48.Sohn, K., Gelder, A.V.: Termination detection in logic programs using argument sizes. In: PODS, pp. 216–226. ACM Press (1991)Google Scholar
 49.Williams, D.: Probability with Martingales. Cambridge University Press, Cambridge (1991)CrossRefzbMATHGoogle Scholar
 50.Yang, L., Zhou, C., Zhan, N., Xia, B.: Recent advances in program verification through computer algebra. Front. Comput. Sci. China 4(1), 1–16 (2010)CrossRefzbMATHGoogle Scholar