Bayesian inference for psychology, part III: Parameter estimation in nonstandard models
Abstract
We demonstrate the use of three popular Bayesian software packages that enable researchers to estimate parameters in a broad class of models that are commonly used in psychological research. We focus on WinBUGS, JAGS, and Stan, and show how they can be interfaced from R and MATLAB. We illustrate the use of the packages through two fully worked examples; the examples involve a simple univariate linear regression and fitting a multinomial processing tree model to data from a classic falsememory experiment. We conclude with a comparison of the strengths and weaknesses of the packages. Our example code, data, and this text are available via https://osf.io/ucmaz/.
Keywords
WinBUGS JAGS Stan Bayesian estimation Bayesian inferenceIntroduction
In this special issue, Dienes (this issue) has argued that Bayesian methods are to be preferred over classical methods, Kruschke (this issue) and Etz and Vandekerckhove (this issue) have introduced the Bayesian philosophy and associated mathematics, and Love et al. (this issue; see also Love et al., (2015)) and Wagenmakers et al. (this issue) described software that implements standard hypothesis tests within the Bayesian framework. In the present paper, we demonstrate the use of three popular software packages that enable psychologists to estimate parameters in formal models of varying complexity.
The mathematical foundations of Bayesian parameter estimation are not especially difficult—all that is involved are the elementary laws of probability theory to determine the posterior distribution of parameters given the data. Once the posterior distribution has been defined, the final hurdle of Bayesian parameter estimation is to compute descriptive statistics on the posterior. In order to obtain these descriptive statistics, one widely applicable strategy is to draw random samples from the posterior distribution using Markov chain Monte Carlo methods (MCMC; van Ravenzwaaij, this issue)—with sufficient posterior samples, descriptives on the sample set can substitute for actual quantities of interest.
In this article, we describe the use of three popular, generalpurpose MCMC engines that facilitate the sampling process. We will focus on WinBUGS, JAGS, and Stan, and illustrate their use for parameter estimation in two popular models in psychology. The development of these software packages has greatly contributed to the increase in the prevalence of Bayesian methods in psychology over the past decade (e.g., Lee & Wagenmakers, 2013). The packages owe their popularity to their flexibility and usability; they allow researchers to build a large number of models of varying complexity using a relatively small set of sampling statements and deterministic transformations. Moreover, the packages have a smooth learning curve, are well documented, and are supported by a large community of users both within and outside of psychology. Their popularity notwithstanding, WinBUGS, JAGS, and Stan represent only a subclass of the many avenues to Bayesian analysis; the different avenues implement a trade–off between flexibility and accessibility. At one end of the spectrum, researchers may use off–the–shelf Bayesian software packages, such as JASP (Love et al. this issue; see also Love et al., 2015). JASP has an attractive and userfriendly graphical user interface, but presently it only supports standard hypothesis tests (see also Morey et al., 2015). At the other end of the spectrum, researcher may implement their own MCMC sampler, one that is tailored to the peculiarities of the particular model at hand (e.g., van Ravenzwaaij, this issue; Rouder & Lu, 2005). This approach provides tremendous flexibility, but it is timeconsuming, laborintensive, and requires expertise in computational methods. Generalpurpose MCMC engines—such as WinBUGS, JAGS, and Stan—are the middleoftheroad alternatives to Bayesian analysis that provide a large degree of flexibility at a relatively low cost.
We begin with a short introduction of formal models as generative processes using a simple linear regression as an example. We then show how this model can be implemented in WinBUGS, JAGS, and Stan, with special emphasis on how the packages can be interacted with from R and MATLAB. We then turn to a more complex model, and illustrate the basic steps of Bayesian parameter estimation in a multinomial processing tree model for a falsememory paradigm. The WinBUGS, JAGS, and Stan code for all our examples is available in the Supplemental Materials at https://osf.io/ucmaz/. The discussion presents a comparison of the strengths and weaknesses of the packages and provides useful references to hierarchical extensions and Bayesian model selection methods using generalpurpose MCMC software.
An introduction with linear regression
Specification of models as generative processes
Before we continue, it is useful to consider briefly what we mean by a formal model: A formal model is a set of formal statements about how the data come about. Research data are the realizations of some stochastic process, and as such they are draws from some random number generator whose properties are unknown. In psychology, the random number generator is typically a group of randomly selected humans who participate in a study, and the properties of interest are often differences in group means between conditions or populations (say, the difference in impulsivity between schizophrenia patients and controls) or other invariances and systematic properties of the data generation process. A formal model is an attempt to emulate the unknown random number generator in terms of a network of basic distributions.
This simple model also helps to introduce the types of variables that we have at our disposal. Variables can be stochastic, meaning that they are draws from some distribution. Stochastic variables can be either observed (i.e., data) or unobserved (i.e., unknown parameters). In this model, y, β _{1}, β _{2}, and τ are stochastic variables. Variables can also be deterministic, which means their values are completely determined by other variables. Here, μ _{ i } is determined as some combination of β _{1}, β _{2}, and x _{ i }. N is a constant.
Taken together, a Bayesian model can be thought of as a datageneration mechanism that is conditional on parameters: Bayesian models make predictions. In particular, the sampling statements— including the priors—in Eqs. 1, 4, 5, and 6 and the deterministic transformation in Eq. 2, fully define a generative model; this set of statements fully defines the model because they are all that is needed to generate data from the model. The generative model thus formalizes the presumed process by which the data in an empirical study were generated.
A toy data set
Example data set for linear regression
Expected (x)  Observed (y)  

51  24  32  33  35  32  x < − c(51, 44, 57, 41, 53, 56, 
44  21  42  55  18  31  49, 58, 50, 32, 24, 21, 
57  23  27  49  14  37  23, 28, 22, 30, 29, 35, 
41  28  38  56  31  17  18, 25, 32, 42, 27, 38, 
53  22  32  58  13  11  32, 21, 21, 12, 29, 14) 
56  30  21  61  23  24  y < − c(33, 55, 49, 56, 58, 61, 
49  29  21  46  15  17  46, 82, 53, 33, 35, 18, 
58  35  12  82  20  5  14, 31, 13, 23, 15, 20, 
50  18  29  53  20  16  20, 33, 32, 31, 37, 17, 
32  25  14  33  33  7  11, 24, 17, 5, 16, 7) 
Implementing a generative model
The parameter beta[1] denotes the intercept (i.e., observed number of attendees for 0 expected attendees), beta[2] denotes the slope of the regression line (i.e., the increase in the observed number of attendees associated with a oneunit increase in the expected number of attendees), and tau represents the inverse of the error variance. This short piece of code maps exactly to the generative model for linear regression that we specified. Of course, since there is much more freedom in mathematical expression than there is in computer code, the pointtopoint translations will not always be perfect, but it will typically be an excellent starting point.
In the code, deterministic variables are followed by the < assignment operator. For instance, the line mu[i] < beta[1] + beta[2] ⋆ x[i] specifies that the mu parameters are given by a linear combination of the of the stochastic beta variables and the observed data x. The # symbol is used for comments. The complete list of distributions, functions, logical operators, and other programming constructs that are available in WinBUGS, JAGS, and Stan, is listed in their respective user manuals. BUGS is a declarative language, which means that the order of the statements in the model file is largely irrelevant. In contrast, in Stan, the order of statements matters. With the model translated from formal assumptions to BUGS language, the next step is to interact with the software and sample from the posterior distribution of the parameters.
WinBUGS graphical user interface
WinBUGS (Bayesian inference Using Gibbs Sampling for Windows; Lunn et al., 2000, 2009, 2012; Spiegelhalter et al., 2003; for an introduction see Kruschke, 2010, and Wagenmakers, 2013) is a standalone piece of software that is freely available at http://www.mrcbsu.cam.ac.uk/bugs/. In this section, we give a brief description of the WinBUGS graphical user interface (GUI) using the linear regression model introduced above; later we illustrate how WinBUGS can be called from other software, such as R and MATLAB. For a detailed stepbystep introduction to the WinBUGS GUI, the reader is referred to Lee and Wagenmakers (2013).
To interact with WinBUGS via the GUI, users have to create a number of files. First, there is a model file that describes the generative specification of the model, second is the data file that contains the raw data, and third is an initial values file that contains some starting values for the sampling run.
The same data format is used to store the (optional, but strongly recommended) set of initial values for the unobserved stochastic variables. If initial values are not supplied, WinBUGS will generate these automatically by sampling from the prior distribution of the parameters. Automatically generated initial values can provide poor starting points for the sampling run and may result in numerical instability. If multiple MCMC chains are run in order to diagnose convergence problems, we encourage users to create a separate file for each set of initial values. As shown in Fig. 1c, we will run three chains, each with a different set of initial values, and store these in inits1.txt, inits2.txt, and inits3.txt.
 1.
Load the model file and check the model specification. To open the model file, go to File > Open and select linreg_model.txt in the appropriate directory. To check the syntax of the model specification, go to Model > Specification and open the Specification Tool window (Panel D in Fig. 1), activate the model file by clicking inside linreg_model.txt, click on check model, and wait for the message “model is syntactically correct” to appear in the status bar.
 2.
Load the data file. To open the data file, go to File > Open and select data.txt in the appropriate directory. To load the data, activate the data file, click on load data in the Specification Tool window, and wait for the message “data loaded” to appear in the status bar.
 3.
Compile the model. To compile the model, specify the number of MCMC chains in the box labeled num of chains in the Specification Tool window, click on compile, and wait for the message “model compiled” to appear in the status bar. In the linear regression example, we will run three MCMC chains, so we type “3” in the num of chains box.
 4.
Load the initial values. To open the file that contains the initial values for the first chain, go to File > Open and select inits1.txt in the appropriate directory. To load the first set of initial values, activate inits1.txt, click on load inits in the Specification Tool window, and wait for the message “chain initialized but other chain(s) contain uninitialized variables”. Repeat these steps to load the initial values for the second and third MCMC chain. After the third set of initial values is loaded, wait for the message “model is initialized” to appear in the status bar (Fig. 1e).
 5.
Choose the output type. To ensure that WinBUGS pastes all requested output in a single userfriendly log file, go to Output > Output options, open the Output options window, and select the log option (Fig. 2a).
 6.
Specify the parameters of interest. To specify the parameters that you want to draw inference about, go to Inference > Samples, open the Sample Monitor Tool window, type one by one the name of the parameters in the box labeled node, and click on set (Fig. 2b). In the linear regression example, we will monitor the beta[1], beta[2], and tau parameters. To request dynamic trace plots of the progress of the sampling run, select the name of the parameters in the dropdown menu in the Sample Monitor Tool window and click on trace. WinBUGS will start to display the dynamic trace plots once the sampling has begun.
 7.
Specify the number of recorded samples. To specify the number of recorded samples per chain, fill in the boxes labeled beg, end, and thin in the Sample Monitor Tool window. In our linear regression example, we will record 500 posterior samples for each parameter. We will discard the first 500 samples as burnin and start recording samples from the 501^{ t h } iteration (beg= 501); we will draw a total of 1,000 samples (end= 1000); and we will record each successive sample without thinning the chains (thin= 1).
 8.
Sample from the posterior distribution of the parameters. To sample from the posteriors, go to Model > Update, open the Update Tool window (Fig. 2c), fill in the total number of posterior samples per chain (i.e., 1,000) in the box labeled updates, specify the degree of thinning (i.e., 1) in the box labeled thin, click on update, and wait for the message “model is updating” to appear in the status bar.
 9.
Obtain the results of the sampling run. To obtain summary statistics and kernel density plots of the posterior distributions, select the name of the parameters in the dropdown menu in the Sample Monitor Tool window and click on stat and density. WinBUGS will print all requested output in the log file (Panel D in Fig. 2). The figures labeled “Dynamic trace” show trace plots of the monitored parameters; the three MCMC chains have mixed well and look identical to one another, indicating that the chains have converged to the stationary distribution and that the successive samples are largely independent. The table labeled “Node statistics” shows summary statistics of the posterior distribution of the parameters computed based on the sampled values. For each monitored parameter, the table displays the mean, the median, the standard deviation, and the upperand lower bound of the central 95% credible interval of the posterior distribution. The central tendency of the posterior, such as the mean, can be used as a point estimate for the parameter. This 95% credible interval ranges from the 2.5^{ t h } to the 97.5^{ t h } percentile of the posterior and encompasses a range of values that contains the true value of the parameter with 95% probability; the narrower this 95% credible interval, the more precise the parameter estimate. The figures labeled “Kernel density” show density plots of the posterior samples for each parameter.
As the reader might have noticed by now, running analyses via the GUI is inflexible and laborintensive; the GUI does not allow for data manipulation and visualization and requires users to click through a large number of menus and options. Later we therefore illustrate how WinBUGS can be called from standard statistical software, such as R and MATLAB.
JAGS and Stan commandline interface
Both JAGS and Stan are based on a commandline interface. Although this type of interface has fallen out of fashion, and it is strictly speaking not required to use either of these programs, we introduce this lowlevel interface here—using JAGS as the example—in order to provide the reader with an appreciation of the inner workings of other interfaces. Readers who are not interested in this can skip to either one of the next two sections.
Before launching the program, it is again useful to make a set of text files containing the model, data, and initial values. The model file should contain the code in the listing above; for this example, we saved the model in linreg_model.txt.
The data file should contain the data, formatted as in the right column of Table 1. The data format in Table 1 is sometimes referred to as “Sstyle”; each variable name is given in double quotation marks, followed by the assignment operator < and the value to be assigned to the variable. Vectors are encapsulated in the concatenation operator c(...) and matrices are defined as structures with a dimension field: struct(c(...),.Dim=c(R,C)), where the R × C matrix is entered in columnmajor order. Our data file is called linreg_data.txt.
The same data format is used to store the (optional, but strongly recommended) set of initial values. For at least some of the unknowns nodes (i.e., nodes which in the BUGS code are followed by the sampling operator ∼), initial values should be provided. If multiple chains will be run, one unique file for each chains is recommended. Our initial values files are called inits1.txt, inits2.txt, and inits3.txt.
This will produce a set of files starting with samples_chain and an index file starting with samples_index. These files can be loaded into a spreadsheet program like Microsoft Excel or LibreOffice Calc (or command line tools like awk and perl) to compute summary statistics and do inference. However, this approach is both tedious and laborintensive, so there exist convenient interfaces from programming languages such as R, MATLAB, and Python.
Working from MATLAB
MATLAB is a commercial software package that can be obtained via http://www.mathworks.com/. Just like Python or R, MATLAB can be used to format data, generate initial values, and visualize and save results of a sampling run. In this section we outline how users can interact with WinBUGS, JAGS, and Stan using MATLAB. R users can skip this section; in the next sections, we will describe how to use R for the same purposes.
To interact with the three computational engines from MATLAB, we will use the Trinity toolbox (Vandekerckhove, 2014), which is developed as a unitary interface to the Bayesian inference engines WinBUGS, JAGS, and Stan. Trinity is a workinprogress that is (and will remain) freely available via http://tinyurl.com/matlabtrinity. The MATLAB code needed to call these three engines from Trinity is essentially identical.
It is also possible to write the model in a separate file and provide the file name here instead of the model code. One advantage of writing model code directly into the MATLAB script is that the script can be completely selfcontained. Another is that the model code, when treated as a MATLAB variable, could be generated onthefly if tedious or repetitive code is required to define a model or if the model file needs to be adapted dynamically (e.g., if variable names need to change from one run to another).
Each field name (in single quotes) is the name of the variable as it is used in the model definition.^{3} Note that Trinity will not permit the model code to have variable names containing underscores, as this symbol is reserved for internal use. Following each field name is the value that this variable will take; this value is taken from the MATLAB workspace, so it can be an existing variable with any name, or it can be a MATLAB expression that generates the correct value, as we did here with N. Of course, before making this data structure, the data may need to be parsed, read into MATLAB, and possibly preprocessed (outliers removed, etc.).
In this example, anonfun (part (1)) is the name given to the new function—this can be anything that is a valid MATLAB variable name. Part (2) indicates the start of an anonymous function with the @ symbol and lists the input variables of the function between parentheses. Part (3) is a single MATLAB expression that returns the output variable, computed from inputs a and b. This anonymous function could be invoked with: anonfun(1,4), which would yield 5.
Often, many of these settings can be omitted, and Trinity will choose default values that are appropriate for the engine and operating system. The first input selects the engine. The various '⋆filename' inputs on lines 6–9 serve to organize the temporary files in a readable fashion, so that the user can easily access them for debugging or reproduction purposes.^{4}
The input values on lines 10–14 determine how many independent chains should be run, how many samples should be used for burnin, how many samples should be saved per chain, which parameters should be saved, and by how much the chains should be thinned (n means every n ^{ t h } sample is saved). Line 15 determines a working directory, which is currently set to a value that will work well on UNIX systems; Windows users might want to change this. Line 16 determines how much output Trinity gives while it is running. Line 17 decides whether the text output given by the engine should be saved.
Line 18 determines if parallel processing should be used—if this is set to true, all the chains requested on line 10 will be started simultaneously.^{5} Note that for complex models, this may cause computers to become overburdened as all the processing power is used up by Trinity. Users who want to run multiple chains than they have computing cores available can use the optional input pair 'numcores', C, ..., where C is the maximum number of cores Trinity is allowed to use. Finally, line 19 lists optional extra modules (JAGS only). By default, the dic module is called because this facilitates tracking of the model deviance as a variable. Users with programming experience can create their own modules for inclusion here (e.g., 'wiener'; see Wabersich & Vandekerckhove, 2014).
A successful callbayes call will yield up to four output arguments. stats contains summary statistics for each saved parameter (mean, median, standard deviation, and the mass of the posterior below 0). These can be used for easy access to parameter estimates. chains contains all the posterior samples saved. The usefulness of this is discussed below. diagnostics provides quick access to the convergence metric \(\hat {R}\) and the number of effective samples (Gelman & Rubin, 1999). info gives some more information, in particular the model variable and a list of all the options that were set for the analysis (combining the userprovided and automatically generated settings).
The most important output variable is chains, which contains the saved posterior samples that are the immediate goal of the MCMC procedure. This variable is used by practically all functions in Trinity that do postprocessing, summary, and visualization. The default Trinity script contains the line grtable(chains, 1.05). The grtable function prints a table with a quick overview of the sampling results, such as the posterior mean, the number of samples drawn, the number of effective samples (n_eff) and the \(\hat {R}\) convergence metric. The second input to grtable can be either a number, in which case only parameters which an \(\hat {R}\) larger than that number will be printed (or a message that no such parameters exist); or it can be a string with a regular expression, in which case only parameters fitting that pattern will be shown.^{6}
Note that the posterior distributions of the regression parameters contain the first bisector (β _{1} ≈ 0, β _{2} ≈ 1).
Working from R
R (R Development Core Team, 2004) is a free statistical software package that can be downloaded from http://www.rproject.org/. In this section, we outline how users can interact with WinBUGS, JAGS, and Stan using R. As with MATLAB, using R to run analyses increases flexibility compared to working with these Bayesian engines directly; users can use R to format the data, generate the initial values, and visualize and save the results of the sampling run using simple R commands.
Interacting with WinBUGS: R2WinBUGS
To interact with WinBUGS, users have to install the R2WinBUGS package (Sturtz et al., 2005). The R2WinBUGS package allows users to call WinBUGS from within R and pass on the model specification, the data, and the initial values to WinBUGS using the bugs() function. WinBUGS then samples from the posterior distribution of the parameters and returns the MCMC samples to R.

data specifies the list object that contains the data.

inits specifies the list object that contains the initial values.

parameters specifies the vector that lists the names of the parameters of interest.

model.file specifies the text file that contains the model specification. The model.file argument can also refer to an R function that contains the model specification that is written to a temporary file.

n.chain specifies the number of MCMC chains.

n.iter specifies the total number of samples per chain.

n.burnin specifies the number of samples per chain that will be discarded at the beginning of the sampling run.

n.thin specifies the degree of thinning.

DIC specifies whether WinBUGS should return the Deviance Information Criterion (DIC; Spiegelhalter et la., 2002) measure of model comparison.

bugs.directory specifies the location of WinBUGS14.exe.

codaPkg specifies the output that is returned from WinBUGS. Here codaPkg is set to FALSE to ensure that WinBUGS returns the posterior samples in the samples object. If codaPkg is set to TRUE, WinBUGS returns the paths to a set of files that contains the WinBUGS output.

debug specifies whether WinBUGS will be automatically shut down after sampling. Here debug is set to FALSE to ensure that WinBUGS shuts down immediately after sampling and returns the results to R. If debug is set to TRUE, WinBUGS will not shut down after sampling and will display summary statistics and trace plots of the monitored parameters. As the name suggests, setting debug to TRUE can also provide—often cryptic—cues for debugging purposes.
For more details on the use of bugs(), the reader is referred to the help documentation.
The posterior samples for beta[1], beta[2], and tau are stored in samples$sims.array (or samples$sims.list). The hist() function can be used to plot histograms of the posterior distribution of the parameters based on the samples values. The print(samples) command displays a useful summary of the posterior distribution of each model parameter, including the mean, the standard deviation, and the quantiles of the posteriors, and (if multiple chains are run) the \(\hat {R}\) convergence metric.
Interacting with JAGS: R2jags
To interact with JAGS, users have to install the R2jags package (Su & Yajima, 2012). The R2jags package allows users to call JAGS from within R and pass on the model specification, the data, and the start values to JAGS using the jags() function. JAGS then samples from the posterior distribution of the parameters and returns the MCMC samples to R.

data specifies the list object that contains the data.

inits specifies the list object that contains the initial values.

parameters.to.save specifies the vector that lists the names of the parameters of interest.

model.file specifies the file that contains the model specification. The model.file argument can also refer to an R function that contains the model specification that is written to a temporary file.

n.chains specifies the number of MCMC chains.

n.iter specifies the total number of samples per chain.

n.burnin specifies the number of samples per chain that will be discarded at the beginning of the sampling run.

n.thin specifies the degree of thinning.

DIC specifies whether JAGS should return the DIC.
For more details on the use of jags(), the reader is referred to the help documentation.
Interacting with Stan: rstan
To interface R to Stan, users need to install the rstan package (Guo et al., 2015). The rstan package allows users to call Stan from within R and pass the model specification, data, and starting values to Stan using the stan() function. The MCMC samples from the posterior distribution generated by Stan are then returned and can be further processed in R.
There are a few differences between WinBUGS/JAGS and Stan that are worth noting when specifying Stan models. While JAGS and WinBUGS simply interpret the commands given in the model, Stan compiles the model specification to a C++ program. Consequently, Stan differentiates between a number of different variable types, and variables in a model need to be declared before they can be manipulated. Moreover, model code in Stan is split into a number of blocks, such as “data” and “model”, each of which serves a specific purpose. Finally, unlike in WinBUGS and JAGS, the order of statements in a Stan model matters and statements cannot be interchanged with complete liberty.
There are a number of very obvious ways in which this model specification differs from that in WinBUGS and JAGS. The model code is split into four blocks and all variables that are mentioned in the “model” block are defined in the preceding blocks. The “data” block contains the definition of all observed data that are provided by the user. The “parameters” block contains the definition of all stochastic variables, and the “transformed parameters” block contains the definition of all transformations of the stochastic variables. The difference between these latter two parts of the code is rather subtle and has to do with the number of times each variable is evaluated during the MCMC sampling process; a more elaborate explanation can be found in the Stan reference manual (Stan Development Team, 2015).
We will not discuss the specifics of all the variable definitions here (see Stan Development Team (2015), for details) but will rather illustrate a few important points using as example the tau variable. As in the model specification forWinBUGS and JAGS, tau is the precision of the Gaussian distribution. Defining a variable for the precision of the Gaussian is, strictly speaking, not necessary because distribution functions in Stan are parameterized in terms of their standard deviation. Nevertheless, we retain tau for easy comparability of the Stan MCMC samples with the output of WinBUGS or JAGS. The first line of the definition of tau states that it is a real number that is not smaller than 0, and Stan will return an error message should it encounter a negative value for tau during the sampling process. The next line states that tau is the inverse of the variance of the Gaussian. If we were to reverse the order of these last two lines, due to Stan’s linebyline evaluation of the code, we would get an error message stating that the variable tau is not defined.
The specification of the actual sampling statements in the “model” block begins, in line with Stan’s linebyline evaluation style, with the prior distributions for the regression coefficients beta[1] and beta[2] and the variance of the Gaussian. Note that the prior for sigma2 is an inverse gamma distribution—this is equivalent to the prior specification in the WinBUGS/JAGS model where the inverse of the variance was given a gamma prior. Finally, we summarized (1) and (2) into a single line, which is another way in which the Stan model specification differs from the WinBUGS/JAGS code. While WinBUGS does not allow users to nest statements within the definition of a stochastic node, Stan (and also JAGS) users can directly specify the mean of the Gaussian to be a function of the regression coefficients and observed data x, without needing to define mu[i].

data specifies the list object that contains the data.

init specifies the list object that contains the initial values.

pars specifies the vector that lists the names of the parameters of interest.

model_code specifies the string vector that contains the model specification. Alternatively, the name of a .stan file that contains the model specification can be passed to Stan using the file argument.

chains specifies the number of MCMC chains.

iter specifies the total number of samples per chain.

warmup specifies the number of samples per chain that will be discarded at the beginning of the sampling run.

thin specifies the degree of thinning.
For more details on the use of stan(), we refer readers to the corresponding R help file.

samples object containing the posterior samples from Stan.

pars character vector with the names of the parameters for which the posterior samples should be accessed.

inc_warmup logical value indicating whether warmup samples should be extracted too.
The posterior samples for beta[1], beta[2], and tau can be visualized and summarized using the hist() and print() functions, respectively. As the name suggests, the traceplot(samples) command displays trace plots of the model parameters, which provide useful visual aids for convergence diagnostics.
Example: Multinomial processing tree for modeling falsememory data
In this section, we illustrate the use of WinBUGS, JAGS, and Stan for Bayesian parameter estimation in the context of multinomial processing trees, popular cognitive models for the analysis of categorical data. As an example, we will use data reported in Wagenaar and Boer (1987). The data result from an experiment in which misleading information was given to participants who were asked to recall details of a studied event. The data were previously revisited by Vandekerckhove et al., (2015), and our discussion of Wagenaar and Boer (1987)’s experiment and their three possible models of the effect of misleading postevent information on memory closely follows that of Vandekerckhove et al., (2015).
The experiment proceeded in four phases. Participants were first shown a sequence of drawings involving a pedestriancar collision. In one particular drawing, a car was shown at an intersection where a traffic light was either red, yellow, or green. In the second phase, participants were asked questions about the narrative, such as whether they remembered a pedestrian crossing the road as the car approached the “traffic light” (in the consistentinformation condition), the “stop sign” (in the inconsistentinformation condition) or the “intersection” (the neutral group). In the third phase, participants were given a recognition test. They were shown pairs of pictures from phase I, where one of the pair had been slightly altered (e.g., the traffic light had been replaced by a stop sign), and asked to pick out the unaltered version. In the final phase, participants were informed that there had indeed been a traffic light, and were then asked to recall the color of the light.
The first theoretical account on the effect of misleading postevent information is Loftus’ destructive–updating model. This model predicts that when conflicting information is presented, it replaces and destroys the original information. Second is the coexistence model, under which the initial memory is suppressed by an inhibition mechanism. However, the suppression is temporary and can revert. The third model is the no–conflict model, under which misleading postevent information cannot replace or suppress existing information, so that it only has an effect if the original information is somehow missing (i.e., was not encoded or is forgotten).
Multinomial processing tree models
To calculate the probability of the four possible response patterns (i.e., correct vs. error in phase III and correct vs. error in phase IV), we add together the probabilities of each branch leading to that response pattern. The probability of each branch being traversed is given by the product of the individual probabilities encountered on the path. For example, under the noconflict model, the probability (and hence, expected proportion) of getting phase III correct but phase IV wrong is (adding the paths in Fig. 4 from left to right and starting at the bottom from those cases where phase III was correct but phase IV was not): \(\frac {2}{3} \times q \times (1c) \times p + \frac {2}{3} \times (1q) \times (1c) \times p + \frac {2}{3} \times \frac {1}{2} \times (1q) \times (1p)\).
The two competing models both add one parameter to the noconflict model. In the case of the destructiveupdating model, we add one parameter d for the probability that the traffic light information is destroyed upon encoding the stop sign. In the case of the coexistence model, we instead add one parameter s for the probability that the stop sign encoding causes the traffic light information to be suppressed, not destroyed, so that it remains available in phase IV.
We will now fit the noconflict model to the Wagenaar and Boer (1987) data using WinBUGS, JAGS and Stan in combination with both MATLAB and R. Obtaining parameter estimates for the destructive–updating and the coexistence models requires only minor modifications to the code. In particular, we would have to modify the category probabilities (10–21) to reflect the tree architecture of the alternative models and define an additional parameter (i.e., parameter d for the destructiveupdating and parameter s for the coexistence model) with the corresponding uniform prior distribution. As an illustration, the Supplemental Material presents the WinBUGS, JAGS and Stan model files and the corresponding R code that allows users to estimate the parameters of the noconflict as well as the destructiveupdating and coexistence models.
Working from R using R2WinBUGS
Working from R using R2jags
Working from R using rstan
The posterior samples for the three model parameters are returned in the samples object. The third column of Fig. 6 shows estimates of the posterior densities based on the sampled values; the posteriors closely resemble those obtained with WinBUGS and JAGS.
Working from MATLAB using trinity
After the data are entered, the model definition needs to be provided as a cell string. We omit the model specification here because both the WinBUGS/JAGS and Stan versions are fully given in the previous sections.
Testing hypotheses
After the posterior samples have been drawn, and posterior distributions possibly visualized as above, there remains the issue of testing hypotheses relating to parameters. With the current falsememory data set, one hypothesis of interest might be that the probability p of encoding the traffic light is greater (versus lower) than chance (Hypothesis 1). The same question might be asked of the probability c of encoding the light color (Hypothesis 2).
Given samples from the posterior, a convenient way of computing the posterior probability that a hypothesis is true is by computing the proportion of posterior samples in which the hypothesis holds. To test Hypothesis 1, we would calculate the proportion of cases in which p > 0.5. To test Hypothesis 2, we calculate the proportion of cases in which c > 0.5.
As it turns out, the probability of Hypothesis 1 given the data is about 84% and that of Hypothesis 2 is about 45%. In other words, neither of the hypotheses is strongly supported by the data. In fact, as Fig. 7 shows, most of the posterior mass is clustered near 0.5 for all parameters.
Conclusions
Bayesian methods are rapidly rising from obscurity and into the mainstream of psychological science. While Bayesian equivalents of many standard analyses, such as the t test and linear regression, can be conducted in off–theshelf software such as JASP (Love et al., 2015), custom models will continue to require a flexible programming framework and, unavoidably, some degree of software MacGyverism. To implement specialized models, researchers may write their own MCMC samplers, a process that is timeconsuming and laborintensive, and does not come easy to investigators untrained in computational methods.
Luckily, generalpurpose MCMC engines—such as WinBUGS, JAGS, and Stan—provide easytouse alternatives to custom MCMC samplers. These software packages hit the sweet spot for most psychologists; they provide a large degree of flexibility at a relatively low time cost.
In this tutorial, we demonstrated the use of three popular Bayesian software packages in conjunction with two scientific programming languages, R and MATLAB. This combination allows researchers to implement custom Bayesian analyses from already familiar environments. As we illustrated, models as common as a linear regression can be easily implemented in this framework, but so can more complex models, such as multinomial processing trees (MPT; Batchelder & Riefer, 1980; Chechile, 1973; Riefer & Batchelder, 1988).
Although the tutorial focused exclusively on nonhierarchical models, the packages may also be used for modeling hierarchical data structures (e.g., Lee 2011). In hierarchical modeling, rather than estimating parameters separately for each unit (e.g., participant), we model the betweenunit variability of the parameters with grouplevel distributions. The grouplevel distributions are used as priors to “shrink” extreme and poorly constrained estimates to more moderate values. Hierarchical estimation can provide more precise and less variable estimates than nonhierarchical estimation, especially in data sets with relatively few observations per unit (Farrell & Ludwig, 2008; Rouder et al., 2005). Hierarchical modeling is rapidly gaining popularity in psychology, largely by virtue of to the availability of accessible MCMC packages. The WinBUGS, JAGS, and Stan implementation of most hierarchical extensions is very straightforward and often does not require more than a few additional lines of code. For the hierarchical WinBUGS implementation of regression models, the reader is referred to Gelman and Hill (2007). For the hierarchical implementation of custom models, such as multinomial processing trees, signal detection, or various response time models, the reader is referred to Lee and Wagenmakers (2013), Matzke et al., (2015), Matzke and Wagenmakers (2009), Nilsson et al., (2011), Rouder et al., (2008) and Vandekerckhove et al., (2011).
Although the goal of our tutorial was to demonstrate the use of generalpurpose MCMC software for Bayesian parameter estimation, our MPTexample has also touched on Bayesian hypothesis testing. Various other Bayesian methods are available that rely on MCMCoutput to test hypotheses and formally compare the relative predictive performance of competing models. For instance, Wagenmakers et al., (2010) and Wetzels et al., (2009) discuss the use of the SavageDickey density ratio, a simple procedure that enables researchers to compute Bayes factors (Jeffreys, 1961; Kass and Raftery, 1995) for nested model comparison using the height of the prior and posterior distributions obtained from WinBUGS. Vandekerckhove et al., (2015) shows how to use posterior distributions obtained from WinBUGS and JAGS to compute Bayes factors for nonnested MPTs using importance sampling. Lodewyckx et al., (2011) outline a WinBUGS implementation of the productspace method, a transdimensional MCMC approach for computing Bayes factors for nested and nonnested models. Most recently, Gronau et al., (2017) provide a tutorial on bridge sampling—a new, potentially very powerful method that is under active development. It is important to note, however, that these methods are almost all quite difficult to use and can be unstable, especially for highdimensional problems.
Throughout the tutorial, we have advocated WinBUGS, JAGS, and Stan as flexible and userfriendly alternatives to homegrown sampling routines. Although the MCMC samplers implemented in these packages work well for the majority of models used in psychology, they may be inefficient and impractical for some. For instance, models of choice and response times, such as the linear ballistic accumulator (Brown and Heathcote, 2008) or the lognormal race (Rouder et al., 2015), are notoriously difficult to sample from using standard MCMC software. In these cases, custommade MCMC routines may be the only solution. For examples of custommade and nonstandard MCMC samplers, the reader is referred to Rouder and Lu (2005) and Turner et al., (2013), respectively.
Their general usefulness notwithstanding, the three packages all have their own set of limitations and weaknesses. WinBUGS, as the name suggests, was developed specifically for Windows operating systems. Although it is possible to run WinBUGS under OS X and Linux using emulators such as Darwine and CrossOver or compatibility layers such as Wine, user experience is often jarring. Even under Windows, software installation is a circuitous process and requires users to decode a registration key and an upgrade patch via the GUI. Once installed, users typically find the GUI inflexible and laborintensive. In interaction with R, user experience is typically more positive. Complaints focus mostly on WinBUGS’ cryptic error messages and the limited number of builtin functions and distributions. Although the WinBUGS Development Interface (WBDev; Lunn, 2003) enables users to implement custommade functions and distributions, it requires experience with Component Pascal and is poorly documented. Matzke et al., (2013) provide WBDev scripts for the truncatednormal and exGaussian distributions; Wetzels et al., (2010) provide an excellent WBDev tutorial for psychologists, including a WBDev script for the shiftedWald distribution. Importantly, the BUGS Project has shifted development away from WinBUGS; development now focuses on OpenBUGS (http://www.openbugs.net/w/FrontPage).
Stan comes equipped with interfaces to various programming languages, including R, Python and MATLAB, and only requires the installation of the specific interface package, which is easy and straightforward under most common operating systems. In terms of computing time, Stan seems a particularly suitable choice for complex models with many parameters and large posterior sample sizes. This advantage in computing time is due to the fact that Stan compiles the sampling model to a C++ program before carrying out the sampling process. The downside of this compilation step is that, particularly for small models as used in the present tutorial, compilation of the model might require more time than the sampling process itself, in which case WinBUGS or JAGS seem a more advantageous choice.
Finally, we will highlight two advantages of JAGS over Stan. First, as illustrated in our example code, Stan code requires variable declaration and as a result can be somewhat more complicated than JAGS code. Second, as a consequence of Stan’s highly efficient Hamiltonian Monte Carlo sampling algorithm, some model specifications are not allowed—in particular, Stan does not easily allow model specifications that require inference on discrete parameters, which reduces its usefulness if the goal is model selection rather than parameter estimation.
We demonstrated the use of three popular Bayesian software packages that enable researchers to estimate parameters in a broad class of models that are commonly used in psychological research. We focused on WinBUGS, JAGS, and Stan, and showed how they can be interfaced from R and MATLAB. We hope that this tutorial can serve to further lower the threshold to Bayesian modeling for psychological science.
Footnotes
 1.
We chose values for the parameters of the prior distributions that fit the introductory example. In general, these values should depend on the application at hand (see Vanpaemel & Lee, this issue; and Morey, this issue).
 2.
It is likely that the exact appearance of this code will vary a little over successive versions of the Trinity toolbox, but the requirements will remain broadly the same.
 3.
Because MATLAB does not differentiate between vectors and singlecolumn or singlerow matrices, but some of the computational engines do, it is sometimes convenient to pass variables explicitly as a matrix or explicitly as a vector. For this situation, Trinity allows the flags AS_MATRIX_ and AS_VECTOR_ to be prepended to any variable name. A common situation in which this is useful is when matrix multiplication is applied in the model, but one of the matrices has only one column. JAGS, for example, will treat that matrix as a vector and throw a “dimension mismatch” error unless the flag is applied. In our example, the data structure would then be defined as struct('AS_MATRIX_x', x).
 4.
When using JAGS or Stan, the working directory will contain a file with a cryptic name that starts with tp and ends in a sequence of random characters, with no file extension. This is the entry point script that Trinity uses to call the engine. It can be used to reproduce the analysis outside of MATLAB, if desired—the files in that directory that do not have the .txt extension are all that is needed for reproduction. The ⋆.txt files are output, containing the posterior samples and the log file. When using WinBUGS, data files, initial values files, and model files will be available in the working directory where they can be accessed with the WinBUGS GUI.
 5.
On UNIX systems, this requires the installation of the free program GNU parallel (Tange, 2011). On Windows systems, it currently requires the MATLAB Parallel Computing Toolbox, but we are working to resolve this dependency.
 6.
Regular expressions are an extremely powerful and flexible programming constructs. To give some examples: if the expression is 'beta', all parameters with the string beta in their name will be shown. If it is '̂beta', only parameters starting with that string will be shown. 'beta$' will show only those ending in that string. '.' will match any variable, and 'beta' will match anything containing be or ta. A complete overview to regular expressions in MATLAB can be found via the documentation for the function regexp.
Notes
Acknowledgements
The authors thank EricJan Wagenmakers for helpful comments during the writing of this article. DM was supported by a Veni grant #45115010 from the Netherlands Organization of Scientific Research (NWO). UB was supported by an NWO Research Talent grant #40612125. JV was supported by NSF grants #1230118 and #1534472 from the Methods, Measurements, and Statistics panel and John Templeton Foundation grant #48192.
References
 Batchelder, W. H., & Riefer, D. M. (1980). Separation of storage and retrieval factors in free recall of clusterable pairs. Psychological Review, 87, 375–397.Google Scholar
 Brown, S. D., & Heathcote, A. J. (2008). The simplest complete model of choice reaction time: Linear ballistic accumulation. Cognitive Psychology, 57, 153–178.Google Scholar
 Chechile, R. A. (1973). The relative storage and retrieval losses in shortterm memory as a function of the similarity and amount of information processing in the interpolated task (Unpublished doctoral dissertation). Pittsburgh: University of Pittsburgh.Google Scholar
 Farrell, S., & Ludwig, C. J. H. (2008). Bayesian and maximum likelihood estimation of hierarchical response time models. Psychonomic Bulletin & Review, 15, 1209–1217.Google Scholar
 Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.Google Scholar
 Gelman, A., & Rubin, D. B. (1999). Evaluating and using statistical methods in the social sciences. Sociological Methods & Research, 27, 403–410.Google Scholar
 Gronau, Q. F., Sarafoglou, A., Matzke, D., Ly, A., Boehm, U., Marsman, M., & Steingroever, H. (2017). A tutorial on bridge sampling. arXiv:1703.05984
 Guo, J., Lee, D., Goodrich, B., de Guzman, J., Niebler, E., Heller, T., & Goodrich, B. (2015). rstan: R interface to stan [Computer software manual]. Retrieved from https://cran.rproject.org/web/packages/rstan/index.html
 Jeffreys, H. (1961). Theory of probability, 3rd edn. Oxford: Oxford University Press.Google Scholar
 Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.Google Scholar
 Kruschke, J. K. (2010). Doing Bayesian data analysis: A tutorial introduction with R and BUGS. Burlington: Academic Press.Google Scholar
 Lee, M. D. (2011). How cognitive modeling can benefit from hierarchical Bayesian models. Journal of Mathematical Psychology, 55, 1–7.Google Scholar
 Lee, M. D., & Wagenmakers, E.J. (2013). Bayesian modeling for cognitive science: A practical course. Cambridge: Cambridge University Press.Google Scholar
 Lodewyckx, T., Kim, W., Tuerlinckx, F., Kuppens, P., Lee, M. D., & Wagenmakers, E.J. (2011). A tutorial on Bayes factor estimation with the product space method. Journal of Mathematical Psychology, 55, 331–347.Google Scholar
 Love, J., Selker, R., Marsman, M., Jamil, T., Dropmann, D., Verhagen, A. J., & Wagenmakers, E. J (2015). JASP [computer software]. https://jaspstats.org/
 Lunn, D. J. (2003). WinBUGS development interface (WBDev). ISBA Bulletin, 10, 10–11.Google Scholar
 Lunn, D. J., Jackson, C., Best, N., Thomas, A., & Spiegelhalter, D. (2012). The BUGS book: A practical introduction to Bayesian analysis. Boca Raton: Chapman & Hall/CRC.Google Scholar
 Lunn, D. J., Spiegelhalter, D., Thomas, A., & Best, N. (2009). The BUGS project: Evolution, critique and future directions. Statistics in Medicine, 28, 3049–3067.Google Scholar
 Lunn, D. J., Thomas, A., & Best, N. (2000). WinBUGS—a Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325–337.Google Scholar
 Matzke, D., Dolan, C. V., Batchelder, W. H., & Wagenmakers, E.J. (2015). Bayesian estimation of multinomial processing tree models with heterogeneity in participants and items. Psychometrika, 80, 205–235.Google Scholar
 Matzke, D., Dolan, C. V., Logan, G. D., Brown, S. D., & Wagenmakers, E.J. (2013). Bayesian parametric estimation of stopsignal reaction time distributions. Journal of Experimental Psychology: General, 142, 1047–1073.Google Scholar
 Matzke, D., & Wagenmakers, E.J. (2009). Psychological interpretation of the exGaussian and shifted Wald parameters: A diffusion model analysis. Psychonomic Bulletin & Review, 16, 798–817.Google Scholar
 Morey, R. D., Rouder, J. N., & Jamil, T. (2015). Package Bayes factorâǍŹ. http://cran.rproject.org/web/packages/BayesFactor/BayesFactor.pdf
 Nilsson, H., Rieskamp, J., & Wagenmakers, E.J. (2011). Hierarchical Bayesian parameter estimation for cumulative prospect theory. Journal of Mathematical Psychology, 55, 84–93.Google Scholar
 R Development Core Team (2004). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from http://www.Rproject.org (ISBN 3900051003).
 Riefer, D. M., & Batchelder, W. H. (1988). Multinomial modeling and the measurement of cognitive processes. Psychological Review, 95, 318–399.Google Scholar
 Rouder, J. N., & Lu, J. (2005). An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin & Review, 12, 573–604.Google Scholar
 Rouder, J. N., Lu, J., Morey, R. D., Sun, D., & Speckman, P. L. (2008). A hierarchical process dissociation model. Journal of Experimental Psychology: General, 137, 370–389.Google Scholar
 Rouder, J. N., Lu, J., Speckman, P. L., Sun, D., & Jiang, Y. (2005). A hierarchical model for estimating response time distributions. Psychonomic Bulletin & Review, 12, 195–223.Google Scholar
 Rouder, J. N., Province, J. M., Morey, R. D., Gomez, P., & Heathcote, A. (2015). The lognormal race: A cognitiveprocess model of choice and latency with desirable psychometric properties. Psychometrika, 80, 491–513.Google Scholar
 Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society B, 64, 583–639.Google Scholar
 Spiegelhalter, D. J., Thomas, A., Best, N., & Lunn, D. (2003). WinBUGS version 1.4 user manual. Cambridge: Medical Research Council Biostatistics Unit.Google Scholar
 Stan Development Team (2015). Stan modeling language: User’s guide and reference manual. version 2.7.0 [Computer software manual]. Retrieved from https://github.com/standev/stan/releases/download/v2.7.0/stanreference2.7.0.pdf
 Sturtz, S., Ligges, U., & Gelman, A. (2005). R2WinBUGS: A package for running WinBUGS from R. Journal of Statistical Software, 12, 1–16.Google Scholar
 Su, Y. S., & Yajima, M. (2012). R2jags: A package for running JAGS from R [Computer software manual]. Retrieved from http://CRAN.Rproject.org/package=R2jags
 Tange, O. (2011). Gnu parallel  the command–line power tool. ;login: The USENIX Magazine, 36(1), 42–47. Retrieved from http://www.gnu.org/s/parallel
 Turner, B. M., Sederberg, P. B., Brown, S. D., & Steyvers, M. (2013). A method for efficiently sampling from distributions with correlated dimensions. Psychological Methods, 18, 368–384.Google Scholar
 Vandekerckhove, J. (2014). Trinity: A MATLAB interface for Bayesian analysis. http://tinyurl.com/matlabtrinity
 Vandekerckhove, J., Matzke, D., & Wagenmakers, E.J. (2015). Model comparison and the principle of parsimony. In Busemeyer, J. R., Townsend, J. T., Wang, Z. J., & Eidels, A. (Eds.) Oxford handbook of computational and mathematical psychology. Retrieved from http://p.cidlab.com/vandekerckhove2014model.pdf (pp. 300–317). Oxford: Oxford University Press.
 Vandekerckhove, J., Tuerlinckx, F., & Lee, M. D. (2011). Hierarchical diffusion models for twochoice response times. Psychological Methods, 16, 44–62. Retrieved from http://p.cidlab.com/vandekerckhove2011hierarchical.pdf
 Wabersich, D., & Vandekerckhove, J. (2014). Extending JAGS: A tutorial on adding custom distributions to JAGS (with a diffusion model example), (Vol. 46. Retrieved from http://p.cidlab.com/wabersich2014extending.pdf
 Wagenaar, W. A., & Boer, J. P. (1987). A Misleading postevent information: Testing parameterized models of integration in memory. Acta Psychologica, 66, 291–306.Google Scholar
 Wagenmakers, E. J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010). Bayesian hypothesis testing for psychologists: A tutorial on the Savage–Dickey method. Cognitive Psychology, 60, 158–189.Google Scholar
 Wetzels, R., Lee, M. D., & Wagenmakers, E.J. (2010). Bayesian inference using WBDev: A tutorial for social scientists. Behavior Research Methods, 42, 884–897.Google Scholar
 Wetzels, R., Raaijmakers, J. G. W., Jakab, E., & Wagenmakers, E. J (2009). How to quantify support for and against the null hypothesis: A flexible WinBUGS implementation of a default Bayesian t test. Psychonomic Bulletin & Review, 16, 752–760.Google Scholar