Bayesian two-part multilevel model for longitudinal media use data

Multilevel models are effective marketing analytic tools that can test for consumer differences in longitudinal data. A two-part multilevel model is a special case of a multilevel model developed for semi-continuous data, such as data that include a combination of zeros and continuous values. For repeated measures of media use data, a two-part multilevel model informs market research about consumer-specific likeliness to use media, level of use across time, and variation in use over time. These models are typically estimated using maximum likelihood. There are, however, tremendous advantages to using a Bayesian framework, including the ease at which the analyst can take into account information learned from previous investigations. This paper develops a Bayesian approach to estimating a two-part multilevel model and illustrates its use by applying the model to daily diary measures of television use in a large US sample.


Introduction
Consumer market research has long relied on surveys administered at a single point in time to measure and evaluate consumer behavior. There are numerous advantages to onetime surveys, such as ease of administration, ease of access to diverse consumer markets and low administration costs. For many market research problems, cross-sectional studies yield data from which analysts can draw valid inferences about consumer behavior. Although much of market research continues to rely on cross-sectional survey data, there are research questions that cannot be answered using a single data collection, and hence, a longitudinal study design is needed.
Longitudinal data give market researchers insight about individual consumer's changing behaviors that cannot be understood by administering a single survey (Rindfleisch et al. 2008). Collecting data for the same individuals over time is especially important for behaviors that are expected to change and where understanding that change is fundamental to the study of the behavior (Beuk et al. 2014;Blozis et al. 2019;Chaney and Martin 2017;Katakam et al. 2021;Kumar et al. 2017;Payan et al. 2010). If the analyst's goal is to understand change at the consumer level, then a statistical framework that captures these important aspects of consumer-level behavior is critical.
Multilevel models, also known as mixed-effects models and hierarchical models, are widely popular across multiple disciplines for analyzing longitudinal data. A multilevel model includes one or more random subject effects, also called random coefficients, that indicate how an individual differs from the population-level response (Jennrich and Schluchter 1986;Laird and Ware 1982;Verbeke and Molenberghs 2000;Wu 2010). For example, if media consumption is expected to generally increase with time at a constant rate but the response level and rate of change are expected to differ between individual consumers, then a linear growth model with a random intercept and random slope for time is specified. The variance of a random coefficient quantifies the degree to which individuals differ with respect to the coefficient. The variance of a random intercept, for instance, informs the analyst about the extent to which individuals differ from each other in their response levels (when all predictors, if included in a model, are equal to zero). If a model also includes a random slope for time, then the variance of a random slope quantifies the extent to which individuals differ from each other in their rates of change.
Within individuals, the occasion-specific residuals represent the deviations of an individual's observed responses from their fitted trajectory, and so the variance of the residuals describes the within-subject variation in observed responses about the individual's fitted trajectory.
An important advantage of a multilevel model is that covariates can be included to account for individual differences in a variable measured over time, and this can provide great insight into consumer behavior (Blozis et al. 2019;Katakam et al. 2021;Kumar et al. 2017). For longitudinal data, covariates may be included at the first level of the model to account for occasion-specific variation in responses and at the second level to account for individual differences in the random effects. In other words, a covariate can be time varying or time invariant. Thus, it is possible to estimate within-consumer effects of covariates, such as whether media use is higher on weekends versus weekdays, as well as between-consumer covariate effects, such as whether media use is higher for men versus women (Blozis et al. 2019).
Important to a marketing analyst is the flexibility offered in how a multilevel model can be specified. That is, these models can be applied to a wide range of response types, making this modeling framework amenable to analyzing different kinds of consumer outcomes. For example, these models can be applied to normally distributed data, count data, or categorical response data, to name a few (Vonesh 2012). The fact that these models have this kind of flexibility is particularly important for the analysis of media use data that, similar to time use data, can be described as having a subset of responses equal to zero, denoting that some consumers do not engage with the target form of media, and are otherwise continuous and likely to be positively skewed when relatively few individuals engage with media at an increasingly high level (Blozis et al. 2019;Taipale et al. 2021). Indeed, multilevel models permit the kind of flexibility to handle the complex responses that marketing analysts can encounter.
Considering measures of time spent engaged with media in particular, Blozis et al. (2019) developed a special type of multilevel model for semi-continuous data that permit heterogeneity of variance both within and between individuals in repeated measures of media consumption. A semi-continuous response is one in which the response can be equal to a discrete value, such as zero, but is otherwise continuous (Olsen and Schafer 2001). This particular type of response is characteristic of media use when the recorded response is the time spent engaged with media. As is common for measures of time use data more generally, zero is recorded on those occasions when a consumer did not engage in the behavior and a positive value is indicated otherwise if the consumer did engage, and thus, denotes the time spent engaged. Their model was a two-part multilevel location scale model that specifically targeted a semi-continuous response, but additionally, provided for a thorough study of the different sources of variation that are often observed in repeated measures of media use.

Multilevel models to understand withinand between-consumer variation
Recently, multilevel models evolved to permit a more refined examination of within-and between-person differences in repeated-measures data. That is, along with having random coefficients that make a model unique to the individual, heterogeneity of the occasion-level residual variance and heterogeneity of variance of the random coefficients can be studied using multilevel location scale model (Hedeker et al. 2008;Hedeker and Nordgren 2013). In this model, a repeated measure is assumed to follow a function that describes the general tendency of an individual's responses over time, such as an intercept-only function, a linear growth function, or quadratic growth function. This aspect of the model is shared with multilevel models as commonly applied to longitudinal data. A multilevel location scale model goes beyond this by permitting the variance of the occasion-level residuals to vary by individual, either by modeling the variance to be a function of covariates or by including a random scale effect to account for heterogeneity due to unmeasured influences. In Blozis et al. (2019), for instance, a multilevel location scale model was applied to daily reports of time spent watching TV for adults in the US who were surveyed for eight consecutive days. They found that younger consumers relative to older had greater day-today variability in their positive reports of time spent watching TV across survey days. They also reported that not all of the day-to-day variation in time spent could be accounted for by the measured covariates, leaving open questions about other variables that may explain the variation.
A multilevel location scale model can also be used to model heterogeneity of variance in the random coefficients. That is, in contrast to estimating a multilevel model to estimate the variance of a random coefficient, with a goal to understand the extent to which individuals differ in a particular feature describing a behavior over time, such as the variance of a random intercept that quantifies individual differences in conditional response levels, a multilevel location scale model provides a means to study the variance of a random coefficient as a function of covariates. In Hedeker et al. (2008), for example, a random intercept model was used to study repeated measures of positive mood in adolescent cigarette smokers. In that report, there was greater between-person variation in the random intercept for smokers who had been identified as loners relative to others. This result suggested a greater degree of individual differences in positive mood for those identified as loners relative to others.

Estimation of multilevel location scale models
A multilevel location scale model can be estimated using commercial software programs for clustered data that have the flexibility to model a variance by a function, such as an exponential function. Expressing a variance by an exponential function ensures that the estimated variance is not negative. Important for market research, an exponential model makes it convenient to model a variance as a function of covariates. For instance, Hedeker et al. (2008) illustrate how SAS PROC NLMIXED can be used for maximum likelihood estimation of a multilevel location scale model. In the special case of a semi-continuous variable measured over time, Blozis et al. (2019Blozis et al. ( , 2020 develop PROC NLMIXED syntax to fit a two-part multilevel location scale model for semi-continuous repeated measures. In Blozis et al. (2019) the model was applied to daily diary measures of time spent watching TV to model individual differences in the likeliness that an individual watched TV, the amount of time spent watching TV on days when they watched, and the dayto-day variation in time spent on days when they watched. Covariates were used to predict the likeliness that individuals watched TV, their daily time spent, and between-person differences in daily variability in time spent across days.

The current study
Although multilevel models can be estimated using maximum likelihood, there are advantages to considering a Bayesian framework. Reasons for using a Bayesian approach that are often cited (and not limited to the estimation of multilevel models) include the ability to incorporate background knowledge into an analysis (Gelman et al. 2004;Carlin and Louis 2000). For instance, past research on media use, such as an estimate of the mean time spent by a consumer group, can be included as prior information in a Bayesian analysis. In this way, the analyst is updating their current knowledge of the behavior based on past information and the information provided by the current sample. In another example, if data collection involves daily measures of media use where it is known in advance of the analysis that the maximum possible time spent in a given day is 24 h, a Bayesian framework can make use of this information. A Bayesian approach also offers great ease in the interpretation of a credible interval (i.e., the Bayesian counterpart of a confidence interval obtained by the frequentist approach). That is, assuming information based on prior studies or assumptions made about a parameter and combining that information with that provided by the observed data, the true population parameter has a 95% chance of being included within the estimated 95% credible interval. This is unlike a 95% confidence interval estimated under a frequentist approach in which the method of calculating the interval is expected to be successful 95% of the time across a large number of replications of the study. Another reason to opt for a Bayesian approach is that when fitting a model with multiple predictors, there is typically no need to adjust for multiple comparisons, as is often done to control the overall Type 1 error rate in a model estimated using a frequentist approach (Gelman et al. 2012).
This paper builds from previous work to develop a Bayesian approach to the estimation of a two-part multilevel location scale model for repeated semi-continuous measures. First, it extends Xing et al. (2017) in which a Bayesian approach is used for the estimation of a two-part multilevel model for semi-continuous repeated measures. Earlier work demonstrates the desirable properties of a Bayesian approach to the estimation of a logistic random-effects model, evidence that is relevant to the estimation of a two-part multilevel model given that one part of the model involves a binary outcome (Li et al. 2011). The work here further extends Lin et al. (2018) in which Bayesian estimation was used to fit a multilevel location scale model to continuous data, with evidence that a Bayesian estimation approach can be used to sample from the specified distributions and result in consistent estimates.
To make the developments here accessible to analysts, this paper uses publicly available data to illustrate the method. That is, the daily diary reports of reported time spent watching TV reported in Blozis et al. (2019) are presented here. A multilevel location scale model is applied to the data with additions that were not included in this previous work. Similar to Blozis et al., covariates are included to predict daily measures of engagement and time spent, in addition to accounting for the day-to-day variability in time spent watching TV. A point of departure from their analysis is that covariates are included to study the variances of the random intercepts of each model part. The first model part relates to whether or not a consumer watched TV on a given day, and so the intercept of the model represents the consumer log odds of watching TV across days. The variance of this intercept quantifies individual differences in the log odds, and so covariates are brought into a model for the variance to test if any account for between-subject heterogeneity of variance in the log odds. The second model part relates to the amount of time a consumer spent watching TV on a given day, and the intercept of the model represents the consumer mean time spent watching TV across days when TV was watched. The variance of this intercept quantifies individual differences in the mean time spent, with covariates brought into a model for the variance to test if any covariates account for between-subject heterogeneity of variance in daily mean time spent. Additionally, SAS PROC MCMC, a procedure developed for Bayesian estimation, is used for estimation of the models presented here. This procedure is highly flexible in that it may be applied to data that involve any number of levels in a data hierarchy, as well as models for which the random coefficients enter in a linear or nonlinear way.

Data collection strategies in consumer behavior research
Consumer behavior research has been successfully carried out using traditional methods of data collection, such as drawing random samples from a target consumer population or using stratified random sampling in which random samples are drawn from each strata of a population (e.g., geographic location) to ensure representation across strata. Sampling methods that use a one-time draw of consumers give a snapshot of consumer behavior and can inform the analyst about consumer characteristics that predict behavior. For example, in a study of young adults, researchers used an online survey to understand motivations for following social media influencers and the relationships between these motivations and self-report measures of advertisement clicking and buying behaviors (Croes and Bartels 2021). As the researchers noted, the use of cross-sectional data did not lend to an understanding of how these behaviors evolve over time. In fact, the data provide only an understanding of individual differences in self-reported estimates of typical advertisement clicking and buying behaviors and the degree to which the measured variables were related to each other at the one point in time.
There are many instances in which a variable is expected to change with time and understanding that change is fundamental to consumer research. Chaney and Martin (2014), for example, studied change in consumer loyalty and found that a decrease in loyalty was slower if consumers and managers had a high level of shared social and cultural values. Payan et al. (2010) relied on longitudinal data to predict the survival and dissolution of the relationships between organizations over an extended period time. In a study of impulse purchase behavior, Katakam et al. (2021) collected longitudinal data to understand predictors of impulse purchase behaviors over time, with their analysis also allowing for changes in the predictors over time. Kumar et al. (2017) studied the effects of social media and marketing on brand sales with a special focus on the time-varying effects of social media. Indeed, the need for longitudinal data is essential for gaining a deeper understanding of consumer behavior.

Daily diary measures of media use
Diary methods for data collection are a convenient way to obtain repeated measures of consumer data, especially if the data collection can be done by consumer personal devices. Advantages to diary methods include data collection over time, such as taking multiple measures within or across multiple days. When data can be collected at a high frequency, such as reporting on daily media use across a period of multiple days, this can relieve participants of the need to estimate their own 'typical' level of behavior as is required in a one-time survey. Diary methods make it easier as well to collect measures of covariates whose values may also change with time.

Media use data
Media use data share features with time use measures reported in other domains of research. That is, measures of time use generally include a spike of zeros that represents the subset of respondents who do not engage in the target behavior or who did not use at the time of measurement. For those who are users of a given medium, the amount of time spent naturally varies. In the end, a visual display of the time spent using the medium can spike at 0 and possibly have a long tail extending to the right for heavy users. Data that have this pattern of zeros combined with positive and continuous values are called semi-continuous data, and two-part models are statistical models developed to address these particular features of semi-continuous data (Duan et al. 1983;Olsen and Schafer 2001). Two-part models are gaining popularity in many fields of research as a means for understanding semi-continuous data given their special features. It is therefore worthwhile to consider two-part models as a tool for understanding media behaviors.

Empirical data example
Data from a large study of daily time spent watching TV are presented here to illustrate the methods developed in this paper. The data are used because they are publicly available, and thus, accessible to researchers interested in exploring the methods developed here and later applying them to their own data. Although the data are daily reports of time spent watching TV, the methods developed here are applicable to data collected using more time-intensive assessments, such as assessments collected by personal devices (e.g., smartphones) that can render data collection at a relatively higher frequency. The data analyzed here are from the Midlife in the US (MIDUS Refresher): Daily Diary Project (Ryff andAlmeida 2012-2014) study in which participants were selected to be nationally representative using a randomdigit-dialing method. Participants were English-speaking adults residing in the contiguous US. Descriptive statistics of the sample are reported in Blozis et al. (2019). Briefly, most (56%) participants were women and 58 years of age on average (SD = 12.5 years). Most were married (71%) and had a college degree or higher (69%). Descriptive statistics for measures of reported time spent watching TV are given in Blozis et al. (see Table 1 therein) and so are not duplicated here.

Two-part multilevel models
Two-part multilevel models were developed for semi-continuous data obtained across repeated occasions for the same individuals (Olsen and Schafer 2001). A semi-continuous variable is one for which a response can be discrete (e.g., equal to zero) or continuous. A variable that measures time spent engaged in an activity, including time spent engaged with media, is an example of a semi-continuous variable. For time use measures, the discrete value is zero and all other values are continuous and positive. A two-part multilevel model is based on a two-part model for semi-continuous data obtained at a single point in time (Duan et al. 1983). The two-part multilevel model differs from this earlier model in that it accounts for dependencies of scores within subject due to repeated measures.
In a two-part multilevel model, there are two model parts, each of which is used to model a different aspect of the semi-continuous variable that has been measured repeatedly for the same individuals. The first part models a binary response that indicates whether or not an individual engaged in the behavior, such as whether or not an individual watched TV on a given day. The second part models the continuous response, such as the time spent watching TV on a given day conditional that TV was watched. As a multilevel model, one or more of the coefficients of each model part are unique to the individual which permits individual differences in the likeliness to engage in the behavior over repeated measures and individual differences in the average amount of engagement on those occasions when individuals engaged. At the individual level, the two models are joined by a covariance that can be used to study the strength of the association between the likeliness to engage in the behavior and the degree of engagement when engaged. For example, Blozis et al. (2019) reported a positive association between the likeliness to watch TV and the mean time spent watching on days when consumers watched TV.
A two-part multilevel model has several notable features. First, by separately modeling the two aspects of a behavior it is possible to predict both why individuals engage in a behavior, and when engaged, the level of engagement, by including covariates in each of the model parts, that is, that for the binary response and that for the continuous. There is no requirement that the covariates be the same between the two model parts, making it possible to uniquely predict each aspect of the behavior. Another advantage to using a two-part model is that the zeros are separated from the continuous values. The mean of a variable is influenced by the presence of zeros, and so a mean value that includes zeros in its calculation will pull the mean down. Therefore, when the analytic goal is to summarize mean time spent engaged with media, it is important to keep this in mind and clarify whether the zeros should be included in the calculation or not. In a two-part model, the zeros are handled by separating them from the continuous values. In this way, the mean of the continuous part of the variable reflects the mean value in absence of the zeros and reflects the measure conditional that there was engagement in the measured behavior. In the context of a media use study, for example, there may be instances (e.g., days of measurement) when an individual does not engage with the media and so these instances are separated from those when the individual is engaged to calculate mean levels of engagement.

Missing data
Estimation of a multilevel model does not require complete response data for all planned occasions. If there are missing data, then statistical inference is valid if the data are missing at random, meaning that whether or not an individual has missing data, referred to as missingness, is independent of the missing values (Laird and Ware 1982). For the analyses presented here, the data are assumed to be missing at random.

Covariates
Covariates included in the analyses here are consumer age and sex (time-invariant covariates), and the day of the week when the daily survey was conducted. Age was centered about the sample mean age of 48 years, and gender was coded as Gender = 1 if a woman and 0 if a man. Indicator variables, denoted generally by Day j for j = 1,…,6, were created to denote the day of the week, with Sunday serving as the reference day. For example, the indicator Mon was equal to 1 if the survey was conducted on a Monday and was equal to 0 otherwise. Indicators were created for all other days (except for Sunday) to include the effects of the other days.

Methods
A two-part multilevel model is applied to repeated measures of daily reports of time spent (in hours) watching TV. Following Blozis et al. (2019), the binary measures were assumed to follow a logistic distribution and the continuous measures were assumed to follow a lognormal distribution (given that the distribution of the positive responses was positively skewed). Three sources of variation in TV measures were studied: an individual's log odds of watching TV, daily log mean time spent across days when TV was watched, and variation in reported values of log time spent about the individual's log mean time. Here, a two-part multilevel model without covariates was first fitted to the data to provide Bayesian estimates of a baseline model. Next, covariates were added to the model, with this model representing the typical application of a two-part multilevel model in which covariates predict the individual log odds of watching TV and the daily time spent. Last, the model was extended to a two-part multilevel location scale model where the covariates play a greater role in accounting for different sources of variation in media use. This sequence of model fitting is done to highlight the potential for added elements provided by the model and how it differs from a standard application of a two-part multilevel model. Thus, in addition to using the covariates as predictors of the individual log odds and daily time spent, the covariates predict the residual variance of the continuous model part and the variances of the random intercepts of the two model parts, thus permitting a deeper look at individual differences in media use.

Using SAS PROC MCMC for Bayesian estimation
Bayesian estimation was carried out using PROC MCMC with SAS/STAT software Version 9.4 1 whose sampling mechanism is a self-tuned random walk Metropolis algorithm (Chen, 2009). In specifying a model, prior distributions are specified for all model parameters. Here, weakly informative prior distributions were used such that the prior distributions were based on a relatively high degree of variation. By selecting prior distributions that are considered to be weakly informative, the influence of the prior on statistical inference is minimized because the likelihood component that relates to the observed data will dominate, particularly as sample size increases (Press 2003). All fixed effects were assumed to have Gaussian priors with mean equal to 0 and SD = 10 so that the prior distribution of each fixed effect had a high degree of variation and was centered primarily about 0. The prior distribution of the intercept of the continuous model, a parameter that corresponds to the individual's daily mean log time spent watching TV, was also specified to have an upper bound of log(24 h) = 1.4 log hours (rounded up to 1 decimal place); this was done to set the maximum time spent on any given day to 24 h. The prior of the variance-covariance matrix of the random coefficients was assumed to be an inverse Wishart with a small number of degrees of freedom (here, 2 df) (Gelman et al. 2014). The residual variance at the first level of the continuous model part was modeled using an exponential function (following Blozis et al. 2019). The variances of the random intercepts were also modeled using exponential functions. Using a nonlinear function to model each of the three variances (the residual variance at the occasion level and the variances of the random intercepts of the two model parts), the parameters of each variance model were assumed to have Gaussian priors with mean equal to 0 and SD = 10.
An important aspect to using Bayesian estimation is to assess model convergence (Brooks and Roberts 1998). Model convergence was evaluated by inspecting the trace and autocorrelation plots for each parameter and the effective sample size (ESS) of each parameter. A lower bound ESS for each model was calculated using the R package mcmcse (Flegal et al. 2021) assuming a 95% confidence level. This lower bound value of the ESS depends on the total number of model parameters. The ESS value for a given model was compared to the observed ESS corresponding to each model parameter to check that the minimum value was met. Markov chains were run for 20,000,000 iterations with 20,000 burn-in iterations and thinning set to 2000 (thinning was done only to reduce memory requirements). Estimates of the fixed effects are posterior means and estimates of the variance parameters are posterior modes. The 95% highest posterior density intervals (HPDI) are also reported.

Two-part multilevel model
A two-part multilevel model is described first and then is extended to form a two-part multilevel location scale model. First, let y ij be the response for individual i at time j, where i = 1,…,N and j = 1,…,n i , with N denoting the total number of individuals and n i the number of responses for the individual. To fit a two-part multilevel model, two new variables are created from the original response y ij . The first is a binary variable, denoted here by u ij , set equal to 1 if y ij > 0 and set equal to 0 if y ij = 0. The second is a positive and continuous measure, denoted here by m ij , set equal to y ij if y ij > 0 and coded as missing if y ij = 0.
The binary response is modeled using a multilevel logistic model. Let η ij denote the logit of the probability that individual i engaged in the behavior at time j, then The logit η ij is assumed to follow a two-level model to account for the repeated measures nested within person: In Eq. (1), α 0 is a fixed effect conditional on covariates and represents the population log odds of engaging in the behavior when all covariates and a i are equal to 0; ∑ K 1 k X ijk is the sum of the K effects of covariates X ijk for k = 1, … , K ; and a i is a random subject effect that indicates how an individual's log odds, conditional on the covariates, differs from the population value. Between individuals, the random effect a i is assumed to be independently and identically distributed (i.i.d.) as normal with mean = 0 and variance 2 a that quantifies the degree to which individuals differ in their log odds of engaging in the behavior, conditional on the covariates.
The continuous measure of the behavior is assumed to follow a generalized linear multilevel model. For example, Blozis et al. (2019) assumed a lognormal response distribution for positive reports of time spent watching TV. A model for the continuous response is assumed to be where 0 is a fixed effect that represents the population mean response across occasions when all covariates and s i are equal to 0; ∑ K 1 k X ijk is the sum of the K effects of covariates X ijk for k = 1, … , K ; and s i is a random subject effect that indicates how an individual's conditional mean response differs from the population value. The occasion-and individual-specific residual e ij indicates how a person's response differs from their predicted response. The random effect s i is assumed to be i.i.d. normal with mean = 0 and variance 2 s that quantifies individual differences in the positive measures. The residual is assumed to be independent between and within individuals with mean = 0 and variance 2 e . In anticipation of fitting a two-part multilevel location scale model to study the variances of the random effects at the subject level and the variance of the residuals at the occasion level, the variances of the intercepts of the logistic and continuous model parts and the residual variance at the first level of the continuous model part are each assumed to follow exponential functions: where the parameter of each variance model, such as τ 0 for the residual variance in Eq. (3c), when exponentiated, is the corresponding variance. As will be evident later, expressing each variance using an exponential function ensures that the variance is not negative and conveniently allows for the inclusion of covariates to account for heterogeneity of variance.
The binary and continuous model parts in Eqs.
(1) and (2), respectively, join at the second level by a covariance between the random intercepts of the two model parts. (2) 2 e = exp 0 , Conditional on the covariance, the parameters of the two model parts are assumed to be independent. The covariance between the random effects is denoted here by sa . At the second level of the model, the covariance matrix Φ is given as follows:

Results
Parameter estimates are provided in Table 1. Across days and individuals, the log odds of watching TV was estimated to be 2.59 (95% HPDI: [2.37, 2.81]), which corresponds to a probability of exp{2.59} (1+exp{2.59}) = .93 . Across days and individuals, the estimated mean log time spent watching TV, conditional that any positive amount of time was spent, was 0.17 (95% HPDI: [0.15, 0.19]), which corresponds to 10 17 = 1.5 h. The variances of the random effects of each model part indicate individual differences in the log odds of watching TV ( ̂ 2 a = 4.68 ) and the daily mean log time spent on days when TV was watched ( ̂ 2 s = 0.08 ). Each of these variances is assumed to be homogeneous across days and individuals, an  assumption that will be relaxed when the model is extended to a two-part multilevel location scale model. Random draws from posterior distributions were taken to generate bar plots of 95% interval estimates of the random effects of the binary and continuous model parts. The intervals are shown in Fig. 1: the upper figure displays the interval estimates from the binary model, and the lower figure displays the estimates from the continuous model. The individual's mean is indicated by the solid circle at the center of the interval. The overall mean of the random effects, equal to 0, is indicated by the solid vertical line shown centered in each display. These displays show the extent to which individuals differ from each other in the log odds and mean log time spent watching TV. The estimated correlation between the consumer-level log odds of watching TV and the mean log time spent was 0.67, indicating a moderately strong tendency for a higher likeliness of watching TV to correspond to longer times spent watching on days when TV was watched. Finally, the within-person residual variance that represents the degree to which daily log times fall about an individual's fitted mean log across days was estimated to be ̂ 2 e = 0.06 (95% HPDI: [0.06, 0.07]. Under a two-part multilevel model, the residual variance of the continuous model part is assumed to be constant across days and individuals. Similar to assumptions about the variances of the random intercepts of the two model parts, this assumption of homogeneity of the residual variance is relaxed next under a two-part multilevel location scale model.

Two-part multilevel location scale model
A two-part multilevel location scale model extends the preceding model by expressing the variances of the random coefficients of the two model parts using exponential functions which allows each variance to be a function of covariates. Specifically, the variance of the random effect in the model of the log odds of engaging in the behavior is expressed as a function of covariates. Similarly, the variance of the random effect in the model of log time spent engaged in the behavior is a function of covariates. Additionally, the variance of the occasion-level residual is modeled using an exponential function so that it is possible to include covariates of this variance to account for heterogeneity of the within-subject variance that is due to measured influences. Further, this particular variance model includes a random subject effect that accounts for heterogeneity of variance due to unmeasured sources. Thus, the variance model for the residual variance addresses heterogeneity of variance due to measured and unmeasured influences.
First, the models represented in Eqs. (3a) and (3b) in which the variance of each random effect was expressed using an exponential function were extended to include covariates: where the intercept of each model, when exponentiated, is the variance of the random intercept when all covariates X ijk for k = 1,…,K, where K is the number of covariates, are equal to 0. For the variance of the random intercept of the binary model, a positive covariate effect would indicate that a higher covariate level corresponds to greater between-person differences in the conditional log odds of the behavior, and a negative effect would indicate that a higher covariate level corresponds to a lower degree of between-person differences in the conditional log odds of the behavior. For the variance of the random intercept of the continuous model, a positive covariate effect would indicate that a higher covariate level corresponds to greater between-person differences in the conditional mean behavior, and a negative covariate effect would indicate that a higher covariate level corresponds to a lower degree of between-person differences in the conditional mean behavior. Finally, the model that expresses the variance of the occasion-level residuals of the continuous model part is extended to include covariates, and importantly, also includes a random subject effect v i : where 0 , when exponentiated, is the residual variance when the covariates X ijk for k = 1,…,K, where K is the number of covariates, and random effect v i are equal to 0. The random effect v i is assumed to be i.i.d. normal with mean = 0 and variance 2 v . The random effect v i is a random scale effect that indicates how an individual's conditional residual variance differs from the common value. A variance larger than 0 indicates differences between individuals in their variability of observed measures across time as they deviate about the individual's fitted mean after accounting for the effects of measured covariates on the residual variance. In other words, the random scale effect accounts for variation due to unmeasured influences, an aspect of the model that is particularly helpful for a data analysis in which the sources of the variation are unknown.
With the added random effect in the model for the residual variance of the continuous model part, the variance-covariance matrix of the random effects of a two-part multilevel location scale model is: where va and vb are the covariances between the random subject effect v i of the within-subject variance model in Eq. (5) and the random effects of Eqs. (1) and (2), respectively. The covariances va and vb represent the linear relationships between the random effect of the within-subject variance model with the individual-level log odds and individual conditional mean, respectively. A positive covariance would indicate that as either the individual log odds or conditional mean levels increase, there is a corresponding increase in the within-person variation in the continuous measures. A negative covariance would indicate that as either the individual log odds or conditional mean levels increase, there is a corresponding decrease in the withinsubject variation in the continuous measures.

Conditional two-part multilevel model
Two conditional models were applied to reports of TV use. The purpose of fitting these two models was to highlight the added features of the two-part multilevel location scale model over a two-part multilevel model. Both models include covariates but the first model is specified as would typically be done in fitting a two-part multilevel model with covariates used to predict the consumer-level daily reports of whether or not TV was watched and the time spent on days when TV was watched (see Blozis et al. 2019). The variances of the random intercepts of the two model parts characterize consumer differences in the log odds of watching TV and time spent, respectively. The variance of the residual of the continuous model part characterizes the typical squared deviation of an observed score and the consumer's estimated mean time spent across days, and this variance is assumed, under this model, to be homogeneous across days and individuals. As will be described in detail later, the second model is a two-part multilevel location scale model that extends the first model to include predictors of the three variances and additionally permits heterogeneity of the residual variance due to unmeasured sources. In a two-part multilevel model, the logit of the binary model was predicted by indicators for each of the days of the week, Age i and Gender i as follows: where 0 is the log odds of watching TV on Sundays for a man at the mean sample age of 49 and whose random effect a i is equal to 0; 1j for j = 1,…,6 are the effects of the days of the week (Monday through Saturday) relative to Sunday, and 2 and 3 are the effects of Age i and Gender i, respectively, on the logit. With the addition of covariates to predict the logit, the random effect a i is conditional on the covariates. As a result, a i is assumed to have an expected value of 0 and variance 2 a , where now 2 a is the between-subject variance of individual log odds conditional on the covariates.
The continuous model part for log time spent watching TV, conditional that any positive amount of time was spent, now also includes covariates: where 0 is the expected mean log time spent watching TV on Sundays for a man at the sample mean age of 49 and whose random effect s i is equal to 0. The coefficient 1j for j = 1,…,6 are the effects of the days of the week (Monday through Saturday) relative to Sunday, and 2 and 3 are the effects of Age i and Gender i , respectively. With the addition of covariates to predict the individual means of log time spent, the random effect s i is conditional on the covariates. Consequently, s i is assumed to have an expected value of 0 and variance 2 s , where 2 s is the conditional between-subject variance of the individual mean log time spent. The residual in Eq. (7) e ij is the deviation of an individual's observed score from the fitted value. The residuals are assumed to have a mean of 0 and to have constant variance across days and individuals.

Results
PROC MCMC syntax to fit the two conditional models are in "Appendices 1" and "2". Bayesian estimates of the parameters of a two-part multilevel model are shown in the first set of columns of both Tables 2 and 3. Estimates of fixed effects are in Table 2. As shown, a two-part multilevel model is used to predict the individual log odds of watching TV and mean log time spent on days when TV was watched. According to the estimates in Table 2, the log odds of watching TV tended to not differ between Sunday and all other days except for Saturday when the typical log odds was lower ( ̂ 1f = −0.36, 95%HPDI ∶ [−0.72, −0.05] . Older participants were more likely to watch TV relative to younger participants ( ̂ 2 = 0.03, 95%HPDI ∶ [0.02, 0.05] ) and women were less likely to watch than were men ( ̂ 3 = −0.40, 95%HPDI ∶ [−0.78, −0.01] . For the mean log time spent, less time was spent on all days relative to Sunday. Older participants reported more time spent watching relative to younger participants ( b 2 = 0.005, 95%HPDI ∶ [0.004, 0.007] ), but there was no statistically significant difference between men and women ( b 3 = 0.002, 95%HPDI ∶ [−0.04, 0.04]. Estimates of the variances of the random intercepts of the two model parts and the variance of the residual from the continuous model part are shown in Table 3. The values in the table correspond to the parameters of the exponential functions defined earlier in Eqs. (3a)-(3c). To convert values to variance estimates, each is exponentiated. Thus, the estimated variance of the individual log odds is ̂ 2 a = exp ̂ a0 = exp{1.52} = 4.57 , corresponding to a probability of exp{4.57} (1+exp{4.57}) = 0.99 . Thus, for men at the mean age of 49, the estimated probability of watching TV is 0.99.
The estimated variance of the daily log mean time spent is ̂ 2 s = exp ̂ s0 = exp{−2.63} = 0.07, corresponding to 10 07 = 1.2h for these individuals . The estimated residual variance of daily log time spent is exp ̂ o = exp{−2.80} = 0.06. As described earlier, a two-part multilevel model permits individual differences in each aspect of the measured variable and quantifies each by an estimated variance. Each variance is assumed to be homogeneous across days and individuals. As is shown next, this assumption is relaxed by fitting a twopart multilevel location scale model.

Two-part multilevel location scale model
In a two-part multilevel location scale model, the variances of the random effects of the binary and continuous model parts are functions of measured covariates, and the variance of the residual variance of the continuous model part is a function of measured covariates and a random subject effect to address unmeasured influences on the residual variance. Prior to reporting on the estimates from fitting this model, the unconditional model reported earlier in Table 1 was extended to include a random effect in the exponential function used to model the residual variance of the conditional model: where 0 is the common residual variance for an individual whose random effect v i is equal to 0. From this model, the estimated variance of the random scale effect was 0.40. To illustrate individual differences in this random scale effect, estimates of the 95% intervals of the random effect v i are shown in Fig. 2. This helps to illustrate the extent to which individuals differ in their day-to-day variability in time spent watching TV. This between-subject scale effect is modeled next.
In fitting a two-part multilevel location scale model, Eqs. (6) and (7) are used to predict the individual log odds of watching TV and the daily mean log time spent, as was done in the two-part multilevel model. Under this new model, the variances of the random intercepts are now functions of the covariates:  where a0 and s0 , when exponentiated, are the intercept variances of the binary and continuous models, respectively, on a Sunday for men whose mean age was equal to the sample mean of 49. The coefficients a1j for j = 1,…,6 reflect differences in the binary model's intercept between the days of the week relative to Sunday, and the coefficients s1j for j = 1,…,6 reflect differences in the continuous model's intercept between the days of the week relative to Sunday, all holding constant the effects of Age and Gender in each model. The coefficient a2 and s2 are the effects of Age on the intercepts of the binary and continuous models, respectively, holding constant the effects of Gender and the days of the week. The coefficient a3 and s3 are the effects of Gender on the intercepts of the binary and continuous models, respectively, and so reflect differences in the respective variances between men and women, holding constant the effects of age and days of the week.
To model the residual variance of the continuous model part, Eq. (8) was expanded to include the covariates: where τ 0 , when exponentiated, is the residual variance on a Sunday for a man at the sample mean age of 49. The effects of the days of the week are τ 1j for j = 1,…,6 and reflect differences in the residual variance between the days of the week

Results
Bayesian estimates of the parameters of a two-part multilevel location scale model are shown in the last set of columns of both Tables 2 and 3. Estimates of fixed effects are in Table 2. As shown in Table 2, the interpretation of how the covariates relate to the log odds of watching TV, in terms of which effects differ from zero and the direction of the effects, is the same as that provided from the two-part multilevel model. It is notable is that the standard deviations of the posterior distributions tend to be larger under the two-part multilevel location scale model, leading to wider credible intervals. In the two-part multilevel model, the variance of the individual log odds was assumed to be constant across days, age and gender. Under the two-part multilevel location scale model, this variance could vary according to these covariates. As shown in Table 3, all credible intervals include zero, making it unclear about the direction of covariate effects. From these results, it is not clear whether or not it is reasonable to assume homogeneity of the intercept variance, as is done under a two-part multilevel model, but rather, if there is heterogeneity, the results suggest that it is not attributable to the measured covariates considered in this study.
As shown in Table 2, conclusions about how the covariates relate to the mean log time spent watching TV is similar under a two-part multilevel model and a two-part location scale model. This result is to be expected because the model for predicting the daily log mean time spent is specified in the same manner under the two models. The difference between the two models is that the latter permits heterogeneity of the random coefficient variance. In contrast to the twopart multilevel model that assumes homogeneity of variance of the individual mean log time spent across days, age and gender, this variance could vary according to these covariates under the two-part multilevel location scale model. As shown in Table 3, although the intercept variance was not related to age ( ̂ b2 = 0.01, 95%HPDI ∶ [−0.004, 0.02] ) o r g e n d e r ( ̂ b3 = −0.16, 95%HPDI ∶ [−0.39, 0.02] ) , there are indications of heterogeneity of variance by the day of the week. Specifically, although the intercept variance did not differ between Sunday and Saturday ( ̂ b1f = 0.06, 95%HPDI ∶ [−0.13, 0.25] ), there were greater between-person differences in the intercept variance on the remaining days relative to Sunday. This result suggests that people differ more from each other in their mean log time spent on weekdays.
Unlike the two-part multilevel model that assumed homogeneity of the within-person variance, the two-part multilevel location scale model permits this variance to be a function of measured covariates as well as unmeasured sources. According to the estimates in Table 3 that relate to the residual variance, relative to Sunday, within-person variance was reduced

Discussion and implications
Choices in study designs are driven by the analytic goals of an investigation. For analyses that aim to study mean differences between consumers, such as consumer segmentation research, data collection at a single point in time is appropriate. There are numerous advantages to such study designs, including the ease of administration and timely data collection. If, however, the inherent interest involves questions about how consumer-level behaviors change over time, repeated measures are necessary.
Multilevel models are among the most popular statistical frameworks for analyzing repeated measures data. The flexibility of these models to handle a wide range of variable types, in addition to handling missing data, have made these models essential for marketing analytics. Not only are these models applied in regression analyses, but they have been used to advance regression-based techniques, including structural equation models for complex samples (Hirschmann and Swoboda 2017). Their applications are appropriate for both repeated measures data in which individuals are studied over time, as well as for clustered data, such as for cross-sectional studies of individuals clustered according to non-overlapping groups, including consumer segments defined by geographic location.
Multilevel models applied to repeated measures data are subject-specific models, meaning that the model is defined at the individual level. This is in contrast to standard modeling techniques that are applied to cross-sectional data, such as regression and analysis of variance, where parameters serve to characterize population-level aspects of behavior. In a regression model, for example, each coefficient of the model is the expectation of the effect of a covariate across individuals. Multilevel models, in contrast, provide information about the population in terms of the expectation of the effect of a covariate, but they also characterize the extent to which individuals differ from each other in the effect. This aspect of multilevel models has made them especially important as a tool in marketing analytics to understand individual differences in consumer research (Blozis et al. 2019;Katakam et al. 2021;Kumar et al. 2017).
Recent developments in multilevel models have greatly expanded the ways in which repeated measures are analyzed to improve understanding of individual differences. Multilevel location scale models, an extension of a standard multilevel model, offer a means to study individual differences in behavioral variation over time. This is done by modeling the withinsubject residual variance of a continuous measure as a function of covariates. Covariates may vary with time or be time invariant. The model for the residual variance may also include a random subject effect to capture unmeasured influences on the variance. This latter point is especially important in applied research in which sources of variation are either unknown or were not part of the planned measures. Multilevel location scale models also make it possible to better understand the variation in random coefficients. Whereas a standard multilevel model results in an estimate of the variance of a random coefficient, a multilevel location scale model includes a model for the variance of a random coefficient so that it may be a function of covariates, including time-varying and time-invariant covariates. These ideas have been extended to special types of data, including semi-continuous data that are defined by a combination of a discrete and continuous values (Blozis et al. 2020).
Estimation of multilevel location scale models requires specialized statistical software (Blozis et al. 2020). That is, these models involve a particular specification of the variance of the residuals at the first level of the model, as well as the variances of the random effects at the second level, thus requiring that software permit the analyst this flexibility. PROC NLMIXED was developed for maximum likelihood estimation of nonlinear mixed-effects models and is highly flexible in how a model can be specified. Importantly, the procedure includes a general model statement that permits users to specify their own log-likelihood function, a feature that is suitable to problems for which a non-standard response distribution is needed. Thus, PROC NLMIXED is a popular choice for fitting multilevel location scale models.
Although maximum likelihood approaches represent a standard for estimation across many fields of study, Bayesian estimation is increasingly being used. Among the reasons to prefer Bayesian estimation over frequentist methods is the ability to inform an analysis with information gained from prior studies. That is, the posterior estimates that result from a Bayesian analysis involve information from the observed data in combination with prior information that is specified by the analyst. As a result, analysts can continually update their knowledge about a behavior by using prior knowledge to inform current analyses. In the context of consumer research, this implies that business managers can continually update their knowledge of consumers. Relative to ML estimation, Bayesian analyses do not require large samples. Bayesian estimation can be statistically powerful and achieve precision in estimation while requiring a smaller ratio of model parameters to sample size (Lee and Song 2004;Hox et al. 2012).
Depending on the statistical software program used, a Bayesian approach can also offer greater flexibility in what the analyst assumes about the parameters of a model. In the example presented in this paper, one aspect of the behavior was a measure of time that consumers spent engaged with media. Given that each measure of time spent was restricted to a 24-h period, estimation of the multilevel model using a Bayesian approach could include this information as part of the prior information assumed about some parameters of a model.
As the field of marketing analytics continues to advance, researchers have an increasing number of ways in which to analyze data. This paper showcased two-part multilevel models and two-part multilevel location scale models for the analysis of repeated measures of semi-continuous data. As marketing analytics continue to expand by involving more complex study designs, including longitudinal investigations, these tools offer analysts greater flexibility in their analyses and present new ways of thinking about consumer behavior research. By placing these methods within a Bayesian framework, these tools offer many promising advantages.
It is important for the analyst to understand limitations to using a Bayesian approach to estimation. Among these are that the selection of priors is dependent on the choices made by the analyst, and thus, there is no certainty that a correct choice was made. Thus, it is important to realize that without careful consideration of the priors, the results generated from a given analysis can be misleading. It is especially important to realize that the results from a given analysis can be heavily influenced by the choices in the priors when sample size is small, and consequently, the prior distributions weigh more heavily in the results produced (Smid et al. 2020). Concerns about the sensitivity of results to choices in priors can be lessened by reliance on relatively large sample sizes.

Appendix 1
SAS PROC MCMC syntax for estimation of the two-part multilevel model fitted to TV use reports.