In this chapter, we will consider a more complicated form of the state equation—one that contains both a forgetting factor and an external input. We will also extend the earlier state-space model to the case where we now have one binary observation and two continuous observations. As before, however, we will first consider a scenario motivating the need for such a state-space model.

Recall the example two chapters ago concerning the estimation of sympathetic arousal from skin conductance features. In reality, it is not just the rate of occurrence of neural impulses to the sweat glands that reflects changes in arousal. Other features in a skin conductance signal also contain arousal information. A skin conductance signal comprises a fast-varying phasic component superimposed on top of a slower-varying tonic component. The phasic component consists of all the SCRs. The amplitudes of these SCRs (or equivalently, the amplitudes of the neural impulses that generated them), in addition to their occurrence, also reflect changes in arousal [59]. In particular, larger SCRs reflect greater sympathetic arousal. Additionally, the tonic level also contains information regarding general arousal [60]. Thus, there are three primary sources of information in a skin conductance signal that capture arousal levels: (i) the occurrence of SCRs (or equivalently the occurrence of the neural impulses that generated the SCRs); (ii) the amplitudes of the SCRs (or the amplitudes of the neural impulses); (iii) the tonic component. These three make up one binary feature and two amplitude (continuous) features. A state-space model based on these three features was developed in [29], for estimating arousal from skin conductance. Here, a transformed version of the SCR amplitudes was interpolated over to derive the first continuous variable, and the tonic component was considered the second continuous variable. Different algorithms are available for separating out the tonic and phasic components in a skin conductance signal (e.g., [61, 62]).

A further note is also worth mentioning here. Recall the \({\mathbf {u}}_{k}\) term in our discussion of state-space models at the beginning. Thus far, we have not yet considered a model where an external input also drives the state \(x_{k}\). In reality, external circumstances and environmental inputs all affect the way we feel. The model in [29] included such an external input term \(I_{k}\). The model was evaluated on two experimental datasets, one of which involved a Pavlovian fear conditioning experiment. In typical fear conditioning experiments, a neutral cue is paired with an unpleasant stimulus such as a painful electric shock. Through repeated pairing, the neutral cue alone begins to elicit a physiological response that is typically seen for the unpleasant stimulus [63]. In fear conditioning experiments, the unpleasant stimulus could also take other forms such a blast of air to the throat, an aversive image, or a loud sound [64, 65]. In [29], the neutral cues along with the unpleasant shocks were modeled as binary-valued indicator inputs \(I_{k}\) that drove the sympathetic arousal state \(x_{k}\).

5.1 Deriving the Predict Equations in the State Estimation Step

Let us now turn our attention to the state-space model itself and assume that \(x_{k}\) evolves with time as

$$\displaystyle \begin{aligned} x_{k} &= \rho x_{k - 1} + \alpha I_{k} +\varepsilon_{k}, \end{aligned} $$
(5.1)

where \(\alpha \) is a constant and \(I_{k}\) is an external input. The other terms have their usual meanings. Let us again consider how we may derive the mean and variance using basic statistical formulas. Since we know what the external input is, we do not treat it as a random variable but rather as a constant term. We first consider the mean.

$$\displaystyle \begin{aligned} \mathbb{E}[x_{k}] &= \mathbb{E}[\rho x_{k - 1} + \alpha I_{k} + \varepsilon_{k}] \end{aligned} $$
(5.2)
$$\displaystyle \begin{aligned} &= \mathbb{E}[\rho x_{k - 1}] + \mathbb{E}[\alpha I_{k}] + \mathbb{E}[\varepsilon_{k}] \enspace \text{using (2.1)} \end{aligned} $$
(5.3)
$$\displaystyle \begin{aligned} &= \mathbb{E}[\rho x_{k - 1}] + \alpha I_{k} + \mathbb{E}[\varepsilon_{k}] \enspace \text{using (2.2)} \end{aligned} $$
(5.4)
$$\displaystyle \begin{aligned} &= \rho\mathbb{E}[x_{k - 1}] + \alpha I_{k} + \mathbb{E}[\varepsilon_{k}] \enspace \text{using (2.3)} \end{aligned} $$
(5.5)
$$\displaystyle \begin{aligned} &= \rho\mathbb{E}[x_{k - 1}] + \alpha I_{k} \enspace \text{since }\mathbb{E}[\varepsilon_{k}] = 0 \end{aligned} $$
(5.6)
$$\displaystyle \begin{aligned} \therefore \mathbb{E}[x_{k}] &= \rho x_{k - 1|k - 1} + \alpha I_{k}. \end{aligned} $$
(5.7)

We next consider the variance.

$$\displaystyle \begin{aligned} V(x_{k}) &= V(\rho x_{k - 1} + \alpha I_{k} + \varepsilon_{k}) \end{aligned} $$
(5.8)
$$\displaystyle \begin{aligned} &= V(\rho x_{k - 1} + \alpha I_{k}) + V(\varepsilon_{k}) + 2 Cov(\rho x_{k - 1} + \alpha I_{k}, \varepsilon_{k}) \enspace \text{using (2.4)} \end{aligned} $$
(5.9)
$$\displaystyle \begin{aligned} &= V(\rho x_{k - 1} + \alpha I_{k}) + V(\varepsilon_{k}) \\&\quad \text{since }\varepsilon_{k}\text{ is uncorrelated with any of the }x_{k}\text{ or }I_{k}\text{ terms} \end{aligned} $$
(5.10)
$$\displaystyle \begin{aligned} &= V(\rho x_{k - 1}) + V(\varepsilon_{k}) \enspace \text{using (2.5)} \end{aligned} $$
(5.11)
$$\displaystyle \begin{aligned} &= \rho^{2} V(x_{k - 1}) + V(\varepsilon_{k}) \enspace \text{using (2.6)} \end{aligned} $$
(5.12)
$$\displaystyle \begin{aligned} \therefore V(x_{k}) &= \rho^{2} \sigma^{2}_{k - 1|k - 1} + \sigma^{2}_{\varepsilon}. \end{aligned} $$
(5.13)

When \(x_{k}\) evolves with time following \(x_{k} = \rho x_{k - 1} + \alpha I_{k} + \varepsilon _{k}\), the predict equations in the state estimation step are

$$\displaystyle \begin{aligned} x_{k|k - 1} &= \rho x_{k - 1|k - 1} + \alpha I_{k} \end{aligned} $$
(5.14)
$$\displaystyle \begin{aligned} \sigma^{2}_{k|k - 1} &= \rho^{2} \sigma^{2}_{k - 1|k - 1} + \sigma^{2}_{\varepsilon}. \end{aligned} $$
(5.15)

5.2 Deriving the Update Equations in the State Estimation Step

In this state-space model, we include a second continuous variable \(s_{k}\). Similar to \(r_{k}\), we will assume that \(s_{k}\) too is linearly related to \(x_{k}\) as

$$\displaystyle \begin{aligned} s_{k} &= \delta_{0} + \delta_{1}x_{k} + w_{k}{}, \end{aligned} $$
(5.16)

where \(\delta _{0}\) and \(\delta _{1}\) are constants and \(w_{k} \sim \mathcal {N}(0, \sigma ^{2}_{w})\) is sensor noise. Similar to the case of \(r_{k}\) in (4.16), we also have

$$\displaystyle \begin{aligned} p(s_{k}|x_{k}) &= \frac{1}{\sqrt{2\pi \sigma^{2}_{w}}}e^{\frac{-(s_{k} - \delta_{0} - \delta_{1}x_{k})^{2}}{2\sigma^{2}_{w}}}. \end{aligned} $$
(5.17)

The procedure to derive the update equations in the state estimation step is now similar to what we have seen earlier. With \(s_{k}\) included, we have yet another exponent term in \(p(x_{k}|y_{1:k})\). Therefore,

$$\displaystyle \begin{aligned} p(x_{k}|y_{1:k}) &\propto p(n_{k}|x_{k})p(r_{k}|x_{k})p(s_{k}|x_{k})p(x_{k}|n_{1:k - 1}, r_{1:k - 1}, s_{1:k -1 }) {} \end{aligned} $$
(5.18)
$$\displaystyle \begin{aligned} &\propto e^{n_{k}\log(p_{k}) + (1 - n_{k})\log(1 - p_{k})} \times e^{\frac{-(r_{k} - \gamma_{0} - \gamma_{1}x_{k})^{2}}{2\sigma^{2}_{v}}} \times e^{\frac{-(s_{k} - \delta_{0} - \delta_{1}x_{k})^{2}}{2\sigma^{2}_{w}}}\\&\quad \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2 \sigma^{2}_{k|k - 1}}}. \end{aligned} $$
(5.19)

Taking the log on both sides, we have

(5.20)

Taking the first derivative of q and setting it to 0 yield

$$\displaystyle \begin{aligned} \frac{d q}{d x_{k}} &= n_{k} - p_{k} + \frac{\gamma_{1}(r_{k} - \gamma_{0} - \gamma_{1}x_{k})}{\sigma^{2}_{v}} + \frac{\delta_{1}(s_{k} - \delta_{0} - \delta_{1}x_{k})}{\sigma^{2}_{w}} \\ &\quad - \frac{(x_{k} - x_{k|k - 1})}{\sigma^{2}_{k|k - 1}} = 0{}. \end{aligned} $$
(5.21)

We used a trick in the previous chapter to solve for \(x_{k}\). We added and subtracted \(\gamma _{1}x_{k|k - 1}\) to the term containing \(r_{k}\). We will do the same here. We will also add and subtract \(\delta _{1}x_{k|k - 1}\) to the term containing \(s_{k}\).

(5.22)
(5.23)
(5.24)

Therefore,

$$\displaystyle \begin{aligned} &\frac{(x_{k} - x_{k|k - 1})}{\sigma^{2}_{k|k - 1}} + \Bigg(\frac{\gamma_{1}^{2}\sigma^{2}_{w} + \delta_{1}^{2}\sigma^{2}_{v}}{\sigma^{2}_{v}\sigma^{2}_{w}}\Bigg)(x_{k} - x_{k|k - 1})\\ &\quad = n_{k} - p_{k} + \frac{\gamma_{1}(r_{k} - \gamma_{0} - \gamma_{1}x_{k|k - 1})}{\sigma^{2}_{v}}+ \frac{\delta_{1}(s_{k} - \delta_{0} - \delta_{1}x_{k|k - 1})}{\sigma^{2}_{w}} \end{aligned} $$
(5.25)
$$\displaystyle \begin{aligned} &(x_{k} - x_{k|k - 1})\Bigg[\frac{\sigma^{2}_{v}\sigma^{2}_{w} + \sigma^{2}_{k|k - 1}(\gamma_{1}^{2}\sigma^{2}_{w} + \delta_{1}^{2}\sigma^{2}_{v})}{\sigma^{2}_{k|k - 1}\sigma^{2}_{v}\sigma^{2}_{w}}\Bigg] \\ &\quad = \frac{\sigma^{2}_{v}\sigma^{2}_{w}}{\sigma^{2}_{v}\sigma^{2}_{w}}(n_{k} - p_{k}) + \frac{\gamma_{1}\sigma^{2}_{w}(r_{k} - \gamma_{0} - \gamma_{1}x_{k|k - 1})}{\sigma^{2}_{v}\sigma^{2}_{w}} \\&\qquad + \frac{\delta_{1}\sigma^{2}_{v}(s_{k} - \delta_{0} - \delta_{1}x_{k|k - 1})}{\sigma^{2}_{v}\sigma^{2}_{w}}. \end{aligned} $$
(5.26)

This yields the state update

$$\displaystyle \begin{aligned} x_{k|k} &= x_{k|k - 1} + \frac{\sigma^{2}_{k|k - 1}}{\sigma^{2}_{v}\sigma^{2}_{w} + \sigma^{2}_{k|k - 1}(\gamma^{2}_{1}\sigma^{2}_{w} + \delta^{2}_{1}\sigma^{2}_{v})}\\ &\quad \times\Big[\sigma^{2}_{v}\sigma^{2}_{w}(n_{k} - p_{k|k}) + \gamma_{1}\sigma^{2}_{w}(r_{k} - \gamma_{0} - \gamma_{1}x_{k|k - 1}) \\&\quad + \delta_{1}\sigma^{2}_{v}(s_{k} - \delta_{0} - \delta_{1}x_{k|k - 1})\Big]. \end{aligned} $$
(5.27)

Likewise, the second derivative yields

$$\displaystyle \begin{aligned} \frac{d^{2}q}{dx_{k}^{2}} &= -p_{k}(1 - p_{k}) - \frac{\gamma_{1}^{2}}{\sigma^{2}_{v}} - \frac{\delta_{1}^{2}}{\sigma^{2}_{w}} - \frac{1}{\sigma^{2}_{k|k -1}}{}. \end{aligned} $$
(5.28)

And therefore,

$$\displaystyle \begin{aligned} \sigma^{2}_{k|k} &= \Bigg[\frac{1}{\sigma^{2}_{k|k - 1}} + p_{k|k}(1 - p_{k|k}) + \frac{\gamma_{1}^{2}}{\sigma^{2}_{v}} + \frac{\delta_{1}^{2}}{\sigma^{2}_{w}}\Bigg]^{-1}. \end{aligned} $$
(5.29)

When \(x_{k}\) gives rise to a binary observation \(n_{k}\) and two continuous observations \(r_{k}\) and \(s_{k}\), the update equations in the state estimation step are

$$\displaystyle \begin{aligned} x_{k|k} &= x_{k|k - 1} + \frac{\sigma^{2}_{k|k - 1}}{\sigma^{2}_{v}\sigma^{2}_{w} + \sigma^{2}_{k|k - 1}(\gamma^{2}_{1}\sigma^{2}_{w} + \delta^{2}_{1}\sigma^{2}_{v})}\\ &\quad \times\Big[\sigma^{2}_{v}\sigma^{2}_{w}(n_{k} - p_{k|k}) + \gamma_{1}\sigma^{2}_{w}(r_{k} - \gamma_{0} - \gamma_{1}x_{k|k - 1}) \\&\quad + \delta_{1}\sigma^{2}_{v}(s_{k} - \delta_{0} - \delta_{1}x_{k|k - 1})\Big] \end{aligned} $$
(5.30)
$$\displaystyle \begin{aligned} \sigma^{2}_{k|k} &= \Bigg[\frac{1}{\sigma^{2}_{k|k - 1}} + p_{k|k}(1 - p_{k|k}) + \frac{\gamma_{1}^{2}}{\sigma^{2}_{v}} + \frac{\delta_{1}^{2}}{\sigma^{2}_{w}}\Bigg]^{-1}. \end{aligned} $$
(5.31)

5.3 Deriving the Parameter Estimation Step Equations

In this state-space model, we have the parameters \(\alpha \), \(\rho \), \(\gamma _{0}\), \(\gamma _{1}\), \(\delta _{0}\), \(\delta _{1}\), \(\sigma ^{2}_{v}\), and \(\sigma ^{2}_{w}\) to determine. We have already seen how \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) were derived in the previous chapter when we had \(r_{k}\). We will not repeat those derivations here again. Instead, we will only consider the derivations related to the new model parameters or changes to the way that earlier model parameters were derived. We will use this same approach of not re-deriving previous equations in the chapters that follow as well.

5.3.1 Deriving the Terms in the State Equation

We now have both \(\rho \) and \(\alpha \) in the state equation. To determine them at the parameter estimation step, we will take the partial derivatives of the log-likelihood term containing \(\rho \) and \(\alpha \). In this case, the term we are interested in is

$$\displaystyle \begin{aligned} Q_{1} &= \frac{1}{2\sigma^{2}_{\varepsilon}}\sum_{k = 1}^{K}\mathbb{E}\Big[(x_{k} - \rho x_{k - 1} - \alpha I_{k})^{2}\Big]. \end{aligned} $$
(5.32)

Again, we set \(x_{0} = x_{1}\) to permit some bias at the beginning and ignore the relationship through \(\rho \) for this boundary condition. Therefore,

$$\displaystyle \begin{aligned} Q_{1} &= \frac{1}{2\sigma^{2}_{\varepsilon}}\Bigg\{\sum_{k = 2}^{K}\mathbb{E}\Big[(x_{k} - \rho x_{k - 1} - \alpha I_{k})^{2}\Big] + \mathbb{E}\Big[(\alpha I_{1})^{2}\Big]\Bigg\}. \end{aligned} $$
(5.33)

We will now take the partial derivatives of \(Q_{1}\) with respect to \(\alpha \) and \(\rho \) and set them to 0. Let us first begin with \(\alpha \).

$$\displaystyle \begin{aligned} \frac{\partial Q_{1}}{\partial \alpha} &= \frac{1}{2\sigma^{2}_{\varepsilon}}\Bigg\{\sum_{k = 2}^{K}\mathbb{E}\big[-2I_{k}(x_{k} - \rho x_{k - 1} - \alpha I_{k})\big] + 2\alpha I_{1}^{2}\Bigg\} = 0 \end{aligned} $$
(5.34)
$$\displaystyle \begin{aligned} \implies 0 &= -\sum_{k = 2}^{K}I_{k}\mathbb{E}\big[x_{k}\big] + \rho\sum_{k = 2}^{K}I_{k}\mathbb{E}\big[x_{k - 1}\big] + \alpha\sum_{k = 1}^{K}I_{k}^{2} \\ &= -\sum_{k = 2}^{K}I_{k}x_{k|K} + \rho\sum_{k = 2}^{K}I_{k}x_{k - 1|K} + \alpha\sum_{k = 1}^{K}I_{k}^{2} \end{aligned} $$
(5.35)
$$\displaystyle \begin{aligned} \implies \rho\sum_{k = 2}^{K}I_{k}x_{k - 1|K} &+ \alpha\sum_{k = 1}^{K}I_{k}^{2} = \sum_{k = 2}^{K}I_{k}x_{k|K}. {} \end{aligned} $$
(5.36)

We will next take the partial derivative of \(Q_{1}\) with respect to \(\rho \).

$$\displaystyle \begin{aligned} \frac{\partial Q_{1}}{\partial \rho} &= \frac{1}{2\sigma^{2}_{\varepsilon}}\sum_{k = 2}^{K}\mathbb{E}\big[-2x_{k - 1}(x_{k} - \rho x_{k - 1} - \alpha I_{k})\big] = 0 \end{aligned} $$
(5.37)
$$\displaystyle \begin{aligned} \implies 0 &= -\sum_{k = 2}^{K}\mathbb{E}\big[x_{k}x_{k - 1}\big] + \rho\sum_{k = 2}^{K}\mathbb{E}\big[x_{k - 1}^{2}\big] + \alpha\sum_{k = 2}^{K}I_{k}\mathbb{E}\big[x_{k - 1}\big] \\ &= -\sum_{k = 1}^{K - 1}U_{k, k + 1} + \rho\sum_{k = 1}^{K - 1}U_{k} + \alpha\sum_{k = 2}^{K}I_{k}x_{k - 1|K} \end{aligned} $$
(5.38)
$$\displaystyle \begin{aligned} \implies \rho\sum_{k = 1}^{K - 1}U_{k} &+ \alpha\sum_{k = 2}^{K}I_{k}x_{k - 1|K} = \sum_{k = 1}^{K - 1}U_{k, k + 1}. \end{aligned} $$
(5.39)

We now have two equations with which to solve for \(\alpha \) and \(\rho \).

The parameter estimation step updates for \(\rho \) and \(\alpha \) when \(x_{k}\) evolves with time following \(x_{k} = \rho x_{k - 1} + \alpha I_{k} + \varepsilon _{k}\) are

$$\displaystyle \begin{aligned} \Bigg[\begin{array}{c} \rho \\ \alpha \end{array}\Bigg] = &\Bigg[\begin{array}{cc} \sum_{k = 1}^{K - 1}U_{k} & \sum_{k = 2}^{K}I_{k}x_{k - 1|K} \\ \sum_{k = 2}^{K}I_{k}x_{k - 1|K} & \sum_{k = 1}^{K}I_{k}^{2}\end{array}\Bigg]^{-1} \Bigg[\begin{array}{c}\sum_{k = 1}^{K - 1}U_{k, k + 1} \\ \sum_{k = 2}^{K}I_{k}x_{k|K}\end{array}\Bigg]. \end{aligned} $$
(5.40)

5.3.2 Deriving the Process Noise Variance

We next consider the parameter estimation step update for the process noise variance \(\sigma ^{2}_{\varepsilon }\). The log-likelihood term containing \(\sigma ^{2}_{\varepsilon }\) is

$$\displaystyle \begin{aligned} Q_{2} &= \frac{-K}{2}\log\big(2\pi \sigma^{2}_{\varepsilon}\big) - \sum_{k = 1}^{K}\frac{\mathbb{E}\Big[(x_{k} - \rho x_{k - 1} - \alpha I_{k})^{2}\Big]}{2\sigma^{2}_{\varepsilon}} \\ &= \frac{-K}{2}\log\big(2\pi \sigma^{2}_{\varepsilon}\big) - \sum_{k = 2}^{K}\frac{\mathbb{E}\Big[(x_{k} - \rho x_{k - 1} - \alpha I_{k})^{2}\Big]}{2\sigma^{2}_{\varepsilon}} - \frac{\mathbb{E}\Big[(\alpha I_{1})^{2}\Big]}{2\sigma^{2}_{\varepsilon}}. \end{aligned} $$
(5.41)

We take the partial derivative of \(Q_{2}\) with respect to \(\sigma ^{2}_{\varepsilon }\) and set it to 0 to solve for the parameter estimation step update.

$$\displaystyle \begin{aligned} \frac{\partial Q_{2}}{\partial \sigma^{2}_{\varepsilon}} &= \frac{-K}{2\sigma^{2}_{\varepsilon}} + \frac{1}{2\sigma^{4}_{\varepsilon}}\sum_{k = 2}^{K}\mathbb{E}\Big[(x_{k} - \rho x_{k - 1} - \alpha I_{k})^{2}\Big] + \frac{(\alpha I)^{2}}{2\sigma^{4}_{\varepsilon}} = 0 \end{aligned} $$
(5.42)
$$\displaystyle \begin{aligned} \sigma^{2}_{\varepsilon} &= \frac{1}{K}\sum_{k = 2}^{K}\Big\{\mathbb{E}\big[x_{k}^{2}\big] - 2\rho\mathbb{E}\big[x_{k}x_{k - 1}\big] + \rho^{2}\mathbb{E}\big[x_{k - 1}^{2}\big] - 2\alpha I_{k}\mathbb{E}\big[x_{k}\big]\\ &\quad + 2\alpha\rho I_{k}\mathbb{E}\big[x_{k - 1}\big]\Big\} + \frac{\alpha^{2}}{K}\sum_{k = 1}^{K}I_{k}^{2} \\ = &\frac{1}{K}\Bigg\{\sum_{k = 2}^{K}U_{k} - 2\rho \sum_{k = 1}^{K - 1}U_{k, k + 1} + \rho^{2}\sum_{k = 1}^{K - 1}U_{k} - 2\alpha\sum_{k = 2}^{K} I_{k}x_{k|K} \\ &\quad + 2\alpha\rho \sum_{k = 2}^{K}I_{k}x_{k - 1|K} + \alpha^{2}\sum_{k = 1}^{K}I_{k}^{2}\Bigg\}. \end{aligned} $$
(5.43)

The parameter estimation step update for \(\sigma ^{2}_{\varepsilon }\) when \(x_{k}\) evolves with time following \(x_{k} = \rho x_{k - 1} + \alpha I_{k} + \varepsilon _{k}\) is

$$\displaystyle \begin{aligned} \sigma^{2}_{\varepsilon} &= \frac{1}{K}\Bigg\{\sum_{k = 2}^{K}U_{k} - 2\rho \sum_{k = 1}^{K - 1}U_{k, k + 1} + \rho^{2}\sum_{k = 1}^{K - 1}U_{k} - 2\alpha\sum_{k = 2}^{K} I_{k}x_{k|K}\\ &\quad + 2\alpha\rho \sum_{k = 2}^{K}I_{k}x_{k - 1|K} + \alpha^{2}\sum_{k = 1}^{K}I_{k}^{2}\Bigg\} . \end{aligned} $$
(5.44)

5.3.3 Deriving the Constant Coefficient Terms and the Sensor Noise Variance

The procedure for deriving the constant coefficients \(\delta _{0}\) and \(\delta _{1}\) related to \(s_{k}\) is similar to what we have seen earlier for \(\gamma _{0}\) and \(\gamma _{1}\). The derivation of the sensor noise variance \(\sigma ^{2}_{w}\) is also similar to that for \(\sigma ^{2}_{v}\).

The parameter estimation step updates for \(\delta _{0}\), \(\delta _{1}\), and \(\sigma ^{2}_{w}\) when we observe a second continuous variable \(s_{k} = \delta _{0} + \delta _{1}x_{k} + w_{k}\) are

$$\displaystyle \begin{aligned} \Bigg[\begin{array}{c} \delta_{0} \\ \delta_{1}\end{array}\Bigg] &= \Bigg[\begin{array}{cc} K & \sum_{k = 1}^{K}x_{k|K} \\ \sum_{k = 1}^{K}x_{k|K} & \sum_{k = 1}^{K}U_{k}\end{array}\Bigg]^{-1} \Bigg[\begin{array}{c}\sum_{k = 1}^{K}s_{k} \\ \sum_{k = 1}^{K}s_{k}x_{k|K}\end{array}\Bigg] \end{aligned} $$
(5.45)
$$\displaystyle \begin{aligned} \sigma^{2}_{w} &= \frac{1}{K} \Bigg\{\sum_{k = 1}^{K}s_{k}^{2} + K\delta_{0}^{2} + \delta_{1}^{2}\sum_{k = 1}^{K}U_{k} - 2\delta_{0}\sum_{k = 1}^{K}s_{k} - 2\delta_{1}\sum_{k = 1}^{K}s_{k}x_{k|K}\\ &\quad + 2\delta_{0}\delta_{1}\sum_{k = 1}^{K}x_{k|K}\Bigg\} . \end{aligned} $$
(5.46)

5.4 MATLAB Examples

The code examples implementing the EM algorithm for the current state-space model are provided in the “one_bin_two_cont∖sim” and “one_bin_two_cont∖expm” folders. These two directories contain the following files:

  • one_bin_two_cont∖

    • sim∖

      • data_one_bin_two_cont.mat

      • filter_one_bin_two_cont.m

      • data_one_bin_two_cont_no_extern_stim.mat

      • filter_one_bin_two_cont_no_extern_stim.m

    • expm∖

      • expm_data_one_bin_two_cont.mat

      • expm_filter_one_bin_two_cont.m

      • expm_data_one_bin_two_cont_no_extern_stim.mat

      • expm_filter_one_bin_two_cont_no_extern_stim.m

For both simulated and experimental data, we have provided examples with and without \(\alpha I_{k}\) (the external input). Results from running the code on a simulated example with \(\alpha I_{k}\) included and on an experimental data example without \(\alpha I_{k}\) are shown in Fig. 5.1. For simulated and experimental data containing \(\alpha I_{k}\), the “.m” files are named “filter_one_bin_two_cont.m” and “expm_filter_one_bin_two_cont.m,” respectively. The corresponding examples without \(\alpha I_{k}\) have the “no_extern_stim” suffix added to them.

Fig. 5.1
10 waveforms on state estimation with simulated data and state excitation with estimation data. The time index has input quantities versus standard normal quantities is an uniformly increasing trend. The waveforms show multiple peaks in the initial phase and then decline slowly with increase in time.

State estimation based on observing one binary and two continuous variables. The left sub-figure depicts estimation on simulated data, and the right sub-figure depicts the estimation of sympathetic arousal from skin conductance data. The sub-panels on the left, respectively, depict: (a) the probability of binary event occurrence \(p_{k}\) (blue) and its estimate (red) (the green and black dots above at the top denote the presence or absence of binary events, respectively); (b) the first continuous variable \(r_{k}\) (blue) and its estimate (red); (c) the second continuous variable \(s_{k}\) (blue) and its estimate (red); (d) the state \(x_{k}\) (blue) and its estimate (red) (the cyan and black dots denote the presence or absence of external binary inputs, respectively); (e) the QQ plot for the residual error of \(x_{k}\). The sub-panels on the right, respectively, depict: (a) the skin conductance signal (the green and black dots on top denote the presence or absence of SCRs, respectively); (b) the phasic-derived variable \(r_{k}\) (green solid) and its estimate (dotted); (c) the tonic level \(s_{k}\) (pink solid) and its estimate (dotted); (d) the arousal state \(x_{k}\) and its 95% confidence limits; (e) the probability of SCR occurrence \(p_{k}\) and its 95% confidence limits; (f) the HAI (the regions above 90% and below 10% are shaded in red and green, respectively). The background colors on the right sub-figure correspond to the instruction period, a counting task, a color–word association task, relaxation, and watching a horror movie clip. From [32], used under Creative Commons CC-BY license

In this case, the model takes in as inputs the variables n, r, and s that denote \(n_{k}\), \(r_{k}\), and \(s_{k}\), respectively, for estimating \(x_{k}\). Since there are three different observations, the code also has more parameters to initialize. In the code, the variables r0 and r1 are used for \(\gamma _{0}\) and \(\gamma _{1}\), respectively, and s0 and s1 are used for \(\delta _{0}\) and \(\delta _{1}\). The variables vr and vs denote the corresponding sensor noise variances \(\sigma ^{2}_{v}\) and \(\sigma ^{2}_{w}\). Finally, ve, rho, and alpha denote the process noise variance \(\sigma ^{2}_{\varepsilon }\), the forgetting factor \(\rho \), and the \(\alpha \) term related to \(I_{k}\), respectively. Shown below is a brief code snippet showing the parameter initialization.

Also, base_prob (\(p_{0}\)) is still calculated based on the average probability of \(n_{k} = 1\) occurring in the data. The other variables x_pred, x_updt, and x_smth for \(x_{k|k - 1}\), \(x_{k|k}\), and \(x_{k|K}\) remain the same, as well as the corresponding v_pred, v_updt and v_smth variables for variance. There is a sequential progression in the code through \(k = 1, 2, \ldots , K\) and then through \(k = K, (K - 1), \ldots , 1\) at the state estimation step. The terms r0, r1, s0, s1, vr, vs, ve, rho, and alpha are calculated at the parameter estimation step. Shown below is a code snippet in the forward progression.

The code where we proceed in the reverse direction at the state estimation step is shown below. While it is largely similar to what we saw in an earlier chapter, now the variable rho is also included.

Note that in the examples where an external input is absent, alpha is not calculated. The state estimation step and the parameter estimation step are performed in turn until convergence.

5.4.1 Application to Skin Conductance and Sympathetic Arousal

This state-space model with one binary and two continuous observations was used in [29] for estimating sympathetic arousal from skin conductance. In the model, the tonic component made up the continuous variable \(s_{k}\). The other continuous variable \(r_{k}\) was derived somewhat differently. SCR amplitudes can have a skewed distribution which a log transformation can help correct. Therefore, the log of the SCR amplitudes was taken and interpolated over to generate \(r_{k}\).

A further point is to be noted with experimental data. The estimated state \(x_{k}\) can occasionally overfit to one of the continuous variables [29]. Consequently, an additional constraint was applied to allow the parameters corresponding to \(r_{k}\) and \(s_{k}\) (i.e., \(\gamma _{0}\), \(\gamma _{1}\), \(\sigma ^{2}_{v}\), \(\delta _{0}\), \(\delta _{1}\), and \(\sigma ^{2}_{w}\)) to update only if the sensor noise variance estimates did not differ by more than a certain amount. Details of this can be found in [29]. This constraint prevented one of the sensor noise variance estimates from being driven down at the expense of the other (which takes place during overfitting).

If the external inputs are unknown, the version of the code without \(\alpha I_{k}\) can be used. The experimental results in Fig. 5.1 are from a case where \(\alpha I_{k}\) is not considered. The data come from the stress experiment in [53] which we also considered two chapters ago. The portion of the experiment considered here consists of the cognitive stressors, relaxation, and the horror movie clip. The state estimates are high during the cognitive stressors and thereafter gradually diminish. However, the increase seen in the HAI at the beginning of the horror movie clip is quite significant.

Data from the Pavlovian fear conditioning experiment in [66] are taken for the experimental code example containing the \(\alpha I_{k}\) term. The results are shown in Fig. 5.2. The experiment is described in detail in [67, 68]. In a typical fear conditioning experiment, a neutral cue is paired with an unpleasant stimulus such as a painful electric shock. Through repeated pairing, a subject begins to display a response to the neutral cue alone. In the experiment in [66], two types of cues were used. One of the cues never preceded the electric shock. This is labeled the CS- cue. The second cue, labeled as CS+, preceded the shock 50% of the time. The code example sets \(I_{k} = 1\) at the locations of the neutral cues and the shocks. Other types of inputs may also be considered for \(I_{k}\). Figure 5.2 depicts the averages for the CS- trials, the CS+ trials that did not contain the electric shock, and the CS+ trials that did contain the shock. As seen in Fig. 5.2, for this particular subject, the CS+ with the shock elicited the highest skin conductance and sympathetic arousal responses. The CS- trials had the lowest skin conductance and arousal levels, and the CS+ without the shock had responses in-between these two.

Fig. 5.2
7 waveforms on sympathetic arousal estimation on fear conditioning from 0 to 2500 seconds. 2 graphs on z k and x k versus time. Both the graphs show an increasing and steady trend from 0 to 8 seconds. After that, there is heavy fluctuation. The averages have an overall rise.

State estimation in Pavlovian fear conditioning. The sub-panels, respectively, depict: (a) the skin conductance signal \(z_{k}\); (b) the phasic-derived variable (green solid) and its estimate (dashed); (c) the tonic level \(s_{k}\) (mauve solid) and its estimate (dashed); (d) the probability of SCR occurrence \(p_{k}\) (the cyan and black dots on top denote the presence or absence of SCRs, respectively); (e) the arousal state \(x_{k}\) (the green and black dots denote the presence or absence of external binary inputs, respectively); (f) the averages corresponding to different trials for skin conductance (CS\(-\) —green, CS+ without the shock—mauve and CS+ with the shock—red); (g) the same averages for the arousal state \(x_{k}\). Ⓒ 2020 IEEE. Reprinted, with permission, from [29]