Thus far we have considered binary observations and continuous observations in our state-space models. With binary observations, we do not consider the magnitudes of the binary-valued events (since each is just a 0 or a 1) but are merely interested in the event occurrences. Consequently, we can treat the spiking-type observations in the earlier chapter as binary-valued as well. There too, our concern was primarily with the occurrence of the cardiac contractions and the accompanying spikes in an EKG signal, but not the actual amplitudes of the spikes. But what happens when we observe a point process that is not just a sequence of zeros and ones but rather is a sequence of zeros and real-valued amplitudes? Such a point process forms a marked point process (MPP). These are encountered frequently in physiological state estimation applications as well. For instance, the sequence of neural impulses underlying a skin conductance signal forms an MPP (Fig. 3.2). So do pulsatile hormone secretory events. In this chapter, we will learn how to derive the state and parameter estimation step equations when the state-space model contains MPP observations.

In this chapter also, we will begin by considering a motivating example. Now we can build many models ranging from simple to complex to account for physiological phenomena. Any mathematical abstraction of a real-world system will have some imperfections to it and will not be able to fully account for all of the data. Occasionally, in engineering systems, we will encounter cases where a simpler model performs better than a more complex model. The estimation of sympathetic arousal from skin conductance is one such case. The state-space model with one binary and two continuous observations is quite complex [29]. However, despite its complexity, it is somewhat imperfect in that it interpolates over a log-transformed version of the SCR amplitudes. A more natural way to account for phasic skin conductance variations is to model the underlying neural impulses as an MPP [32]. This eliminates the need for two continuous variables and is simpler.

A further algorithmic detail is also worth noting here. Mathematical models of real-world systems will always have some limitations. The limitations may be in the model itself or have to do with issues that arise during computation. This book focuses on the estimation of unobserved physiological quantities that are related (fully or partially) to point process observations. Occasionally, when we have both binary and continuous variables involved, the EM algorithm can have a tendency to converge to locations where there is a near-perfect fit to one of the continuous variables (i.e., overfitting occurs). The state-space model with one binary and two continuous observations has this tendency to overfit on experimental data. Consequently, additional constraints have to be put in place to control it [29]. This issue can also occur in the model with one binary and one continuous observation. The use of the MPP framework circumvents the need to have a continuous variable and thus avoids the need for external overfitting control. Thus the simpler MPP state-space model for estimating arousal based on skin conductance performed quite well in comparison to others [32].

7.1 Deriving the Update Equations in the State Estimation Step

In this chapter also, we will assume that \(x_{k}\) evolves with time following one of the state equations we have already seen. Thus no new predict step equations have to be derived.

Recall from (3.21) that the PDF of a single (Bernoulli-distributed) binary observation \(n_{k}\) is

$$\displaystyle \begin{aligned} p(n_{k}|x_{k}) &= p_{k}^{n_{k}}(1 - p_{k})^{1 - n_{k}}. \end{aligned} $$
(7.1)

This same density function can be written as

$$\displaystyle \begin{aligned} p(n_{k}|x_{k}) = \begin{cases} 1 - p_{k} & \text{if } n_{k} = 0 \\ p_{k} & \text{if } n_{k} = 1 {}. \end{cases} \end{aligned} $$
(7.2)

In reality, we could derive our state estimation step update equations based on (7.2) as well. For instance, if we observed \(n_{k} = 0\) at time index k, the posterior density would be

$$\displaystyle \begin{aligned} p(x_{k}|n_{1:k}) \propto e^{n_{k}\log(p_{k})\!+(1 - n_{k})\log(1 - p_{k})} \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2 \sigma^{2}_{k|k - 1}}}\!\!=\! e^{\log(1 - p_{k})} \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2 \sigma^{2}_{k|k - 1}}}{}, \end{aligned} $$
(7.3)

where we have substituted \(n_{k} = 0\) into the exponent of the first term. We could next take the first and second derivatives of the exponent to obtain the corresponding state estimation step update equations for \(x_{k|k}\) and \(\sigma ^{2}_{k|k}\). We could also do the same for \(n_{k} = 1\). In the case of \(n_{k} = 1\), we would have

$$\displaystyle \begin{aligned} p(x_{k}|n_{1:k}) \propto e^{n_{k}\log(p_{k}) + (1 - n_{k})\log(1 - p_{k})} \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2 \sigma^{2}_{k|k - 1}}} = e^{\log(p_{k})} \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2 \sigma^{2}_{k|k - 1}}}. \end{aligned} $$
(7.4)

Finally, we could express the update step equations for the two different cases based on an if-else condition. This would be of the form

if \(n_{k} = 0\),

$$\displaystyle \begin{aligned} x_{k|k} &= \ldots \end{aligned} $$
(7.5)
$$\displaystyle \begin{aligned} \sigma^{2}_{k|k} &= \ldots \end{aligned} $$
(7.6)

if \(n_{k} = 1\),

$$\displaystyle \begin{aligned} x_{k|k} &= \ldots \end{aligned} $$
(7.7)
$$\displaystyle \begin{aligned} \sigma^{2}_{k|k} &= \ldots. \end{aligned} $$
(7.8)

In the case of an MPP where we have non-zero amplitudes only at the instances where point process events occur, the density function for the observations is

$$\displaystyle \begin{aligned} p(n_{k}\!\!\cap\! r_{k}|x_{k})\!=\! \begin{cases} 1 - p_{k} = e^{\log(1 - p_{k})}& \!\! \text{if } n_{k} = 0 \\ p_{k} \frac{1}{\sqrt{2 \pi \sigma^{2}_{v}}}e^{\frac{-(r_{k} - \gamma_{0} - \gamma_{1}x_{k})^{2}}{2\sigma^{2}_{v}}} = e^{\log(p_{k})} \frac{1}{\sqrt{2 \pi \sigma^{2}_{v}}} e^{\frac{-(r_{k} - \gamma_{0}\! - \gamma_{1}x_{k})^{2}}{2\sigma^{2}_{v}}} & \!\!\text{if } n_{k} = 1 {}, \end{cases} \end{aligned} $$
(7.9)

where the point process event amplitudes (i.e., the marks) \(r_{k}\) are assumed to be linearly related to \(x_{k}\) through \(r_{k} = \gamma _{0} + \gamma _{1}x_{k} + v_{k}\), where \(v_{k} \sim \mathcal {N}(0, \sigma ^{2}_{v})\) is sensor noise.

Let us now proceed with calculating the update step equations for the two cases where \(n_{k} = 0\) and \(n_{k} = 1\). First consider \(n_{k} = 0\). Based on (7.9), the posterior density is

$$\displaystyle \begin{aligned} p(x_{k}|y_{1:k}) &\propto p(n_{k} \cap r_{k}|x_{k})p(x_{k}|n_{1:k - 1}, r_{1:k - 1}) \propto e^{\log(1 - p_{k})} \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2\sigma^{2}_{k|k - 1}}}. \end{aligned} $$
(7.10)

We can now take the log, take its derivative, and set it to 0 to solve for the mean. This yields

$$\displaystyle \begin{aligned} \frac{d q_{1}}{d x_{k}} &\!=\! - \frac{1}{(1 - p_{k})}p_{k}(1 - p_{k}) - \frac{2(x_{k} - x_{k|k - 1})}{2\sigma^{2}_{k|k - 1}}\! =\! 0 \end{aligned} $$
(7.11)
$$\displaystyle \begin{aligned} \implies \frac{(x_{k} - x_{k|k - 1})}{\sigma^{2}_{k|k - 1}} &\! =\! -p_{k} \end{aligned} $$
(7.12)
$$\displaystyle \begin{aligned} x_{k} &\!=\! x_{k|k - 1}\! +\! \sigma^{2}_{k|k - 1}(-p_{k}) \end{aligned} $$
(7.13)
$$\displaystyle \begin{aligned} x_{k} &\!=\! x_{k|k - 1}\! +\! \sigma^{2}_{k|k - 1}(n_{k}\! -\! p_{k})\enspace\text{since }n_{k} = 0 . \end{aligned} $$
(7.14)

Interestingly, this is the same as (3.38) where we only had one binary observation \(n_{k}\) in the state-space model. Let us now calculate the variance by taking the second derivative.

$$\displaystyle \begin{aligned} \frac{d^{2}q_{1}}{d x_{k}^{2}} &= \frac{-1}{\sigma^{2}_{k|k - 1}} - p_{k}(1 - p_{k}). \end{aligned} $$
(7.15)

Again, interestingly, this turns out to be the same as (3.40) where we only had one binary observation. Therefore, when a point process event does not occur (i.e., when \(n_{k} = 0\)), our state estimation step update equations are similar to the case where we only had one binary observation in the state-space model.

We will next consider the case when \(n_{k} = 1\). Note that we will then have the \(r_{k}\) amplitude term as well. Based on (7.9), the posterior is now

$$\displaystyle \begin{aligned} p(x_{k}|y_{1:k}) &\propto e^{\log(p_{k})} \times e^{\frac{-(r_{k} - \gamma_{0} - \gamma_{1}x_{k})^{2}}{2\sigma^{2}_{v}}} \times e^{\frac{-(x_{k} - x_{k|k - 1})^{2}}{2 \sigma^{2}_{k|k - 1}}}. \end{aligned} $$
(7.16)

Taking the log and proceeding to take the first derivative, we have

$$\displaystyle \begin{aligned} \frac{dq_{2}}{dx_{k}} &= \frac{1}{p_{k}}p_{k}(1 - p_{k}) + \frac{\gamma_{1}(r_{k} - \gamma_{0} - \gamma_{1}x_{k})}{\sigma^{2}_{v}} - \frac{(x_{k} - x_{k|k - 1})}{\sigma^{2}_{k|k - 1}} = 0. \end{aligned} $$
(7.17)

Since \(n_{k} = 1\), we will replace \((1 - p_{k})\) with \((n_{k} - p_{k})\). Therefore,

$$\displaystyle \begin{aligned} \frac{dq_{2}}{dx_{k}} &= (n_{k} - p_{k}) + \frac{\gamma_{1}(r_{k} - \gamma_{0} - \gamma_{1}x_{k})}{\sigma^{2}_{v}} - \frac{(x_{k} - x_{k|k - 1})}{\sigma^{2}_{k|k - 1}} = 0. \end{aligned} $$
(7.18)

This is the same as (4.21) where we had both a binary variable and a continuous variable in the state-space model. Therefore, based on (4.26), the mean update for \(x_{k}\) is

$$\displaystyle \begin{aligned} x_{k} &= x_{k|k - 1} + \frac{\sigma^{2}_{k|k - 1}}{\gamma_{1}^{2}\sigma^{2}_{k|k - 1} + \sigma^{2}_{v}}\Big[\sigma^{2}_{v}(n_{k} - p_{k}) + \gamma_{1}(r_{k} - \gamma_{0} - \gamma_{1}x_{k|k - 1})\Big]. \end{aligned} $$
(7.19)

Also, when we take the second derivative, we end up with

$$\displaystyle \begin{aligned} \frac{d^{2}q_{1}}{dx_{k}^{2}} &= -p_{k}(1 - p_{k}) - \frac{\gamma_{1}^{2}}{\sigma^{2}_{v}} - \frac{1}{\sigma^{2}_{k|k -1}} \end{aligned} $$
(7.20)

just like (4.28).

This provides an interesting insight. In the case of an MPP, the state estimation step update equations switch between those where one binary variable was observed and where both a binary variable and a continuous variable were observed. This switching occurs depending on whether \(n_{k} = 0\) or \(n_{k} = 1\).

When \(x_{k}\) gives rise to MPP observations comprising of the pairs \((n_{k}, r_{k})\), the update equations in the state estimation step are

if \(n_{k} = 0\),

$$\displaystyle \begin{aligned} x_{k|k} &= x_{k|k - 1} + \sigma^{2}_{k|k - 1}(n_{k} - p_{k|k}) \end{aligned} $$
(7.21)
$$\displaystyle \begin{aligned} \sigma^{2}_{k|k} &= \Bigg[\frac{1}{\sigma^{2}_{k|k - 1}} + p_{k|k}(1 - p_{k|k})\Bigg]^{-1} \end{aligned} $$
(7.22)

if \(n_{k} = 1\),

$$\displaystyle \begin{aligned} x_{k|k} &= x_{k|k - 1} + \frac{\sigma^{2}_{k|k - 1}}{\gamma_{1}^{2}\sigma^{2}_{k|k - 1} + \sigma^{2}_{v}}\Big[\sigma^{2}_{v}(n_{k} - p_{k|k}) + \gamma_{1}(r_{k} - \gamma_{0} - \gamma_{1}x_{k|k - 1})\Big] \end{aligned} $$
(7.23)
$$\displaystyle \begin{aligned} \sigma^{2}_{k|k} &= \Bigg[\frac{1}{\sigma^{2}_{k|k - 1}} + p_{k|k}(1 - p_{k|k}) + \frac{\gamma_{1}^{2}}{\sigma^{2}_{v}}\Bigg]^{-1}. \end{aligned} $$
(7.24)

7.2 Deriving the Parameter Estimation Step Equations

The only changes that occur at the parameter estimation step relate to \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\). Parameter estimates for other variables such as the process noise variance \(\sigma ^{2}_{\varepsilon }\) do not change.

7.2.1 Deriving the Constant Coefficient Terms

Recall from (4.43) that when we observed one binary variable and one continuous variable, the probability term containing \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) required at the parameter estimation step was

$$\displaystyle \begin{aligned} p(r_{1:K}|x_{1:K}, \Theta) &= \prod_{k = 1}^{K}\frac{1}{\sqrt{2\pi \sigma^{2}_{v}}} e^{\frac{-(r_{k} - \gamma_{0} - \gamma_{1}x_{k})^{2}}{2\sigma^{2}_{v}}}. \end{aligned} $$
(7.25)

Notice that the product is over \(k = 1, 2, \ldots , K\). This is when we observed a non-zero \(r_{k}\) at each point in time. When we observe an MPP variable as modeled in (7.9), \(r_{k}\) shows up only at the time indices where \(n_{k} = 1\). Let us assume that the point process events occur at time indices \(\tilde {K} \subseteq \{1, 2, \ldots , K\}\). Therefore, in the case of an MPP, the probability term we are interested in at the M-step will be

$$\displaystyle \begin{aligned} \prod_{k \in \tilde{K}}\frac{1}{\sqrt{2\pi \sigma^{2}_{v}}} e^{\frac{-(r_{k} - \gamma_{0} - \gamma_{1}x_{k})^{2}}{2\sigma^{2}_{v}}}, \end{aligned} $$
(7.26)

where the product is only over the specific indices \(\tilde {K}\) rather than everywhere. The corresponding log-likelihood term is therefore

$$\displaystyle \begin{aligned} Q &= \frac{-|\tilde{K}|}{2}\log\big(2\pi \sigma^{2}_{v}\big) - \sum_{k \in \tilde{K}}\frac{\mathbb{E}\Big[(r_{k} - \gamma_{0} - \gamma_{1}x_{k})^{2}\Big]}{2\sigma^{2}_{v}}. \end{aligned} $$
(7.27)

We can now proceed by taking the partial derivatives with respect to \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\), setting them to 0 and solving. This yields

$$\displaystyle \begin{aligned} \sum_{k \in \tilde{K}} r_{k} &= \gamma_{0}|\tilde{K}| + \gamma_{1}\sum_{k \in \tilde{K}} x_{k|K} \end{aligned} $$
(7.28)
$$\displaystyle \begin{aligned} \sum_{k \in \tilde{K}} r_{k}x_{k|K} &= \gamma_{0}\sum_{k \in \tilde{K}} x_{k|K} + \gamma_{1}\sum_{k \in \tilde{K}} U_{k} \end{aligned} $$
(7.29)
$$\displaystyle \begin{aligned} \sigma^{2}_{v} &= \frac{1}{|\tilde{K}|} \Bigg\{\sum_{k \in \tilde{K}}r_{k}^{2} + |\tilde{K}|\gamma_{0}^{2} + \gamma_{1}^{2}\sum_{k \in \tilde{K}}U_{k} - 2\gamma_{0}\sum_{k \in \tilde{K}}r_{k}\\ &\qquad - 2\gamma_{1}\sum_{k \in \tilde{K}} r_{k}x_{k|K} + 2\gamma_{0}\gamma_{1}\sum_{k \in \tilde{K}} x_{k|K}\Bigg\}. \end{aligned} $$
(7.30)

Note that all three equations shown above are similar to the case where a continuous variable was always present. Now, however, the summations are only over \(\tilde {K}\). Thus the parameter estimation step updates for \(\gamma _{0}\), \(\gamma _{1}\), and \(\sigma ^{2}_{v}\) are very similar to what we have seen before.

The parameter estimation step updates for \(\gamma _{0}\), \(\gamma _{1}\) and \(\sigma ^{2}_{v}\) when we observe an MPP variable with the amplitudes modeled as \(r_{k} = \gamma _{0} + \gamma _{1}x_{k} + v_{k}\) are

$$\displaystyle \begin{aligned} \Bigg[\begin{array}{c} \gamma_{0} \\ \gamma_{1}\end{array}\Bigg] &= \Bigg[\begin{array}{cc} |\tilde{K}| & \sum_{k \in \tilde{K}}x_{k|K} \\ \sum_{k \in \tilde{K}}x_{k|K} & \sum_{k \in \tilde{K}}U_{k}\end{array}\Bigg]^{-1} \Bigg[\begin{array}{c}\sum_{k \in \tilde{K}}r_{k} \\ \sum_{k \in \tilde{K}}r_{k}x_{k|K}\end{array}\Bigg] \end{aligned} $$
(7.31)
$$\displaystyle \begin{aligned} \sigma^{2}_{v} &= \frac{1}{|\tilde{K}|} \Bigg\{\sum_{k \in \tilde{K}}r_{k}^{2} + |\tilde{K}|\gamma_{0}^{2} + \gamma_{1}^{2}\sum_{k \in \tilde{K}}U_{k} - 2\gamma_{0}\sum_{k \in \tilde{K}}r_{k} - 2\gamma_{1}\sum_{k \in \tilde{K}} r_{k}x_{k|K}\\ &\qquad + 2\gamma_{0}\gamma_{1}\sum_{k \in \tilde{K}} x_{k|K}\Bigg\}. \end{aligned} $$
(7.32)

7.3 MATLAB Examples

The MATLAB code examples for estimating \(x_{k}\) from a set of MPP observations are provided in the following folders:

  • one_mpp

    • sim∖

      • data_one_mpp.mat

      • filter_one_mpp.m

    • expm∖

      • expm_data_one_mpp.mat

      • expm_filter_one_mpp.m

The code examples estimate \(x_{k}\) based on the inputs \(n_{k}\) and \(r_{k}\) denoted by the variables n and r. A few differences are to be noted in this code compared to the previous examples. In the previous MATLAB examples, we had the predict, update, predict, update, etc. steps executed repeatedly for \(k = 1, 2, \ldots , K\). However, when we have MPP observations, we have two different filter update equations depending on the value of \(n_{k}\). The \(r_{k}\) amplitudes are only taken into account when \(n_{k} = 1\). Therefore, the state estimation step contains the following:

The state update, also based on an if-else depending on the value of \(n_{k}\), is calculated using the get_posterior_mode(...) function shown below:

The other variables used in the code largely remain the same. For instance, we still use x_pred, x_updt, and x_smth to denote \(x_{k|k - 1}\), \(x_{k|k}\), and \(x_{k|K}\), respectively, and v_pred, v_updt and v_smth to denote the corresponding variances \(\sigma ^{2}_{k|k - 1}\), \(\sigma ^{2}_{k|k}\) and \(\sigma ^{2}_{k|K}\).

7.3.1 Application to Skin Conductance and Sympathetic Arousal

As stated earlier, the sequence of neural impulses underlying the phasic variations in a skin conductance signal forms an MPP. This sequence of impulses is extracted via deconvolution. In the code example, the input (i.e., the deconvolved neural impulse train) is provided through the variables n and r. The variable r(k) has a non-zero amplitude whenever n(k) is equal to 1. The r(k) amplitudes are not taken into account when n(k) is 0. Running the MATLAB examples on simulated and experimental data produces the results shown in Fig. 7.1. The filter was used in [32] for estimating sympathetic arousal from deconvolved skin conductance data. The results on experimental data shown in the figure are based on the study described in [53] (seen in the earlier chapters as well). The study involved different types of stressors interspersed by periods of relaxation. The results of using the same code on the driver stress data in [54] for a particular subject are shown in Fig. 7.2.

Fig. 7.1
9 waveform plots on state estimation with simulated data and state estimation with experimental data versus time. A time index on input quantities versus standard normal quantities is uniformly increasing from negative 3 to positive 3. Simulated data has a rise. Experimental data has a decline.

State estimation based on observing one MPP variable. The left sub-figure depicts the estimation on simulated data and the right sub-figure depicts the estimation of sympathetic arousal from skin conductance data. The sub-panels on the left, respectively, depict (a) the MPP observations (blue) and the estimated \(r_{k}\) (red), (b) the point process event occurrence probability \(p_{k}\) (blue) and its estimate (red), (c) the state \(x_{k}\) (blue) and its estimate (red), and (d) the QQ plot for the residual error of \(x_{k}\). The sub-panels on the right, respectively, depict (a) the skin conductance signal, (b) the neural impulses underlying phasic variations, (c) the arousal state \(x_{k}\) and its 95% confidence limits, (d) the probability of impulse occurrence \(p_{k}\) and its 95% confidence limits, and (e) the HAI (the regions above 90% and below 10% are shaded in red and green, respectively). The background colors on the right sub-figure correspond to the instruction period, a counting task, a color-word association task, relaxation, and watching a horror movie clip. From [32], used under Creative Commons CC-BY license

Fig. 7.2
5 waveform plots on sympathetic arousal estimation of driver stress. The quantities measured are skin conductance, n subscript k, state, probability, and H A I versus time. The waveforms indicate multiple peaks from 1000 to 4000 seconds. The wave is gradually increasing.

Driver stress estimation. The sub-panels, respectively, depict (a) the skin conductance signal, (b) the neural impulses, (c) the arousal state \(x_{k}\) and its 95% confidence limits, (d) the probability of impulse occurrence and its 95% confidence limits, and (e) the HAI (the regions above 90% and below 10% are shaded in red and green, respectively). The background colors in turn denote rest, city driving, toll road, highway, toll road, city driving, toll road, highway, toll road, city driving, and rest. From [32], used under Creative Commons CC-BY license