Rejoinder on ‘Marked spatial point processes: current state and extensions to point processes on linear networks’

We are grateful to all discussants for their invaluable comments, suggestions, questions, and contributions to our article. We have attentively reviewed all discussions with keen interest. In this rejoin-der, our objective is to address and engage with all points raised by the discussants in a comprehensive and considerate manner. Consistently, we identify the discussants, in alphabetical order, as follows: CJK for Cronie, Jansson, and Konstantinou, DS for Stoyan, GP for Grabarnik and Pommerening, MRS for Myllym¨aki, Rajala, and S¨arkk¨a, and MCvL for van Lieshout throughout this rejoinder.


Introduction
In our article, we considered marked spatial point processes on two frequently examined state spaces, namely, R 2 and linear networks; the latter increasingly got attention during the past two decades (Moradi, 2018;Baddeley et al., 2021).Concerning marks, we focused on situations where the marks are qualitative, quantitative, or non-scalar, which we referred to as discrete and integer-valued marks, real-valued, and objectvalued marks.For practical applications, examples of object-valued quantities include the demographic evolution for different provinces of a country as in Ghorbani et al. (2021, Figure 1) or the total crown defoliation per year for a set of trees as in Eckardt et al. (2023, Figure 4).In the province data example, the potential object of interest might be determining the heterogeneity/correlation of the curves across space, specifically whether nearby provinces exhibit greater similarity in their demographic evolution.For the tree example, a potential research question could involve determining the spatial correlation or the variation of total crown defoliation across a forest.Leaving point process models aside, given the constraints of space, we explicitly focused on mark summary characteristics, which play undeniable roles in marked point process summary characteristics, we presented such properties for the unmarked version of X, denoted as X.
Nevertheless, CJK adequately presented the appropriate notation for such classes of spatial point processes in R 2 and on linear networks in the framework of the marked spatial point process X, which shed more light on these concepts for interested readers.
MCvL asked if mark correlation functions such as those presented in Tables 1 and 3 of our article, which in practice are applicable in the case of real-valued marks, are inherently tied to stationarity.These summary characteristics, such as Stoyan's mark correlation function κmm, are, given a fixed interpoint distance r, simply functions of the marks and do not account for the spatial distribution of points.In an inhomogeneous situation, for different interpoint distances r, the proportions of points satisfying dpx, yq " r are influenced by the spatial distribution of points.Thus, one might need to take into account not only the mark distribution of nearby points but also the spatial distribution of the points in question.In an effort to do so, mark-weighted K-functions are proposed by Penttinen et al. (1992) and Ghorbani et al. (2021) as mentioned by CJK in their discussion.In our article, we have also proposed mark-weighted K-functions for point processes on linear networks.However, the interpretation of mark-weighted summary characteristics might not be easy as different sources of variation exist, including pairwise interactions between points and between marks.
Concerning IRMPS models for point processes on linear networks, Cronie et al. (2020) showed that Poisson processes are one example of such models.Additionally, they identified the essential conditions under which log-Gaussian Cox point processes on linear networks can be classified as IRMPS.In fact, they showed that for the log-Gaussian Cox point process X on linear network L with a random intensity measure ΛpAq " where Z is a Gaussian random field on L with a mean function µpuq and a covariance function Cpu, vq, u, v P L, X becomes IRMPS if for any u P L and some function C. In other words, the covariance function Cpu 1 , u 2 q should only depend on the distances between u 1 , u 2 and an arbitrary point u P L; these are recently extended to spatio-temporal settings by Moradi and Sharifi (2024).The development of IRMPS models is still in its early stages.Besides the information provided on log-Gaussian Cox point processes, to the best of our knowledge, there have been no additional findings in the literature regarding models that meet the IRMPS criteria or the fundamental conditions required for certain models to qualify as IRMPS.Thus, at present and in theoretical forms, our proposed higher-order summary characteristics can be calculated for Poisson processes and log-Gaussian Cox point processes under condition (1).However, Cronie et al. (2020) and Moradi and Sharifi (2024) showed that, in practice, their proposed J-functions are generally able to detect clustering/regularity/randomness in practice.
Lastly, GP addressed the spatial limitation beyond linear network settings by emphasizing practical scenarios where a collection of (disjoint) sub-spaces within a larger observation window is considered the state space.Practical examples they provided included nesting sites of birds on trees or rocks within a larger landscape, as well as the utilization of geostatistical covariates to impose spatial restrictions.These examples seem to us quite interesting; however, to the best of our knowledge, they have not received much attention within the literature, and determining how to proceed in such scenarios may not be immediately clear.As GP also noted, in these cases, interpreting the results may pose challenges, and in our opinion, additional insights from practical experts might be necessary.
3 Mark filtering and non-parametric estimators Throughout our article, we did not discuss estimators for the summary characteristics we presented despite having estimated them in our applications.This was highlighted by CJK, GP, and MCvL.In particular, MCvL, in her discussion, presented non-parametric estimators for cross-type inhomogeneous nearestneighbor distance distribution function H inhom ij prq and intensity function λ i pxq, x P X i .Below, for marked point processes in R 2 , we briefly go through non-parametric estimators for some of the summary characteristics presented in our article.We also comment on estimators for the presented mark summary characteristics in settings where linear networks are the state space.
The cross-type K-function, presented in equation ( 4) of our article, is usually estimated via where epx, yq is en edge-correction and |W | is the size of the window where the marked point process X is observed (Møller and Waagepetersen, 2004).An estimator for the dot-type K-function can similarly be achieved.We add that these estimators can, in practice, be evaluated via functions Kcross.inhom and Kdot.inhom in the R package spatstat (Baddeley et al., 2015); few edge corrections are accessible (Baddeley et al., 2015, Chapter 7).For cross-type inhomogeneous nearest-neighbor distance distribution function given in equation ( 6) in our article, MCvL presented an estimator for which was originally presented by Cronie and van Lieshout (2016); bpx, rq stands for a disc of radius r centered at x, and W ar is an r-reduced window defined as where BW is the border of window W .An estimator for F inhom j prq is given as where G is a fine grid defined over W (van Lieshout, 2011).Having these two estimators, one can estimate the cross-type J-function.We add that, in practical situations, one can make use of function Jinhom.crossfrom the R package spatstat (Baddeley et al., 2015).It's worth noting that the inhomogeneous J-function extends beyond pairwise interactions, and it might be more precise than the K-function in certain scenarios.
In fact, as pointed out by DS, in the example of influenza virus proteins, the cross-type J-function seems to deviate from the theoretical value for a marked Poisson process quicker than the cross-type K-function.
This might have happened due to the construction of J-functions going beyond pairwise interactions, cf.
van Lieshout (2011, Figure 1).The performance of non-parametric estimators for K-and J-functions may also play a role.
Looking at the discussed estimators, it is evident that we must first estimate the intensity functions.
Note that in real scenarios, we do not have access to the true intensity functions, and thus, one needs to estimate them in advance.We add that based on our experience with at least K-functions for point processes on planar spaces, simulation studies have revealed that utilizing the true intensity functions in the estimators of K-functions results in a reduction in performance.Kernel estimation undeniably stands out as the primary method for non-parametric intensity estimation; this was also highlighted by MCvL.
Within the literature, for a given unmarked point pattern x " tx 1 , x 2 , . . ., xnu the most frequently used kernel-based intensity estimators are and where Kσ is a symmetric density function on R 2 with bandwidth σ, and is the mass of the kernel centred at u P W , playing the role of an edge corrector to compensate for the lack of information outside W .The estimator (5), which is unbiased if the true intensity is constant, is often called uniformly-corrected, and (6), which conserves mass, is called Jones-Diggle (Baddeley et al., 2015, Chapter 6).In situations where the observed window is not regular, according to Baddeley et al. (2022), these two estimators may suffer from tunnelling mass as well as simultaneous under and over-smoothing.
Thus, Baddeley et al. (2022) proposed a kernel-based intensity estimator where intensity is estimated via a transition probability density of a Brownian motion on W that respects a boundary.Their estimator is given as where t " σ 2 , σ is the smoothing bandwidth, and K t p¨|x i q is the heat kernel.The performance of the above intensity estimators heavily depends on the smoothing bandwidth, according to which a small bandwidth leads to low bias and high variance, whereas a large bandwidth yields high bias and low variance.Different bandwidth estimators, based on different perspectives, are proposed within the literature: Diggle's approach Diggle (1985), likelihood cross-validation (Loader, 2006), Scott's rule-of-thumb (Scott, 2015), and Cronie and van Lieshout's criterion (Cronie and Van Lieshout, 2018).Since a single smoothing bandwidth may not lead to an estimate which performs equally well all over the observed window, adaptive kernel-based intensity estimators are proposed, which use a set of locally estimated bandwidth (Abramson, 1982;Davies and Baddeley, 2018;Rakshit et al., 2019;van Lieshout, 2021) Mateu and Moradi (2024).
Turning to mark correlation functions, the numerator of the t f -correlation function is estimated via ř ‰ x,yPX t f pmpxq, mpyqq Kpdpx, yq ´rqwpx, yq where K is a kernel function on the real line and wpx, yq is an edge correction factor (Baddeley et al., 2015, Chapter 15); the normalization factor c t f is the sample average of t f pmpxq, mpyqq taken over all x, y P X.It's often noted in the literature that these estimators can be computed without edge corrections, particularly when both the numerator and denominator are estimated using the same principle (Illian et al., 2008, Chapter 5).In fact, in our article, such as in Figure 6, no edge-correction was applied in computing Stoyan's mark correlation function.
As for marked point processes on linear networks, we point to Baddeley et al. (2014) for non-parametric estimators of second-order cross/dot-type summary characteristics.Non-parametric estimators for markweighted K-functions could be defined by closely following Ang et al. (2012); Rakshit et al. (2017).Regarding cross/dot-type higher-order summary characteristics such as J-functions, their non-parametric estimators could be achieved by following Cronie and van Lieshout (2016); Cronie et al. (2020).Finally, non-parametric estimators of mark correlation functions would be of a similar nature as in ( 10) but with a distance metric adapted to linear networks.We emphasise that the choice of distance metric might very well depend on the application under study.
4 Marked spatial point process models In our article, we restricted our focus to the current state-of-the-art for mark summary characteristics.CJK and MRS commented on lacking a thorough discussion on point process models when points are marked.
In particular, MRS provided a profound overview of the matter, which we greatly appreciate.Indeed, the literature covers various doubly-stochastic models such as Cox point processes, where the intensity itself is driven by a random field, and Markov point process models, which formalize the intensity as a product over clique potentials, i.e. energy functions.It would be welcomed to provide an in-depth treatment of the current state-of-the-art of models for marked spatial point processes, including all proposed estimation methods.Given the thorough and compact discussion of the subject given by MRS, we here only point to some further models and recent extensions.
Notable early contributions for marked point process models include the balanced and linked Cox models of Diggle and Milne (1983) in which a bivariate point process X " tX 1 , X 2 u is driven by a nonnegative valued bivariate random field Z " tZ 1 , Z 2 u such that given the realizations λ 1 puq " Z 1 puq and λ 2 puq " Z 1 puq for all u P R 2 the components of X are independent inhomogeneous Poisson processes with intensity functions λ i p¨q, i " 1, 2. For any ν ą 0, the points of the two components X 1 , X 2 may then show clustering/repulsion tendencies with Z 1 " νZ 2 in the case of the linked Cox process and with Z 1 `Z2 " ν for the balanced Cox process model, respectively.Less restrictive extensions of the Cox model include the marked versions of the log-Gaussian Cox process (Møller et al., 1998) and the shot-noise Cox process (Brix, 1999).Although log-Gaussian Cox processes appear most commonly in the literature, we point to the work of Jalilian et al. (2015) who proposed a multivariate product-shot-noise Cox process to model a multispecies point pattern.Allowing also for temporal variation of spatial marked processes, various spatio-temporal models can be considered, including growth-interaction, spatial birth-and-death and Hawkes processes; these were also highlighted by CJK.Note also a combination of log-Gaussian Cox processes with Hawkes processes, i.e. the so-called Cox-Hawkes process, proposed by Miscouridou et al. (2022), which could also be extended to marked cases.Moreover, we also point to the Candy (Stoica et al., 2004(Stoica et al., , 2005) ) and Bisous (Tempel et al., 2016) models, which, instead of points model line segment and, thus, might be useful tools to derive models for (marked) spatial movement data.
Lastly, we appreciated the idea of MRS treating the mark sum as a local mark characteristic.Indeed, for a fixed observed point x, the normalized mark-sum measure adjusts the number of points in a disc bpx, rq by the sum of the marks in bpx, rq and can, thus, be interpreted as the contribution of marks in a distance r centred at x (Penttinen, 2007).

Applications of mark summary characteristics
We are grateful to DS for proposing the computation of different mark correlation functions for some given data and discussing their abilities to describe mark correlations.It is important to highlight that any such mark summary characteristic addresses a particular property of the mark distribution.To illustrate the application of different mark correlation functions, we revisit the three considered models I to III of Section 4.2.2 in our article.In a similar manner, for each model, we simulate 199 patterns for which we calculate Stoyan's mark correlation function κ L mm pr L q, Beisbart and Kerscher's mark correlation function κ Bei,L mm pr L q, mark variogram γ L mm pr L q, and Shimantani's I function I Shi,L mm pr L q.We used the notation κ for, e.g.Stoyan's mark correlation function, to be better distinguished from the cross/dot-type K-functions.Based on the 199 estimated mark correlation functions per each test function, we obtain 95% pointwise critical envelopes to compare the general behaviours of the four chosen test functions; one could instead use global envelopes (Myllymäki et al., 2017;Mrkvička et al., 2020) as pointed out by MRS.Concerning DS's comment on the differences between the mark variogram and (semi-)variogram, although both are constructed through half-squared distances, they are in general different except under the geostatistical marking model where marks are generated from an underlying random field (Schlather et al., 2004;Guan et al., 2007).In this model, also referred to as a random field model due to its construction, the marks and the locations are generally assumed to be independent.We note that the geostatistical marking model can be tested against deviations from the independence assumption between marks and points, i.e. non-geostatistical marking model, by e.g.Schlather's Eprq and V prq functions and the tests proposed in (Guan, 2005) Turning to some further numerical evaluations, for model I (left column of Figure 1), in which the point configuration inherits a trend from bottom-left to top-right of the network, for nearby points (small values of r L ), we can see low variation (based on mark variogram γ L mm pr L q) with some positive association (based on Stoyan's mark correlation function κ L mm pr L q and Beisbart and Kerscher's mark correlation function κ Bei,L mm pr L q) and high positive spatial autocorrelation (based on Shimanti's I Shi,L mm pr L q).With increasing distance r L , the marks become more heterogeneous, yielding a clear increase in spatial variation as it can be seen from the mark variogram γ L mm pr L q, less association indicated by Stoyan's mark correlation function κ L mm pr L q and Beisbart and Kerscher's mark correlation function κ Bei,L mm pr L q and, correspondingly, negative spatial autocorrelation revealed by Shimantani's I function I Shi,L mm pr L q.For a more thorough exploration of how these mark correlation functions differ in their effectiveness at describing mark correlations, we add that if pairs of points have similar mark values, the variation between the marks is small, leading to small values of the mark variogram γ L mm pr L q close to zero, while the association is high, yielding values above one for Stoyan's mark correlation function κ L mm pr L q and Beisbart and Kerscher's mark correlation function κ Bei,L mm pr L q.In cases where the marks for any pair of points, deviating from the mark mean, demonstrate substantial similarity (or dissimilarity) in value, positive (negative) autocorrelation is expected to be revealed by Shimanti's I Shi,L mm pr L q.While the focus here is limited to point patterns on linear networks, we emphasise that these interpretations hold for point patterns on R 2 as well.
Turning to the central column of Figure 1, which shows the results for model II, we again notice a distinct opposite behavior between the mark variogram γ L mm pr L q and the mark correlation functions κ L mm pr L q and κ Bei,L mm pr L q, especially at short distances.Recalling the construction of the marks in model II based on the shortest-path distance of each point to the dendrite's border, nearby points are expected to have similar marks.As the distance r L increases, the marks become more dissimilar, which yields a strong variation in value and, at the same time, low association.However, the marks at very long distances, say at two distinct points at different borders, are again more similar due to the simulation design.Here, we would

Fig. 1
Fig.195% critical envelops for the Stoyan's mark correlation function κ L mm pr L q, Mark variogram γ L mm pr L q, Shimanti's I Shi,L mm pr L q, and Beisbart and Kerscher's mark correlation function κ Bei,L mm pr L q, and their averages, based on 199 simulated patterns from Models I, II, and III, from left to right, respectively.
Barr and Schoenberg (2010)sed intensity estimators are Voronoi-based estimators, which are adapted to the local variability of the underlying point process and generally outperform kernel-based estimators; seeBarr and Schoenberg (2010);Moradi et al.
(2019); Mateu et al. (2020) for details.Nevertheless, according to our experience, kernel-based estimators with less variability give rise to more reliable estimates of the summary statistics; Cronie et al. (2020) and Moradi and Sharifi (2024) used kernel-based intensity estimators with Scott's rule-of-thumb when estimating J-functions.A comprehensive practical review of non-parametric intensity estimators based on real-data scenarios, with all examples being reproducible, is presented by