Functional marked point processes: a natural structure to unify spatio-temporal frameworks and to analyse dependent functional data

This paper treats functional marked point processes (FMPPs), which are defined as marked point processes where the marks are random elements in some (Polish) function space. Such marks may represent, for example, spatial paths or functions of time. To be able to consider, for example, multivariate FMPPs, we also attach an additional, Euclidean, mark to each point. We indicate how the FMPP framework quite naturally connects the point process framework with both the functional data analysis framework and the geostatistical framework. We further show that various existing stochastic models fit well into the FMPP framework. To be able to carry out nonparametric statistical analyses for FMPPs, we study characteristics such as product densities and Palm distributions, which are the building blocks for many summary statistics. We proceed to defining a new family of summary statistics, so-called weighted marked reduced moment measures, together with their nonparametric estimators, in order to study features of the functional marks. We further show how other summary statistics may be obtained as special cases of these summary statistics. We finally apply these tools to analyse population structures, such as demographic evolution and sex ratio over time, in Spanish provinces.


Introduction
Many types of functional data, such as financial time series, animal movements, growth functions for trees in a forest stand, the spatial extensions of outbreaks of a disease over time with respect to the outbreak centres, population growth functions of towns/cities in a country, and different functions describing spatial dependence (e.g. LISA functions; see Section 11 in Supplementary Materials and the references therein), are represented as collections { f 1 (t), . . . , f n (t)}, t ∈ T ⊂ [0, ∞), n ≥ 1, of functions/paths in some k-dimensional Euclidean space R k , k ≥ 1; note that the argument t need not represent time, it could, for example, represent spatial distance. The common approach to deal with such data within the field of functional data analysis (FDA) (Ramsay and Silverman 2005) is to assume that the functions f i , i = 1, . . . , n, belong to some suitable family of functions (usually an L 2 -space) and are realisations/sample paths of some collection of independent and identically distributed (iid) random functions/stochastic processes {F 1 (t), . . . , F n (t)}, t ∈ T , with sample paths belonging to the family of functions in question.
For many applications, however, the following two adequate questions may quite naturally arise: 1. Does it make sense to assume that the random elements F 1 , . . . , F n , which have generated the functional data set { f 1 , . . . , f n }, are in fact iid? 2. Is the study designed in such a way that the sample size n is known a priori, or is n in fact unknown before the data set is realised?
The first question can be framed within the context of multivariate (Gaussian) random fields/processes, and it has been addressed quite extensively in the literature; see, e.g. Banerjee et al. (2014), Gelfand and Banerjee (2010), Genton and Kleiber (2015). The second question, however, in particular in combination with the first question, has not really been explored to any degree, and it would be beneficial to have a foundation with a proper structure for such analyses. Functional data sets (believed to be) generated in accordance with the above remarks will be referred to as functional marked point patterns, and Fig. 1 provides illustrative examples of such data sets. The top panels show two functional marked point patterns based on the centres of the provinces on the Spanish mainland. To each point, which corresponds to a centre, we have associated the demographic evolution of the population on logarithmic scale (left) and the sex ratio (right), over the years 1998-2017. In the top right panel, for each of the 47 functions/provinces, the horizontal red dashed line corresponds to y = 1, which illustrates the case where we have the same size of genders in the province in question. The bottom panels show animal movement tracks. The lower left panel shows the movement tracks of two Mongolian wolves, starting from random initial monitoring locations (red squares); the data are taken from the Movebank website. The lower right panel shows the movement tracks of 15  (Hebblewhite and Merrill 2008), starting from some random initial monitoring locations.
Another setting where these questions also naturally arise is found in spatiotemporal geostatistics (Montero et al. 2015). Assume that each of the data-generating stochastic processes F i (t) = Z (x i , t), t ∈ T , i = 1, . . . , n, is associated with a spatial location x i ∈ W ⊂ R d and that Z (x, t), (x, t) ∈ W × T , is a (Gaussian) spatio-temporal random field. Here, the functions F 1 , . . . , F n are clearly not independent (ignoring pathological cases) and one may further ask whether it would not in fact make sense to assume that the sampling/monitoring locations x 1 , . . . , x n are actually randomly generated. In addition, does it make sense to assume that the total number of such locations was fixed a priori, or did these locations, for example, appear over times (in relation to each other), thus allowing us to treat them as a randomly evolving entity with a random total number of components N ≥ 1? For instance, all the weather stations monitoring precipitation in a given country/region have (most likely) arrived over time, in relation to each other, rather than being placed at their individual locations at the same time. For example, we do not know a priori how many stations will have appeared during the period 2010-2040 and where they will be located.
Taking these remarks into account, we argue that for many functional data sets { f 1 (t), . . . , f n (t)}, t ∈ T ⊂ [0, ∞), n ≥ 1, it would make sense to assume i) that n ≥ 1 is the realisation of some discrete non-negative random variable N and ii) that (conditional on N = n) the random functions F 1 , . . . , F n are possibly dependent. A natural way to tackle the statistical analysis under such non-standard assumptions is to assume that the functional data set is generated by a point process in some space F of functions f : T → R k . This would mean that we would model the functional data set (a functional marked point pattern) as the realisation of a set of random functions {F 1 (t), . . . , F N (t)}, t ∈ T , of random size N . Note that by construction, all components F i have the same marginal distributions. Under such a setup, a so-called binomial point process (Møller and Waagepetersen 2004;van Lieshout 2000) would yield the classical FDA setup mentioned above. Note that the idea of analysing point patterns (collections of points) with attached functions has already been noted in the literature (Comas 2009;Delicado et al. 2010).
It is often the case that these functions have some sort of spatial dependence. For example, two functions f i and f j , with starting points f i (0) and f j (0) which are spatially close to each other in R k , either gain or lose from each other's vicinity. Accordingly, it seems natural to generate F 1 , . . . , F N conditionally on some collection of random spatial locations X i and some further set of random variables L i associated with the random functions F i ; conditionally on the spatial locations, the L i 's would influence the random functions F i in a non-spatial sense. We argue that the natural setting to do this is through functional marked point processes (FMPPs). More precisely, we define a FMPP Ψ = {(X i , (L i in R d to which we assign marks {(L i , F i )} N i=1 ; note that by forcing all L i to take the same value, we may reduce the FMPP to the collection {(X i , F i )} N i=1 . We here take a full grip and provide a proper framework for FMPPs, where we in particular take into account that for the standard point process machinery to go through (in particular the use of regular conditional probability distributions), one has to assume that the mark space, and thereby the function space F, is a Polish space (Daley and Vere-Jones 2008). In particular, one may then provide a reference stochastic process X F , with sample paths in F, whose distribution ν F on F acts as a reference measure which one integrates with respect to (in a Radon-Nikodym sense). We further provide a plethora of examples from the literature which fit into the FMPP framework and discuss these in some detail. Examples include geostatistics (Cressie and Kornak 2003) with random sampling locations, point processes marked with 'spatiotemporal random closed sets', e.g. spatio-temporal Boolean models (Sebastian et al. 2006), constructed functional marks, e.g. so-called LISA functions (Mateu et al. 2007), and the Renshaw-Särkkä growth-interaction model (Cronie and Särkkä 2011;Cronie et al. 2013;Renshaw and Särkkä 2001;Särkkä and Renshaw 2006). To be able to carry out statistical analyses in the context of FMPPs, various moment characteristics, such as product densities, are required and we here cover such characteristics.
A key observation here is that we, in contrast to previous works, completely move away from the (arguably unrealistic) assumption of stationarity. We then proceed to discussing various general marking structures, such as the marks having a common marginal distribution and the marks being (conditionally) independent. To study inter-actions between functional marks, we further define new types of summary statistics (of arbitrary order), which we refer to as weighted marked reduced moment measures and mark correlation functionals. These summary statistics are essentially mark-test function-weighted summary statistics which have been restricted to pre-specified mark groupings. We study them in different contexts and show how they under different assumptions reduce to different existing summary statistics. In addition, we provide nonparametric estimators for all the summary statistics and show their unbiasedness. We also show how these summary statistic estimators can be employed to carry out functional data analysis when the functional data-generating elements are spatially dependent (according to a FMPP). We finally apply our summary statistic estimators to the data illustrated in the first two panels of Fig. 1, in order to analyse population structures such as demographic evolution and sex ratio of human population over time in Spanish provinces.

Functional marked point processes
Throughout, let X be a subset of d-dimensional Euclidean space R d , d ≥ 1, which is either compact or given by all of R d . Denote by · = · d the d-dimensional Euclidean norm, by B(X ) the Borel sets of X ⊂ R d and by | · | = | · | d the Lebesgue measure on X ; dx denotes integration w.r.t. | · |. It will be clear from the context whether | · | is used for the Lebesgue measure or the absolute value. We denote by B(·) n the n-fold product of an arbitrary Borel σ -algebra B(·) with itself. Moreover, we denote by μ 1 ⊗ μ 2 the product measure generated by measures μ 1 and μ 2 and by μ n 1 the n-fold product of μ 1 with itself. Recall further that a topological space is called Polish if there is a metric/distance which generates the underlying topology and turns the space into a complete and separable metric space. A closed ball of radius r ≥ 0, centred in x ∈ S, where the space S is equipped with a metric d S (·, ·), will be denoted by Consider a point process Illian et al. 2008;Chiu et al. 2013). Throughout the paper, we refer to Ψ G as a ground/unmarked point process. To each point of Ψ G , we may attach a further random element, a so-called mark, in order to construct a marked point process Ψ . In this paper, a mark is given by a k-dimensional random function/stochastic process , a functional mark, possibly together with some further random variable L i , which we refer to as an auxiliary/latent mark. The resulting marked point process , N ∈ N 0 , will be referred to as a functional marked point process (FMPP). The main purpose of including auxiliary marks is to control the supports of the functional marks, on the one hand, and on the other hand, they may serve as indicators/labels for different types of points of the point process, in a classical multi-type point process sense.

Construction of functional marked point processes
To formally define a FMPP, we first need to specify the underlying mark space M. The general theory for marked point processes Vere-Jones 2003, 2008;van Lieshout 2000) allows us to consider any Polish space M as mark space. Here, we let the mark space be the Polish product space M = A × F given by the product of -a Borel subset A L i of some Euclidean space R k A , k A ≥ 1, referred to as the auxiliary/latent mark space, Note that due to the Polish structures of these spaces, the Borel sets of M are given by the product σ -algebra Explicit examples of auxiliary and functional mark spaces are given in Supplementary Materials, Section 13. Let Y = X × M and let N l f be the collection of all point patterns, i.e. locally finite subsets ψ = {(x 1 , l 1 , f 1 ), . . . , (x n , l n , f n )} ⊂ Y, n ≥ 0; n = 0 corresponds to ψ = ∅. Note that local finiteness means that the cardinality ψ(A) = |ψ ∩ A| is finite for any bounded Borel set A ∈ B(Y). Denote the corresponding counting measure σ -algebra on N l f by N l f (see Daley and Vere-Jones (2008, Chapter 9)); N l f is the σ -algebra generated by the mappings ψ → ψ(A) ∈ N 0 , ψ ∈ N l f , A ∈ B(Y). By construction, since point patterns here are defined as subsets, all ψ ∈ N l f are simple, i.e. ψ ({(x, l, f ) If a point process Ψ on Y is such that the ground/unmarked point process Ψ G = {x : (x, l, f ) ∈ Ψ } is a well-defined point process in X , we call Ψ a (simple) functional marked point process (FMPP) and when X ⊂ R d−1 × R, d ≥ 2, and Ψ G is a spatiotemporal point process in X , we call Ψ a spatio-temporal FMPP.
Note that Ψ may be treated either as a locally finite random subset ). In the spatio-temporal case, it may be convenient to write to emphasise that each ground process point has a spatial component, X i ∈ R d−1 , as well as a temporal component T i ∈ R.
Remark 1 Since all of the underlying spaces are Polish, we may choose a metric d(·, ·) on Y which turns Y into a complete and separable metric space, with metric topology given by the underlying Polish topology. For example, we may con- and the metrics d A (·, ·) and d F (·, ·) make A and F complete and separable metrics spaces (van Lieshout 2000); when A = R k A or A is a compact subset of R k A , we may use d A (l 1 , l 2 ) = l 1 − l 2 k A . In the spatio-temporal case, it may be natural to consider d X (( (Cronie and van Lieshout 2015), which is topologically equivalent to d X (( We will write P(R) = P Ψ (R) = P({ω ∈ Ω : Ψ (ω) ∈ R}), R ∈ N l f , for the distribution of Ψ , i.e. the probability measure that Ψ induces on (N l f , N l f ). When X = R d , for any ψ ∈ N l f and any z ∈ R d , we will write ψ + z to denote (x,l, f )∈ψ δ (x+z,l, f ) (or {(x + z, l, m) : (x, l, m) ∈ ψ}), i.e. a shift of ψ in the ground space by the vector z.
has the same distribution as Ψ for any rotation r .

Components of FMPPs
We emphasise that any collection of elements {(X 1 , L 1 , F 1 ), . . . , (X n , L n , F n )} ⊂ Ψ , n ≥ 1, consists of the combination of: -a collection of random spatial locations X 1 , . . . , X n ∈ X , -a collection L 1 , . . . , L n of random variables taking values in A, -an n-dimensional random function/stochastic process {F 1 (t), . . . , F n (t)} t∈T ∈ (R k ) n , with realisations in F n ; formally, this is an unordered collection of n stochastic processes in R k with sample paths in In particular, is a marked point process of the usual kind, with locations in R d and marks in A ⊂ R k A , i.e. each auxiliary mark L i = (L 1i , . . . , L k A i ) is given by a k A -dimensional random vector. Depending on how A and the distributions of the L i 's are specified, we are able to consider an array of different settings. For example, if A = {1, . . . , k d }, k d ≥ 2, each random variable L i has a discrete distribution on A. Since Ψ X ×A hereby becomes a multi-type/multivariate point process in R d , one may call such FMPPs multi-type/multivariate (Daley and Vere-Jones 2003;van Lieshout 2000;. In Supplementary Materials, Section 13, we look closer at specific choices for A. It is often convenient to write A = A d to emphasise when we have a discrete auxiliary mark space, such as A d = {1, . . . , k d }, and A = A c to emphasise when have a continuous space ((closure) of an open set), such as Within the current definition of FMPPs, we may also consider the scenario where the auxiliary marks play no role and thereby may be ignored. This may be obtained by, for example, setting A = {c} for some constant c ∈ R, so that all auxiliary marks attain the value c, or equivalently, setting L i = c a.s. for any i = 1, . . . , N , assuming that c ∈ A. It is worth remarking that the inclusion of the auxiliary marks allows us to impose an order on the points in the sense that we would consider a functional marked sequential point process; van Lieshout (2006b) connects sequential point processes with marked spatio-temporal point processes with mark space (0, 1).
Note that when we want to consider functional marks with realisations given by functions f (t) = ( f 1 (t), . . . , f k (t)) ∈ R k , t ∈ T , which describe spatial paths, we let k ≥ 2. Often the spatial locations X i describe the initial location of such a path, and it is then natural to assume that d = k ≥ 2 and f (t) ∈ X a.s. for any t ∈ T . An application here would be that the marks describe movements of animals, living within some spatial domain X ; recall Fig. 1.
Recall that each functional mark k ≥ 1, and U are Polish function spaces (products of Polish spaces are Polish). By conditioning Ψ on Ψ X ×A , which includes conditioning on N , we obtain the random functional which may be regarded as a stochastic process with dimension N and with the same marginal distributions for all of its components. Due to the inherent temporally evolving nature of the functional marks, one may further consider some filtration Σ T , and thus obtain a filtered probability space (Ω, Σ, Σ T , P), such that all F i = {F i (t)} t∈T , i = 1, . . . , N , are adapted to Σ T (see Supplementary Materials, Section 13.2, for more details).
Remark 2 Formally, Ψ |Ψ X ×A may be obtained as the point process generated by the family of regular conditional probabilities obtained by disintegrating P Ψ with respect to the distribution of Ψ X ×A on its point pattern space (Daley and Vere-Jones 2003, Appendix A1.5.).
We impose the Polish assumption on U in order to carry out the usual marked point process analysis Vere-Jones 2003, 2008); note that U being Polish implies that F is Polish and B(F) = B(U k ) = B(U) k . However, choosing a Polish function space U is a delicate matter; note that Comas et al. (2011) did not address this issue. In Supplementary Materials, Section 13.2, we consider functional mark spaces in more detail and there we cover the two most natural choices for U, namely Skorohod spaces and L p -spaces (Billingsley 1999;Ethier and Kurtz 1986;Jacod and Shiryaev 1987;Silvestrov 2004). Note that these two classes of functions are not mutually exclusive.
Noting that, in general, the support supp( f ) = {t ∈ T : f (t) = 0} ⊂ T of a function f ∈ F need not be given by all of T , in some contexts it may be natural to let Ψ X ×A govern the supports supp( To illustrate this idea, consider the case where d = 1 and X = T = [0, ∞), so that is a temporal point process. In addition, assume that k A = 1 and that each auxiliary mark L i is some non-negative random variable, such as an exponentially distributed one, which does not depend on Ψ G . Let us think of T i and L i as a point's birth time and lifetime, respectively. Defining the corresponding death time as D i = T i + L i , we may then, for example, let s., where 0 is the k-dimensional vector of 0s. Note further that there in addition to this may exist t ∈ [T i , D i ) such that F i (t)|Ψ X ×A = 0 in some way (e.g. absorption), which is something governed by the distribution of {F i (t)|Ψ X ×A } t∈T on F. An explicit construction to obtain this when k = 1 would, , t ∈ T , for some stochastic process Y (t), t ∈ [0, ∞), which starts in 0.

Reference measures and reference stochastic processes
For the purpose of integration, among other things, we need a reference measure on (Y, B(Y)). We let it be given by the product measure where then we may choose ν A to be some probability measure. If A = A d × A c is given by a product of a discrete and a continuous space, then ν A can be taken to be a product measure Turning to the functional mark space (F, B(F)), consider some suitable reference random function/stochastic process where each X F (ω) ∈ U k = F is commonly referred to as a sample path/realisation of X F . This random element induces a probability measure on F, which we will employ as our reference measure on F. Note that the joint distribution on (F n , B(F n )) of n independent copies of X F is given by ν n F , the nfold product measure of ν F with itself. Moreover, if there is a suitable measure ν U on U, we let ν F = ν k U . Specifically, ν F , or X F , should be chosen such that suitable absolute continuity results can be applied. More specifically, the distribution P Y on (F n , B(F n )), n ≥ 1, of some stochastic process Y = {Y (t)} t∈T ∈ F n = (U k ) n of interest should have some (functional) density/Radon-Nikodym derivative f Y with respect to ν n Kolmogorov's consistency theorem allows us to specify the (abstract) distribution P Y of Y through its finite-dimensional distributions (on (R k ) n ). In many situations, a natural choice for ν F is a Gaussian measure on B(F), i.e. one corresponding to some Gaussian process X F , or the distribution corresponding to a Markov process X F : (Ω, Σ, P) → (F, B(F)). An often natural choice, which satisfies both of these properties, is the k-dimensional standard Brownian motion/Wiener process which is generated by the corresponding Wiener measure W F on B(F). In certain cases, one speaks of an abstract Wiener space or Cameron-Martin space. Here, issues related to absolute continuity have been extensively studied, and explicit constructions of Radon-Nikodym derivatives involve, for example, the Cameron-Martin-Girsanov (change of measure) theorem. For discussions, overviews, and detailed accounts, see, e.g. Kallenberg (2006), Rajput (1972), Maniglia and Rhandi (2004), Skorohod (1967) and the references therein.
Note that integration of a measurable function h with respect to ν satisfies whenever the auxiliary marks are (partially) discrete, the integral over A is (partially) replaced by a sum.

FMPP examples
The class of FMPPs provides a framework to give structure to a series of existing models, and it allows for the construction of new important models and modelling frameworks, which have uses in different applications. In Supplementary Materials, Section 11, we provide an array of different models which fit into the FMPP structure. More specifically, we look closer at the following examples: -By letting the functional marks be random constant functions, we obtain (equivalents of) point processes with real valued marks. -Deterministic functional marks, obtained by letting A particular instance of this, which we also look at extensions for, is the growth interaction process of Renshaw and Särkkä (2001), which has been extensively employed for dynamical modelling of forest stands; here, the functional marks are governed by a set of differential equations.
-For a (spatio-temporal) FMPP Ψ , where the spatial locations X i are located in some subset of R 2 and k = 1, i.e. F = U, so that F i (t) = F i1 (t) ∈ R, t ∈ T , we look closer at the temporally evolving random closed set This provides a natural geometric interpretation for many FMPP settings (see Fig. 4 in Supplementary Materials, Section 16).
k ≥ 1, we define spatio-temporal geostatistical marking/sampling of a random field at random locations by letting Here, we, for example, look closer at spatio-temporal geostatistics under such a random monitoring location setting. -Constructed functional marks are constructed to reflect geometries of point configurations in neighbourhoods of individual points. A typical example is given by so-called LISA functions (local estimators), which we here formally define as , for some function S. -We discuss spatio-temporal intensity-dependent marking which we define to occur if, conditionally on Ψ G and the auxiliary marks, the functional marks is a spatio-temporal zero mean Gaussian noise process.
In Supplementary Materials, Section 12, we additionally provide a few (further) examples of applications, and in Supplementary Materials, Section 16, we provide examples of classical point process models which are functional marked.

Moment characteristics for FMPPs
Besides illustrating the connections above, the aim of this paper is to consider different statistical approaches which allow us to analyse point pattern data with functional marks. For a wide range of summary statistics, the core elements are intensity functions and higher-order product density functions. We next consider product densities and intensity reweighted product densities for FMPPs. In Supplementary Materials, Section 13, we look closer at what these entities look like under various auxiliary and functional mark space choices.

Product densities and intensity functionals
Let Ψ be a FMPP with ground process Ψ G . Given some n ≥ 1 and some measurable functional h : Here, = denotes summation over distinct n-tuples. We first note that the nth-order factorial moment measure α (n) (A 1 × · · · × A n ) of Ψ is retrieved by letting h be the indicator function for the set Assume next that the nth-order (functional) product density ρ (n) , i.e. the Radon-Nikodym derivative of α (n) with respect to the n-fold product of the reference measure ν in (1) with itself, exists. We have that α (n) and ρ (n) satisfy the following Campbell formula (Chiu et al. 2013): (5) is interpreted as the probability of having ground process points in the infinitesimal neighbourhoods dx 1 , . . . , dx n ⊂ X of x 1 , . . . , x n , with associated marks belonging to the infinitesimal neighbourhoods d( Turning to the ground process Ψ G , through α (n) we may define the nth-order ground factorial moment measure α (n) G with respect to the n-fold product |·| n of the Lebesgue measure |·| with itself, which is called the nth-order ground product density. Note that by letting the function h in (5) be a function on X only, we obtain a Campbell formula for the ground process Ψ G . Moreover, by the existence of ρ (n) G and ρ (n) , it follows that (Heinrich 2013) where are densities of the families as the density of the conditional joint probability distribution of n auxiliary marks in A, given that Ψ indeed has n points at the locations x 1 , . . . , is interpreted as the density of the conditional joint probability distribution of n functional marks in F, given that Ψ G has points at the n locations x 1 , . . . , x n ∈ X with attached auxiliary marks l 1 , . . . , l n ∈ A. Recalling Sects. 2.2 and 2.3, we see that P F (x 1 ,l 1 ),...,(x n ,l n ) (·) represents the probability distribution on (F n , B(F n )) of n components of Ψ |Ψ X ×A = {F 1 |Ψ X ×A , . . . , F N |Ψ X ×A }, which may be seen as an n-dimensional random function/stochastic process {F 1 (t)|Ψ X ×A , . . . , F n (t)|Ψ X ×A } t∈T ⊂ F. This distribution is absolutely continuous with respect to the reference measure ν n F , i.e. the distribution of an n-dimensional version of the reference process X F , with density given by (8).
, is a functional; here, we use the term 'functional' for any mapping which takes a function as one of its arguments. The two regular probability distribution families (9) and (10) constitute the so-called n-point mark distributions (Chiu et al. 2013): The intensity measure is given by and we refer to is the intensity of the ground process, Ψ G .

Correlation functionals
Pair correlation functions, which are not in fact correlations in the usual sense, are valuable tools for studying second-order dependence properties of point processes. These may be generalised to arbitrary orders n ≥ 2 to characterise n-point interactions between the points of a point process, and here in the FMPP context, we will refer to them as correlation functionals. Assuming that ρ and ρ (n) , n ≥ 1, exist, the nth-order correlation functional is defined as where and is the nth-order correlation function of the ground process, Ψ G . Note that γ F (x 1 ,l 1 ),...,(x n ,l n ) (·) represents the conditional joint density of n functional marks, given their associated locations and auxiliary marks, divided by the conditional marginal densities of these functional marks, given their corresponding associated locations and auxiliary marks. An analogous interpretation holds for the second term, but then regarding the auxiliary marks instead and conditioned only on the locations. The particular case , is referred to as the pair correlation functional (pcf) and we note that g is the pair correlation function of the ground process (Baddeley et al. 2000;Chiu et al. 2013). When n = 2, the first term on the righthand side in (12) may be expressed as γ A represents a conditional density on F of one functional mark, F 1 , given another functional mark, F 2 , as well as the associated locations and auxiliary marks.

FMPP model structures
We next look closer at a few structural distributional assumptions and model structures for FMPPs. In the context of the auxiliary marks, we have already highlighted some effects of imposing different independence assumptions on the marks. Here, we mainly focus on two assumptions which will play a role in the statistical analysis: common marginal mark distributions and (location-dependent) independent marking. In Supplementary Materials, Section 16, we further provide a few different functional marked classical point process models.

Common mark distributions
An assumption which may be realistic in a variety of different contexts is that the marks are not necessarily independent but they have the same marginal distributions. We next look closer at this setting and we note that the statements below should be understood in an almost everywhere (a.e.) setting.
Definition 2 Let Ψ be a FMPP with ground process Ψ G and consider the following scenarios, defined conditionally on Ψ G .
This is, for example, the case when Ψ is stationary (Schneider and Weil 2008, Thm 3.5.1.); P M (·) is then commonly referred to as the mark distribution. -Ψ has a common (marginal) functional mark distribution: each F i |Ψ X ×A ∈ Ψ |Ψ X ×A , i = 1, . . . , N , has the same marginal distribution on (F, B(F)), which neither depends on its spatial location nor its auxiliary mark. Here, Under the assumption of a common mark distribution, it may further be the case that the common mark distribution P M coincides with the reference measure ν M = ν A ⊗ν F (so ν A and ν F must be probability measures), which implies that For example, ν A may be a Bernoulli distribution with parameter p ∈ [0, 1] and A = A d = {0, 1}, and ν F a Wiener measure W F , whereby (marginally) L i is a Bernoulli random variable and F i is a Brownian motion, which are independent of each other. Under the weaker assumption that Ψ has a common functional mark distribution, recalling the reference process X F in (2), which has ν F as distribution, when additionally P F = ν F we here obtain that, marginally, each component F i |Ψ X ×A , i = 1, . . . , N , has the same distribution as X F . To provide an example for this setting, note, for example, that for the (stochastic) growth-interaction model, con- Remark 3 Note that when Ψ has a common functional mark distribution, we do not necessarily assume that there is a common (marginal) auxiliary mark distribution, i.e. that Ψ X ×A has a common mark distribution. Under such an assumption, all L i |Ψ G , i = 1, . . . , N , have the same marginal distributions, which do not depend on the spatial locations, whereby Hence, if there is a common auxiliary mark distribution as well as a common functional mark distribution, it follows that P M i.e. L i and F i are conditionally independent for any i = 1, . . . , N . This is a stronger assumption than the assumption of a common mark distribution and it holds, for example, when P M = ν M = ν A ⊗ ν F .

Location-dependent independent marking and random labelling
We next turn to two common notions of mark independence: location-dependent independent marking and random labelling.

Definition 3
We say that a FMPP Ψ is (location-dependent) independently marked if, conditional on its ground process Ψ G , all marks (L i , F i ), i = 1, . . . , N , are independent but not necessarily identically distributed (Daley and Vere-Jones 2003, Definition 6.4.III).
By further adding the assumption of a common marginal mark distribution to independent marking, so that the marks become independent and identically distributed as well as independent of the ground process Ψ G , we obtain the definition of random labelling.
Hereinafter, we will use the shorter term 'independent marking', thus leaving out the part 'location-dependent', in keeping with Daley and Vere-Jones (2003). Under independent marking, each mark (L i , F i ) may depend on its associated spatial location and it follows that for any D i × E i ∈ B(A × F), i = 1, . . . , n, and any n ≥ 1. Furthermore, under random labelling, expression (15) reduces to if the common mark distribution coincides with the reference measure ν M = ν A ⊗ ν F ; this additionally implies that the auxiliary and functional marks are (conditionally) independent of each other. Under independent marking it clearly follows that the correlation functionals satisfy g (n) Ψ ((x 1 , l 1 , f 1 ), . . . , (x n , l n , f n )) = g (n) G (x 1 , . . . , x n ), n ≥ 1.
Hence, if, for example, the pair correlation functional coincides with the pair correlation function of the ground process, then the auxiliary and functional marks are pairwise conditionally independent.
It is not always the case that one wants to have both the auxiliary and the functional marks being independent. We next turn to the case where the functional marks are independent.
Definition 4 If all the components of Ψ |Ψ X ×A = {F 1 |Ψ X ×A , . . . , F N |Ψ X ×A } are independent, we say that Ψ has (location-and auxiliary mark-dependent) independent functional marks.
When Ψ has both independent functional marks and a common marginal functional mark distribution, we say that Ψ has randomly labelled functional marks.
Further, given that Ψ has independent functional marks, if we additionally assume that the auxiliary marks are conditionally independent, so that P A , for any n ≥ 1, we retrieve the classical definition of independent marking for real valued marks (Daley and Vere-Jones 2003, Definition 6.4.III), and consequently that of random labelling by assuming that they are also identically distributed.
Remark 4 A weaker form of location-and auxiliary mark-dependent independent functional marking, conditional independent functional marking, may be obtained by assuming that for any n ≥ 1 and some family {P F (x 1 ,l 1 ),...,(x n ,l n ) (E) : (x 1 , l 1 ), . . . , (x n , l n ) ∈ X × A, E ∈ B(F)} of regular probability distributions. Note that here the distribution of a functional mark may depend on all the spatial locations and auxiliary marks.

Poisson processes
Poisson processes (Daley and Vere-Jones 2003;Chiu et al. 2013), the most well-known point process models, are the benchmark/reference models for representing lack of spatial interaction and constructing other, more sophisticated models. Given a positive locally finite measure μ on B(Y) = B(X × A × F), a functional marked Poisson process Ψ , with intensity measure μ, is simply a Poisson process on Y with the additional assumption that Ψ G is welldefined. When Ψ has a well-defined intensity functional ρ(·), i.e. when the intensity measure in (11) ((x 1 , l 1 , f 1 ), . . . , (x n , l n , f n )) ≡ 1 for any n ≥ 1. Note that, formally, not every (functional marked) Poisson process is actually a marked point process; we may not necessarily have that Ψ G is a well-defined point process in X (van Lieshout 2000, p. 8). That being said, we here clearly have an example of independent marking. When there is a common functional mark distribution, all of the functional marks are given by independent copies of the reference process X F in (2). In particular, if the reference measure ν F is given by a Wiener measure W F on F, then the functional marks are iid Brownian motions. Moreover, when Ψ has a common mark distribution, it becomes randomly labelled and ρ (n) ((x 1 , l 1 , f 1 ), . . . , (x n , l n , f n )) = ρ n G > 0 if the common mark distribution coincides with ν M .
When we condition on N = n, we obtain a Binomial point process, which is simply a random (iid) sample

Reference measure averaged reduced Palm distributions
In the statistical analysis, we will need to consider Palm conditioning with respect to a given mark set (D × E) ∈ B(A × F); we interpret this as conditioning on the null event that there is a point of Ψ G at a given location, under the assumption that the mark associated with this point belongs to (D × E). To be able to do so, we follow van Lieshout (2006a), Cronie and van Lieshout (2016) and define the ν M -averaged reduced Palm distribution with respect to (D × E) ∈ B(A × F).

Definition 5 Given a FMPP Ψ , its family
Since P !(x,l, f ) (·) is the distribution of the reduced Palm process Ψ !(x,l, f ) , heuristically, P !(x,l, f ) (·) is the conditional distribution of Ψ , given that Ψ has a point at (x, l, f ) which we neglect. Moreover, the probability measure P !x D×E (·) has expectation by Fubini's theorem.
In particular, for a Poisson process on X × A × F, by Slivnyak's theorem (Chiu et al. 2013), , which is independent of the choice of auxiliary reference measure ν A . When Ψ has a common mark distribution which coincides with the reference measure, i.e. P M x ∈ X , we obtain a non-stationary and reduced version of the Palm distribution of Ψ with respect to the mark set D × E found in Chiu et al. (2013, p. 135): .
This may now be interpreted as the conditional distribution of Ψ , given that it has a point with location x with a mark belonging to D × E. Note further that under stationarity we have that P !(x,l, f ) (·) ≡ P !(0,l, f ) (·) for any x ∈ X = R d so the reduced Palm distributions with respect to D × E all satisfy P !x D×E (·) ≡ P !0 D×E (·). In Supplementary Materials, Section 14, we mention n-point versions of the above.
To connect the above distributions to the reduced Palm distributions P !x G (·), x ∈ X , of the ground process, let h in the reduced Campbell-Mecke formula (16) depend only on the ground location and the FMPP: may be interpreted as an average Palm distribution of Ψ , given that it has a point at x with unspecified mark (Daley and Vere-Jones 2008, (13.1.13)). The measureP !x (·) is a distribution on the space (N l f , N l f ) of marked point patterns, but by projecting it onto the corresponding measurable space of unmarked point patterns, we obtain the reduced Palm distribution P !x G (·) of Ψ G at x ∈ X (Daley and Vere-Jones 2008, p. 279). For any non-negative and measurable function h on the product of the ground space and the space of all unmarked point patterns, denotes expectation under P !x G (·). Moreover, when Ψ has a common mark distribution which coincides with the reference measure, we obtain that P !x A×F (·) = P !x (·). Hence, under this assumption, the projection of P !x A×F (·) onto the space of unmarked point patterns is simply P !x G (·).

Marked intensity reweighted moment stationarity
To be able to treat the summary statistics considered in this paper, we first have to introduce the notion of kth-order marked intensity reweighted stationarity (k-MIRS) (cf. Cronie and van Lieshout 2016; Iftimi et al. 2019).
Note that, loosely speaking, this definition essentially states that after having scaled away the effects of the varying intensity, the dependence structure, which is reflected by the product densities, only depends on the distance between the points. Note further that we have implicitly assumed that the product densities up to order k exist. Further details about and examples of k-MIRS processes are mentioned in Supplementary Materials, Section 15.

Summary statistics
Having provided various moment characteristics (Sect. 4) and notions of intensity reweighted moment stationarity (Sect. 7) for FMPPs, we may now look closer at how these can be exploited to study dependence structures in FMPPs. Characterising dependence in marked point processes can, in general, be done in various different ways. There are, however, essentially two main approaches which are studied: 1. Spatial interaction between groups of points of Ψ G , based on different classifications of the marks. 2. Dependence between the marks, conditionally on the ground process.
The former approach may be carried out by means of marked second-order reduced moment measures/K -functions, marked inhomogeneous nearest neighbour distance distribution functions, marked inhomogeneous empty space functions and marked inhomogeneous J -functions, which are defined in Iftimi et al. (2019), Cronie and van Lieshout (2016), and van Lieshout (2006a). The last three of these are full-distribution summary statistics and require that the point process is k-MIRS for any order k ≥ 1, whereas the first two are second-order statistics which require SOMIRS. We here study the second approach, and to this end, we define some new summary statistics and, as we shall see, they generalise most existing finite-order (marked) inhomogeneous summary statistics.
Definition 7 Assuming that 2 ≤ n ≤ k, let Ψ be k-MIRS and consider some test function t = t n , by which we mean a measurable mapping t : , the corresponding t-weighted marked nth-order reduced moment measure is defined as . . , n − 1. We further refer to r 1 , . . . , r n−1 ≥ 0, as the t-weighted nth-order marked inhomogeneous K -function; The interpretation of (18) is essentially provided by Lemma 1. Having scaled away the individual intensity contributions of the points of Ψ , conditionally on Ψ having a point at an arbitrary location z ∈ R d with associated mark (L(z), F(z)) ∈ D × E, which is neglected, (18) provides the mean of the random variable t((L(z), F(z)), (L 1 , F 1 where the locations X 1 , . . . , X n−1 of the points associated to n − 1 other marks (L 1 , F 1 ), . . . , (L n−1 , F n−1 ) belong to the respective sets z + C i , i = 1, . . . , n − 1. (18). The current choice has been made to emphasise the connection with the summary statistics in Cronie and van Lieshout (2016) and Iftimi et al. (2019).

Remark 5 We could just as well have chosen to absorb the indicator function
In order to give a feeling for how the mark sets in (18) may be specified here in the FMPP context, consider a bivariate FMPP, i.e. A = {1, 2}, where k = 1, so that F i : T → R. Next, let n = 2 and let D = {1}, Here, we would thus restrict the t-weighted correlation provided by (18) to only be between points of different types and, moreover, to be between the two classes of functional marks which either exceed the threshold c or not (see Sect. 8.1 for examples of test functions). For instance, in the forestry context A would represent the two species under consideration while c would be the threshold diameter at breast height of the trees; if we would instead set D = D 1 = A, we would ignore the species and simply study the interaction between large and small trees, irrespective of the trees' species. Hence, we are able to study how large trees affect the survival of small trees, which is something of interest in ecology (Platt et al. 1988;. We emphasise that it should be checked that the chosen sets E i , i = 1, . . . , n − 1, are indeed measurable, given the chosen function space (F, B(F)).
We will see that (18) is closely related to the nth-order reduced moment measure of the ground process (cf. Møller and Waagepetersen (2004, Section 4.1.2)), the last two equalities follow from the Campbell formula, the imposed nth-order intensity reweighted stationarity of Ψ G (which follows from Ψ being k-MIRS) and the Campbell-Mecke formula. An n-point generalisation of the inhomogeneous Baddeley et al. (2000) to the nth-order intensity reweighted stationary setting is obtained by considering K denotes the closed origin-centred ball with radius r ≥ 0. Note further that stationarity implies that α (n) where, clearly, in this case K (n) inhom (r ), r ≥ 0, yields an n-point generalisation of the K -function of Ripley (1976). In addition, we will see in Lemma 1 that (18) is also related to the following kernel (recall (13)).

Definition 8 The (nth-order) intensity reweighted t-correlation measure
In other words, κ · t is a spatially dependent weighting of ν t (·) and we interpret it as the expectation of the random variable t((L 1 , F 1 ), . . . , (L n . . , n, having scaled away the individual mark density contributions. Note that since Ψ is simple, (19) vanishes whenever x i = x j for any i = j and, moreover, by the imposed nth-order marked intensity reweighted stationarity, we further have that κ To highlight the connections with Penttinen and Stoyan (1989), we refer to i.e. (19) with all mark sets set to A × F, as the (nth-order) intensity reweighted tcorrelation functional; note that it is interpreted as the expectation of the random variable t((L 1 , F 1 ), . . . , (L n , F n )), conditionally on X i = x i , i = 1, . . . , n, having scaled away the individual mark density contributions. Lemma 1, to which the proof is found in Supplementary Materials, Section 19, gives reduced Palm and ν M -averaged reduced Palm distribution representations of (18). It also expresses (18) through (19) and K G , and it tells us that (18) is independent of the choice W ∈ B(R d ). From a statistical point of view, the main importance of Lemma 1 is related to nonparametric estimation-instead of repeated sampling to estimate (18), we can simply estimate (18) by sampling over each point of the point pattern, which is an effect of the imposed k-MIRS.

Lemma 1 The t-weighted marked nth-order reduced moment measure in (18) satisfies
for almost every z ∈ R d , where (L(z), F(z)) denotes the mark associated with the reduced Palm conditioning under P !z D×E (·) Hence, (18) may be expressed as a spatial dependence scaling (reflected by K G ) of the spatially dependent mark-dependence function (19).
Looking closer at Lemma 1, we see that normalising (18) by K G can reveal features of the marking structure, conditionally on the locations.

Definition 9
The normalised t-weighted marked nth-order reduced moment measure is defined as where the normalisation of K G in the last term is a probability measure on C 1 × · · · × C n−1 .
In Supplementary Materials, Section 17, we look closer at how our summary statistics change when we assume the existence of a common mark distribution.
This reduces to γ A x 1 ,...,x n (l 1 , . . . , l n ) when Ψ has independent functional marks and we obtain the usual Poisson case when Ψ has independent marks. When Ψ is a FM ground Poisson process, and by additionally assuming independent marking,K Note that these observations may be used to statistically test independent (functional) marking and Poisson assumptions.

Choosing test functions: analysing dependent functional data
By choosing different test functions t(·), we may extract different features from the marks. In practice, in a statistical context, it is most likely that one will focus only on the case n = 2; note the connections with Comas et al. (2011). Note in particular that when n = 2, if we ignore the functional marks and set t((l 1 , f 1 ), (l 2 , f 2 )) = l 1 l 2 , (18) yields an intensity reweighted version of the classical mark correlation function for the auxiliary marks. If, instead, t((l 1 , f 1 ), (l 2 , f 2 )) = (l 1 − l 2 ) 2 /2, we obtain the classical mark variogram for the auxiliary marks (Illian et al. 2008). The question that remains is how we should choose sensible tests functions t(·) which include also the functional marks.
Starting with the simple case t(·) ≡ 1, we obtain ν t = ν n M and By additionally letting n = 2 in (18), we retrieve the marked second-order reduced moment measure K (D×E)(D 1 ×E 1 ) (C) of Iftimi et al. (2019), which measures intensity reweighted interactions between points with marks in D × E and points with marks in D 1 × E 1 , when their separation vectors belong to C ∈ B(R d ). We stress that this measure, and thereby also (18), is non-symmetric in the mark sets, i.e. (Iftimi et al. 2019). In particular, choosing C 1 to be the closed origin-centred ball B R d [0, r ] of radius r ≥ 0, we obtain the marked inhomogeneous K -function K (D×E)(D 1 ×E 1 ) inhom (r ) of Cronie and van Lieshout (2016), which measures pairwise intensity reweighted spatial dependence within distance r between points with marks in D×E and points with marks in D 1 ×E 1 . Moreover, setting ψ] or v ∈ [π + φ, π + ψ]} for φ ∈ [−π/2, π/2) and ψ ∈ (φ, φ + π ], we obtain a marked inhomogeneous directional version which may be used to study departures from isotropy, and setting C 1 = {(x, s) : x ≤ r and |s| ≤ t} when Ψ is a spatio-temporal FMPP, we obtain a spatio-temporal version K (Iftimi et al. 2019).
Hence, for an arbitrary n, setting t(·) ≡ 1 in (18) we would obtain a definition of a marked nth-order reduced moment measure, K (D×E) , which has an analogous interpretation; it measures intensity reweighted spatial interaction between an arbitrary point with mark in D × E and distinct (n − 1)-tuples of other points where, respectively, the separation vectors between these points and the D × E-marked point belong to C i , i = 1, . . . , n − 1, and these points have marks belonging to D i × E i , i = 1, . . . , n − 1. Moreover, it may be of particular interest to choose all C i , i = 1, . . . , n − 1, to be the same set C 1 . For example, (r ), of the marked inhomogeneous K -function of Cronie and van Lieshout (2016), which may be used to analyse intensity reweighted interactions between a point with mark in D × E and n − 1 of its r -close neighbours, which have marks belonging to the respective sets D i × E i , i = 1, . . . , n − 1.
It should be mentioned that when t(·) ≡ 1 under independent marking, t (·) ≡ 1, which may be used to statistically test for independent marking.
We next turn to test functions which include the functional marks and we here only consider the case n = 2. A natural starting point, we argue, is to consider metrics (distances) between the (functional) marks. There are various choices to be considered (see e.g. Deza and Deza (2009) and the references therein) and each may reflect different features of the functional marks' properties; although it may be natural to use the metric having generated the assumed Polish topology of the function space F, we may naturally consider different choices here. We here choose to consider the following metrics as test functions: L p -metrics as defined in (28) (see also Section 13.2), and the symmetrised Kullback-Leibler (KL) divergence, the rationale behind using the symmetrised KL divergence (as opposed to the usual KL divergence) is that the test function t(·, ·) is permutation invariant/symmetrical. A further choice is to consider angles, or rather inner products; t( f 1 , In the literature on functional clustering, a common measure of proximity between two functions is Ferraty and Vieu (2006) When, conditionally on Ψ X ×A , all the functional marks have the same meanF(t) = E[F i (t)|Ψ X ×A ], t ∈ T , which, for example, is the case when there is a common functional mark distribution, we may consider a functional mark counterpart of the test function for the classical variogram, where in practice,F(t) may be estimated by means of (1/n) n i=1 f i (t), i.e. the average functional mark at time t for the observed functional part of the point pattern. Note that for each of the above choices we may reduce the interval T to some smaller interval [a, b] ⊂ T . Moreover, we may consider combinations of them by summing them up.
When we want to consider test functions which include both functional and auxiliary marks, we may exploit metric preserving properties of certain operations (van Lieshout 2000, p. 8), such as summation and maximum, and apply these to the above mentioned test functions (metrics) for the functional marks and the metrics provided by Illian et al. (2008, p. 343) for auxiliary marks in order to define a test function for general purposes. When n = 2, one may, for example, consider the following two test functions: where d F (·, ·) is a metric on function space F as mentioned above. For general n, we will follow the same procedure.
Moving away from the classical stationary setting to the current (intensity reweighted) inhomogeneous setting makes the interpretation of the summary statistics less straightforward. This added complexity is motivated by a number of factors. Firstly, in practice it is often natural to assume that the (spatial) sampling does not occur homogeneously. Further, one may be interested in dealing with modelling, for example, spatially varying correlations among the marks, but in the stationary case this is not possible since the (higher-order) mark distributions need to be translation invariant. Whether we consider the general inhomogeneous case or the special case of stationarity, our summary statistics provide statistical insight into the link between two different dependence structures, one with respect to the mark space and one with respect to the ground space. When we use functional metrics as test functions, we obtain a rather natural interpretation of (18). If we condition on the ground process, then what we are left with is a quantity which only depends on the functional marks. Moreover, if we consider a Poisson process on Y (recall (22)) then we essentially obtain a mean distance between the random functions; note that we here have a random number of iid random functions. Recalling the above discussion about marked K -functions as special cases of (18), we see that in the general (non-Poisson) case we essentially weight this mean distance by the marked K -function, i.e. we penalise it based on the marked spatial interaction of Ψ . As an effect, when the interaction in the underlying point process is weak, the outcome of the summary statistic is mainly governed by the marking structure. It may further be noted that by letting the test function be the projection t( f 1 , f 2 ) = f 1 (s) f 2 (u) for some fixed u, s ∈ T , we would instead get a version of the mark correlation function and, consequently, we obtain weighted second moment-like arguments for the functional marks (negative values of the test function are handled by splitting quantities into positive and negative parts). Moreover, in principle, we could construct a test function such that our summary statistic is the expectation of a covariance function estimator. In conclusion, various notions of dependence/closeness of the functional marks can be obtained by considering different test functions.

Nonparametric statistical inference
We next turn to the nonparametric estimation of our summary statistics. Specifically, we here assume that we observe a FMPP Ψ within a bounded spatial domain W ∈ Theorem 1 provides a nonparametric estimator of the t-weighted marked nth-order reduced moment measure, and it provides a condition for edge corrections to render it unbiased. Its proof is found in Supplementary Materials, Section 19.

Theorem 1 Consider a k-MIRS FMPP Ψ and a test function t = t n
is an unbiased estimator of K

provided that the intensity function ρ(·) is known and that the edge correction function w(·) satisfies
Here, three relevant questions immediately arise: Which edge correction methods satisfy the condition in Theorem 1, and are there other (biased) edge correction methods which still work well in practice? How do we deal with the rather abstract reference measure ν M = ν A ⊗ν F in (24)? How should we deal with the unknown true intensity ρ(·) in (24)?
Regarding the edge correction function w(·), letting t(·) ≡ 1 as well as assuming that Ψ has a common mark distribution which coincides with the reference measure, we obtain the estimator of K G (C 1 × · · · × C n−1 ), based on Ψ G ∩ W , and by looking closer at the case n = 2 in the literature [see, e.g. Cronie and van Lieshout (2016), Gabriel (2014, Appendix 1) and Baddeley (1998)], we get guidance in identifying suitable edge corrections. We obtain that the following choices satisfy the condition of Theorem 1; the proof of Corollary 1 is given in Supplementary Materials, Section 19.

Corollary 1
The minus sampling edge correction where denotes Minkowski subtraction, and the translational edge correction both yield that the estimator in Theorem 1 is unbiased. Moreover, when the ground space is given by R d , d = 2, 3, and n = 2, also the isotropic or rotational edge correction yields an unbiased estimator (24); here denotes length in R 2 or surface area in R 3 and ∂ is used to denote the boundary of a set.
There are clearly other edge correction methods such as rigid motion correction which do not satisfy the condition in Theorem 1 but still work well in practice.
Turning to the second question, in analogy with Baddeley et al. (2000), Cronie and van Lieshout (2016), Iftimi et al. (2019), Zhao and Wang (2010), define the random measures Following a suggestion by Stoyan and Stoyan (2000), in (24) by the corresponding estimator to obtain a ratio-unbiased estimator which yields better estimates in practice. This approach is referred to as the Hamilton principle. Moreover, in the case of the minus sampling edge correction, the arguments above should be applied to |W Assuming k-MIRS for any order k ≥ 1 (see Supplementary Materials, Section 15, for details) as well as some notion of ergodicity, by following the steps of the proof of Iftimi et al. (2019, Theorem 2), one could also show that (24) is strongly consistent, i.e. that it a.s. converges to (18) when we apply an increasing domain asymptotic regime. This regime is obtained by replacing the window W in (24) by a convex averaging sequence W n , |W n | → ∞ (Daley and Vere-Jones 2008, Definition 12.2.I). The question of asymptotic normality of (24) is a more delicate matter, however, and one would likely need to consider some notion of mixing for the FMPP in question (cf. Biscio et al. 2018;Biscio and Waagepetersen 2019).
These observations directly connect to the third question, which is how we deal with the fact that the true intensity function is unknown in practice. The most common and natural approach is to replace ρ(·) in Theorem 1 by a plug-in estimator ρ(x, l, f ), (x, l, f ) ∈ W × A × F. This, however, connects back to the problem of specifying ν M because to estimate ρ(·) we need to know ν M -the intensity function is a Radon-Nikodym derivative with respect to the reference measure. A pragmatic and (we argue) not so restrictive approach is to assume that there is a common functional mark distribution which coincides with the functional mark reference measure ν F . By doing so, any intensity estimator is of the form ρ (x, l, f it does not depend on the functional mark values. In other words, we are in the land of estimating intensity functions for point processes with real valued marks or/and multivariate point processes. Hence, we may consider the estimator . Moreover, taking the Hamilton principle into account, we would here replace the reference measure related parts in (25) by the esti- . . , n − 1. This is indeed quite remarkable-we may estimate a statistic based on something as abstract as a measure on a Polish function space, as well as a Radon-Nikodym derivative with respect to it, without ever having to know or consider any of these entities. Now, it should be noted that the Hamilton principle reference measure estimators may be ignored for certain intensity estimators since these estimators already satisfy |W | = Ξ(W ; ρ G ) = |W | and ν M (D × E; ρ W ×A , ρ G ) = ν M (D × E) (Cronie and van Lieshout 2018;Moradi et al. 2019). Note finally that if we impose the stronger assumption that there is a common mark distribution P M (auxiliary and functional marks) which coincides with ν M , or if we do not consider any auxiliary marks, we simply replace ρ W ×A (·) above by ρ G (·).
In Supplementary Materials, Section 18, we briefly indicate how one could exploit the nonparametric estimators above to carry out minimum contrast-based parametric inference.

Simulation study
We next conduct a brief simulation experiment to illustrate how our summary statistics, i.e. the t-weighted marked nth-order reduced moment measures in (18), can be used to analyse FMPP data. The main idea is to show that the t-weighted nth-order marked inhomogeneous K -function can be used to test for random labelling, i.e. independence between functional marks and their associated spatial locations. To generate a point pattern with a random curve associated with each spatial location, we consider a spatiotemporal Gaussian random field which we sample spatially at the spatial coordinates generated by a spatial point process in W = [0, 1] 2 ; in practice, we have to sample the spatially referenced curves over a fine temporal grid t 1 < · · · < t k . The procedure is as follows: first we generate spatial coordinates ψ G = {x i } n i=1 according to the following two distinct scenarios: an inhomogeneous Poisson with intensity ρ G (x) = 50(x 2 1 +x 2 ), x = (x 1 , x 2 ) ∈ W , and a Thomas process with offspring dispersion parameter 0.075, parent intensity 25 and expected number of offspring for each parent being 4. The obtained point patterns are generated using the functions rpoispp() and rThomas() in the R package spatstat (Baddeley et al. 2016). To each spatial location x i of the generated point pattern ψ G , we assign a function f i (t) = z(x i , t), t ∈ T = [0, 10], which is obtained by sampling a realisation z(x, t), (x, t) ∈ W × T , of a stationary spatio-temporal Gaussian random field Z at the spatial point pattern location x i , using either a double exponential or Gneiting covariance function for Z ; the simulation is carried out using the RFsim() function of the R package CompRandFld (Padoan and Bevilacqua 2015). The double exponential covariance function is a separable model given by where θ = (τ 2 , α s , α t ), τ 2 is the variance of the spatio-temporal process, and α s and α t are positive spatial and temporal scale parameters, respectively. For this model, we set the parameters to τ 2 = 1, α s = 0.4 and α t = 0.5 in our simulations.
Separable models such as double exponential models are often chosen for convenience rather than for their ability to fit the data well and these models are rather limited, because of their lack of flexibility in modelling space-time dependencies. For such reasons, we would also like to consider a non-separable covariance model and we here choose a model from the Gneiting class of covariance functions (Gneiting 2002), which is given by where τ 2 , α t and α s are as in the double exponential model, and α and γ are the smoothness parameters which take values in (0, 2]. The non-separability parameter β takes values in [0, 1] and the model is separable if β = 0. In our simulation study, we set τ 2 = 1, α t = 1, α s = 0.4, α = 1, γ = 1 and β = 0.5. For each scenario, we use the test function (23) in the estimator in expression (25); note that we here assume that there is a common mark distribution and that there are no auxiliary marks present. Since we in practice have to sample each functional mark discretely in time (over a fine temporal grid), each observed function f i is represented by a collection f i (t 1 ), . . . , f i (t k ), i = 1, . . . , n. As a result, the distance mapping (23) for any two observed functions f 1 and f 2 is approximated bỹ where a = 0, b = 10 andf (t) = 1 n n i=1 f i (t). In all our examples, we set n = 50; note that we here assume an equidistant sampling scheme. We thus focus on pairwise interactions and we let C 1 be given by the balls B R 2 [0, r ], r ≥ 0, whereby we obtain a weighted K -function, where we use Ripley's isotropic edge correction (recall Corollary 1) to correct for edge effects. Moreover, we estimate the intensity function of the ground process nonparametrically and with local edge correction utilising the density.ppp() function of the R package spatstat (Baddeley et al. 2016). To select the bandwidth, we use the criterion of Cronie and van Lieshout (2018), i.e. the spatstat function bw.CvL().
In order to study whether there is random labelling, we create a large number of new realisations (here 499), each obtained by randomly assigning the functional marks to the spatial points x i ∈ ψ G , which are kept fixed. We then compute our summary  Figure 2 shows K t (r ) for the data and pointwise 0.05 level envelopes based on 499 permutations of the generated functional marks from a spatio-temporal Gaussian random field with a double exponential covariance function with spatial coordinates generated from the inhomogeneous Poisson process (top left) and the Thomas process (top right). The bottom panels are as in the top panels, but the Gneiting covariance function has been used instead. As one would expect, the functions go outside the envelopes and thus suggest that we are not dealing with a randomly labelled process. Hence, the subsequent analyses of the data should proceed accordingly, i.e. not assuming that the functional data in question have been generated according to an iid procedure. Looking closer at the deviations from the envelopes in Fig. 2, it becomes visibly clear that the behaviour of our summary statistics is influenced by both the spatial dependence structure of the ground process and the structure of the functional marking mechanism.

Data analysis: Spatial variation of population characteristics in Spain
Here, we numerically illustrate how our proposed setting and methods may be applied to real data. In particular, we will focus on the summary statistics and show their potential usefulness for extracting features in Spanish province population growth; see the discussion in Fig. 1. The boundary and centre coordinate data of the provinces of Spain are extracted as shapefiles from the R package raster (Hijmans 2019) and the statistical information about the population is taken from the web page of the Spanish Institute of Statistics (www.ine.es).
To better understand the structure and dynamics of populations, two key points are having information about i) the spatial distribution of and the magnitude variation in the demography and ii) the population growth rate. In anthropology and demography, demographical evolution and sex ratio are two important population characteristics which can change over time because of, for example, birth and death rates, economical situations or migration. However, it is natural to expect that these indices are much more similar in neighbouring regions/provinces than in distant regions/provinces. As highlighted in Sect. 1, one of the most important aspects of the analysis is to deduce whether the functional marks, i.e. the demographic evolution and sex ratio, are spatially dependent. Note that the emergence of the regions and their centres is something that occurred a long time ago, i.e. not in connection to the time frame of the functional marks considered. However, we believe that this does not reduce the interest in analysing the functional mark structures in the data.
Similar to the simulation study, for both the demographic evolution and sex ratio curves, we use the test function (23) in the estimator in expression (25); note that we here also assume that there is a common mark distribution and that there are no auxiliary marks present. In both cases, we observed the functions for 20 distinct years, starting from 1998. Hence, each such observed function f i can be represented as the collection f i (t 1 ), . . . , f i (t 20 ), i = 1, . . . , n. As a result, for any two observed functions f 1 and f 2 , we approximate the distance function (23) by (26), with k = 20, a = 1998 and b = 2017. Hence, as simulation study we focus on pairwise interactions and we let C 1 be given by the balls B R 2 [0, r ], r ≥ 0, whereby we obtain a weighted K -function, where we use the same procedure as in the simulation study for edge correction and estimation of the intensity of the ground process.
The analysis is illustrated in Fig. 3. The top left panel shows the spatial point pattern of the centres of 47 Spanish provinces. The other three panels show the resulting functional marked K -functions for the Spanish provinces functional marked point pattern (Fig. 1). The transformedK t (r ) for the data together with simulated pointwise 95% envelopes generated from 39 simulations of a homogeneous Poisson process, obtained by keeping the functional marks fixed, is shown in the top right panel; the (K t (r )) 1/3 for the demographic evolution in 47 provinces of Spain (solid line), average and simulated pointwise 95% envelopes under the homogeneous Poisson process for (K t (r )) 1/3 (dashed lines) (top right panel). Bottom left: as top right panel but average and simulated 95% envelopes from 39 random relabellings of the demographic evolution data (dashed lines). Bottom right: as left but for the sex ratio data. In the bottom panels the curves are shown only for r ≥ 48.27 km since for the smaller distances the estimated functional mark K -function vanishes obtained intensity estimate was quite flat, so we proceeded by assuming homogeneity. Such envelopes are obtained for each value of r by calculating the smallest and largest simulated values of (K t (r )) 1/3 ; see (Diggle 2013). This suggests that the functional marked Poisson process model does not fit the functional marked data set in the first panel of Fig. 1 well. Recalling the theoretical form of our summary statistics under Poisson assumptions, we see that the current Poisson assumption puts particular emphasis on the functional mark structures. However, some regular model intuitively makes the most sense in terms of describing the location data and as a follow-up one could fit a regular model, e.g. a Strauss model, to the ground point pattern and repeat the analysis above. Note that the transformation (K t (r )) 1/3 is just for visualisation pur-poses and plays no actual role-the reason for employing a cubic root transformation instead of the common square root for the spatial point processes is the potentiality of having negative values forK t (r ). The bottom panels show the transformed version of K t (r ) for the data and the pointwise 0.05 level envelopes based on 39 permutations of the demographic evolution on logarithmic scale (left) and sex ratio (right), holding the corresponding locations fixed. For r < 48.27 km,K t (r ) = 0 and is thus not depicted in the last two panels. These functions suggest that there is no spatial dependence between the functional marks, which points to that the way the population size and the sex ratio have evolved from 1998 to now in different provinces are spatially independent. Hence, the subsequent analyses of the data may proceed by assuming that the functional data here have been generated according to an iid procedure. Consequently, we may proceed with a classical FDA approach for the functions and a separate spatial point pattern analysis for the centres.

Discussion
In principle, the current definition of FMPPs may also accommodate situations where we want to consider locations X i ∈ S and functional marks F i (t) ∈ S, t ∈ T ⊂ [0, ∞), which live on some (Polish) space S other than some Euclidean space; for example, S could be a linear network (Baddeley et al. 2016;Dejby 2017) or a sphere (Møller and Rubak 2016). For instance, in the linear network case, each functional mark would describe the movement along S of the ith point/event/individual, whereby we would have a setup for modelling, for example, cars driving on a road network during a given time period. One could also extend the current setting to having T be an arbitrary (connected) subset of R d , for some arbitrary d ≥ 1, so that when d ≥ 2 the variable t in each F i (t) represents a 'spatial' location and F i : T → R k is a k-variate random field/process. Moreover, this would allow us to let T be any suitable interval in R, not necessarily a subset of [0, ∞), e.g. T = R.
We have proposed a general framework to analyse dependent functional data, with an emphasis on the mathematical and statistical aspects of this framework. A wealth of particular cases and models can be treated using our approach, and thus, a plethora of real problems can be analysed using this new context. Although only one specific data analytic example have been illustrated here, we believe that we have clearly indicated that many different types of data can be analysed using our framework.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.