1 Introduction

In this paper, we present and evaluate a method to support both the processes of time based granulation and the phase of reasoning/decision making on time granulated information. The method starts from a previous work (Loia et al. 2018), and integrates those results with the three-way formal concept analysis. In Loia et al. (2018), traditional formal concept analysis (FCA) has been used as a way to perform time based granulation devoted at discovering periodicity in data. The method presented in Loia et al. (2018) is shown in Fig. 1, and consists of the following phases: (a) a human operator establishes the time-related parameters (time unit, segment, periodic time slots, interval of interest) to analyze a dataset; (b) such information is used to pre-process the dataset in order to obtain a formal context; and (c) formal concepts and lattices are used to realize granulation and produce a granulated view of the data. The resulting granules are evaluated using quality measures (coverage and specificity) and the decision makers decide if going ahead or back to adjust or change the considered view by modifying some time-related parameters.

Fig. 1
figure 1

FCA for time granulation (from Loia et al. (2018))

As extension to Loia et al. (2018), we improve the observation and interpretation phase, as described in Sect. 4, by using an extended version of FCA, namely, the three-way FCA (Qi et al. 2014) that provides a model to make three-way decisions. In particular, we define an approach to support the human operator in reasoning on spatio-temporal events. The lattice structures resulting from the application of three-way FCA allow a tri-partitioning of both extensional and intensional parts of the formal concepts included in the Universe of concepts learned by FCA. Such tri-partitioning supports the decision making process by considering inclusion (e.g., an object or event possesses all the attributes), exclusion (e.g., an object or event does not posses any attribute) and by including a third option, say deferment, that can support humans in taking better decisions by considering a partition for which a further analysis is needed, possibly by gathering and analysing additional information.

The overall workflow of the method proposed in this paper is shown in Fig. 2. The method starts with a pre-processing of the event dataset. Specifically, a decision maker defines a time slot and transforms time slots in time-related attributes. Next, s/he uses the approach for time based granulation with FCA to create a formal context consisting, among other concepts, of timed information granules (TIGs), which are particular formal concepts including time-related attributes. TIGs are evaluated with granular measures such as specificity and coverage. The TIGs that are meaningful and semantically correct are then used to generate a new formal context that relates spatial information (such as places) with time information (such as time-related attributes). Places and time attributes are in relation if and only if there is a co-occurence of events. This last formal context is used to perform spatio-temporal reasoning with the three-way FCA.

Fig. 2
figure 2

Overall workflow of the proposed method

The contribution of the paper is twofold.

First, we define an original method to granulate spatio-temporal information, such as events, and reason on the created granules with a three-way approach. The adoption of granular computing (GrC) for analysis of temporal data is studied in the context of time series analysis and forecasting (Gupta and Kumar 2019) as well as events prediction, such as traffic flows (Chen et al. 2019). In temporal data analysis, GrC is used to create information granules starting from both the selection of a given time unit and the construction of time slots in a time interval of interest. Once created, granules can be used to make decisions and, indeed, reasoning and taking decisions on the basis of granulated information is the most difficult phase for analysts. Our method is based on two phases allowing to: (i) use FCA as a guide for the time granulation process (to this purpose we also review traditional measures for granulation, such as coverage and specificity, to be applicable to information granules build on FCA formal concepts) and (ii) use three-way FCA to allow deeper comprehension of the events, and support the decision maker in understanding when it is preferable to defer a decision on actions to be taken.

Second, we present an interesting application of three-way FCA. Several researchers are investigating FCA and its integration with rough sets, three way and granular computing (Yao 2020; Zhi et al. 2019; Wan et al. 2020). To the best of our knowledge, however, there are not applications of those approaches and methods for spatio-temporal reasoning on events.

The main motivation behind this work refers to the need of explainable results in the domain of decision making. Still today, most of decision-making tools based on machine learning techniques provide results as black box, that are difficult to interpret by human operators. The basic idea behind our work is to combine formal representations of knowledge, such as those of FCA, with a process of granulation to obtain results that can be represented in the form of conceptual structures. These last ones can be analysed at different levels of abstractions. Furthermore, thanks to the fact the we employ a three-way version of FCA, we gain also the cognitive effort reduction of three way decisions models.

The paper is organized as follows. Section 2 reports basic information on the background of our method. Section 3 presents the way we use FCA to create time based information granules, and Sect. 4 describes how we support spatial and temporal reasoning with time based information granules and three-way FCA, with an illustrative example in Sect. 4.1. Section 5 reports and discusses the evaluation on real data. Section 6 reports a comparison of our method with other approaches based on concept approximation for spatio-temporal reasoning. Section 7 draws conclusions.

2 Background

The section reports background information on the granulation process and on the principle of justified granularity (Pedrycz and Homenda 2013), on three-way decisions (Yao 2016a), on formal concept analysis (FCA) (Wille 2009) and on FCA extensions to support three-way reasoning (Qi et al. 2014).

2.1 The principle of justified granularity

Granular computing (GrC) (Pedrycz 2001) is an information processing paradigm focused on representing and processing basic chunks of information, namely, information granules. It finds its origin in the intuition of Zadeh defining a granule as clump of objects drawn together by indistinguishability, similarity, proximity, and function (Zadeh 2001).

The term granulation refers to the creation of information granules. Forming granules in a correct and appropriate way is an issue that has been investigated by several authors. Pedrycz and Homenda (2013) have proposed the principle of “justifiable” granularity as a way to design information granules which are justified on the basis of experimental evidence and specific enough to be semantically meaningful. This principle is based on a trade-off between two measures that are independent from the specific application: coverage and specificity. A correct evaluation of these two measures depends on the nature of the set created (fuzzy or crispy). Generally speaking, the term coverage (Tsumoto 2002) refers to the ability of covering data and specificity deals with the abstraction level of the granule prototype. Coverage and specificity are in a sense orthogonal measures.

A general formulation of coverage and specificity is defined, among other works, in Gacek and Pedrycz (2014). Given some numeric data \(\mathbf{Z } = \lbrace z_{1}, ..., z_{n} \rbrace\), the coverage of an information granule \(\varOmega\) can be defined as an increasing function of the cardinality, e.g., \(f1(card \lbrace z_{k} | z_{k} \in \varOmega \rbrace )\), where f1 is an increasing function. The specificity of \(\varOmega\) can be defined as function of the length of the interval. If \(\varOmega = [a,b]\) any continuous nonincreasing function of the length of this interval, say \(f2(m(\varOmega ))\) where \(m(\varOmega ) = |b - a|\), is an indicator of specificity. The shorter the interval (i.e., the higher the value of \(f2(m(\varOmega ))\)), the better the satisfaction of the specificity requirement. An example of f2 is \(f2 (u) = \text {exp}(-\alpha u)\), where \(\alpha\) is a parameter delivering flexibility when optimizing the information granule \(\varOmega\). A coverage-specificity plot can be used to depict the relationship, which can be also parametrized, and to evaluate the area under the curve to get a global measure of quality, namely Q.

The two measures above described can be used in any setting devoted to granulation. The definition of these measures, when using FCA to guide granulation process, is reported in Sect. 3.

2.2 Three-way decisions

Three-way decisions mimic a particular way of human decision making process that is based on a trisecting-and-acting model (Yao 2016a, b) involving two tasks: a division of the universal set into three pairwise disjoint regions (i.e., a trisection or a tri-partition of the universe), and the definition of actions or strategies to act upon the objects of the three regions.

Usually, the three regions are defined as positive (POS), negative (NEG) and boundary (BND). These three regions are viewed, respectively, as the regions of acceptance, rejection, and non-commitment in a ternary classification.

One of the most adopted formal setting to compute with Three-Way decisions is that one of rough set (Pawlak 1982). Yao (2010) discusses three-way decisions in the classical rough set model and in the decision-theoretic rough set model. On the basis of upper and lower approximations is possible to define

$$\begin{array}{ll} \mathrm{POS} (X)= & {\underline{apr}} \left( X \right) ,\nonumber \\ \mathrm{BND}(X)= &{\overline{apr}} \left( X \right) - {{apr}} \left( X \right) , \\ {\mathrm{NEG}} (X)= & U - {\overline{apr}} \left( X \right) \end{array}$$
(1)

If \(x\in \mathrm{POS}(X)\), then it belongs to target set X certainly. If \(x\in \mathrm{\mathrm{NEG}}(X)\), then it does not belong to target set X certainly. As third option, there is the non-commitment region: if \(x\in \mathrm{BND}(X)\), then it cannot be determined whether the object x belongs to target set X or not.

However, in this context, it is important to highlight the wide sense of three-way decisions rather than formally define the model. For a complete overview of the development track and evolution process of three-way decisions, readers can refer to Liu et al. (2020). As already mentioned in this section, rough set and its variants (such as probabilistic rough set Ziarko 2005; Yao 2008) are the most adopted formal settings for three-way decisions. Three-Way decisions and rough set share a similar approach to decision making, which is based on a tri-partition of universal set in three regions of acceptance, rejection and deferment. However, three-way decisions are not are not strictly related to a specific formal model, and a decision maker can decide the tri-partition the universe according to different strategies. For example, in Liu and Liu (2015) it is proposed a three-way decisions model based on fuzzy set for linguistic evaluation. The main idea behind three-way decisions is that one of a trisecting-and-acting model that includes a precise definition of actions or strategies to act upon the objects of the three regions. In general, three-way decisions are built on solid cognitive foundations and offer cognitive advantages and benefits, such as reduction of cognitive load, simplicity and flexibility, that can enable rapid decision making, allowing decision makers to make quick and right decisions for some cases and to focus more efforts on some other cases. Concrete examples of Three-Way decisions in real life concern triage systems in emergency departments, medical decision making (i.e., treatment, further test, or non-treatment), and so on.

Three-way decisions models are gaining attention also as tools to explain real world phenomena based on multi-level and multi-view abstractions. Specifically, by interpreting the three parts as three levels, it is possible to analyse phenomena and situation with tri-level conceptual models such as perception–cognition–action (PCA) (Yao 2020). Recent works on three-way decisions (Liu et al. 2020) enforce this multi-level view by integrating also a multi-view strategy with the support of granular computing.

2.3 Formal concept analysis

FCA is used as a tool for data analysis and knowledge representation. The basis of FCA is a formal context and a hierarchical structure of concepts, i.e., a concept lattice. A formal context is a triple \(K = (O, M, I)\), where O is a set of objects, M is a set of attributes, and \(I \subseteq O \times M\) is a binary relation. Given \(A \subseteq O\) and \(B \subseteq M\), we can define two operators \(A^{'} = \lbrace m \in M | (o, m) \in I \; \forall \; o \in A \rbrace\) and \(B^{'} = \lbrace o \in O | (o, m) \in I \; \forall \; m \in B \rbrace\).

The two operators (that are called derivation operators) generate, respectively, the set of all attributes held by a given set of objects, and the set of all objects holding a given set of attributes. A pair (AB) is a formal concept on the formal context K if: \(A \subseteq O\), \(B \subseteq M\), \(A^{'}=B\) and \(B^{'} =A\). A and B are called extent and intent of the formal concept (AB).

FCA has been extended to accommodate the inclusion of conditions in a formal concept and the analysis of triadic concepts (Wille 1995), fuzzy FCA (Burusco and Fuentes-González 1998), and FCA based on rough sets (Kent 1996). In the following section we introduce an extension based on three-way decisions theory (Yao 2012).

2.4 Three-way formal concept analysis

In Qi et al. (2014), three-way concept lattices and some relevant operators are defined to extend FCA to deal with three-way decisions. Authors of Qi et al. (2014) start from the assumption that traditional FCA can only support two way decisions. Basically, FCA can support only decision making procedures based on inclusion method: the two derivation operators of traditional FCA allow a decision maker to determine whether an object possesses all elements in the intension based on its membership of the extension. Exclusion methods (i.e., whether an object does not possess any elements in the intension) are also used in decision making procedures and, by combining inclusion and exclusion methods, a decision maker can also introduce a third way (non-commitment).

In this section, we report only the definitions of positive and negative operators, and their combination to define three way operators.

Given a binary information table (UVR) where U is a set of objects, V is a set of attributes, and \(R \subseteq U \times V\) is a binary relation, authors of Qi et al. (2014) introduces \(*\) and \(*^{-}\) as follows. Given \(X \subseteq U\), \(A \subseteq V\):

$$\begin{aligned} X^{*}= & \lbrace v \in V | \; \forall \; x \in X \; xRv \rbrace \nonumber \\ A^{*}= & \lbrace u \in U | \; \forall \; a \in A \; uRa \rbrace \end{aligned}$$
(2)
$$\begin{aligned} X^{*^{-}}= & \lbrace v \in V | \; \forall \; x \in X \; xR^{c}v \rbrace \nonumber \\ A^{*^{-}}= & \lbrace u \in U | \; \forall \; a \in A \; uR^{c}a \rbrace \end{aligned}$$
(3)

where \(R^{c} = \lbrace (u, v) | \lnot (uRv) \rbrace\) is the complement of R that is: \(uR^{c}v\) if and only if \(\lnot (uRv)\). In definition (2) there are the positive operators—and are the analogous of the derivation operators—and in definition (3) there are the negative operators.

By combining \(*\) and \(*^{-}\), authors of Qi et al. (2014) define Three-Way operators, \(\lessdot : P(U) \rightarrow DP(V)\) and \(\lessdot : P(V) \rightarrow DP(U)\), as follows:

$$\begin{aligned} X^{\lessdot } = (X^{*}, X^{*^{-}}) \nonumber \\ A^{\lessdot } = (A^{*}, A^{*^{-}}) \end{aligned}$$
(4)

where \(P(\cdot )\) denotes the power set and \(DP(\cdot ) = P(\cdot ) \times P(\cdot )\). By using (3), V can be divided as follows:

$$\begin{array}{ll} \mathrm{POS}^{V}_{X} = X^{*} \nonumber \\ \mathrm{NEG}^{V}_{X} = X^{*^{-}} \nonumber \\ \mathrm{BND}^{V}_{X} = V - (X^{*} \cup X^{*^{-}}) \end{array}$$
(5)

where \(\mathrm{POS}^{V}_{X}\) is the positive region, in which every attribute is definitely shared by all objects in X. \(\mathrm{NEG}^{V}_{X}\) is the negative region, in which each attribute is not possessed definitely by any object in X, and \(\mathrm{BND}^{V}_{X}\) represent a boundary region consisting of those attributes possessed by some, but not all, objects in X. In a similar way, U can be divided in three regions:

$$\begin{array}{ll} \mathrm{POS}^{U}_{A} = A^{*} \nonumber \\ \mathrm{NEG}^{U}_{A} = A^{*^{-}} \nonumber \\ \mathrm{BND}^{U}_{A} = U - (A^{*} \cup A^{*^{-}}) \end{array}$$
(6)

and similar considerations (with respect to objects) can be done.

Defining also the inverse operators, i.e., \(\gtrdot : DP(U) \rightarrow P(V)\) and \(\gtrdot : DP(V) \rightarrow P(U)\), authors of (Qi et al. 2014) introduce the Object induced three-way lattice and the Attribute induced three-way lattice, and define

  • an Object induced (OE) concept as a pair (X, (AB)), where \(X \subseteq U\) and \(A, B \subseteq V\) if and only if \(X^{\lessdot } = (A, B)\) and \((A, B)^{\gtrdot } = X\)

  • an Attribute induced (AE) concept as a pair ((XY), A), where \(X,Y \subseteq U\) and \(A \subseteq V\) if and only if \((X,Y)^{\gtrdot } = A\) and \(A^{\lessdot } = (X,Y)\).

In Sect 4.1, we contextualize the illustrative example reported in Qi et al. (2014) to an urban planning decision making scenario to show the added value of OE and AE concepts for temporal and spatial reasoning.

3 Time-based granulation with FCA

This section summarizes the approach for creating Timed Information Granules (TIGs) with FCA. The method is detailed in Loia et al. (2018), and starts from the assumption that a concept of a FCA, say (AB), can be regarded as an information granule. If objects in a formal context come with information related to the time, it is hence naturally possible to apply FCA to realize time-related knowledge discovery and representation.

Let us give an example before formally define TIGs. Let us consider a formal context \(K=(O,M,I)\), where the objects of O are events and the attributes of M consist of places and days, e.g. \(M=\lbrace \text {Rome, Milan, Monday, Tuesday, Wednesday, Thursday, Friday}\rbrace\). Let us consider \(D \subseteq M\) the subset consisting of time related attributes, i.e. \(D=\lbrace \text {Monday, Tuesday, Wednesday, Thursday, Friday}\rbrace\). A formal concept (AB) of the formal context K is a TIG if and only if \(B \cap D \ne \emptyset\). TIGs are interpreted as periodic occurrences or co-occurrences of events.

Figure 3 shows the process for creating TIGs. The process allows to define a scenario, let us call \(\varphi\), consisting of a formal context, a set of time related attributes, and a family of TIGs.

The dataset can be pre-processed to build a formal context by transforming the time slots data into time-related attributes. The analyst can decide the period of the time slots. So, with reference to Fig. 3, s/he can decide to divide a time period (such as a week) into day slots, and can transform those day slots into time-related attributes, \(d_{i}\). Thus, for instance, the attribute Tuesday in the formal context of Fig. 3 represents the union of all slots “Tuesday” for several weeks. So, the formal concept \(((e12, e21), (\text {Milan, Tuesday}))\) is a TIG giving information on the occurrence of those events in Milan on Tuesday. Such granules can be evaluated using the coverage and the specificity, mentioned in Sect. 2.1.

The decision maker can decide to granulate time into different time slots and to derive different time related attributes (such as quartiles of a year), and this decision results in the definition of a different scenario of analysis, say \(\varphi _{1}\).

Fig. 3
figure 3

Time-based granulation (from Loia et al. (2018))

Formally, given a scenario \(\varphi = (K,D,F)\), where

  • \(K=(O,M,I)\) is a formal context,

  • \(D = \{ d_{1}, d_{2}, ..., d_{n} \}\) is a set of time related attributes, and

  • \(F= \{ (A_{i}, B_{i})| A_{i} \subseteq O, B_{i} \subseteq M s.t. B_{i} \cap D \ne \emptyset \}_{i=1,\ldots n}\) is a family of TIGs

the coverage and specificity of a TIG are defined as follows:

$$\begin{aligned} \mathrm{COV}_\varphi ((A_{i}, B_{i}))= & {} \dfrac{|A_i|}{|O|} \end{aligned}$$
(7)
$$\begin{aligned} \mathrm{SPEC}_\varphi ((A_{i}, B_{i}))= & {} \dfrac{|B_{i}|}{|M|} \end{aligned}$$
(8)

When it is clear from the context, the subscript \(\varphi\) will be dropped.

The above definitions of COV and SPEC are compliant with the general formulation reported in Sect. 2.1. In fact:

  • when \(|B_{i} | = 0\) the formal concept \((A_{i}, B_{i})\) does not possess any attribute. This concept corresponds to the supremum of a formal lattice. The extent of this concept contains all the events of O and, thus, \(|A_{i}| = |O|\). In this situation, we have COV = 1 and SPEC = 0.

  • On the other hand, when \(|B_{i}| = |M|\) we have that the formal concept \((A_{i}, B_{i})\) possesses all the attributes. This concept corresponds to the infimun of a formal lattice. In this situation we have COV = 0 and SPEC = 1.

The case \(|B_{i}| = 0\) is similar to the condition we can have in data granulation when an interval, say \(|b-a|\), contains all the data. In fact, it represents the situation of an improper concept containing all the objects (events). However, this concept is vague, not specific, since it does not possess attributes. The resulting TIG can be considered as an improper granuleFootnote 1 with SPEC = 0 and COV = 1.

The case \(|B_{i}| = |M|\) is similar to the condition we can have in data granulation when an interval, say \(|b-a|\), collapses into a single point. In fact, it represent the situation where all the objects (events) reduce to a single very specific object, described by all attributes: it can be considered as a degenerated granule with SPEC = 1 and COV = 0.

An overall quality measure is \(Q ((A_{i}, B_{i})) = \mathrm{COV}((A_{i}, B_{i})) \times \mathrm{SPEC} ((A_{i}, B_{i}))^{\xi }\), where \(\xi\) is a parameter fixed during the analysis. If \(\xi = 0\), the decision maker is considering only coverage in the quality measure. The quality measure Q, in our case, gives us information on how much a formal concept is meaningful and relevant with respect to the periodicity of the occurrences of events (e.g., fire events or traffic jam).

4 Supporting spatial and temporal reasoning with three-way FCA

Fixed a scenario, \(\varphi\), the decision maker can use Three-Way FCA to reason on spatio-temporal events. The idea is to build a new formal context starting from \(\varphi\). Let us explain how in the following.

Table 1 shows a formal context, K, which we can build with the approach described in the previous section, where \(d_{1}\), \(d_{2}\), \(d_{3}\) and \(d_{4}\) are time related attributes, and Area1 and Area2 are places. As described in the previous Sect. 3, we can build a family of TIGs \(F = \lbrace (A_{i}, B_{i}) \rbrace _{ i=1,\ldots n }\) . The scenario consists of this triple: \(\varphi = (K, D, F)\).

From F, we build a new formal context, \(L=(P, N, J)\), where places are objects and attributes consist only of time related attributes. We define:

  • \(P \subseteq \cup _{i} B_{i}\) a set of places,

  • \(N \subseteq \cup _{i} B_{i}\) a set of time related attributes,

  • \(J = \lbrace (p,d) \in P \times N \; | \; \forall (A_{i}, B_{i}) \in F, \; Q(A_{i}, B_{i}) \ge \alpha \rbrace\) a binary relation s.t, for each formal concept (information granule) correlates spatial and temporal information if and only if \(Q(A_{i}, B_{i}) \ge \alpha\), where \(\alpha\) is a parameter that depends on the specific analysis scenario.

P and N are disjoint sets: \(P \cap D = \emptyset\)

Figure 4 shows the approach to derive L from a formal context of a scenario, \(\varphi\), such as that one of Table 1. Looking at the lattice (left-hand side of the figure), we can see TIGs. If the \(Q-\)value of the TIG is \(\ge \alpha\), we correlate its spatial and temporal attributes, and create the new formal context L (right-hand side of the figure).

Table 1 Formal context resulting from application of FCA for time based granulation
Fig. 4
figure 4

Derivation of a new formal context from a concept lattice

The formal context L is the starting point to apply three-Way FCA for reasoning on events. The overall methodology is formalized in the pseudo-code of Algorithm 1. More in detail, from step 1 to step 4, some variables are initialized. From step 5 to step 9, two lattices, \(\varGamma _1\) and \(\varGamma _2\), are generated through the application of the FCA method over the formal context constructed from the results of the TimeBsedGranulation (step 5) (Loia et al. 2018) and its complement. From step 10 to step 17, the application of the property concepts on a lattice returns an iterator of its formal concepts. Moreover, the properties extent and intent, applied to a given formal concept, respectively return the sets of objects and attributes included in such a concept. From step 18 to step 25, OEL and AEL are generated. Finally (step 26), such lattices are returned. The following sub-section reports an illustrative example.

figure a

4.1 An illustrative example: decision making on urban traffic

We contextualize the illustrative example reported in Qi et al. (2014) to explain how we can use three-way FCA for reasoning on events. Let us consider an urban planning decision making scenario. Let us suppose that a decision must be taken on how to improve vehicular traffic in an urban area during the weekly days in the morning time slot, when activities and educational institutes—that insist on the territory - are opening. Let us suppose the universe of discourse, U, consists on N observations related to an urban area that is divided in four adjacent areas. Each observation reports time information (e.g., date, day, hour) and information on the number of vehicles and intensity of the traffic flow. Examples of observations are reported in Table 2.

Table 2 Observations

Following the approach described in Sects. 3 and 4, we can build a formal context, L, such as that one of Table 3 (reproduced from Qi et al. (2014)), where the objects are the four areas under investigation and the attributes are daily based. L can be regarded as an information tableFootnote 2, and the value in a cross between an area and a time related attribute is 1 if the value of Q of the corresponding TIG is greater than 0.5 indicating a majority of intense traffic flow events for the specific daily TIG.

Table 3 Information table (reproduced from Qi et al. (2014))

Figure 5 shows the concept lattice (part a) and its complement (part b) induced by using traditional FCA from Table 3.

Fig. 5
figure 5

Concept lattice and its complement derived from Table 3 (reproduced from Qi et al. (2014))

Besides the infimun and supremum, we have in both the cases 4 concepts related to intensive traffic events (part a) of Fig. 5 and normal traffic events (part b) of Fig. 5. The lattices of Fig. 5, however, give information on concepts and sub-concepts in a separate way, supporting a sort of two way decision. For instance, they inform that areas 1 and 3 are subject to the same conditions of intensive traffic on Thursday and of normal traffic on Wednesday. A third information is, anyway, implicitly available in the form of non-commitment for Monday, Tuesday and Friday where the decision maker can not assume neither intensive nor normal traffic events for the couple of areas 1–3. This information, pertaining to a boundary zone, appears explicitly when we consider lattices created with the three-way FCA, such as the OE and AE lattices of Fig. 6. Having the concepts of upper and lower approximations as its foundation, the three-way FCA has the same potential as methods based on rough sets to model inductive reasoning schemes (Skowron and Dutta 2018). Specifically, with the three-way FCA we can model inductive reasoning schema based, respectively, on OE and AE lattices.

Fig. 6
figure 6

OEL and AEL lattices (reproduced from Qi et al. (2014))

Figure 6a shows the OE lattice induced by the three-way FCA. Each concept is modeled with an extension (a set of areas) and an intention consisting of: (i) a POS region in which every attribute is shared by the objects of the extension, (ii) a NEG region in which each attribute is not shared by the objects, and (iii) a BND in which each attribute is not possessed definitively by any object. More formally, each concept is represented as \((X, (X^{*}, X_{*}))\), where: \(X^{*}=\mathrm{POS}^{V}_{X}\), \(X_{*} = \mathrm{NEG}^{V}_{X}\), and \(V-(X^{*} \cup X_{*}) = \mathrm{BND}^{V}_{X}\) with V attribute set. So, the concept (13, (dc)) of Fig. 6a affirms that the couple 1–3 shares d, does not posses c and we can say nothing on the other attributes. This is—with an higher level of granularity—the same information we have previously discussed in this section.

In a AE lattice, a concept is represented as \(((A^{*}, A_{*}), A)\) where the extension is a couple consisting of a POS region (all the objects having all the attributes in A), a NEG region (the set of objects not possessing any attribute in A), and there is a \(\mathrm{BND} = U - (A^{*} \cup A_{*})\), with U objects set.

OE and AE introduce new concepts with respect to a lattice generated with traditional FCA. These new concepts are induced by the three-way FCA operators presented in Sect. 2.4. Moreover, since we considered in Table 3 time slots as attributes and spatial areas as objects, an OE lattice is a representation better suited for a time-perspective inductive reasoning and an AE lattice for a spatial one.

With an OE lattice, the decision maker has direct information on POS and NEG regions of the attributes that, in our case, are granular information on days. Therefore, s/he has direct information on the days when areas are subject to intensive traffic or not, and can follow this perspective when analyzing the sub-concepts providing finer information. On the other hand, with an AE lattice, the decision maker has direct information on POS and NEG regions of the objects that, in our case, are areas of a city. So, s/he has direct information on the areas subject to intensive traffic or not in specific days.

Depending on the decisions s/he must take, the decision maker can select or combine the two lattices. For instance, if the decision problem concerns traffic jam reduction on Monday and Tuesday, the decision makers can analyze the concepts ((1, 3), abe) and ((24, 3), abc) of Fig. 6b to make a decision of re-routing traffic from 1, 2 and 4 to 3. Of course, the same information can be inferred from the OE lattice of Fig. 6a. An OE lattice, however, appears more intuitive if the decision problem relates to the identification of a certain time period to limit traffic into areas. For instance, analyzing the concept \((123, (ab, \emptyset ))\) it is intuitive to select Monday and Tuesday to limit traffic.

5 Experimentation on real data

The evaluation for the approach proposed in the previous sections is based on the use of the forestfire dataset (Cortez and Morais 2007), constructed by using real-world observation of fires occurred in the Montesinho natural park in Portugal. The data used in the experiments was collected from January 2000 to December 2003 and it was built by merging two databases. The first database collects observations related to fires occurred in the aforementioned national park. At a daily basis, every time a forest fire occurred, several features were registered, such as the time, date, spatial location within a 9 \(\times\) 9 grid (x and y axis), the type of vegetation involved, the six components of the FWI system and the total burned area. The second database contains several weather observations (e.g. wind speed) that were recorded with a 30-min period by a meteorological station located in the center of the Montesinho park. The final dataset used in the present work consists of 517 entries with the following features: X (x-axis coordinate), Y (y-axis coordinate), month (month of the year), day (day of the week), FFMC (Fine Fuel Moisture code), DMC (Duff Moisture code), DC (Drought code), ISI (Initial Spread Index that is a fire behavior index), temp (outside temperature in Celsius degrees), RH (percentage of the outside relative humidity), wind (outside wind speed in km/h), rain (outside rain in mm/m\(^2\)), and area (amount of burned area in ha). The used dataset adopts a model of the Montesinho park which has been organized in a grid of 81 cells (9 \(\times\) 9). Some cells do not map a real zone of the park and are not used in the dataset. Features X and Y are used to localize the zone in which an event occurred. Every row in the dataset reports data related to an observation of a specific fire event. It is possible to have more events in the same zone and also more events in the same zone, in the same month, but possibly in different years or in the same zone, in the same day, but possibly in different weeks. Such events represent different fires observed in the same month and day but in different years. Frequency information of fires in the same zone for the same period can be calculated by opportunely counting the number of events in the same period.

5.1 Data pre-processing

The data pre-processing phase has been organized in three main sequential steps. The first pre-processing step has been executed by cutting irrelevant (with respect to the goal of the work) features. In particular, we used only the following subset of features: X, Y, month, and area, where the last one is the decisional feature. The second pre-processing step was aimed at conceptually scaling the multi-values feature month (months from January to December) in order to have twelve boolean features, one for each month. The features X and Y have been merged to obtain the feature XY, whose values are numbers of two digits. Furthermore, during the third pre-processing step we have dropped, from the dataset, all the rows having values less than 0.20 for the feature area that is considered as decision attribute. In this way, it was possible to exclude all events having low significance, with respect to the aims of the work. Lastly, the area feature has been dropped from the dataset.

5.2 Temporal information granulation

The starting point is a table (let us call it master table) in which rows are fire events and columns are time slots. Therefore, for each event we know only the time slot and the zone (coordinates into the grid) in which it occurred. Given the original dataset, we start with the finest temporal granularity, in the sense that we have the detail about the month (of the year) in which the fire occurred. Starting from the master table we have constructed different time based granulations by aggregating event data in different time slots. In particular, the chosen granularity levels (guiding granulations) are reported in Fig. 7. How the temporal granulation process works is described in Sect. 2 and graphically represented in Fig. 3. More in details, for each granulation level a formal context and the corresponding lattice are created by applying the FCA operators (see Sect. 2.3). Now, for each granulation level, the function Q is calculated (with parameter \(\xi =0.7\) to give more importance to coverage with respect to specificity) for every formal concept included in the corresponding lattice. The result of this phase is a set of tables. In such tables, rows indicate zones and columns indicate time slots. Therefore, all tables have the same number of rows and a different number of columns due to the different number of time slots. In particular, for granulation level 1 the columns are 3 (JanFebMarApr, MayJunJulAug, SepOctNovDec), for granulation level 2 the columns are 4 (JanFebMarApr, MayJun, JulAug, SepOctNovDec) and for granulation level 3 the columns are 5 (JanFebMarApr, MayJun, Jul Aug, SepOctNovDec). A cell of this table contains the Q-value for a specific zone in a given time slot. Note that the Q-value is a measure of the interestingness of a concept including spatio-temporal information. In our case, the interestingness is related to multiple occurrences of fire events in the same zone in the same time period (e.g., the same month, possibly for different years). In particular, the more occurrences the more interesting is the granule/concept.

Fig. 7
figure 7

Three levels of granulation

5.3 Executing the three-way decisions based on FCA

To execute the sequential analysis approach proposed in the present paper, the tables, obtained by applying the Temporal Information Granulation, are transformed into formal contexts (zones are objects and time slots are attributes). The discretization of the Q-values (in tables) was realized as follows. If Q(zts) is greater than a given threshold than we put 1 in the corresponding cell (zts), otherwise the value of the cell will be 0. This rule was applied to all tables (one for each granulation level). Therefore, the produced formal contexts (they are three, one for each granulation level) were used to build OE and AE by using the operators described in Sect. 2.4. Figures 8, 9, 10, 11, 12 and 13 report the formal concepts of OE and AE for each considered granulation level.

Fig. 8
figure 8

OE concepts for granulation level 1

Fig. 9
figure 9

AE concepts for granulation level 1

Fig. 10
figure 10

OE concepts for granulation level 2

Fig. 11
figure 11

AE concepts for granulation level 2

Fig. 12
figure 12

OE concepts for granulation level 3

Fig. 13
figure 13

AE concepts for granulation level 3

Two types of reasoning can be executed on the results previously reported: spatial reasoning and temporal reasoning. Let us proceed with the first one (spatial). Consider the OE for granulation level 1 (Fig. 8) and, in particular, the concept including the place \(('34', '44', '45', '63', '65', '74')\). This concept has the positive region equal to \(('SepOctNovDec',)\), an empty negative region and a boundary region equal to \(('JanFebMarApr', 'MayJunJulAug')\). The interpretation of such concept is that zone \((X=3, Y=4)\), \((X=4, Y=4)\), \((X=4, Y=5)\), \((X=6, Y=3)\), \((X=6, Y=5)\) and \((X=7, Y=4)\) are all involved in significant fire events in the period going from September to December. Moreover, for the periods going from January to April and from May to August, we cannot affirm that all the aforementioned zones are involved in fire events or not. It is interesting to note that such zones form two clusters of neighbors. This result can be very useful for a decision maker in order to make actions for decreasing the dangerous effects of the fires or to prevent them. The use of granulation levels 2 and 3 allows to consider finer time slots. In fact, the OEL for granulation level 2 (Fig. 10) provides a concept including the same places analyzed for granulation level 1. In this case we have a positive region equal to \(('SepOctNovDec',)\), a negative region equal to \(('MayJun',)\) and a boundary region equal to \(('JanFebMarApr', 'JulAug')\). The interpretation is that, after refining the time slots (i.e., by considering granulation level 2), we obtain that in the months May and June all the aforementioned zones are not involved in fire events (this was not clear by using only granulation level 1). Such considerations are graphically explained in Fig. 14 where places in the positive region are red and signed with letter P, places in the boundary region are yellow and signed with letter B and places in the negative region are green and signed with letter N. Granulation level 3 provides finer results.

Fig. 14
figure 14

Graphic representation of granulations at level 1 and 2 of the places 34, 44, 45, 63, 65 and 74

The second type of reasoning (temporal) can be executed by considering AE for the three granulation levels we are considering. In particular, if we focus on the concept including the time slot \(('MayJunJulAug',)\) it is possible to observe that the positive region is \(('22', '43', '74', '86')\), the negative region is \(\begin{array}{ll} ('14', '15', '24', '25', '34', '44', '45', '46', '54', \\ '63', '64', '65', '73', '75', '76', '83', '88', '94', '96')\end{array}\) and the boundary region is empty. The interpretation for the aforementioned concept is that, in the period going from May to August, the zones \((X=2, Y=2)\), \((X=4, Y=3)\), \((X=7, Y=4)\), and \((X=8, Y=6)\) are afflicted by significant fires with a significant periodicity. If we consider, now, the granulation level 2, we observe two concepts. The first one including the time slot \(('JulAug',)\) and a positive region equal to \(('22', '74', '86')\). The second one including the time slot \(('MayJun')\) and an empty positive region. The interpretation is that, after splitting the original time slot (from May to August) we obtained two finer time slots and it is possible to affirm that the zones \((X=2, Y=2)\), \((X=7, Y=4)\), and \((X=8, Y=6)\) are involved in significant fire events with a significant periodicity in the period going from July to August. There are no zones involved in significant fires with a significant periodicity in the period going from May to August. Furthermore, the zone \((X=4, Y=3)\), that seams to be disappeared, is now included in the boundary regions of both the above concepts. The interpretation is that the number of fires for zone \((X=4, Y=3)\) is distributed across the period going from May to August and once we have divided this period into two time slots, the number of occurrences of fires in such zone is not still significant for both the periods going from May to June and from July to August. Also in the case of AEL it is possible to use granulation level 3 to obtain finer results to support better the task of decision makers.

6 Discussion and comparison with other approaches

We discuss and compare our results with other methods and techniques for spatio-temporal reasoning. Specifically, we consider those methods based on concept approximation and, in general, techniques based on the Granular Computing paradigm.

The problem of spatio-temporal reasoning has been studied by Barzan (2008), Skowron et al. (2016, 2020). These works share the idea of modeling the reality as complex dynamic systems and use concept approximation to reason on real world phenomena and situations involving spatial and temporal representations.

Bazan (2008) defines a hierarchical classifier for the approximations of complex concepts, where a complex concept can be related to real world objects consisting of several parts. Barzan uses ontology models to represent domain knowledge, and he applies the method to approximate spatio-temporal complex concepts, identify behavioural patterns of complex objects, and automate their planning. The classifier is also used to define production rules and approximate reasoning scheme (Skowron and Synak, n.d.) that support agents and human in the decision making.

Skowron et al. (2016) further develop the concepts above mentioned in an Interactive Granular Computing (IGrC) framework based on complex granules (c-granules), that can support the decision maker with the Adaptive judgement, that combines the traditional forms of reasoning based on deduction, induction and abduction.

This papers above analysed describes frameworks offering a functional completeness which includes, among other things, activities of identification of behavioral patterns of complex objects and planning automation. These activities are outside the scope of our method that is limited to the definition of a new approach to granulate events, observe and understand these granules with the support of a formal knowledge structure such as the Three-Way FCA.

We share with the results of Bazan the necessity of using formal representations of knowledge, and to this purpose the first phase of our method consists of the adoption of a formal context to perform granulation with TIGs. However, we limit our method to supporting the interpretation and observation of concepts related to occurrence of events. Our TIGs are semantically different from complex objects or c-granules. TIGs are groups of events, and do not represent complex phenomena, and TIGs can support the decision maker in deductive reasoning on the basis of occurrence of events that the decision maker can observe at different level of abstraction. The decision maker can thus infer where and when specific events (e.g., fire events) can occur or cannot occur.

The adoption of three-way FCA allows the introduction of a third option, deferment, that combined with the knowledge relating to occurrence of events can support the decision maker in defining strategies and actions to make an event happen or not.

To conclude, one of the most important thing of the proposed approach with respect to existing works is the type of results. In this work, the approach provides explainable results that can be accessed and interpreted by both human and automatic agents. Representation and interpretation of Information Granules are, in fact, basic issues in GrC. After the granulation process it is necessary to give a name or a descriptive label to the information granule, which must be representative of the information relating to the granule. As suggested in Yao and Zhong (2007) a strategy may be to assign a name to a granule such that an element in the granule is an instance of the named category, or also provide a formal description of objects in the same granule. Our method, to simplify the interpretation of informative granules, leverages a formal representation based on formal contexts. As we can see from the experimentation context, the results are represented by actionable knowledge in the form of conceptual structure providing interesting insights able to support decision-making in an environmental monitoring scenario. The provided knowledge supports the cognitive processes of the operators who know the problem, receive the insights that are represented in a formal context and explainable (this is the main difference with the typical machine learning based approach), evaluate them and make decisions. They are not replaced by an automatic system neither are helped with a value (or a set of values) that is difficult (if not impossible) to be interpreted.

7 Conclusions

We have presented a method to execute time-based granulation, and support a decision maker in the interpretation of the created granules. The method performs a time-based granulation guided by knowledge structures, such as FCA, and adopts a three-way FCA to support spatio-temporal reasoning and decision making. The method has been evaluated on real data set consisting of observations on fire events in the north-east region of Portugal. The achieved results are promising mainly with respect to the possibility of obtaining explainable results that can be observed and understood at different levels of granularity. We are going to evaluate the method in other applications such as situation awareness (Loia et al. 2016; D’Aniello et al. 2017), analysis and reasoning on tweet streams (Cuzzocrea et al. 2015) and sign prediction in social networks (Loia et al. 2018), smart city (D’Aniello et al. 2016) and smart museum (Capuano et al. 2016), and kinds of events such as terrorism (“Understanding the composition and evolution of terrorist group networks: A rough set approach”, 2019) and security analysis (Fujita et al. 2019).