1 Introduction

According to the United Nations World Tourism Organization, over the last 3 years, the developed countries including Germany, the US, and China, exhibited the strongest growth rates and they had the highest international tourism expenditures. Despite this high growth rate, there are several challenges still tourism industry is dealing with. They include: offering information in several languages, providing more personalised suggestions and offers, offering more value to the user, and retaining your customers. In this regard, crowd sourced gather data, ambient intelligence (AmI) and mobile services can contribute significantly.

Most people enjoy the subject of travel, and it’s easy to find active mobile users in the crowd who are happy to participate in projects on the topic. The participation of the crowd helps to give more realistic and up-to-date information about destinations already visited. This paper proposes a framework to use the crowd-sourced data for the purpose of tourist guidance and ambient services, in particular suggestion-making services. Crowd-sourced trajectory data is analysed to extract the sites/points of interest and attractive paths and places. The use of crowd trajectories can provide more up-to-date destination suggestions minimising the direct effect of commercialised advertisement, which might not reflect the public interest.

The contribution of the ambient intelligence (AmI) in travel and tourism related an application is also significant. This is mainly because the visitors, travellers, and the tourists are not usually very familiar with the visiting areas and therefore they need an assistive, personalized, adaptive, ubiquitously aware of the location, and intelligent service (Manes 2002).

This paper shows the use of crowd-sourced trajectories of movements in an art gallery to analyse and recognise the patterns of movement, identify the point of interests (PoIs) using spatio-temporal data mining techniques and use the recognised patterns and PoI in more personalised, adaptive and intelligent recommender and suggestion making services. There are several research projects investigating the use of AmI in tourist assisted services (Huang et al. 2015; Umanets et al. 2014; Wang et al. 2016; Nakamura et al. 2015; Kabassi 2013; Sorrentino et al. 2015; Smirnov et al. 2013; Xu and Ke 2015; Tom Dieck and Jung 2015), there is still a need to use the patterns of volunteers’ trajectories of movements as there are some information hidden within the data. However as reviewed by Yuan et al. (2016), majority of data mining techniques for the application of smart tourist services, are limited to online multimedia, such as travel (Vu et al. 2015), Check-in data (Lu et al. 2012), rather than movement behaviour and trajectories of travels. (Abramson et al. 2014) used the trajectories of Flickr geo-tagged photos to identify the points of interest. And Yuan et al. (2016) used the blog data mining to summarise and categorise tourist information. This paper uses the trajectories of movements, both indoors and outdoors, to extract knowledge about the interests, preferences, and demands of tourists. For example, some individuals skip particular paintings in a gallery and spend more time in some sections, as their interest could be different from the others. If such patterns are identified, the same suggestions can be made to a new individual who starts with the same interest area. In addition to this, the use of crowd-sourced data allows us to cluster tourists based on common interest without distracting them by keep asking preference-related questions. Their needs and demands can be automatically predicted if the shared patterns are available. Also, there is some additional information that can be provided to the visitor, with respect to the PoIs, if the interest and preferences are identified. In general tourist guidance, is one of the applications where AmI can have a great contribution to the quality of the service, as the tourists, who may not be very familiar with the visiting site, need a location-aware, ubiquitous, personalised and informative service. Such services should be able to understand the preferences of the users without requiring the users to specify them, predict their interests, and provide relevant and tailored services in the most appropriate way, including audio, visual, and haptic.

To identify such patterns and clusters, spatio-temporal data mining techniques (Cressie and Wikle 2015) can be very helpful tools. The spatio-temporal data mining techniques can identify the patterns hidden in data, detect the similarities and anomalies, and find classes and clusters. This paper focuses on using spatio-temporal data mining, and in particular trajectory mining techniques, to identify PoIs using anonymous trajectories of movements and use the PoIs in some applications including Tourist guidance and suggestion making services, and navigation services.

In this regard, trajectory data, without any reference to a moving user’s identity, should be stored. It is possible to use many different anonymisers to make sure that anonymity of data is preserved; in this project, K-anonymity (Kalnis et al. 2007) has been applied. The contributors of the trajectories can simply upload their Global Positioning System (GPS) tracks or draw and store their movements’ polylines on the maps. It is also possible to capture the trajectories indirectly from other resources, including CCTVs, Global Navigation Satellite System (GNSS) embedded in mobile phones or vehicles, accelerometers and so on. Having a huge amount of input data from a variety of resources makes the data management and storage challenging. Due to the volume of the input data, and also the variety of data resources and consequently the different schema for the datasets, the trajectory analysis should be done using big data analytics techniques. Therefore the input data is stored in a Graph database, Neo4j, to have more efficient trajectory data management and analysis trajectory elements (nodes and edges) in comparison with other solutions (Amirian et al. 2015). In order to extract patterns and identify rules, an inference engine is developed to apply big data mining techniques to detect anomalies, identify clusters and classify data (Baiget et al. 2008), and then extract patterns and rules (Kuntzsch and Bohn 2013; Zhang et al. 2013). Such information and knowledge can be used in pathfinding and routing decision-making and many other navigational suggestion-making applications.

This paper is structured as following; the next section discusses the nature of the trajectory data, i.e. trajectories, data preparation and management (in graph databases), then it discusses the trajectory data mining techniques. And Sect. 3 implements the data mining techniques and inference system with application to tourist guidance.

2 Trajectory mining

Data mining, as a field at the intersection of computer science and statistics, is a field, which attempts to discover patterns from large datasets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems (Zheng 2015). The overall goal of the data mining process is to extract information from a dataset and transform it into an understandable structure for further use. Data mining involves six common classes of tasks:

  1. 1.

    Anomaly detection (outlier/change/deviation detection)—the identification of unusual data records, that might be potential errors or anomalies

  2. 2.

    Association rule learning (dependency modelling)—searches for relationships between variables. Using association rule learning, one can determine which variable are more related to another. This is sometimes referred to as market basket analysis.

  3. 3.

    Clustering—is the task of discovering groups and structures in the data that are in some way or another “similar”, without using known structures in the data.

  4. 4.

    Classification—is the task of generalizing known structure to apply to new data.

  5. 5.

    Regression—attempts to find a function which models the data with the least error.

  6. 6.

    Summarization—providing a more compact representation of the dataset, including visualisation and report generation

In order to identify the patterns of movements among tourists’ trajectories and recognise PoIs for recommender systems, several steps to be taken to accomplish the trajectory data mining process, see Fig. 1. The earliest steps in trajectory data mining process are data capture and preparation, which are explained in the next subsection.

2.1 Input data: trajectories

A trajectory is the trace of movement of a moving object. The moving object can be a pedestrian, driver, a group of animals, a natural phenomenon like a tornado, that moves in space and time (Mazimpaka and Timpf 2016). Trajectories can have several components, however location and time are the main components of trajectory data.

There are some preconditions that should be satisfied to make sure that the results are valid and are not based on incomplete input datasets. If the input dataset is small (for example in terms of the number of samples and the extent and density/frequency of trajectories with respect to space and time), then outputs (the rules and patterns) may be valid only for a specific situation, area, and time interval. The level of robustness of output knowledge is strongly correlated with input sample size. In this regard, for successful pattern recognition and rule extraction there are some rules-of-thumb (Basiri et al. 2016b).

  1. 1.

    The sample size of trajectory data has to be large, otherwise the training and control/test process may not find all patterns contained in the data.

  2. 2.

    The sample data should be dense enough to give complete spatial coverage.

  3. 3.

    The sample data should be frequent enough to give complete temporal coverage (different days, times, weekends, seasons, and so on)

  4. 4.

    The sample data should cover all (at least most) travel behaviours and modes. This helps spatio-temporal data mining to exclude anomalies and exceptions, and also it helps to find clusters and classes, which share common patterns easier and with a greater level of certainty.

The larger input datasets may lead to more realistic results since anomalies will have less weight in pattern recognition process. However capturing input data with an acceptable level of quality is a big challenge as there is not a globally available, accurate, cheap or ideally free-to-use positioning technique, which can localise the users seamlessly indoors and outdoors. Due to this issue, tracking data can be captured from several different sources such as Global navigation Satellite Systems (GNSS) receivers, e.g. GPS receivers embedded in mobile phones and In-Vehicle Navigation Devices, RFID tags and readers, video cameras and CCTVs, Bluetooth networks (Basiri et al. 2015b, 2017). Also large input data storage, retrieval and analysis could only be efficiently handled if the trajectories are stored in a graph database. This paper uses a graph database, Neo4j, to store and retrieve trajectories. As the name implies graph databases are based on graph theory and employ nodes, properties and edges as their building blocks. The nodes and edges can have properties. In the graph databases, various nodes might have different properties. The graph databases are well suited for data, which can be modelled as networks such as road networks, social networks, biological networks and semantic webs. Their main feature is the fact that each node contains a direct pointer to its adjacent node, so no index lookups are necessary for traversing connected data. As a result, they can manage a huge amount of highly connected data since there is no need for expensive join operations. Some of the graph databases support transactions in the way that relational databases support them. In other words, the graph database allows the update of a section of the graph in an isolated environment, hiding changes from other processes until the transaction is committed. Trajectory data and in general geospatial data can be modelled as graphs, since graph databases support topology natively, topological relationship (especially connectivity) between geospatial data can be easily managed by this type of NoSQL databases (Amirian et al. 2015). In most GIS workflows, topological relationships play a major role. In addition since each edge in a graph database can have a different set of properties, they provide flexibility in the traversal of the network, based on various properties. For example, it is possible to combine time, distance, a number of points of interest, and the user preferences in finding best path and the mentioned path would be unique for each user. In summary, the storage model of graph databases is a graph and there is a need for mapping layer whenever another data structure is needed in the application layer.

In addition, the protection of the contributors/users’ privacy is one of the key challenges (Basiri et al. 2015b) as there are many pieces of private and confidential information hidden in the data and any links to the identity of the users/contributors could be damaging to their safety, security and privacy. It is very important to make the trajectories anonymised and this can be done using different anonymisers such as K-anonymity (Gruteser and Grunwald 2003). Most existing anonymisers on tracking data adopt a K-anonymity. In order to do so, the location of the user got by a query and K to the anonymiser, which is a trusted third party (Kalnis et al. 2007; Mokbel et al. 2006) in centralized systems or a peer in decentralized systems (Ghinita et al. 2007a, b). The anonymiser removes the ID of the user and cloaks the exact user location. Then anonymiser sends the location and query to the spatial database or location-based services server.

2.2 The proposed methodology

Having the data anonymised, then trajectories are ready to go through the next stage and be pre-processed. The pre-processing stage is to make them easier to store (using segmentation, compression and simplification techniques) and also semantically more understandable (by identifying the stay points and using the map matching techniques). The pre-processing step is actually based on the fact that all the points in a trajectory are not equally important and meaningful (Zheng 2015). The pre-process step, including stay point detection, trajectory segmentation, trajectory simplification and compression make the trajectories ready to be stored and retrieved in a more efficient way. In addition to the efficiency, trajectory segmentation and stay point detection make the trajectories and some of the points semantically meaningful and easier to interpret. Such stay points can be used in recommended places to visit (Basiri et al. 2014, 2015a), estimate the actual travel time and petrol consumption (Shang et al. 2014).

Stay points refer to the locations where users/contributors have stayed for a while. The stay points can be simply identified if the location of the user is not changing over a period of time. However due to positioning services’ inaccuracy and errors (Pang et al. 2013), it commonly happens that the user stays stationary for a while but the positioning technology generates different readings (Zheng 2015). In order to detect such stay points/area, there are several algorithms and methods.

Basiri et al. (2016a) proposed an approach that checks the travel speed for each segment and if it is smaller than a threshold then the average of these two points are replaced/stored as the “stay point”. It is possible to put distance and temporal interval threshold separately, instead of the speed of movement at each segment (i.e. if the distance between a point and its successor is larger than a threshold and also the time span is larger than a given value) as Li et al. (2008) proposed. Yuan et al. (2016) proposed using the density-clustering algorithm to identify the stay points. This paper uses this approach in order to identify the speed clustering algorithms to identify the stay points. This can be viewed as the combination of segments’ speed threshold and the density-clustering algorithms.

Fig. 1
figure 1

Trajectory mining process

Data mining techniques, which have been implemented in some pattern recognition projects, are based on static approaches (conventional data mining techniques), which are not fully compatible with the spatial and temporal aspect of trajectory data. Also in some research projects (Yavas et al. 2005; Monreale et al. 2009) where dynamic data mining has been applied, spatial and temporal aspects of input data were considered separately. This approach does not include spatio-temporal relationships which can help to identify some other rules.

The problems with most spatial and temporal (not spatio-temporal) data mining techniques, which have been used for pattern recognition, are:

  • Spatio-temporal topological relationships are ignored. The spatial relations, both metric (such as distance) and non-metric (such as topology, direction, shape, etc.) and the temporal relations (such as before and after) are information bearing and therefore need to be considered in the data mining methods. Also, some spatial and temporal relations are implicitly defined and they are not explicitly encoded in a database. These relations should be extracted from the data. However, there is always a trade-off between pre-computing them before the actual mining process starts (eager approach) and computing them on the fly when they are actually needed (lazy approach). Moreover, despite much formalization of space and time relations available in spatio-temporal reasoning, the extraction of spatial/temporal relations implicitly defined in the data introduces some degree of certainty that may have a large impact on the results of the data mining process.

  • Working at the level of stored data, that is, geometric representations (points, lines and regions) for spatial data or timestamps for temporal data, is often undesirable. Therefore, complex transformations are required to describe the units of analysis at higher conceptual levels, where human-interpretable properties and relations are expressed.

  • Spatial resolution or temporal granularity can have a direct impact on the strength of patterns that can be discovered in the datasets. General patterns are more likely to be discovered at the lowest resolution/granularity level. On the other hand, large support is more likely to exist at the higher levels of resolution. To have a better support and also having more clusters, a higher level of resolution and granularity are needed. In this stage it worth to mention that, lack of spatial and temporal accuracy and precision of input data can be compensated, up to some extent, by number of input trajectories. So for large enough datasets, less accurate trajectories can let us have appropriate results as if accurate data was available.

In order to consider spatio-temporal relationships in the data mining process, a spatio-temporal database, which can store all required aspect of spatio-temporal objects, should be firstly generated. This helps to use, modify and analyse different characteristics and relationships between/of trajectory data. Then using such a database and having large enough input datasets, it would be possible to find similarities, clusters, classes, anomalies, etc. and finally find rules and patterns contained within the movement trajectory clusters.

The final step of knowledge discovery from data is to verify that the patterns produced by the data mining algorithms occur in the wider dataset. Not all patterns found by the data mining algorithms are necessarily valid. It is common for the data mining algorithms to find patterns in the training set which are not present in the general dataset. This is called overfitting. To overcome this, the evaluation uses a test set of data on which the data mining algorithm was not trained. The learned patterns are applied to this test set and the resulting output is compared to the desired output. The accuracy of the patterns can then be measured from how many are correctly classified. A number of statistical methods may be used to evaluate the algorithm, such as ROC curves.

If the learned patterns do not meet the desired standards, then it is necessary to re-evaluate and change the pre-processing and data mining steps. If the learned patterns do meet the desired standards, then the final step is to interpret the learned patterns and turn them into knowledge.

However data mining techniques are very well-developed, in order to have a better pattern recognition process and consider all aspects of trajectory data, it is highly recommended to apply spatio-temporal data mining techniques not to have the problem described in the last subsection in the last subsection. In this case, spatio-temporal relationships are also considered in this process.

As discussed by Abraham and Roddick (1998), the forms that spatio-temporal rules may take are extensions of their static counterparts and at the same time are uniquely different from them. Five main types can be identified:

  • Spatio-temporal associations These are similar in concept to their static counterparts as described by Agrawal et al. (1993). Association rules are of the form X → Y (c%, s%) where the occurrence of X is accompanied by the occurrence of Y in c% of cases (while X and Y occur together in a transaction in s% of cases).

  • Spatio-temporal generalisation This is a process whereby concept hierarchies are used to aggregate data, thus allowing stronger rules to be located at the expense of specificity. Two types are discussed in the literature; spatial-data-dominant generalisation proceeds by first ascending spatial hierarchies and then generalising attributes data by region, while nonspatial-data-dominant generalisation proceeds by first ascending the spatial attribute hierarchies. For each of these different rules may result.

  • Spatio-temporal clustering While the complexity is far higher than its static, non-spatial counterpart the ideas behind spatio-temporal clustering are similar—that is, either characteristic features of objects in a spatio-temporal region or the spatio-temporal characteristics of a set of objects are sought (Ng and Han 1994; Ng 1996).

  • Evolution rules This form of rule has an explicit temporal and spatial context and describes the manner in which spatial entities change over time. Due to the exponential number of rules that can be generated, it requires the explicit adoption of sets of predicates that are usable and understandable. Example predicates include Follows, coincides, parallels and mutates (Allen 1983; Freksa 1992; Hornsby and Egenhofer 1998).

  • Meta-rules These are created when rule sets rather than datasets are inspected for trends and coincidental behaviour. They describe observations discovered amongst sets of rules. For example, the support for suggestion X is increasing. This form of rule is particularly useful for temporal and spatio-temporal knowledge discovery.

In order to extract patterns and rules of movements, all input trajectory data are randomly divided into two feature classes. The first one, which is called, training data, is used for pattern recognition and rule learning purposes. Another set of input data, which is called control data, is used to control how the learnt rules and recognised patterns fit into this set of data. After analysing and finding patterns in the training data, the inference system will apply the extracted patterns on the control data to see how similar input control data and estimated results are. If very similar, it is possible to infer that a pattern was discovered and any new data can be analysed using that pattern.

The clusters and classes which can be generated using above-mentioned techniques are used to generate rule sets. The final step of knowledge discovery from data is to verify that the patterns produced by the data mining algorithms occur in the wider dataset. Not all patterns found by the data mining algorithms are necessarily valid. It is common for the data mining algorithms to find patterns in the training set which are not present in the general dataset.

If the learned patterns do not meet the desired standards, then it is necessary to re-evaluate and change the pre-processing and data mining steps. If the learned patterns do meet the desired standards, then the final step is to interpret the learned patterns and turn them into knowledge. In the next section, implementation and the interpretation of such patterns are explained.

3 Implementation

The first step is capturing the trajectory. This can use many different positioning and tracking technologies and methods including Global Navigation Satellite Systems (GNSS), Wireless Local Area Network (WLAN), Radio Frequency Identification (RFID), cameras, mobile networks, Inertial Navigation Systems (INS), Bluetooth networks, tactile floors, and UltraWide Band (UWB) (Basiri et al. 2015b).

We implement this process using an inference engine, developed as an ArcGIS add-in. Tracking data, which has been captured over a period of two months using a mobile app installed on mobile devices, was analysed based on the algorithm. Figure 2 shows the high-level architecture of the system.

As it illustrated in the figure, data mining and machine learning functionalities are hosted in Microsoft Azure cloud computing platform. At the other hand, geospatial functionalities (such as visualization, generalization) are implemented as an ArcGIS Add-in. Communication between the cloud-hosted machine learning functionality and the ArcGIS is performed over the Web.

Fig. 2
figure 2

The high-level architecture of the system (Basiri et al. 2016c)

In order to show that there is a strong correlation between the reliability of the output from data mining and the bounding box (both spatially and temporally) in which the trajectory data are located, two sets of trajectory data for different cities (Hanover in Germany and Maynooth in Ireland) were captured. Trajectories were captured over 2 months (July 2013–August 2013) using a mobile app which can be downloaded from servers at the Institute of Cartography and Geoinformatics (IKG) at the Leibniz University of Hanover and the National University of Ireland, Maynooth (NUIM). The app, which has been used for data capture, can be downloaded from https://www.ikg.uni-hannover.de/index.php?id=635&L=1. Positional data was captured using GPS and mobile networks (usually when the user is moving outdoors) or it can be captured from QR codes affixed to most of the major turning points and important features (especially for indoor localisation) using a QR code reader app (Basiri et al. 2014; 2016a). It is also possible to capture the trajectory of movement from a network of ceiling mounted cameras (such as CCTVs), as it is shown in Figs. 3 and 4.

In camera installation phase, the final goal was maximizing the room coverage and also having a more overlapping area, which is covered by more than one camera. Having the overlapping areas is very important since all analysis is doing over anonymous data and users are identified using a random number (Object ID) assigned. So in order to follow the user roaming from one camera’s coverage area to another’s, it is very important to have an area, which is covered by two cameras to find the corresponding absolute position of a user in overlapping area, as it shown in Fig. 4.

Fig. 3
figure 3

One of the mounted cameras’ coverage areas

Fig. 4
figure 4

Minimum bounding rectangle of a user located in the overlapping area of two cameras

There are two pre-processing stages on the data: anonymity control and noise filtering/error exclusion.

Due to privacy and data protection issues, it is highly important to anonymise the data especially when the data is from CCTV cameras (Gidofalvi et al. 2007). Therefore tracking data is stored without any reference to a user’s identification. There are various anonymisers (Chow and Mokbel 2011). This paper uses the K-anonymity program (Kalnis et al. 2007), which is trusted third party software often used on tracking data (Chow and Mokbel 2011). The data are stored in centralized systems or on decentralized peer devices (Ghinita et al. 2007b). The anonymiser removes the ID of the user and cloaks the exact user location in the spatial database.

Anonymised trajectories can have some points that are not perfectly accurate or in some cases even valid. Such errors and noises should be filtered in advance to minimise the invalid results at the end of the data mining process. It is very important to remember this phase is being carried out to filter the errors and noises and it is not to exclude the abnormalities and anomalies, as they might be helpful for some applications and scenarios. The noises and errors are in the data due some reasons including poor and multipath positioning signals or tracking the reflection (for example on windows) rather than the actual location of the users by CCTV camera. In order to detect and exclude noises and errors there are some methods available, such as Kalman and Particle filtering and mean (or median) filters, described and reviewed by Lee and Krumm (2011).

This application, however, used a heuristic-based method, as the other filters replace the noise/error in the trajectory with an estimated value and this may have a significant impact on the output of trajectory mining (i.e. the recognised patterns and rules). It calculates the distance and the travel time between each consecutive couple of points in the trajectory and then the travel speed for each segment can be easily calculated. Then it is possible to find out the segments whose travel speeds are larger than a threshold (for example 360 km/h). If the travel mode for each trajectory is also identifiable (identified by the contributors or based on statistical methods which can find of the consecutive segments with the almost the same average speed classifiable into three classes of pedestrian, car/bus/train, and bicycle), then it is possible to have different thresholds depend on the travel mode (a pedestrian cannot walk faster than 20 km/h.).

This paper only uses the travel modes specified by the contributors and simply ignore the possibility of error/noise detection using the statistical methods. This is due to some transitional segments (e.g. from pedestrian to car and then again to pedestrian mode) or some anomalies (which are not due to errors or noises) that might be removed if their spatial characteristics and relationships with surrounding spatial features (i.e. map matching) are not considered. Such segments need to be carefully kept for the next steps of trajectory data mining process as they potentially can have valuable information.

An ArcGIS add-in has been developed, as shown in Fig. 5, to visualise, process and analyse the input trajectory data. As illustrated in Fig. 5, a trajectory analyser in a dockable window is available to ArcMap. The first tab creates a feature class by reading the input XML file of the recorded points (i.e. GNSS logs, scanned QR-Codes, mobile cell-ID locations). It can add two columns to the created feature class which calculates the distance between each adjacent point pair (the length of each segment) and the speed of movement of the user passing that segment. The travel speed is compared to the threshold (depending on the travel mode and if the travel mode is not specified by the contributors it is being set to 360 km/h, however this value is a user-defined value and can be easily changed). The travel speed is being stored as an attribute data to each segment as later on it is being used for rule association and also pattern recognition steps. This paper uses the ensemble methods, in addition to some rules of thumb, to identify anomalies and also predictive rules.

Fig. 5
figure 5

Trajectory analyzer dockable window (developed ArcGIS add-in)

A large number of points had labelled to produce a large training set. If a large number of classifiers with accuracy slightly better than a random guess combined together, then the accuracy of the ensemble is superior to most of the known classification algorithms. The idea of ensemble learning is to build a prediction model by combining the strengths of a collection of simpler base models (Hastie et al. 2013; Davidson-Pilon 2015).

Most of the ensemble models in off-the-shelf packages use the binary decision trees as the weak learners and produce an ensemble of hundreds or thousands of binary decision trees. However, this research uses a novel approach for combining other types of classification methods as well as binary decision trees. In MajorityVoteEnsembleClassifier it is possible to use the powerful classification algorithms like logistic regression and support vector machines and improve the accuracy of them by combining these models. Following piece of code, illustrate the code of the mentioned classifier. In summary, this algorithm can be used as anomaly detection algorithm when the training data are composed of points with two labels (valid and invalid-anomaly). Also, it is possible to provide weights for input training dataset in order to learn from the mistakes of previous classifiers in different iterations.

The python code (developed by authors) implements some machine learning algorithms (such as MajorityVoteEnsembleClassifier, is explained later) by extending the scikit-learn package. The scikit-learn package is the most widely used machine learning package in python programming language. As illustrated by code snippet by inheriting from BaseEstimator class in scikit-learn package, the implemented algorithms inherit all methods similar to the other models in scikit-learn.

The python code then hosted in AzureML service of Microsoft Azure cloud computing platform. Microsoft’s Azure Machine Learning (AzureML) (Tejada 2016; Barnes 2015) dramatically simplifies machine learning model deployment by enabling data scientists to deploy their final models as web services that can be invoked from any application on any platform, including desktop, smartphone, mobile and wearable devices. Hosting the python code of this research in AzureML allows using the same model from various platforms without worrying about scalability issues. The data for training the models are stored in SQL Azure database as trajectory database.

C# programming language was used for implementing the ArcGIS Add-in. ArcGIS Desktop Add-Ins is the preferred way to customize and extend ArcGIS for Desktop applications. With Add-Ins all the functionalities of ArcGIS for Desktop applications are available through ArcObjects, which is a set of components that constitute the ArcGIS platform (Amirian 2013). The Trajectory Add-in implemented as a Dockable window in ArcMap. The first tab creates a feature class in a local file geodatabase by reading the input XML file of the recorded points.

figure a

A majority vote ensemble classifier based on the training data (points with x, y) this model can find the points with valid or invalid positions. So, it can be used as anomaly detection algorithm (binary) as well as classification algorithm (Multi-class) for detecting the type of point. The classifier uses set of weak and strong learners to find an answer with lower variance than the single classifiers. If boosting is used, the bias is reduced as well. Parameters include:

  • classifiers: array-like, different instances of classifier objects, shape = [n_classifiers]

  • Different classifiers for the ensemble like SVM, Penalized Linear Models (Lasso, Ridge), Logistic Regression and so on.

  • decisionVote: str, {‘classlabel’, ‘probability’} (default = ‘classlabel’)

If ‘classlabel’ the prediction is based on the argmax of class labels. Else if ‘probability’, the argmax of the sum of probabilities is used to predict the class label (recommended for calibrated classifiers). If decisionVote is ‘classlabel’ and y contains just two values, this ensemble method detects the anomalies.

  • weights: array-like, shape = [n_classifiers], optional (default = None)

If a list of ‘int’ or ‘float’ values are provided, the classifiers are weighted by importance; Uses uniform weights if ‘weights = None’. weights can be used when for learning from mistakes (to set more weights to data points which are classified incorrectly to make the classifier focus on correction the classifier in less iterations.)

figure b

The parameters of the Fit classifier are

  • X : {array-like coordinates of training data},

  • shape = [n_samples,2]

  • Matrix of training samples [x, y].

  • y : array-like, shape = [n_samples]

  • Vector of target class labels.

for anomaly detection y has two unique values. The anomaly label is −1 and for ordinary points the label can be one positive value. For classification, y can be any positive value (unique value of y is equal to number of target classes.)

figure c

If decisionVote equals to ‘classlabel’ and there are just 2 values in y, then the ensemble works like anomaly detector. The parameters include

  • X : array-like coordinates of training data}

  • shape = [n_samples,2]

  • Matrix of training samples [x, y].

This returns maj_vote : array-like, shape = [n_samples], Predicted class labels. And “−1” means an anomaly. All the other values can be interpreted as class labels.

figure d

This part, Predict class probabilities for X (test data and unseen data), can have following parameters:

  • X : array-like coordinates of training data}

  • shape = [n_samples,2]

  • Matrix of training samples [x, y].

  • Training data, where n_samples is the number of data points and the number of features is 2.

This part returns:

  • avg_proba : array-like,

  • shape = [n_samples, n_classes]

  • Weighted average probability for each class per sample.

  • Average probability of a point being classified as anomaly.

In addition to the classification of segments, this function also helps to exclude redundant data. For example, it is possible to find points where the user has been stationary (speed close to zero) and replace them with a single point with a description of the time interval during which the user’s speed was zero.

By calculating the correlation between trajectory data, it is possible to discover other modes of classification. Spatio-temporal clustering helps to identify such classes and to discover underlying rules and patterns through identifying parameters which are highly correlated and extracting rules. Evolution rules are applied at this stage through functions on the selection tab to discover spatial and temporal rules. Figure 5 shows clusters and classes of vertices of trajectories depending on the area of search, buffer threshold, which limits numbers of trajectories with the same types (modes) within an area. There is a buffering since there are always inaccuracies due to positioning technologies. In addition to the buffer area, there is buffering for temporal aspect of input data, which limit time interval when trajectories can be matched. Because of the large number of input features, spatial and temporal thresholds are used to cluster and identify associations. Such thresholds depend on the density, frequency and the nature of the input data. They can be changed using this tab according to experts’ comments and recommendations.

Having the data anonymised and error/noise excluded, then trajectories are ready to go through the next stage and be pre-processed.

This paper uses this approach in order to identify the speed clustering algorithms to identify the stay points. This can be viewed as the combination of segments’ speed threshold and the density-clustering algorithms.

Abnormalities and anomalies are detected (using statistical analysis), then some clusters are identified (using association rules between average speed between two points, number of stops, duration of stops, spatio-temporal topological relationships between trajectories and available features on the maps). Then rules and patterns can be recognised using relevant parameters and criteria, including speed, spatial and temporal correlations between segments and trajectories and also some trajectory matching algorithms. Some of the examples of such rules and patterns are listed below:

If the average speed of movement is more than 50 km/h and the trajectory is matched by the street network, then the travel mode is car; if instead the trajectory is matched with bus lines and there are stops (the speed for a short period of time, becomes zero) at the bus stops then the travel mode may be bus. Such rules can be used in the phase of “recognising user’s current situation”.

The inference can be more complicated than simple if-then rules. It is possible to find the pattern over input dataset using data mining techniques. By analysing input trajectories captured from different types of users, it is possible to automatically find some similarities and patterns.

In order to extract patterns and rules of movements, all input trajectory data are randomly divided into two feature classes. The first one, which is called, training data, is used for pattern recognition and rule learning purposes. Another set of input data, which is called control data, is used to control how the learnt rules and recognised patterns fit into this set of data. After analysing and finding patterns in the training data, the inference system will apply the extracted patterns on the control data to see how similar input control data and estimated results are. If very similar, it is possible to infer that a pattern was discovered and any new data can be analysed using that pattern.

In this paper since prior knowledge about the input data was included (such as a reference map or additional spatial data), it was decided to evaluate the correctness and logic of output patterns and rules using standards, “common sense” rules-of-thumb and expert comments as well as control data test. However there is a need to compare the results of this approach with other approaches results to evaluate them. One of the early inferred rules and patterns is about identifying the travel mode using speed and behaviour of movement. Based on the speed of movement it is possible to classify data into the four categories of pedestrian, bicycle, wheelchair and vehicle. Using patterns of movement, it is possible to find some rules which distinguish between public transportation and cars—public transportation stops regularly at very specific points with very low correlation to time, that is whenever a vehicle arrives at the station, it usually stops. Such rules and patterns should be confirmed by control data. However, if no reference data is available, expert comments, logical rules and standard specifications are also part of this process.

One of the biggest challenges in this project is redundant data. It is possible to have more than one trajectory for each user at the same time interval, with different shapes and sizes, since users can be viewed by different cameras synchronously. If two or more cameras detect a person at the same time in their overlapping area, then we will have two or more trajectories stored for that person, as it is shown in Fig. 6. The simplest policy is to consider all trajectories ignoring this fact that some of them may not show different users’ movements since they are redundant data for the same user moving in a same period of time. This policy may lead to having a higher weight for trajectories located in overlapping areas. In order to handle this, we need to find only one trajectory, which shows the user’s movement as detail as possible. This becomes more complex where there is no link between user and trajectory. Because of privacy issues, there is no link between users and their trajectory. That means, it is not single user ID assigned to user and based on which the corresponding trajectories (got from different cameras) can be identified. In this regard trajectories belong to a single movement made by single user should be identified, and then they should be transformed into one polyline to show user’s movement over a period of time.

In order to find the trajectories to be matched and aggregated into one trajectory, we need to find correlated vertices, which represent one object captured by different cameras, and replace them with one vertex. In order to do this, the simplest policy can be finding vertices, which are spatially and temporally near to each other. Because the vertices, which are recorder spatially and temporally near to each other, are more likely to represent one user and the small differences between them can be because of camera synchronisation drift, calibration and instrumental errors. However this approach may be the simplest approach to find trajectories representing a single user’s movement, it has got some issues. First of all, and the most important one, is mismatching issue. It is possible to find many trajectories that satisfy the conditions, i.e. captured within quite the same temporal interval and also spatially near, while some of these trajectories belong to another user and they represent another user’s movements. In addition, finding the best time interval and also spatial buffering radios for matching process in which trajectories are analysed is quite tricky. Figure 4 shows selection tab in the developed ArcGIS-Add-in which allows selecting vertices captured spatially and temporally close to each other. Then a column will be added to the attribute table to show which segments should be matched. Spatial and temporal thresholds should be given or previously set, for example, the vertices whose distance from the surrounding vertices are less than 2 cm and also have been captured with a 100 ms time interval are selected. Finding the best temporal and spatial threshold depends on the applications, movements’ characteristics, experts’ comments, equipments’ configurations and settings, etc.

This might be the simplest approach, however because of mismatching and also being so experience and application dependent, makes it unreliable. In this regard, this paper proposes using data mining techniques to find matching segment. This means, introducing fixed grid windows in which segments whose patterns are quite same considered to be matched. In contrast with the previous approach which only considers spatial and temporal proximity to find matching segments, this approach considers the patterns of movements. Figure 6 shows how two trajectories are found as matching trajectories using this approach. The green trajectory is captured by camera1 and the red trajectory is captured by camera2. As cameras locations, confutations and settings are different they may capture different trajectories with different numbers of vertices, as it shown in Fig. 6. However the proposed approach considers pattern of movement within a temporal and spatial window. For example the green and the red trajectories in Fig. 6 can be found as matching trajectories since the general trends and the patterns of movement in the predefined window, including four time intervals and a spatial bounding box, is quite same. Those very small noises (which are differences of the trajectories from the general trend) are ignorable according to the predefined settings.

Both approaches, finding near vertices based on a spatial and temporal threshold and also the pattern-based approach, have been implemented in the developed ArcGIS add-in to make choices and also make easier to understand what would be results of both approaches on the same dataset.

Now the data has been pre-analysed and it is possible to define some rules based on which Points of Interests (PoIs) can be extracted. This may be very helpful in tourist guidance or any indoor navigation services. Since cameras (CCTVs) are usually available indoors, especially in galleries and museums. In order to find PoIs, it is possible to select vertices where many users stay for a period of time (probably to see an interesting feature, such as sculpture, painting, etc.). In order to find such points, easily one can use selection tab and select non-moving users. If the number of users who stays in this point (or very close to this point) was more than a threshold, then we can export that point (or area) as a new feature class, called point of interests. Again spatial and temporal threshold and also the number of users are selected based on the application and experts comments. These PoIs can be used to make navigational suggestions for newcomers.

Fig. 6
figure 6

Finding matched trajectories tab, simplification tab

4 Conclusion

Future mobile services will be more anticipatory of users’ needs. Users look for a mobile service/application which can monitor or understand their current situations, predict their needs and demands, and then provide the most appropriate set of services in the most comfortable way. This paper describes a method based on data-mining to detect and extract existing patterns over pedestrian trajectories. The trajectories of anonymised users tracked in surveillance cameras are considered as sequences of time-discrete observations of those users’ positions. These trajectories are processed to find the patterns within pedestrians’ movements. For example, it is possible to find attractive/important points or features in a gallery, based on the most common users’ trajectories; i.e. where there is no movement over a period of time for many tourists. These points can be considered as the point of interests to be applied in giving navigational instructions or suggestions to other users; i.e. it is possible to recommend a new user to go to that point, since many people have visited that location. In order to extract such patterns efficiently from spatio-temporal data of pedestrians, including time and position of users, an ArcGIS add-in has been developed to implement the method. This represents, stores, analyses and extracts patterns of pedestrians’ trajectories based on the spatio-temporal data mining methods described in the paper.