LSH-based missing value prediction for abnormal traffic sensors with privacy protection in edge computing

Traffic flow prediction is an important part of intelligent transportation systems (ITS). However, sensor failures or the transmission distortion often occur in the process of data acquisition, which will inevitably cause the loss or abnormality of traffic flow data transmitted to the edge server. In this situation, it is necessary to share traffic flow data among different platforms. However, existing traffic flow prediction methods are facing two challenges in the process of traffic flow data sharing. First, user privacy is often leaked in the process of sharing traffic data on various platforms. Moreover, with the continuous updating of data, the efficiency and scalability of data sharing between different platforms will become lower and lower. In view of the above challenges, in this paper, we propose a novel prediction method for the missing traffic flow data caused by abnormal sensors, named ASMVPdistr-LSH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ASMVP_{distr-LSH}$$\end{document} based on distributed locality-sensitive hashing (LSH) technique. At last, a case study is presented to illustrate the feasibility and effectiveness of our approach ASMVPdistr-LSH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ASMVP_{distr-LSH}$$\end{document}.


Introduction
Traffic flow prediction remains to be an indispensable part of intelligent transportation systems (ITS) [1,2] in this everrapid changing modern society. In other words, the accurate and reliable traffic prediction will help effectively alleviate the huge traffic congestion problem, which is of great significance to traffic management and social security [3][4][5]. As a crucial part of 5 G, edge computing can optimize data processing performance and reduce the delay of traffic flow prediction [6,7]. Generally speaking, traffic prediction is to predict the future traffic situation of road network based on historical traffic data [8][9][10] historical traffic data is a key ingredient of the prediction recipe to success.
Considering that the data of traffic flow prediction come from different sensors (e.g., laser sensors and infrared sensors) [11][12][13], it is necessary for the infrared sensor platform to synergism with others (e.g., laser sensors platform) to exploit the integrated traffic data to improve the prediction accuracy of missing sensor data [14]. Therefore, it is necessary to buck a way to integrate traffic data [15][16][17]. However, the method is usually not feasible in the actual cooperation between manufacturers. One of the most fundamental reasons is that manufacturers rarely share the original traffic data with others owing to the conflicts of boot and users' sensitive information [18][19][20]. Another reason is that the amount of raw data tends to grow over time, which will lead to lower efficiency of traffic data sharing, processing, mining and analysis [21][22][23].
Considering the above challenges, we propose a novel traffic prediction method named AS M V P distr−L S H , which is based on the principle of distributed locality-sensitive hashing (LSH) [24][25][26] to protect privacy and fill the missing traffic data [27]. LSH has a favorable feature that is to retain similarity, i.e., two adjacent points are likely to be given the same exponent [28][29][30][31]. Overall, our main academic contribution is twofold, which is specified as follows.
(1) We formulate the topic of traffic flow prediction form multi-source data across different sensors and propose a distributed LSH method for the traffic flow prediction to protect user privacy, i.e., AS M V P distr−L S H . This method converts the traffic flow information into index information, and then uses the index information to predict. So as to achieve the purpose of privacy protection. (2) In this paper, we specially consider the data integrity before traffic flow prediction. We use the principle of finding similar in LSH method to fill in the missing data in the sensor, which can ensure the prediction accuracy of missing traffic sensor data.
The rest of this paper is organized as follows. In "Related work", we summarize the related work in current traffic data prediction domain. In "Problem Description And Formulation', we introduce the motivation of the study through a real-world traffic scenario and formalize the problem of missing values prediction in traffic. In "A distributed LSH-based missing value prediction approach: AS M V P distr−L S H ", the novel method (i.e., AS M V P distr−L S H ) proposed in this paper is described in detail to achieve privacy-preserving and time-efficient traffic data prediction. In "Case study", a case study is used to introduce the concrete steps and show the effectiveness of our method AS M V P distr−L S H . In addition, its shortcomings are analyzed in this section. Finally, we summarize the conclusions of this paper and point out the future work in "Evaluation and Further Discussion".

Related work
Next, we briefly review the research progress of traffic flow prediction from the following two aspects, i.e., missing data prediction and user privacy protection.

Missing traffic flow data prediction
The data collected by different types of sensors on the road are the key ingredient of traffic flow prediction model. However, the data collected by these sensors are occasionally missing due to a variety of reasons, e.g., hardware failure, transmission error, etc. Tian et al. [32] solve the problem of missing data for a long time from two different perspectives, and meanwhile propose two machine learning methods to update missing data without gap length limitation. Laña et al. [33] utilize the periodicity of traffic flow data to infer missing values and put forward a method based on Long Short-Term Memory (LSTM) [34] model. Li et al. [35] propose a Multi-View Learning Method (MVLM) to estimate the missing value of traffic flow in the database.
Although the influence of missing values on prediction performances is often ignored in deep neural network meth-ods, Boquet et al. [36] put forward an online unsupervised data imputation method to tackle this issue. Wang et al. [10] use the characteristic that the traffic flow in the network follows the spatial-temporal patterns to restrain the influence of missing data. Zhang et al. [37] propose a method called FNNTEL, which is a data missing estimation method of tensor heterogeneous ensemble learning based on Fuzzy Neural Network (FNN).
At present, the research on the solution of traffic flow data missing often focuses on the model based on either temporal information or spatial information or spatiotemporal information or tensor decomposition. However, they often lack the ability to protect users' private information contained in traffic flow data.

Privacy protection in traffic flow data
In ITS, traffic flow data are the key factor and the foundation of many prediction models and analysis methods. However, it is inevitable to disclose user privacy information contained in traffic flow data. In order to protect the release of real-time vehicle trajectory data, Ma et al. [38] propose a method named RPTR, which is an effective privacy protection mechanism based on differential privacy. And in the RPTR mechanism, ensemble Kalman filter based on user location transition probability matrix is used to ensure the data availability.
To solve the problem of privacy leakage caused by data sharing between different private operators and public institutions, He et al. [39] propose a privacy control algorithm based on k-anonymity diffusion to realize reliable data sharing. Facing the challenge of protecting the privacy of a single vehicle when collecting point-to-point data for geographical location, Zhou et al. [40] propose a privacy-preserving traffic flow measurement method by using bit array to collect data that need to be protected and maximum likelihood estimation to obtain measurement results. In order to protect the location privacy data of users in GPS, Yang et al. [41] propose a virtual travel route system with stealth technology, which promotes the design of distributed architecture considerably.
Wu et al. [42] first used homomorphic encryption function in compressed sensing data collection, and proposed an efficient data collection method with privacy protection to prevent traffic analysis and tracking in wireless sensor networks. Liuet al. [43] propose a neural network algorithm of gated recurrent unit based on FL, where FL is a privacypreserving machine learning technology named federated learning.
Qi et al. [44] put forward an LSH-based service recommendation approach Ser Rec distr−L S H to secure the sensitive information of users hidden in historical user ratings. However, the authors do not take the time context factor into consideration and neglect the dynamic influence of time towards the unstable service quality. Therefore, the accuracy of the proposed Ser Rec distr−L S H approach is reduced accordingly.
Meng et al. [45] propose an "optimal publishing" strategy to reveal only the optimal service quality records instead of publishing all the sensitive service quality data observed by users. This way, most private information contained in service quality data can be protected well. However, such a partial publishing way decreases the availability of historical service quality data significantly since there is a tradeoff between data availability and data privacy. However, the above algorithms have not been applied to deal with the traffic flow prediction scenarios where the traffic flow data generated by multiple sensors are distributed in multiple platforms. Motivated by this fact, we propose a novel traffic flow prediction method based on distributed LSH, as elaborated in the following sections.

Problem description and formulation
In this section, we first describe the LSH-based Traffic Flow prediction problems with privacy-preservation. Then, we formalize the specific problem for easy understanding. The symbols used in this paper can be found in Table 1.

Problem description
We use Fig. 1 to help describe the traffic flow prediction problem with missing data. Concretely, suppose there are three kinds of sensors to record the traffic situation of a certain road. In the figure, three different sensors are represented by three different base stations. To predict the missing data in one of the sensors, it is necessary to use LSH technology find the similar date corresponding to the date of missing data. Finally, the missing values in the traffic flow data are predicted successfully.
However, there are two challenges in the process of traffic flow prediction. (1) As the above traffic data contains user privacy information and involves the interests of various companies, these sensor companies do not want to share their collected data with other companies. (2) With the increase of sensor types and traffic flow, the amount of traffic flow data has become more and more huge. As a result, the efficiency and scalability of data sharing among companies are significantly reduced, and it is difficult to meet the requirements of real-time traffic prediction.
To address these challenges, we propose a new distributed LSH-based approach named AS M V P distr−L S H , which has the characteristics of privacy protection and scalability. Details are given in the next section.

Problem formulation
In this paper, we focus on the problem of missing traffic flow prediction from multi-source data. For the convenience of understanding and the following discussion, we further formulate the traffic data prediction problem as a five-tuple problem L S H_M V P(T F S, T P, Day, day target , T D), where denotes the d-th traffic sensor, which supplies the d-th part of the traffic flow prediction data collected every day.
resents the i-th day of the month. Here, the traffic flow prediction data collected every day come from many sensors in the set TFS, so it is multi-source. (4) day target : a target day when we need to predict the missing data of a traffic sensor in a day. Here, day target ∈ Day holds. (5) TD is the length of a time period, e.g., suppose that we adopt 15 min as a time period, then each day will contain 96 time periods.

A distributed LSH-based missing value prediction approach: ASMVP distr−LSH
Our traffic flow prediction method can not only protect privacy and fill the missing data in abnormal sensors, but also make distributed prediction for a variety of sensor platforms [46,47]. In short, in "LSH: Locality-Sensitive Hashing", we briefly introduce the location sensitive hash technology. In

LSH: locality-sensitive hashing
LSH technology was improved and proposed by Aristides Gionis et al to achieve high speed information retrieval. Specifically, the algorithm makes hash buckets to store more than one point. In other words, (1) it makes two adjacent data points in the original space after hashing likely to be neighbors; and (2) it makes two non-adjacent data points in the original data space are not contiguous after hashing. The above is the main idea of LSH algorithm. Therefore, the hash function satisfying the above two circumstances is named LSH function, and LSH has been proved to be a technology that can effectively deal with distributed applications such as distributed information retrieval, such as the multi-source cloud service recommendation method based on distributed LSH in [20].
Specifically, f(*) is the function of LSH, F(*) is the family of functions of LSH, assuming that x 1 and x 2 are two variables in the primitive data space, r(x 1 ,x 2 ) represents the distance between two variables, f(x) represents the index or hash value of variable x, P(Y) represents the probability that condition Y holds, and (r x , r y , p x , p y ) are a set of thresholds. If both circumstances (1) and (2) hold, then the f() is called (r x , r y , p x , p y )-sensitive.
We use the example to illustrate the general procedure of LSH-based similar days search. First, assume that the original data space contains m data points (data 1 , data 2 ,..., data m ), they can be mapped into n containers data points with similar neighbor characteristics.
As described above, if a target date (i.e., X) wants to hunt for its similar dates from (data 1 , data 2 ,..., data m ), it should cypher the corresponding hash value f(X) through the hash function f(*), and then find the corresponding container, (assume b k (1 ≤ k ≤ n) here). According to the main idea of LSH, the m i data points contained in b k bucket are most likely the similar day data of target days X. In this case, once m i m is established, the size of the search will change from m to m i , and the search efficiency will also be significantly improved.
It can be seen from the above examples that the method based on LSH search has three advantages. First, this method uses the hash value or index generated by hash function f(*) to find the similar data points of the target data. In this situation, LSH protects the privacy information in the data. Second, distributed data points (data 1 , data 2 ,..., data m ) can be centralized into a hash table through LSH, and then unified calculation can be carried out. Third, LSH can establish hash table offline and reduce search space, which can not only improve search efficiency, but also increase search scalability. Therefore, the extended LSH method is applied to the missing value prediction of traffic flow to achieve privacy protection and scalable traffic flow prediction in distributed multi-sensor environment.

ASMVP distr−LSH : traffic flow missing value prediction based on distributed LSH
In this section, we will introduce our AS M V P distr−L S H method in detail. Generally speaking, our method mainly consists of four steps, as follows: Step 1 (Establish date sub-indices offline): Concretely, for each sensor T F S k (1 ≤ k ≤ S N ), we can choose a family of LSH functions F k ( * ) to create a subindex for day ∈ Day offline based on the known traffic flow data collected by sensors. Because Pearson Correlation Coefficient (PCC) is often used to reflect the linear correlation degree of two variables X and Y, in this paper, we use LSH function family corresponding to PCC to build the index. In addition, the selection of LSH function family F k ( * ) also needs to consider the (1) (2) conditional formula described in the previous section. Here then − → x 1 and − → x 2 are likely to be considered similar. Second, since the elements in vector − → p are randomly generated from the data interval [-1,1], the above hashing and mapping process can be repeated S F k times using different vectors − → p . Then, the sub-index (i.e., F k (day) = ( f 1 k (day), ..., f SF k k (day))) of the sensor in one day can be obtained, in which f j k (day)(1 ≤ j ≤ S F k ) is calculated by Eq. (3). In particular, the sub-index F k (day) here is a 0-1 vector with S F k dimensions.
In addition, we can use the following pseudo code to represent the above process(see Algorithm 1).
Step 2 (Establish date index by amalgamating sub-indices offline): In the previous step, in the light of traffic data of different sensors, we get the S N sub-indices F 1 (day), ..., F SN (day).  ( f i, j,1 , ..., f i, j,T N ) In this step, we will amalgamate the S N sub-indices offline into an integrated date index F(day) = (F 1 (day), ..., F SN (day)) with dimension SN i=1 R i . Subsequently, for each day ∈ Day, we repeat the above process until the mapping relationships of "day → F(day)" are established. Next, we record the mapping relationships "day → F(day)" through a pre-defined hash table FTab.
In addition, we can use the following pseudo code to represent the above process(see Algorithm 2). Step 3 (Find similar days of day target online): According to the operation of selecting hash function family F m ( * )(1 ≤ m ≤ S N ) to get sub-index in step 1 and amalgamating sub-index in step 2, we can compute the index F(day target ) of day target online. Next, we can find the bucket with the value of F(day target ) from the FTab exported in step 2. If a valid bucket can be found, in this case, each date contained in the container are regarded to be similar days of day target and put into a dataset named DS-Set. If we cannot find the qualified container, in this case, we cannot simply judge that day target has no similar days, because of the characteristics of LSH (i.e., probability). This characteristic cannot guarantee that all similar days can be found every time, i.e., some qualified results will be ignored. Therefore, we use the above method to create T hash table F T ab 1 , ..., F T ab T by repeating Step 1 and Step 2 to relax the judgment or evaluation condition of similar days search. Next, if the condition in Eq.(4) is true, we regard that day target has similar days, and the dates whose values in the bucket are equal to F(day target ) x are similar days of day target . Moreover, we put the similar days into a new data set named DS-Set.

Algorithm 2 Establish date index by amalgamating subindices offline
∃ day(∈ Day) and x (∈ 1, ..., T ), In addition, we can use the following pseudo code to represent the above process (see Algorithm 3).

Algorithm 3 Find similar days of day target online
Require: T , day target Ensure: DS-Set 1: DS-Set = φ 2: for y = 1 to T do 3: repeat Step 1,2 to find F T able y 4: find container C corresponding to F(day target ) y 5: if C = φ 6: then put dates included in C in DS-Set 7: end if 8: end for

Step 4 (Top-K missing value prediction):
In the previous step, we have gained a similar date set (i.e., DS-Set) for day target . Next, we use DS-Set to predict the missing values in day target (here, we can set a threshold for |DS-Set|). Specifically, we use Eq.(5) to predict the missing values of abnormal sensors in traffic over the time period TD.
Here, T P.F j represents the traffic flow of the corresponding time period in the sensor T F S, which is included in the days similar to day target (i.e., a day with abnormal sensor values). Finally, we rank all the time periods of the sensor according to the prediction results by Eq.(5), and take the traffic flow data corresponding to the first k time periods as the final prediction results.
In addition, we can use the following pseudo code to represent the above process(see Algorithm 4).
After the above four steps of AS M V P distr−L S H approach, we can accurately predict the missing values of abnormal sensors under the condition of privacy.

Algorithm 4 Top-K missing value prediction
Require: T F S, T P Ensure: Fore-Set to day target 1: Fore-Set = φ 2: for each time period TP in TFS do 3: if T P.F target = 0 4: then number = 0 5: for day i ∈ DS-Set do 6: if T P.F i = 0 7: then number++ 8: T P.F target = T P.F target + T P.F i 9: end if 10: end for 11: T P.F target = T P.F target /number 12: end if 13: end for 14: put Top-K traffic prediction results with T P.F target in Fore-Set

Case study
To illustrate the feasibility of our method, in this section, we use a case study to demonstrate the execution process of our proposed method. Suppose there are 2 different sensors that collect traffic flow data. In addition, we adopt 60 min as a time period, then there are totally 24 time periods in each day. For the convenience of readers' understanding and the easy calculation, the traffic flow data of the sensors used only contain 5 days, each of which is divided into 10 time periods (each period is equal to 2.4 h). Step

(Establish date sub-indices offline):
In this section, we use 4 hash functions to form a family of hash functions (i.e., F k (day) = ( f 1 k (day), ..., f 4 k (day))) for better illustration. Concretely, the hash function family is shown in (6).
First, according to Eq. (3), a dot multiplication operation is adopted between a hash function and the vector corresponding to a sensor. Then, the above process is repeated four times based on different hash functions to get the sub-index of the sensor. Here, the sub-index of the sensor is a 0-1 vector. To facilitate the subsequent understanding and analysis, we transform the 0-1 vector of the sensor into a corresponding decimal number. The sub-indexes of the two sensors are shown in (7). Here, f k (day) represents the sub-index of the k-th traffic sensor.
Step 2 (Establish date index by amalgamating sub-indices offline): In Step 1, we have obtained the sub-indexes of the two traffic sensors. Next, we separately merge the sub-indexes of the two sensors, and the final merging results are shown in Eq. (8). At the same time, the combined index is sent to each sensor platform.
Step 3 (Find similar days of day target online): Repeat Step 1 and Step 2 four times to get four different hash tables. Here, the sub-indexes obtained by four different groups of hash function families are presented in (9) and (10), respectively, and the combined indexes of four groups of sub-indexes are shown in (11). Here, f T k (day) represents the sub index of the k-th sensor obtained through the T -th group of hash function family.
Next, according to Eq. (4) and the above index values in (11), we can get a similar date matrix, and the results are shown in (12). In the similarity matrix in (12), the number of rows represents the number of days of the first sensor, and the number of columns represents the number of days of the second sensor.
Step 4 (Top-K missing value prediction): According to Eq. (5) and the similarity matrix obtained in Step 3 (i.e., the matrix in (12)), we can predict the missing values of the abnormal sensors. Specifically, we assume that the original data of the two abnormal sensors are presented in (13). It can be seen from (13) that the data in the 3rd period of the 1st day of the second abnormal sensor is missing. According to (12), the 1st day is similar to the 2nd day and the 3rd day, so the missing value in 1st day can be predicted to be 1 according to Eq. (5). By analogy, after prediction, the complete data of the two sensors after prediction are shown in (14).

Evaluation and further discussion
Next, we measure the performances of our proposed AS M V P distr−L S H method and compare it with another existing methods: Ser Rec distri−L S H [44] and Optimal − Pub [45]. The recruited dataset is WS-DREAM [48]. Experiments are deployed in a laptop with 2.50 GHz CPU and 8.0 GB RAM, and repeated 50 times.

Profile 1: Accuracy comparison
In this profile, we test and compare the prediction accuracy (RMSE, smaller is better) of our method with other ones. Parameters are as follows: S N = 300, T N is varied from 1000 to 4000, threshold of |DS − Set| = 4. Experimental results are reported in Fig. 2. As the results indicate, the accuracy of AS M V P distr−L S H is the highest (i.e., RMSE is the smallest) among the three methods because AS M V P distr−L S H can  This can be explained as follows: a larger threshold means a "more similar" but "fewer" sensors based on the monitored data at more time periods. Therefore, the prediction results are better in accuracy and time cost simultaneously.

Profile 4: RMSE convergence of three methods.
The RMSE convergence of different methods is presented in Fig. 6. Parameters are as follows: S N = 300, T N is varied from 1000 to 4000, threshold of |DS − Set| = 4. Experimental results show that it is rational to execute experiments 50 times since the RMSE performances of three methods are all becoming stable approximately. This means that the convergence of our proposal is relatively satisfactory. Next, we briefly analyze the shortcoming or limitation of our proposal in this paper and point out the possible improvement directions in the upcoming studies. (1) In our prediction method for missing traffic flow data caused by abnormal sensors, LSH technique is employed to achieve the privacy protection goal. Overall, our method can secure the sensitive user information while making missing traffic value prediction for abnormal sensors. However, it is still difficult to measure or evaluate the capability of degree of the proposed privacypreservation method. This is because LSH is essentially a hash-based technique and we cannot measure its privacy-preservation effects directly and quantitatively. (2) Traffic flow data are heavily dependent on the time factor because users' driving behaviors everyday render an obvious time-varied fluctuation tendency. Therefore, this paper takes the time factor into consideration when making accurate traffic data prediction. However, traffic data flow is also rather related to other influencing factors besides time, such as location, weather, climate and so on. Therefore, it is beneficial to extend the current traffic flow data prediction method by incorporating more influencing factors. Such an extension is helpful for creating a comprehensive and wide prediction framework for missing traffic data flow, especially in complex city management. (3) In the current traffic flow prediction method based on time, each day is divided into 96 time periods, each of which is corresponding to 15 min. Such a time interval segmentation way is fixed and lacks of some flexibility. For example, for busy hours in a day, traffic condition varies with time frequently; in this situation, a smaller time period division manner is better to depict the traffic flow condition of the city. While on the contrary, for free hours in a day, traffic condition seldom varies with time; in this situation, a larger time period is better for describing the traffic condition of the city. Therefore, flexible setting of time period in time-aware traffic flow data prediction is necessary and beneficial to the prediction accuracy and efficiency. (4) LSH is practically a probability-based similar object search technique; therefore, there is some uncertainty in traffic flow data prediction. In other words, it is possible that the prediction performances are not as good as expected, especially in terms of prediction accuracy. In view of this limitation, we need to optimize the traffic data prediction accuracy by improving the traditional LSH technique. One promising way in optimization is to use multiple hash functions instead of only one hash function when creating traffic sensor indexes or time slot indexed with LSH. Thus, through multiple repetition process of LSH, we can achieve a good tradeoff between traffic prediction accuracy and efficiency.

Conclusions and future work
Missing data of abnormal sensors is normal in traffic domain and brings a big challenge for accurate traffic flow prediction and traffic routine scheduling in smart city managements. Motivated by this challenge, this paper presents a distributed traffic flow missing value prediction method with privacy-preservation function for abnormal traffic sensors, i.e., AS M V P distr−L S H . First, our method can integrate known traffic flow data from different sensors offline and send these data to the edge server [49,50]. Then, similar dates with close traffic conditions are filtered out based on LSH technique. Finally, the Top-k dates as well as their traffic flow data are used for predicting the missing traffic data of abnormal sensors in a certain day. To verify the feasibility of AS M V P distr−L S H , we provide a case study to demonstrate the concrete process in missing value prediction with privacy-preservation.
In the future, we will use a set of real traffic sensor data to test the performances of our method, and compare its performance with other related methods. Compared with cloud AI, edge AI has lower latency, which will greatly improve the performance of traffic flow prediction [51][52][53]. Therefore, we will consider applying edge AI technology to future research. For abnormal sensors, it is still a challenging task to consider user privacy, prediction accuracy and scalability simultaneously [54,55]. We will consider a more complex traffic flow prediction scenario in the upcoming study [56,57]. We will also further study how missing values in traffic flow are generated and how they compare with contemporary methods. In addition to missing values in traffic flow, we will also consider sensor-induced anomalous data through anomaly detection.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.