1 Introduction

The growth of satellite and space technology has led to the revolution in science and technology.

The satellite images obtained from the space have several insights regarding climate change, town planning, change in urban development pattern, traffic flow pattern, migration of wild, environmental change, vegetation index fragmentation and so on [1]. The geospatial images contain spatial and temporal patterns, which require highly efficient mining techniques to derive insights. Among the satellite images, the Serial Remote Sensing Images (SRSI) has more potential to obtain effective results for earth observation. The SRSI images are captured from a particular area at different time periods that help to identify the habitat changes [2, 3]. The events that occur at certain time intervals such as seasonal changes, deforestation of forests, growth of building in metropolitans, and change in vegetation patterns during different seasons can be easily explored using SRSI. The images are explored based on the mining of frequent patterns that has a higher count than the threshold value. The change of events can be identified using the pixels [4].

The SRSI has the sequential patterns of spatial data using which more knowledge can be mined. In general, a sequential pattern mining algorithm is used to explore the frequent items from conventional databases. In the case of geo spatial data, the data has complex characteristics such as geospatial features, large volume, which hinders the traditional sequential pattern mining algorithm from exploration. The major complexities of geo spatial datasets are that it has multifarious relationships between the objects and autocorrelation of spatial data is critical [5, 6]. To overcome these complexities several researchers have proposed different techniques used conventional sequential pattern mining algorithms by converting the geo spatial database into a transaction database. The conversion can be done by considering the pixels as an object and the value of pixels as events in the transaction [7, 8]. Even though, the conversion requires a large amount of time to get converted into vertical tables as it has a huge volume of data. The challenges can be resolved by rather analyzing the data at the area level than at the pixel level. The grouping of pixels tends to reduce the time complexity and computational complexity. In this paper, the Quantized ternary pattern based pixel grouping and Singular Value Decomposition (SVD)—Run Length Coding (RLC) based pattern mining is proposed. The pixel grouping algorithms groups the pixels block by block thereby reducing the computation overhead and reduces the time consumption. SVD based pattern mining algorithm is used to increase the robustness of the proposed algorithm and it resists against image processing attacks which improves the visual quality [34, 35]. It also satisfies the threshold support values and helps in generating highly accurate sequence patterns [9].

The remaining sections of the paper are ordered as follows: Section II reviews the applications and challenges in various spatial pattern mining algorithms. Section III has a detailed explanation of the proposed Quantized ternary pattern based pixel grouping and Singular Value Decomposition (SVD)—Run Length Coding (RLC) based pattern mining algorithms. The results of the proposed algorithm according to mining time and support values of pattern sequences and performance analysis of the proposed algorithm are discussed in Section IV along with the dataset utilized for experimentation. Section V concludes the paper by highlighting the features of the proposed work.

2 Literature review

In this section, existing works based on spatio temporal pattern mining algorithms for different applications and their issues and corresponding solutions derived by different researches were discussed.

In a fuel cell, the damages were identified using the sequence of events that occur during an acoustic emission using the Cluster Sequence Mining (CSM) algorithm. The correlated cluster pairs were extracted using the CSM algorithm from data that consist of a sequence of events. The spatial and temporal proximity patterns of the events were considered to pair the clusters. The intervals between the sequences were distributed in a periodic manner and the events were ordered in a proper sequence by the Co-occurring Cluster Mining (CCM) algorithm [10].

The Bayesian inference is used to infer the robustness based on the probability of time intervals. The events were correlated with the help of the Dynamic programming concept in accordance with their correspondence. The challenges in the CCM were that it requires both spatial and temporal proximity for evaluation but the order of the event occurrence is not considered. The probability density functions were assumed based on the inferences made by the Bayesian Inference that they follow a certain specific order. The accuracy of the inference was enhanced using the Dynamic Time Warping (DTW). The challenges were compromised by the CSM algorithm in the identification of damages in fuel cells [11].

The autocorrelation indices for local spatial data were applied to explore the patterns in fragmenting the vegetation land area. The vegetation land in the urban areas is cultivated with different crops and it results in high classification error when it is classified based on the data received from the land cover dataset in a discrete format [12, 13]. This issue was addressed using Local Indicators of Spatial Association (LISA) which helps in clustering the patterns of spatial data in an efficient manner. The authors utilized the information of spatial data for the Zimbabwe cultivation pattern to identify the hot and cold clusters in the corresponding region. The Z scores were utilized to highlight the statistical measure of each parameter in the dataset. The moving window method was adopted to analyze the vegetation pattern. The LIAS helped in increasing the accuracy of classifying the spatial patterns [14, 24]

In the network connection using wireless sensors, the sequential patterns were discovered using an incremental mining algorithm to make the mining process more efficient. A traditional mining algorithm, namely. PrefixSpan was enhanced to reduce the complexity of the conventional algorithm affected due to recursion. The proposed algorithm generated the sequence of frequent patterns and computes the support number using a reticular sequence tree which counts the numbers faster. It was identified that the performance of the enhanced Prefix Span outperforms based on space, time and other sensitivity factors [15].

A Vague Grid Sequence (VGS) based frequent pattern mining algorithm was proposed to overcome the drawbacks in Euclidean space and road networks such as difficulty in identifying appropriate positions. The vertical projection distance was considered to partition the zones into explicit and vague. The grid sequence in the vague zones was obtained by the transformation of its trajectories. The grid boundary problem was resolved in this approach using the vertical projection distance. The trajectory pattern mining algorithm is an application to identify the migration of wild, to plan traffic, vegetation fragments, and recommending paths and so on [16, 17].

The spatio temporal patterns of the seismic dataset were used to explore the frequent items from the transactional database. A group of spatial data was collected and the continuous values are converted to discrete values to mine patterns in a more effective way. The exact patterns were matched using an algorithm namely adapted sequential pattern detection algorithm. The data sequences obtained are ensured using the min-support constraints [18].

In the pattern mining process, the negative and positive sequential patterns are present, in which the negative patterns play a critical role when compared to the negative patterns. The negative sequence patterns in the datasets are generally huge and hence it is difficult to mine the patterns from it [19, 20]. The exploration of negative sequence patterns was made easy by proposing the set theory based negative sequence patterns. The negative patterns were recognized using the set theory approach and it is converted to a positive containment problem. It reduces the time for rescanning the database helps in the utilization of positive sequence patterns. Hence, the available positive patterns were used to mine the negative patterns from a large dataset [21].

The networks namely, residual U-Net and Combined Differential Image network were integrated to detect the urban patters using Synthetic Aperture Radar (SAR) sensors. The conventional log ratio method results in poor compression ratio hence the difference in the images were computed using the weighted difference method, which is used to generate the weight function. The log ratio difference and singular decomposition were combined using neighbor constraints. Finally, residual U-Net was applied to detect the changes [22, 23].

A frequent pattern list is introduced to eliminate the invalid candidate for frequent patterns. The valid candidates are approved thereby reducing the processing time. In comparison with the 9DSPA miner approach, the frequent pattern list reduces the processing time [25]. Various denoising image enhancement and restoration techniques can be used in remote sensing images [26].The spatial overlay operation is integrated with Run Length Encoding for the efficient clustering of pixels. The prefix span algorithm eliminates the scanning of unnecessary pixels [27]. The spatial resolution of remote sensing images can be improved by super resolution technique [28].

3 Proposed methodology

In this section, the proposed Quantized ternary pattern-based identical spatial sequential prediction and Singular Value Decomposition (SVD)—Run Length Encoding (RLC) based clustering to analyze the vegetation fragmentation is explained in detail. The SRSI images obtained from the remote sensing satellites have high-intensity pixels that consist of information about the same area at different points of time. The pixel in an image, which is the elementary unit, has the value based on the color value in that particular area. Such pixels are considered as a sequence of events captured at various time intervals and can be termed as a spatial sequence. The proposed system focused to reduce the mining time by considering the non-repeated pixels alone by introducing the quantized ternary pattern. It avoids the encoding of repeating pixels which is a burden to the mining time. Hybridization of RLC with SVD localizes most of the energy content of the matrix into few singular values and also provided much less operation time.

The sequence of spatial data in the image is represented as follows

$$m\, \times \,n\, \times \,\sum\limits_{j = 0}^{N} {P_{n}^{j} }$$
(1)

where N is the total number of images with ‘m’ height and ‘n’width.

The images in the data set is represented in the sequential order as \(imageset = \left\{ {I_{1} ,I_{2} ,I_{3} \ldots I_{n} } \right\}\) from which the sequences that occur frequently are identified. The automatic threshold \(T_{h}\) is used to recognize the frequent items in the sequence, by which the image value greater than the threshold is considered as the frequent pattern. To apply a sequential pattern mining algorithm on the spatial dataset has to be converted to a table format that has pixel values. But, the conversion process involves both time and space complexities. The flow of the proposed system is shown in Fig. 1. The challenges have been solved by proposing

  • Quantized ternary pattern based pixel grouping

  • Singular Value Decomposition (SVD)—Run Length Coding based pattern mining

Fig. 1
figure 1

Flow diagram for the proposed system

3.1 Quantized ternary pattern based Pixel Grouping

The pixel those have similar spatial sequences is grouped along with its neighbors using the quantized ternary patterns. The SRSI images are converted to images that have 7 \(\times\) 7 dimension as follows

$$B_{p} = \mathop {\lim }\limits_{x \to 1\,\,to\,\,m - 7} \,\,\,\mathop {\lim }\limits_{y \to 1\,\,to\,\,n - 7} \left[ {I(x:x + 7,y:y + 7)} \right]$$
(2)

Here, ‘x’ and ‘y’ represents the row and column index for the pattern,‘m’ and ‘n’ denotes the number of rows and number of columns respectively.\(B_{p}\) is the pattern matrix which is the pattern formed from the image. The pixel value of the center point of the pattern matrix is obtained in which the value of the 4th row and 4th column is considered as the center pixel. The neighborhood pixels are subtracted from the center pixel to predict the non-repeating values in the image. The process of identifying the non-repeated pixel values is continued for the entire image using an automatic threshold approach. For each block in the image, a unique value \(U_{j} = unique\left( {d_{j} } \right)\) is obtained as the non-repeating pixel.

figure a

3.2 Singular value decomposition (SVD)—Run length coding (RLC) based pattern mining

The RLC is the image compression technique that processes the image pixel by pixel and hence increases the time complexity. The SVD—RLC is applied to reduce the time complexity of encoding the pixels, which obtains a single value for processing by decomposing the neighbor pixel values. The images in Fig. 2a and b show the input SRSI image and the Singular Value Decomposed image obtained from the datasets respectively. The input image that has unique pixels is processed using SVD-RLC based clustering that results in encoded vectors. The SVD is applied for each and every row and columns in an image. An orthogonal Eigenvector of the image is computed and multiplied with the image. Further, the RLC process is carried out for the SVD decomposed image. The intensity of each pixel starting from 0 to ith position is obtained for all the unique values. The frequency of occurrences of the pixel with the same intensity is identified and the number of occurrences is counted. The process is repeated for each row of the image to get the encoded image vector.

figure b

3.3 Bisect images for encoded vectors

The image bisecting process gets the encoded vectors as input and returns the grouped pixels in a text file. The length of the encoded vector is initialized to intersect two rows of two different images. The number of pixels in both the images is initialized and represented as \(num^{1}\) and \(num^{2}\) respectively. Compare both the images pixel by pixel and perform the intersection operation. The sequence from the images is added and a new row with items along with their sequences is formed, which return a new row of pixels by which the current row is replaced. The encoded vectors are added to the new sequence to return the grouped pixels using Group prefix span algorithm

figure c

4 Results and discussion

In this section, the proposed SVD pixel grouping based clustering algorithm is validated regarding time consumption and mining time (Table 1). The Cropland data layer dataset is utilized to measure the performance of the algorithm. The data set is obtained using the link (https:// nassgeodata.gmu. edu/ Crop Scape/) [29]. The Table 2 shows the dataset details.The dataset consists of geo-spatial raster images that are captured by a satellite with a reasonable resolution. The images cover the continental area of the United States, namely, Iowa State. It covers over 150,000 square kilometers, in which each pixel in the image covers about 900 square meters. The different crops cultivated in the region are identified using the colors on the map. The datasets are segregated as four datasets, namely D1, D2, and D3 and D4 based on the period of time in which it is captured. The comparison of existing methodologies is given in Table 3 and performance analysis of the proposed method is explained. Figures 3 and  4 shows a comparison of mining time for existing algorithms namely, prefix span, group prefix span, GFS pattern and group prefix span AC with the proposed SVD pixel grouping algorithm. The values in Tables 4 and 5 are plotted as graphs in Figs. 3 and  4 as a representation of graphical visualization. It is depicted from Fig. 3 that the SVD pixel grouping algorithm outperformed the Prefix span and group prefix span algorithm. The mining time of the prefix span is high due to its inability to project the database in a vertical manner. The support values of the sequence patterns generated by different sequential pattern mining algorithms are compared in Fig. 5. The plot in Fig. 5 is an interpretation of values in Table 6, which shows that the proposed SVD group prefix span generated more accurate sequence patterns. The prefix span used to skip more number of rows during scanning; hence it fails to generate the patterns with required support values. In comparison with the prefix span and group prefix span algorithm, the SVD group prefix span generated the support values greater than the threshold value. In general, the SRSI images are utilized to analyze the growth and development of a particular area over time. It is realized from the experiment that the threshold value plays an important role, where more patterns can be gained with low threshold values. These patterns are utilized in the prediction of crop yields.

Table 1 Parameter settings
Table 2 Dataset details
Table 3 comparison with existing methodologies
Fig. 2
figure 2figure 2

a Input images of Dataset D1, D2, D3 and D4 b Singular Value Decomposed images of Datasets D1, D2, D3 and D4

Fig. 3
figure 3

a Evaluation of time consumption on test dataset (3.57% decreased) b Evaluation of time consumption on test dataset (5.4% decreased) c Evaluation of time consumption on test dataset (4.83% decreased) d Evaluation of time consumption on test dataset (3.86% decreased)

Table 4 Mining time of prefix span, group-prefix span and SVD group prefix span algorithm based on the threshold values
Table 5 Mining time of GFS patten, group prefix span—AC and SVD grouping algorithm based on the threshold values
Fig. 4
figure 4

a Evaluation of mining time on test b Evaluation of mining time on test dataset (29.41% decreased) c Evaluation of mining time on test (48.57% decreased) d Evaluation of mining time of test dataset dataset (45.67% decreased)

Fig. 5
figure 5

Comparison of the support values of sequential pattern mining methods

Table 6 Support values of sequential pattern methods for each pattern

4.1 Performance analysis

The proposed Quantized ternary pattern-based identical spatial sequential prediction and Singular Value Decomposition (SVD)—Run Length Encoding (RLC) based Pattern mining is compared with some related methodologies and analyzed in terms of memory storage, encoding and decoding information, Compression and Security. Table 3 compares the existing related methods were compression can be processed using SVD without using large memory storage. Encoding and Decoding can be easily done by sequential operation. We retain the high level of information when non repeated pixel based approach is performed. The structure and texture of information are not lost while compression by SVD-RLC algorithm. Here pattern by pattern pixel grouping is performed so attackers cannot reorder the color of the vector in the palate. The idea begins this is to propose an efficient algorithm that can compress Remote sensing image set by SVD Group prefix span which can be applied in big datasets in terms of time and to overcome the challenges in SRSI images.

5 Conclusion

Quantized ternary pattern based frequent spatial sequence and Singular Value Decomposition (SVD)—Run Length Coding (RLC) based pattern mining along with Group PrefixSpan for pixel grouping is proposed to overcome the challenges in SRSI images such as computational overhead due to huge volume of images. The images are compressed by QTP based SVD-RLC Quantized ternary pattern algorithm, which reduces the overhead by processing the images in a block by block manner. The Group prefix sequence pattern outperformed the prefix grouping by scanning the rows without any skips. The algorithms are validated using Cropland data layer dataset and proved that it results in less mining time and generates appropriate sequence patterns above the threshold support values. The proposed method is highly efficient, as it compresses remote sensing images set by the concept of the SVD Group Prefix Span even in the case of big datasets. The method can generate all sequential patterns at reasonable times.

In the future, the mining time and compression ratio can be enhanced using the fraction based run length encoding technique. In addition, the computational overhead in processing the SRSI images can be reduced using compression based classification.