Introduction

Lithological identification is a crucial concern for understanding geological history, prospecting for mineral deposits, and assessing numerous environmental hazards. Remote sensing datasets are considered an efficient, rapid, relatively cheap, and readily available source for lithological mapping (Gad and Kusky 2006; Rajendran et al. 2014; Emam et al. 2016; Ge et al. 2018; Shebl et al. 2021a). Besides saving time and effort, they yield effective and accurate mapping results, especially for inaccessible areas where fieldwork is challenging. Moreover, structural mapping (Kusky et al. 2011; Abd El-Wahed et al. 2019), hydrothermal alteration mapping (Pour and Hashim 2015; Shebl and Csámer 2021a), and mineral discrimination (Amer et al. 2010; Gabr et al. 2010; Ninomiya and Fu 2019) can all be fulfilled efficiently. In all of these studies, wide areas were mapped using digital remote sensing datasets and image processing techniques without exhausting effort or huge amounts of time.

Similarly, using machine learning algorithms (MLAs) as an automatic inductive approach to recognize data patterns (Cracknell and Reading 2014), large numbers of pixels can be classified depending on smaller numbers of labeled pixels, generally referred to as training data. Thus, a significant amount of time and effort can be saved through the utilization of MLAs for lithological mapping. Once trained and learned, MLAs can predict a value and thus create and assign a label to the unknown pixels efficiently. This process is simply a kind of artificial intelligence and can be categorized as a supervised classification because pixels are transformed from unknown to labeled based on previously selected pixels seen by the algorithm. Consequently, supervised classification is premised mainly on the presence of training data (Inzana et al. 2003; Kotsiantis 2007). Alternatively, unsupervised techniques classify pixels (e.g., different rock units) via clustering, depending mainly on spectral characteristics without being fed by training areas (Kumar and Sahoo 2017). This is considered the main base in rock identification, where rocks (with various mineralogical constituents) respond variously to different wavelengths and thus have various responses and appearances in remote sensing data and can be discriminated from each other. Notable improvements in lithological mapping using remote sensing data have been made by using classifiers such as maximum likelihood classifier (MLC) (Yu et al. 2012a; Ge et al. 2018), naïve Bayes (NB) (Cracknell and Reading 2014), artificial neural networks (ANNs) (He et al. 2015; Latifovic et al. 2018), k-nearest neighbors (K-NN) (Cracknell and Reading 2014; Ge et al. 2018), support vector machines (SVMs), and random forests (RF) (Kuhn et al. 2018; Cardoso-Fernandes et al. 2020; Shebl and Csámer 2021b). Besides geological discrimination, mineral exploration programs have significantly enhanced utilizing various machine learning algorithms and classification schemes, e.g., band ratio matrix transformation (Askari et al. 2018; Noori et al. 2019), spectral angle mapper and spectral information divergence (Hadigheh and Ranjbar 2013; El-Magd et al. 2015; Ahmadirouhani et al. 2018; Sheikhrahimi et al. 2019), fuzzy logic modeling (Sekandari et al. 2020), linear spectral unmixing (Pour and Hashim 2012; Pour et al. 2019; Takodjou Wambo et al. 2020), constrained energy minimization (Zhang et al. 2007; Aboelkhair et al. 2021; Shebl et al. 2021a), and mixture tuned matched filtering (Pour and Hashim 2012; Mehr et al. 2013; Pour et al. 2018; Noori et al. 2019), utilizing various remote sensing datasets.

Notwithstanding the proven effectiveness of Advanced Land Imager (ALI) data in geological and hydrothermal alteration mapping (Pour and Hashim 2014), ALI is rarely utilized in lithological classification using MLAs. Thus, the novelty in the current contribution could be outlined in assessing ALI efficiency in delivering accurate lithological allocation and comparing the results of a test case with the widely used datasets (ASTER, Landsat 8, and Sentinel 2) and accepted classifiers (e.g., ANN, MLC, SVM). Moreover, due to the economic importance of the study area because of the presence of several ore deposits including REEs (EGSMA 1983), the current study aims to enhance a recent geological map of the study area depending mainly on objectivity introduced with MLAs, instead of subjectivity that could be evident with traditional mapping, as noticed by some differences among the previous geological maps of the investigated areas.

Materials and methods

Study area description

The Um Salatit–Mueilha area is located in the Central Eastern Desert (CED) of Egypt, as shown in Fig. 1a. Precisely, the investigated area is located between latitudes 24° 49" to 25° 18" N and longitudes 33° 50" to 34° 05" E covering an area of about 1400 km2. The study area is well known for ancient mining activities and has a recently published geological map (Zoheir et al., 2019), which is useful for comparison with our results and verification. The area is covered by a widely distributed stretch of Neoproterozoic ophiolitic mélange consisting mainly of allochthonous ophiolitic fragments mingled in a sheared matrix, as well as other different mappable units (Zoheir et al., 2019). Mélange assemblages are vastly extended within the area along with other mappable units, such as metavolcanics, metagabbro-diorites, and granitic rocks, as shown in the map, Fig. 1b.

Fig. 1
figure 1

(a) Location of the study area (small red rectangle) and (b) geological map of the study area showing its lithological units after Zoheir et al. (2019)

Data characteristics and preprocessing

Landsat-8 (L8) has two sensors, namely OLI (Operational Land Imager) and TIRS (Thermal Infrared Sensor), to acquire spectral data in the visible and near-infrared (VNIR), short-wave (SWIR), and thermal (TIR) infrared regions. OLI data are recorded in nine spectral bands, while TIRS data give information only in two bands, as shown in Table 1. The whole study area is covered by a scene that was acquired on 25 October 2019. Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) is commonly used in lithological discrimination (Pour and Hashim 2014; Pour et al. 2018; Kumar et al. 2020; Cardoso-Fernandes et al. 2020) and detects radiance in fourteen bands covering spectral bands from VNIR, SWIR, and TIR regions (Yamaguchi et al. 2001), as shown in Table 1. A cloud-free ASTER scene (AST_L1A_00303062007083043) acquired on 6 March 2007 is utilized for this study. Earth Observing-1 imagery was provided with three devices: Advanced Land Imager (ALI), the Hyperion, and the Linear Etalon Imaging Spectrometer Array (LEISA) Atmospheric Corrector (LAC) (Franks et al. 2017). Sensors onboard the EO-1 satellite have produced robust products for scientific analysis of the Earth during its entire 16-year mission (Franks et al. 2017). ALI recorded image data from ten spectral bands (Czapla-Myers et al. 2016), as shown in Table 1. The data used for the current study (EO1A1740422003070110PZ) was acquired in 2003. L8, ASTER, and ALI data were obtained through the U.S. Geological Survey, https://earthexplorer.usgs.gov/.

Table 1 Characteristics of the utilized optical datasets

Sentinel 2 (S2) was developed by the European Space Agency (ESA) to provide spectral data in 13 bands (Drusch et al. 2012), as shown in Table 1. For the purpose of the current study and by the availability of S2 data from the European Space Agency (ESA), a cloud-free S2A MSI as an L1C product was downloaded. It should be emphasized that these datasets were accurately selected depending on checking metadata files and technical reports. Based on the solar zenith angle that depends on local overpass time, latitude, and date, we found that ALI data was best recorded in 2003 (Franks et al. 2017). As declared by NASA, all ASTER SWIR data collected after 1 April 2008 have been marked as unusable; therefore, we found that ASTER data acquired during 2007 would be the best. For L8 and S2, and as there are no reported technical errors, we decided to use cloud-free recently launched datasets. Thus, we consider that our data is of high-quality data for achieving the desired lithological classification. The utilized data are georeferenced to UTM, WGS 84 zone 36 N. Subsequently, we performed the fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction (Shebl and Csámer 2021c) and data resizing to the investigated study area. All of these operations were carried out using the Environment for Visualizing Images (ENVI) software version 5.6. For the Sentinel-2 dataset, bands were georeferenced to zone 36 North UTM projection using the WGS-84 datum and then radiometrically corrected using sen2cor processor in sentinel application platform (SNAP).

It is known that finer spatial resolution increases within-class variability; thus, it does not significantly enhance the classification accuracy (Hsieh et al. 2001). Thus, the spatial resolution effect should be removed. All TIR and panchromatic bands were excluded; then, cubic convolution resampling to 20 m was performed. We found that the 20 m pixel size is a reasonable value among the lowest (30 m) and the highest (10 m) spatial resolutions of the implemented data. In this way, an unbiased classification that preserves the relative superiority for each dataset (expressed by variabilities in the number of participated bands) is ensured. Consequently, the resampled bands for each sensor were stacked in files named S2, AST, L8, and ALI, having 12, 9, 7, and 9 bands, respectively. Then, two main combinations were created: S2 + AST + L8 and S2 + AST + L8 + ALI. Now, six inputs are ready to be tested for their potentiality in lithological classification.

Training and testing samples

With reference to the geologic map (Zoheir et al. 2019), well-distributed training pixels were carefully delineated for nine main classes. A total number of 18,567 training pixels were selected. Then, 3776 ground truth pixels were determined from the nine classes to test and evaluate the model performance after this feed by the training pixels as presented in Table 2. To ensure unbiased results, locations of ground truth data were carefully selected depending on the geological map and kept constant for all the classifiers and datasets. Moreover, the number of testing pixels is area-wise accurately selected (i.e., wadi deposits and syn-orogenic granite are tested by 879 and 579 pixels, respectively, due to their larger area compared to ophiolitic metagabbro which is tested only by 188 pixels (as it occupies the smallest rock unit area)).

Table 2 Areas, training and testing pixels, and abbreviations of the lithological classes

Machine learning classifiers

Artificial neural network

Artificial neural network (ANN) is a widely known MLA and is used frequently for pattern recognition. As the name suggests, it tries to imitate the human brain in solving problems after training and learning; thus, ANN’s main processing units are named neurons or sometimes nodes. The network is formed by binding the nodes, which in turn are included in three main layers, input layer, hidden or middle layer, and an output layer, that could be reached by an iterative experiment (Haykin 2010). For this study, we performed multi-layer feed-forward ANN by the logistic activation function. To achieve optimum parameter settings for ANN, several empirical trials were made and assumed values (previously used in similar studies) were assigned to minimize generalization errors. Several local minimums were discarded till reaching the global minimum. We get better results by assigning the training root mean square (RMS) exit criterion as 0.1, training threshold contribution value as 0.9, training rate as 0.2, and training momentum as 0.9.

Maximum likelihood classifier

The maximum likelihood classifier (MLC) is a classical classifier widely used in classifications of remote sensing data. As the name suggests, unknown pixels are classified to a certain class only when they have a high probability of belonging to that class. Thus, the probability density function hypothesis is the main base for MLC (Scott and Symons 1971). For the current study, lithological generalization using MLC is carried out using ENVI 5.6 software.

Support vector machine

SVM has become one of the most important models in remote sensing and machine learning studies. Statistical learning theory (Ougiaroglou et al. 2018) that was first introduced by Vapnik in 1963 (Cortes and Vapnik 1995) is the main base for SVM. In this technique, the known datasets are supposed to be distributed in n-dimensional space and separated by a hyperplane. Logically, the best hyperplane is that introduced by the maximum isolation for the classes. The hyperplane that achieves this maximum separation is named the margin. As well as this optimal separator margin, a penalty for misclassifications is introduced to achieve the most efficient results. Normally, and as the algorithm deals with large amounts of data, a kernel function is frequently required (Wang et al. 2017). To specify the optimum parameters for the SVM classifier, linear, polynomial, radial basis function, and sigmoid kernels were applied to decide the best kernel performance. In this study, the radial basis function kernel delivers the most efficient results. The penalty parameter was set to 100, as the best value (after several trials), for managing the training errors. The gamma parameter in the kernel function was assigned as the inverse of the band number (Othman and Gloaguen 2014) to reasonably control the SVM model’s non-linearity degree.

Accuracy assessment methods

To assess classification outputs and the performance of classifiers, the accuracy for each class has been assessed using the confusion matrix. The producer’s accuracy describes how well the classifier correctly allocates the pixels, and the user’s accuracy shows how well the produced thematic map is by calculating the probability of correctly classifying a pixel into its pre-given class (Congalton 1991; Ge et al. 2018). In this study, we evaluate the results using the average accuracy (average of the producer’s accuracy and user’s accuracy) as well as the overall accuracy (OA), that is, the total number of pixels labeled correctly by MLAs as a fraction of the total number of image pixels. Moreover, the well-known kappa coefficient that measures the coincidence of the resultant thematic maps with the reference data is used to evaluate the consistency of the results (Cohen 1960) according to the following equation.

$$\kappa = \frac{{N\sum\nolimits_{i = 1}^{n} {m_{i,i} - \sum\nolimits_{i = 1}^{n} {\left( {G_{i} C_{i} } \right)} } }}{{N^{2} - \sum\nolimits_{i = 1}^{n} {\left( {G_{i} C_{i} } \right)} }}$$

where i represents the class number, N is the total number of classified values compared to truth values, the correctly classified values number of the truth class i is represented by mi,i. Ci and Gi are the total number of predicted and truth values belonging to class i, respectively.

Results

Classification accuracy results of the nine classes reveal that Osp and Wdp are correctly classified from all the datasets and by the three classifiers, with average accuracy always above 90% (Fig. 2a–c). This is attributed to the pure distinguished spectral signatures, caused by the abundance of antigorite, lizardite, clinopyroxenite, and magnetite in the mineral composition of serpentinite (Gad and Kusky 2006) when compared to other rock units, as well as the bright, distinctive tone and fine texture of wadi deposits. Also, Nss is well classified by a percentage transcended 90% by MLC, SVM for all the datasets except with S2 and L8; its average accuracy was around 80% (Fig. 2a–c). Higher accuracies for ASTER in discriminating sandstone are attributed to significant SWIR absorption features of silicate minerals in ASTER band-passes (Mars and Rowan 2010). Syn-tectonic (Sog) and post tectonic granites (Pog) have been approximately classified from all the datasets and classifiers. A slightly lower average accuracy (60–80%) is recorded for (Mvs) metavolcanics (Fig. 2a–c). The reason for this decrease is the wide chemical and mineralogical compositions for metavolcanics in the study area (acidic to intermediate metavolcanics with their related pyroclastics). As reported by Zoheir and Weihed (2014), metavolcanics comprise a series of weakly metamorphosed calc-alkaline volcanics of andesite-dacite composition forming a mixture of lithofacies with interrelated pyroclastic volcanic tuffs and breccias.

Fig. 2
figure 2

Average accuracies of the lithologic units classified by (a) ANN, (b) MLC, and (c) SVM. Overall accuracies (d) and kappa coefficients (e) for the utilized datasets after SVM classification. (f) Manifest SVM has good performance over ANN and MLC and shows OA increase by merging the different sensors. (g) ALI raised the classification accuracy of Ome and Omg when added to the S2 + AST + L8 combination

Similarly, misclassifications are always accompanied by the ophiolitic mélange (Fig. 2a–c), and this could be explained by the definition of the term mélange itself. It describes mappable geological units or bodies of mixed rocks consisting of blocks of different ages and origins (Kusky et al. 2020). Since the mixing may occur at multiple scales (Kusky and Bradley 1999), including below our pixel resolution, it is difficult to correctly classify this unit. Thus, mixed spectral signatures are often included within the mélange, and thus, confusion is evident for all the classifiers and datasets. This is especially prominent with S2 that has higher spectral characteristics, leading to overfitting the models that adversely affect the average accuracies for ophiolitic mélange (Ome) (Fig. 2a–c), and OAs in all S2 generalization processes (Fig. 2d,e). Moreover, classification errors are always present with ophiolitic metagabbro (Omg), with an average accuracy ranging from 40 to 78% because it is sometimes misclassified as metagabbro-diorite due to the proximity in chemical and mineralogical compositions between the two classes. Thus, MLC totally misclassifies ophiolitic metagabbro (0% accuracy) (Fig. 2b). Also, MLC and SVM distinguished metagabbro-diorite more efficiently when compared to ANN. Metagabbro-diorite plutons, as well as the granitic rocks, intruded the intermediate-acidic metavolcanics. Also, several acidic dykes, granitic sheets, and quartz veins cut through different rock types (Zoheir and Weihed 2014), which sometimes affect the overall accuracy. However, considerable matching with the geologic map is observed, especially when SVM is the used classifier. The results revealed the superiority of ALI over S2, ASTER, and L8 whatever the implemented classifier, as shown in Fig. 2d,e, and described in Table 3. SVM is the most efficient classifier by delivering the highest accuracy percentages in all the applied generalization processes (Fig. 2f). For S2 + AST + L8 combination, a significant raise in the OA is presented when using MLC (86.73%) and SVM (87.79%); however, ANN classifier cannot enhance the OA beyond 77.09% (Fig. 2f).

Table 3 Overall accuracies (OA in %) and kappa coefficients (K) for the utilized datasets and classifiers

By enhancing the previous combination with ALI (S2 + AST + L8 + ALI), a robust boost in the OA for the classifiers is observed, giving 79.21% for ANN, 89.40% for MLC, and transcending 90% for SVM (Fig. 2f, g), confirming the role of ALI in enhancing the classification accuracy using MLAs. Consequently, SVM proved its ability to classify rock units reasonably (Fig. 3) rather than MLC and ANN during all the classification processes performed in this study, as shown in Table 3. Also, ALI proved its worthiness in the generalization process (as noticed by a decrease of the salt and pepper effect that always accompanies lithological classifications, as shown in Fig. 4, when comparing metavolcanics (represented in blue)). These results are confirmed by comparing the results (with slight magnification for the southwestern corner of the study area) produced by Sentinel 2, ASTER, Landsat OLI, and ALI separately, utilizing the three classifiers to produce 12 thematic maps (i.e., 4 thematic maps for each classifier). Figure 4 strongly shows the effect of decreasing error pixels in metavolcanics by embedding ALI in the allocation process.

Fig. 3
figure 3

SVM lithological classification outputs utilizing (a) S2, (b) L8, (c) ASTER, (d) ALI, (e) S2 + AST + L8, and (f) S2 + AST + L8 + ALI

Fig. 4
figure 4

A comparison between the performance of the classifiers (ad) for ANN, (eh) for MLC; and (il) for SVM; form (S2, L8, ASTER, and ALI, respectively) with reference to geologic map (m) drawn after Zoheir et al. (2019)

Discussion

In this study using ANN, MLC, and SVM, reasonable results for classifying the rock units of Um Salatit area are reported. From our point of view and after a comprehensive survey of widely accepted MLAs in performing reliable lithologic mapping through the last decade (Grebby et al. 2011; Amer et al. 2012; Yu et al. 2012b; Mehr et al. 2013; Hadigheh and Ranjbar 2013; He et al. 2015; Jellouli et al. 2016; Othman and Gloaguen 2017; Manap and San 2018; Ge et al. 2018; Bachri et al. 2019; Bentahar and Raji 2021; Karimzadeh and Tangestani 2021; Shebl et al. 2021b), we found that these classifiers are among the most widely recommended classifiers. Moreover, these classifiers employ different mechanisms for data generalization and cover the two main categories of parametric and non-parametric algorithms. Coinciding with Ge et al. (2018) and Bachri et al. (2019), SVM proved its leverage over ANN and MLC. Furthermore, the utilized and recommended SVM classifier outperforms some deep learning methods (e.g., random forest) in lithological classifications (Kumar et al. 2020). For the used datasets, we noticed several variations in generalization accuracies, and this can be explained by considering sensors with different spectral characteristics over several rock units that in turn display wide ranges of chemical and mineralogical compositions. For instance, processes produce absorption features in the visible and near-infrared radiation (0.4 to 1.1 μm) due to the presence of transition elements such as Fe2+, Fe3+ (Hunt and Ashley 1979). In this study, serpentinites and rocks containing Fe2+, Fe3+ can be distinguished due to the spectral advantages in the VNIR ranges for S2, L8, and ASTER. Also, ferric-iron-bearing minerals can be discriminated using six unique wavelength bands of ALI spanning the visible and near-infrared (Hubbard and Crowley 2005). Moreover, due to strong hydroxyl group absorption, serpentinites are rarely misclassified (92% as the lowest OA) for all the sensors. Sog, Pog, Wdp, and Nss are also well distinguished by all the data types, with slight variances in the accuracies of classifying these rocks. These variances are attributed to the performance of the classifiers, as well as mineral absorption features caused by vibrational overtones, electronic transition, charge transfer, and conduction processes (Cloutis 1996) in the reflected solar light area covered by the sensors (0.325 to 2.5 μm).

However, even though S2 has the highest spectral characteristics and the largest number of bands compared to the other sensors, S2 cannot correctly classify Ome and Omg. This can be interpreted by the overfitting for MLAs, caused by the high sensitivity offered by 12 bands of S2 in classifying 9 classes. Overfitting is considered a main defect of MLAs and can be defined by higher sensitivity of the details and noise of training data that could negatively impact the model pursuance on testing data (i.e., low bias and high variance) (Dietterich 1995). This is confirmed by the poor classification results only with the mélange (which has several spectral classes, noise, or fluctuations), where the model picked up these mixed signatures and negatively affected the generalization process resulting in lower accuracies when examined by testing data. Thus, it is recommended to use S2 in mapping many information classes rather than a lower number of spectral classes, which coincides with the results from Ge et al. (2018). On the other hand, underfitting may be the case with L8 lower accuracies for Ome and Omg because this wide range of reflected wavelengths (0.325 to 2.5 μm) is covered only by 7 bands. The best fit for the classification of Ome and Omg is achieved by ASTER and ALI, thus yielding considerably high given average classification accuracies for Ome and Omg. The higher overall accuracy of ALI is interpreted by the improved signal-to-noise ratio (SNR) that is considered one of the most significant performance aspects of ALI to increase the quality of data (Mendenhall et al. 2000; Lobell and Asner 2003). Also, it is noticed that considering the spectral characteristics from more than one sensor boosts the classification accuracies as noted for S2 + AST + L8 and S2 + AST + L8 + ALI (Fig. 2d–f). In the latter combination, the classification improvement caused by adding ALI is basically due to raising the accuracy by correct generalization for Ome and Omg (Fig. 2g). Consequently, it is recommended to use ALI, especially in identifying mélange rocks or generally when the information class includes many spectral subclasses (which is a common case in several remote sensing applications), as the output thematic map from ALI and its combination (S2 + AST + L8 + ALI) fit well with the reference geologic map. In this way, ALI can be used in several geological classifications (as several spectral classes are always included within an information class, by the effect of weathering, vegetation, or any environmental conditions) and may be in other similar applications.

We strongly recommend increasing the training data size, especially when Sentinel 2 data is implemented in the generalization process. Furthermore, executing regularization methods, k-fold cross-validation, and ensemble learning algorithms (Parsa 2021) are also strongly recommended to reduce overfitting and help achieve optimal prediction. It is should be emphasized, however, that the recommended SVM classifier outperforms some deep learning methods (e.g., random forest) in lithological classifications (Kumar et al. 2020). The current study opens the door for the use of ALI data (that is rarely utilized in lithological generalization) applications in future lithological allocations not only with transfer learning methods but also with deep learning algorithms (Shi et al. 2021; Dong et al. 2021; Parsa 2021) that have proven their efficiency in delivering reliable results. Our future research focuses mainly on feeding deep learning algorithms with ALI data (which has proven its potency in the current study) for better lithological and hydrothermal alteration mapping.

Conclusions

This study investigated the potential of ALI, S2, ASTER, and L8 data in mapping rock units of Um Salatit-Mueilha area, utilizing ANN, MLC, and SVM. The study concluded the following.

  1. 1.

    SVM outperforms MLC and ANN in delivering an object-based geological map that could be used for future studies over the investigated area.

  2. 2.

    We were able to better discriminate all the lithological classes studied, but ophiolitic metagabbro and ophiolitic mélange always have lower accuracies in the produced thematic maps, especially with S2. This result may be interpreted by model overfitting with the higher spectral characteristics of S2. The best results from ALI are attributed to improved data quality by enhancing the signal-to-noise ratio.

  3. 3.

    Two additional combinations (S2 + ASTER + L8 and S2 + ASTER + L8 + ALI) show higher OA resulting mainly from boosting Ome and Omg accuracies.

  4. 4.

    Increasing the applied datasets from different sensors significantly enhances the predictive mapping.

  5. 5.

    ALI is recommended for usage in lithological classifications, especially when the number of classes is ten or lower. ALI is much better in generalizing an information class containing spectral subclasses than S2, which is recommended for allocating a higher number of classes.