1 Introduction

Petroleum is one of the most valuable building blocks of the energy market and is obtained from the decomposition of living organisms; it is a volatile liquid. Many of the vehicles used in daily life operate with by-products derived from petroleum. The two most used products of crude oil are diesel and gasoline. Gasoline contains varying amounts of hydrogen and carbon depending on its type and specifications, and it is commonly used as fuel in internal combustion engines and is derived from petroleum. It is a toxic, highly flammable liquid obtained from petroleum. Gasoline and diesel are released result of the thermal treatment of petroleum to varying degrees. Both fuels have high usage values in vehicles. Therefore, proper storage and transportation of fuel are crucial to ensure the regular operation of vehicles.

Gasoline and diesel products are received from the refinery and stored in the storage tanks of fuel carriers before being transported to fuel stations and made available to consumers. There is a continuous circulation process at the fuel stations, where these products are sold. The fuel provided to vehicles from fuel stations is sourced from the underground tanks within the station's premises. As the fuel level in these tanks decreases, the fuel obtained from the refinery is separately unloaded into the underground gasoline and diesel tanks at the fuel station using tanker trucks. These unloading operations are carried out by gas station personnel. People working continuously are at a higher risk of making mistakes due to factors such as carelessness, sleep deprivation, and fatigue. As a result of an incorrect tanker filling, diesel and gasoline fuels inside the tank can mix and lose their usability. In addition to the wastage of existing fuels, improper cleaning and maintenance of the tanker can lead to financial damages and, in some cases, environmental pollution [1].

Ultrasonic technologies are becoming increasingly prevalent in various fields. One of these technologies involves using ultrasonic velocity measurements to analyze the content of liquid mixtures [2, 3]. The use of ultrasonic waves allows for monitoring the composition of liquid mixtures from outside a pipe or container. This facilitates  obtaining real-time information about the liquid composition to ensure product quality and improve the process. This technique can provide measurements related to physical properties, but cannot identify chemical species in the mixture. Measuring the speed of sound and density in mixtures also plays an important role in understanding molecular interactions [4]. Ultrasonic measurement is the characterization of low-energy ultrasonic waves that can measure how materials or liquids propagate and attenuate depending on frequency. This method makes a significant contribution without causing damage to the existing system due to its non-invasive properties [5, 6]. The ultrasonic technique has been tested for various applications within the field of oil, chemical, meat, and construction industries [7,8,9].

In this study, an innovative system was developed to prevent possible incorrect filling during transportation and storage of fuels. An experimental setup was designed to analysis the fuel types. First, the types of gasoline, diesel, ethanol, and water liquids were determined using an ultrasonic sensor. Time of flight (TOF) values of the liquids were measured at different temperatures. Then, the dielectric constants of these liquids were measured using a dielectric sensor. The types of liquid are determined using the measured data The results obtained from the ultrasonic and the dielectric methods were compared, and the advantages and disadvantages of these methods were discussed. The results showed that ultrasonic sensors are effective tool at accurately determining fuel type. The proposed system provides a solution to prevent the incorrect type of fuel from being transferred to the wrong storage tank during refueling. This allows more efficient and safer fuel storage and transportation. Furthermore, a new approach that uses machine learning techniques to determine fuel type is presented. Fuel types were classified using 33 machine learning algorithms such as support vector machines (SVM), artificial neural networks (ANN) and K-Nearest neighbors (KNN). A classification success rate of 94% was achieved in ANN and quadratic discriminant classifiers. The results promote the use of artificial intelligence methods in ultrasound-based measurement systems. The conducted studies so far have demonstrated that ultrasonic waves are commonly employed in fields such as medicine, food, and environmental health. As far as we know, the TOF method has not been used in oil or its products dedection yet. So, the approach involving ultrasonic and artificial intelligence techniques was particularly innovative in distinguishing fuel types.

2 Related Works

2.1 Ultrasonic Methods

Ultrasonic waves are used to determine various parameters such as density of liquids, mixing ratios, speed of sound, pressure, and even wind speed and direction. First, ultrasonic waves have been successfully applied in liquid testing and analysis. The density, composition, and temperature of liquids have been determined using the speed of sound [10]. The successful application of the ultrasonic method has also been reported using sound velocity and attenuation coefficient in the characterization of different types of edible oils [11]. This technology has also been used to measure ammonium nitrate concentration in real-time [12]. In another study, a microcontroller-based ultrasonic system was successfully tested to detect very low component concentrations and determine the amount of ethanol linearly and with high precision over a wide range, without requiring extra equipment [3].

The ultrasonic waves are also used to provide quality control and product quality in industrial processes such as polymerization reactions [13]. Similarly, this approach as  a non-invasive technique has been highlighted to control polymerization rates [14].

The ultrasonic waves have been suggested in physical therapy for micro-massage and heating effects. Additionally, an ultrasound-based automatic patient movement tracking device is reported, which detects the patient's movements during treatment and stops the treatment [15].

A sensor that can detect the presence of hydrocarbons has been developed for environmental applications [16]. Additionally, it has been reported that ultrasonic waves determine the properties of different fuel types and mixtures. The system provides sensitive and accurate results [17]. In another study, the speed of sound was examined on motor fuels diesel, and some vegetable oils [18]. A phononic crystal (PnC)-based sensor platform has also been introduced to optimize engine performance and obtain real-time gasoline characteristics [19].

The characteristics of five different vegetable oils were assessed using the TOF method and an ultrasonic transducer [20]. Research has been conducted on the speed of sound in heterogeneous mixtures, vegetable oils, and pure water using an ultrasonic sensor. It has been observed that the speed of sound in pure water remains constant at a certain temperature, while in mixed states, there is a linear relationship with the inverse of the ultrasonic speed [21].

A fuzzy logic controller and ultrasonic sensor were used instead of traditional calibration methods to design an intelligent liquid flow measurement technique. Although traditional calibration processes can be time-consuming, an optimized fuzzy logic controller can provide a faster and more effective calibration [22]. A new algorithm based on a modified neural network architecture is proposed to improve the accuracy of ultrasonic measurement systems in storage tanks with varying depths and environmental conditions. This feature expands the sensor's usage area and allows it to be used in a wider variety of applications. It also shows adaptability with the ability to adapt to environmental changes in the measurement environment [23]. Finally, it was stated that ultrasonic waves were also used to measure wind speed and direction. The system includes ultrasonic transmitter–receiver, time-to-digital converters (TDC), multiplexer, FPGA, and user interface. FPGA communicates with TDC and multiplexer using SPI protocol and plays a critical role in this system. The proposed system offers a more economical solution by reducing the cost by 75% compared to existing ultrasonic anemometers [24].

The conducted studies so far have demonstrated that ultrasonic waves are commonly employed in fields such as medicine, food, and environmental health. The change in the usage areas of ultrasonic waves has not affected the way of use much and the TOF parameter has been taken as basis in many studies. As far as we know, the TOF method has not been used in oil or its products dedection yet. In this study, a system was developed to determine the fuel type using an ultrasonic sensor in case of errors that may occur during the transfer of petroleum products.

2.2 Dielectric Methods

Dielectric sensors detect environmental changes by measuring electrical property changes and are used in many application areas, for example, industrial automation, medicine andbiomedical, food and beverage industry, environmental monitoring, electronics, and telecommunications. Methods have been developed by electrical conductivity for determining the purity level of petroleum products and for quality control of fuels. A phase-sensitive capacitance meter with a Lock-in amplifier has been employed to determine the total water content in the human body.[25]. A new method is described to detect liquids with different dielectric properties using the electromagnetic band gap (EBG) structure. The designed detector can be used in environments such as laboratories or clean rooms where liquids may be poured into channels [26].

Another area is its use to describe the behavior of materials when exposed to high-frequency or microwave electric fields for dielectric heating applications and to quickly determine their moisture content. The absorption of energy through dielectric heating and its effect on heating materials have long been well known, and many potential applications have been investigated [27, 28]. Various techniques and circuits are used for transmittance measurements in low, medium, and high-frequency ranges. It is important to take into account electrode polarization phenomena at low frequencies and the frequency is affected depending on the nature and conductivity of the measured material [29, 30].

It is possible to use dielectric methods effectively in the food industry, such as in the analysis of the quality of soybean oil. Dielectric constant values can be associated with the degradation state of the oil sample, and by increasing these values, degradation measurement at high temperatures becomes easier [31]. Furthermore, the dielectric constant method has been used to detect the addition of water to raw milk [32].

Studies based on machine learning techniques using dielectric properties have been conducted in the automotive, agriculture, and health sectors. Artificial neural networks (ANNs) are used for monitoring electrical properties of oil samples such as mechanical particles and dielectric properties. The traditional method, the spectral analyzer (SA) has disadvantages such as non-real-time measurements, high cost, and time consumption [33]. Long short-term memory (LSTM), a deep learning model, is employed for predicting the dielectric properties of citrus leaves. In contrast to the Nicolson-Ross-Weir (NRW) algorithm, the study innovatively concentrates on calculating dielectric parameters while taking into account the moisture content (MC) effect [34]. An experimental study is conducted to classify wounds and normal skin with machine learning techniques using a dielectric spectroscopy approach and a network analyzer. It was observed that the dielectric constants of various wound types could be differentiated within the 1–2 GHz frequency range by extracting the optimal frequency. Using supervised learning classification tools, it has been shown that different tissue types can be classified with nearly 100% accuracy on a variety of samples. Additionally, the challenges posed by broadband dielectric sensing systems are highlighted [35].

3 Fuel Type Detection Using Ultrasonic Sensor

Ultrasonic sensors are piezoelectric transducers that can convert an electrical signal into mechanical vibrations and mechanical vibrations into an electrical signal. They can measure distance and detect the presence of an object without the need for physical contact. They achieve this by producing an ultrasonic echo and monitoring that echo. That is, the ultrasonic sensor produces ultrasonic pulses, and these pulses are reflected by an object within the sensor's field of view. The effective range varies from a few centimeters to several meters depending on the sensor and the objects of interest [36, 37].

Sound is a mechanical wave. Therefore, a medium is required a sound wave to propagate. Since, the sound wave is influenced by changes in the propagation medium, variations in the wave's travel time through the air are evaluated, and they are used to analyze the properties of the medium, such as temperature, density, and pressure [3]. A change in density results in a change in the speed of sound [38]. In applications, TOF is measured, but speed is not measured. Depending on the change in speed, the measured TOF will also change [39].

In this study, a single piezoelectric ceramic disk transducer is employed for both generating and detecting sound waves. This phenomenon is commonly known as the pulse-echo principle. It entails the reflection of the ultrasonic wave generated by the piezoceramic from a particular surface and the subsequent detection of this reflected sound wave using the same piezoelectric transducer. When measuring TOF, an initial wave is generated to create an ultrasonic wave, and then a stop wave is generated when an echo above a certain threshold level is detected. The time between the initial wave and the stop wave shows the TOF. Thanks to this system, measurements can be conducted with precision at the nanosecond (ns) level [3].

The TOF technique allows for continuous measurements without the need to interfere with the process liquid component or make any physical contact. Ultrasonic sensors can be installed externally without causing any alterations to the pipe or pipeline. It is crucial to note that during the measurement, significant attenuation in the ultrasonic wave can occur if there are large gas bubbles present in the liquid. The ratio of liquid mixtures is determined based on the acoustic properties of the liquid components, and this ratio varies depending on differences in density and compressibility. Even when two components have the same density, the ratio can be measured due to disparities in compressibility [39].

3.1 Experimental Setup

First, an experimental setup has been devised to determine the types of gasoline, diesel, ethanol, and water liquids. Measurements were conducted in a special test chamber where the temperature could be adjusted between −40 °C and + 85 °C to allow testing of samples at desired temperatures. The diagram explaining the operating principle of the system is shown in Fig. 1.

Fig. 1
figure 1

The general structure of the system

A new card has been designed regarding the TDC1000-C2000EVM development card owned by Texas Instruments. This card can generate ultrasonic waves through the connected piezo sensor and can detect the reflection of these generated waves. The selected sensor has a working frequency of 1 MHz and is a closed-type ultrasonic sensor with the model number T/R975-US0014L353-01 from the Audiowell brand. Frequencies ranging from 1 to 1000 MHz can be utilized for ultrasonic applications. However, 1 MHz frequencies are sufficient for TOF measurements in liquids [40]. Unlike the TDC1000-TDC7200EVM development card, the designed sensor card has RS-485 communication and RF (Radio Frequency) wireless communication capabilities. This allows for the widespread use of the RS-485 protocol with the sensor in the industry, making it easier to implement in industrial applications. To enable the sensor to send data remotely and broadcast results using radio frequency, the E32-868T30S RF module from E-Byte is used.

The TMS320F28035 is used as the MCU on the card, which is the same MCU used on the development card. The integrated TDC1000 handles communication with the ultrasonic sensor and temperature sensor, as well as measurement. Communication with the MCU occurs via SPI, and the necessary measurement results are obtained. There is a transceiver connected to the TDC1000 integrated circuit. It generates a signal from the TRIGG (trigger signal) pin. When the TRIGGER signal is generated, the transceiver begins transmitting ultrasonic sound. As soon as the TDC1000 receives this trigger pulse, it generates a short pulse from the START pin, and the counter starts counting. When the generated signal is reflected or detected by the receiver transducer, the second pulse, known as the STOP pulse, is generated. This STOP signal stops the counter that had started earlier. Thus, the counter running between the START and STOP signals provides us with the response to what TOF is [36]. After determining the fuel type by the sensor and the card, a mechanical system is designed to approve or stop this transmission between the tank and the tanker, preventing possible incorrect refills.

3.2 Shield Case Design

A metal shield case made of stainless steel was specially designed to make measurements, as shown in Fig. 2. An O-ring has been used between the sensor and the metal housing to ensure the sensor's liquid tightness. This allows the piezoelectric sensor to isolate the liquid it meets from the surrounding mechanism. The part where the piezoelectric sensor is mounted is made of brass material. Thanks to the mechanical structure of the sensor, the distance to the reflective surface, where the sound wave sent by the piezoelectric sensor is reflected can be adjusted. This, in turn, enables precise determination and optional adjustment of the exact distance from which the sound wave is reflected. In this design, the reflection distance of the sound wave has been set to 5 cm.

Fig. 2
figure 2

Shield case view

3.3 Measurements

To calculate the TOF within different liquids, an ultrasonic sensor has been placed inside the designed sensor slot and connected to the sensor board. Subsequently, all the liquids were consecutively placed into the measurement vessel, and the necessary measurements were taken. In this study, gasoline, diesel, ethanol, and water were used as samples. Figure 3 shows the measurement setup created to determine the type of fuel.

Fig. 3
figure 3

Experimental Setup

To communicate with the sensor, it is necessary to follow a specific message flow diagram and send messages according to the sensor protocol. In these measurements, the module is connected to a temperature sensor, and initially, the temperature is measured. After obtaining the temperature data, the software calculates the average of the received data, and then data is started to be collected from the ultrasonic sensor. Ultrasonic data is collected in real-time, and after a certain period, the average of this data is calculated, thus obtaining temperature and time information related to the sample.

All the samples used for measurements were tested in a 500 ml measuring vessel as shown in Fig. 3. Initially, the measurements were conducted at room temperature, and then, to enable measurements at different temperatures, the experimental setup was placed in a special test chamber, where the temperature could be adjusted. This allowed for ultrasonic TOF measurements of gasoline, diesel, ethanol, and water samples to be performed between −20 °C and + 60 °C. Each sample could only be used once. When the same sample was reused, it could lead to chemical degradation at high or low temperatures, resulting in different outcomes during the second measurement. Therefore, tests were completed with only one measurement conducted for each sample.

3.4 Experimental Results

In the experimental setup, measurements were conducted using liquids such as gasoline, diesel, ethanol, and water. In these tests, the reflection time of the ultrasonic wave was measured. Since the distance of the vessel in which the ultrasonic signal propagated remained constant, the TOF value could be calculated even if the liquid inside changed. Different TOF values were obtained in studies conducted with gasoline, diesel, ethanol, and water. Subsequently, measurements for all liquids were taken individually at different temperatures. The liquids were first kept in a cold environment, and then measurements were taken after being kept in a warm environment for a specific period. The TOF values obtained from the measurements are shown in microseconds in Table 1.

Table 1 TOF values of samples [µs]

The temperature–time changes for the tested samples can be seen in Fig. 4. In experiments, liquids were first cooled from room temperature to −20 °C. During this period, changes in TOF value were observed. Subsequently, the same liquids were heated up to + 60 °C in the same environment, and the TOF value was recorded at this temperature. As shown in Fig. 4, as the temperature increased, molecules moved farther apart, increasing in TOF value. Conversely, as the temperature decreased, a decrease in TOF was observed. While, the TOF value of water increased more slowly, the TOF of gasoline increased more rapidly. Additionally, it was observed that the TOF of gasoline increased more compared to diesel.

Fig. 4
figure 4

Measured TOF values of all samples

Following the measurements at high and low temperatures, the TOF signals, specifically the start and stop signals, of the samples were monitored with an oscilloscope at room temperature. In the measurements conducted for gasoline and diesel samples, the start and stop triggering signals were observed as shown in Fig. 5. The TOF value for gasoline was 76 microseconds, while for diesel, it was measured as 65 microseconds.

Fig. 5
figure 5

Trigger signals for Gasoline and Diesel

In measurements conducted at room temperature for ethanol and water samples, the start and stop trigger signals were observed on the oscilloscope as shown in Fig. 6. The TOF value for ethanol was 87 microseconds, while the TOF value for water was 67 microseconds.

Fig. 6
figure 6

Trigger signals for Ethanol and Water

Tests on samples were conducted at both high and low temperatures. Upon examining the results of all the samples, a noticeable difference was identified. The start and stop trigger signals of the samples were monitored at room temperature using an oscilloscope. As shown in Table 1, it was observed that the results obtained with the oscilloscope were consistent with the values obtained from the system.

4 Determination of Fuel Types by Dielectric Constant

Dielectric materials can be defined as electrically insulating and nonmetallic materials. All materials in nature have dielectric constants (ε) and magnetic permeabilities (μ). These values are fundamental in influencing the propagation of electromagnetic waves [41]. In this study, experiments have been conducted utilizing the dielectric constant to distinguish between diesel, gasoline, ethanol, and water.

The measurements were conducted using a dielectric sensor and its corresponding card. This sensor communicates via the RS485 protocol, operating at a speed of 9600 baud, and responds to various commands. The PT100 temperature sensor within the sensor provides temperature information and, with its unique design, supplies the dielectric constant information of the substances in the channels. The dielectric sensor belongs to the JCWV110 model, owned by the SENSORJC company. The fuel types used in the measurements are gasoline, diesel, ethanol, and water samples. The measurements were conducted with samples of the same type for comparison with ultrasonic measurements. All measurements and results were obtained at both high and low temperatures. The dielectric sensor and the control card are shown in Fig. 7.

Fig. 7
figure 7

Dielectric sensor and control card

The dielectric sensor was placed in the test vessel  where the samples were to be measured, and then the dielectric constants of sample liquids, such as gasoline, diesel, ethanol, and water, were measured sequentially. The measurements were conducted at room temperature, and additional measurements were taken by varying the ambient temperature between −20 °C and + 60 °C. During the cold room test, disruptions in the liquid form were observed in the diesel fuel after a period. When both diesel and gasoline were cooled in the same room, a phase change was observed in diesel, while gasoline remained in a liquid state. Figure 8 illustrates the phase change observed in diesel fuel during the cold room tests.

Fig. 8
figure 8

Diesel fuel with phase changes in cold room tests

This test was carried out using automotive fuels containing a specific additive. These additives, such as ethanol, are substances aimed at improving fuel efficiency. As a result of tests conducted with high additive concentrations, the changes in the dielectric constants of diesel and gasoline fuels concerning temperature are shown in Fig. 9a. As seen in the graph, in both types of fuel, the dielectric constants decrease as the temperature drops. Another test was performed with fuels containing low additives. Similarly, measurements were taken with changes in ambient temperature between −20 °C and + 60 °C, but a significant difference in dielectric constants could not be observed. This suggests that it may not distinguish between gasoline and diesel fuels based on the dielectric constants of fuels with low additives, as shown in Fig. 9b.

Fig. 9
figure 9

Change of dielectric coefficient of gasoline and diesel samples

It was observed that adding ethanol as an additive to the fuel changed the dielectric constant. Fuels with varying dielectric constants can be easily distinguished as two different fuel types through sensors. However, the fuels without additives were not detected significantly by the sensors. It is more difficult to detect fuels with similar dielectric constants, such as gasoline and diesel. Yet it is easier to identify the fuel type due to the significant difference in dielectric constants in ethanol and water samples. Therefore, it was concluded that the method of determining the fuel type by measuring the dielectric constant may not always yield accurate results.

5 Machine Learning-Based Ultrasonic Measurement System

This section discusses the use of machine learning techniques to identify fuel types, providing valuable insights and recommendations. This serves to increase the widespread use of ultrasonic-based measurement systems in fuel-type determination applications. Fuel Types must be distinguished  and classified as gasoline, diesel, ethanol, and water. Classification refers to a predictive modeling problem in which the input data is classified as one of the predefined labeled classes. Machine learning techniques are one of the widely used techniques for classification [42].

In the study, the obtained data is used for classification by several methods which are included in MATLAB Classification Learner App. All classification algorithms are applied by using the default settings. For this special case cross-validation is chosen to protect the model over fitting. The test data are divided into folds, and the correctness of each fold is evaluated. Default setting of five cross-validation is used by the application. In cross-validation setup the app selects several folds (or divisions) to partition the data set. Then the model is trained by using the observation folds in which there is no validation data. Next, the success of the model is evaluated by using validation data. The procedure is repeated for every fold of data and calculates the average validation error. Explained method gives reasonably good estimate of success of the trained model. This is especially recommended for small data sets as in presented case. A proposed training model scheme is shown in Fig. 10.

Fig. 10
figure 10

Overall block diagram of classification

The MATLAB 2023a version incorporates a comprehensive set of 33 classification algorithms. As shown in Table 2, some of these methods encompass various versions of decision trees, discriminant analysis, support vector machines, logistic regression, KNN, Naive Bayesian, diverse Ensemble Methods, ANN, kernel machines, and additional techniques like linear SVM and logistic regression. The following provides a concise overview of the classification models employed in this article.

Support vector machines (SVM), one of the popular supervised learning methods, are typically utilized in classification problems. These are  regarded as machine learning methods designed to identify a decision boundary between the most extreme classes from any given point. SVM operates within vector spaces, drawing a line to separate points on a plane with the goal of maximizing the distance for both points of its class. It is particularly useful for complex yet small and medium-sized datasets. In a standard SVM, the hyperplane can be expressed as follows [43].

$$Hp\left(x\right)={w}^{T}x+b=0,$$
(1)

where the w is weight vector, x is input, and b is the bias value. In nonlinear classification, the hyperplane becomes:

$$Hp\left(x\right)={w}^{T}\mathrm{\varnothing }(x)+b=0,$$
(2)

where the \(\varnothing (x)\) is nonlinear transformation of the input vector x.

The optimal weight vector follows:

$$w=\sum_{i=1}^{N}{\alpha }_{i}{y}_{i}\varnothing (x),$$
(3)

where \({\alpha }_{i}\) is the coefficients of the Lagrange multiplier. The optimal decision, y, is:

$$y=\sum_{i=1}^{N}{\alpha }_{i}{y}_{i}{\varnothing (x)}^{T}\varnothing \left(x\right)+b$$
(4)

If sign(y) is −1, the label of the input is class −1; otherwise, it is class 1.

The popularity of SVMs is likely because of their flexibility to theoretical analysis. The flexibility of the SVM allows the model to be applied to a wide variety of tasks including structured prediction problems [44]. In this study, there are six types of SVM models. Linear SVM model has automatic linear kernel function. Model has box constraint level is 1 with one-to-one multicasting method and standardization of data. The method utilizes the same configurations as logistic regression, employing automatic settings for solver, regularization, and lambda regularization strength. Additionally, the regression beta tolerance is set to 0.0001, and the multiclass coding follows a one-vs-one approach. Quadratic SVM model has same properties as previous one except it has automatic quadratic kernel function. Cubic SVM model has same properties as previous two models except automatic cubic kernel function. Fine Gaussian SVM has box constraint level is 1 with one-to-one multicasting method and standardization of data same as previous SVM models. But model has kernel scale of 0.35 and kernel function of Gaussian. Medium Gaussian model and Coarse Gaussian model have same properties previous Gaussian except they have kernel scale of 1.4 and 5.7, respectively.

Artificial neural networks (ANN) are intended to mimic the behavior of biological neural systems. ANNs are inspired by the central nervous systems of animals, and they have machine learning capability as well as pattern recognition. A directed network contains nodes representing neurons interconnected as their biological equivalent and represents dendrites and synapses. The weight of the interconnections somewhat represents the activation function[45]. Biological neurons were mathematically first expressed in the form of a logic threshold unit (logic threshold unit) by Warren McCulloch and Walter Pitts in 1943. These models, called artificial neurons, receive one or more inputs, and generate outputs by passing the sum of the inputs through an activation function. A neural network model consisting of a single artificial neuron was described as a perceptron by Frank Rosenblatt in 1958. Figure 11 shows a perceptron model [46].

Fig. 11
figure 11

A perceptron model

The following equations describe the functioning of the perceptron model of a neuron.

$$activation u=\sum_{1}^{n}{w}_{i}{x}_{i}-\theta$$
(5)
$$Output s=f(u)$$
(6)
$$error \delta =b-s$$
(7)
$$weight update \frac{d{w}_{i}}{dt}=\eta \delta {x}_{i},$$
(8)

where θ is bias term, output signal (s) is a nonlinear function of the activation value, b is target output, and η is learning parameter.

Artificial neural network models are obtained by combining multiple artificial neurons into layers with one-way connections decoupled between them. These models were created because the classification of nonlinear data could not be done by a single-layer perceptron. The number of iterations of the learning algorithm decreases with an increase in the number of neurons in the hidden layer of the neural network. However, more neurons require more processing [47, 48]. In this study, five different ANN models are used. Narrow neural networks model has following properties. Number of fully connected layers is 1, first layer size is ten, activation function is ReLU (Rectified Linear Unit) function with 1000 iteration limit having 0 regularization strength lambda and standardized data. Medium neural network model’s first layer size is 25 and wide neural network model has first layer size of 100. In Bi-layered neural network there are two layers of size 10 and Tri-layered neural networks has three layers of size 10 [49].

One of the well-known classification algorithms is known as KNN. These algorithms, representing K-nearest neighbours, are non-parametric and supervised learning classifiers. They utilize the proximity of the data points to decide the classification of the data point. KNN is a supervised learning algorithm used in both regression and classification. Its primary objective is to forecast the accurate class for test data by measuring the distance between the test data and all training points. The algorithm then identifies the K closest points to the test data based on this distance calculation. Subsequently, KNN computes the probability of the test data falling into each class among the 'K' selected training data points. The class with the highest probability is assigned to the test data. In regression scenarios, the predicted value is the mean of the 'K' selected training points [50]. In this study, six types of KNN algorithms are used for classifications. The first one is Fine KNN has 1 neighbor using Euclidian distance metric and equal distance weight with standardized data. Next one is medium KNN similar to first one except it has ten neighbors. Coarse KNN has also same model as previous two except it has 100 neighbors. Cosine KNN is different from previous models having ten neighbors and cosine distance metric. Cubic KNN models differs from with previous models having Minkowski (cubic) distance metric. Lastly, weighted KNN uses Euclidian distance metric with squared inverse distance weight.

Ensemble methods that find applications in statistics and machine learning, use multi-learning algorithms to get better performance results than any of the single-learning algorithms alone. This paper incorporates five different ensemble methods for classification. The boosted trees model employs adaptive boosting (AdaBoost), which can be utilized in a variety of regression and classification tasks. The Bagged Trees method utilizes bagging, also known as Bootstrap Aggregation, as a technique to decrease the variance of a statistical learning method. The RUSBoosted Trees method applies the RUSBoost algorithm, specifically designed to tackle class imbalance issues in datasets with discrete class labels. This approach combines random under-sampling (RUS) with the standard boosting procedure AdaBoost to improve the modeling of the minority class by eliminating majority class samples.

Kernel machines which include SVM are one of the subcategories of algorithms to use in pattern recognition. In these methods, linear classifiers are employed to solve non-linear problems. The object of pattern analysis is to get and learn common types of relations in data such as principal components, classifications, clusters, correlations, rankings. Contrary to other methods kernel methods needs only user determined feature map. Although feature map of kernel machines is infinite dimensional, its requirement as user input only finite dimensional. In this study, SVM Kernel method uses all auto settings in number of expansion dimensions, lambda regularization strength and kernel scale setups. Additionally, it standardizes data with iteration limit of 1000 and multiclass coding is one vs one.

Decision tree learning is one of the supervised learning approaches that is used in machine learning, statistics, and data mining. A categorization or regression outcome tree is used as a model to predict and obtain inferences about a set of results. Named classification trees which are discretely valued sets are the target variables of the tree model. In these tree structures, leaves represent class labels, and branches represent combinations of features that give rise to these class labels. Decision trees are amongst one of the most popular machine learning algorithms because of their clarity and simplicity [51].

An analysis method based on a generalization of Fisher’s linear discriminant known as linear discriminant analysis (LDA) is used in statistics and similar fields to obtain a linear combination of characteristics. LDA is also closely related to principal component analysis (PCA) and factor analysis, as both seek linear combinations of variables that best explain the data. Unlike LDA, which explicitly endeavors to model the decoupling between data classes, PCA does not take into account any distinctions in class, and factor analysis constructs feature combinations based on differences rather than similarities [52].

As a generative model, quadratic discriminant analysis (QDA) is closely related to LDA, but presumes Gaussian (normal) distribution for all the classes. Same as LDA, the used model has full covariance structure, hyperparameter options and PCA are disabled.

A statistical analysis method, logistic regression aims to predict binary outcomes based on previously observed data. By examining the correlation between existing independent variables, the logistic regression model envisages a dependent data variable. Binary outcomes allow a simple decision to be made between the two choices. The model can consider more than one input criteria. Considering, old data on previous outcomes with the same input criteria, it calculates results of new situations according to their likelihood of falling into specific categories. The logistic regression uses all auto settings in solver, regularization, and lambda regularization strength setups. Other than that regression beta tolerance is 0.0001 and multiclass coding is one vs one.

A well-known probabilistic approach, the Naive Bayesian classifier is used in the pattern recognition problems that each descriptive attribute or parameter to be used in the model should be statistically independent. This consideration seems quite restrictive at first glance. Although, this proposition limits the space of use of the Naive Bayesian classifier, it also provides results that can be compared with methods such as more complex artificial neural networks when used by stretching the statistical independence condition.

5.1 Classification Results

We try to classify the measured and collected data. No preprocessing is done on the data except rearrangement of data. The data collected does not include TOF values at negative temperatures, as water turns to ice below 0 °C. For compatibility, the TOF value for temperatures of 0 °C and below is accepted as 65.2158 ms. Consequently, a new matrix table is constructed for the classification of the data which has two inputs namely TOF and temperature and output is the classification of the liquid to be determined. Table 1 shows the measured data. The constructed matrix table is presented to the MATLAB classification learning application, and all classification algorithms are applied using default settings. In this study, there are 33 classification algorithms available.

The hyperparameters used for training are shown in Appendix 1. A summary of the results is shown in Table 2. In the table, accuracy represents the percentage of correctly classified observations, the higher the value, the better the model. Total cost represents the overall cost of misclassifications. Smaller total cost values indicate a better model for a given high accuracy value. Prediction speed shows the estimated speed for new data based on prediction times for validation data sets. Training Time gives the time spent training the model.

Table 2 Summary of the Classification results (SVM Support Vector Machines, KNN K-Nearest Neighbors)

As can be seen in Table 2, when evaluated in terms of training time, Ensemble Discriminant has the highest value with 17.28 s, while Discriminant Linear has the lowest value with 1.58 s. In terms of Prediction Speed, Narrow NN has the smallest value with 413.92 obs/sec, while Coarse Tree has the highest value with 8705.67 obs/sec. Most successful algorithms have a success rate of around 94%. A success rate of more than 85% is chosen arbitrarily as a successful algorithm. These algorithms are listed below: Discriminant Quadratic, Quadratic SVM, Cubic SVM, Fine Gaussian SVM, Narrow NN, Medium NN, Wide NN, Bi-layered NN, Tri-layered NN. Neural Network classifications seem to be clear winners in classifying regardless of the subtype of the algorithm. The three most successful algorithms which have more than a 90% success rate are shown in Table 3 in detail.

Table 3 Detailed presentation of most successful algorithms

According to the Tables 2 and 3, the most problematic classifications have occurred between water and diesel. This is because there is a crossover between diesel and water around 35 to 40 °C and TOF numbers are almost identical. In Fig. 12 there is a scatter plot of Medium Neural Networks’ data and crossover and misclassification points which clearly shows misclassified points using the ‘x’ sign. In the plot vertical axis is TOF in 10–5 s and the horizontal axis is in °C. It can be concluded that ANN classifiers give very successful results in classifying the data.

Fig. 12
figure 12

Scatter plot of classified data by using Medium Neural Networks algorithm. (Ethanol: , Gasoline: , Water: , Diesel: ) (: Correct Classification, : Misclassification)

6 Discussion

Tests were conducted on samples at both high and low temperatures. When the results of all samples were examined, a noticeable difference was observed. In measurements with an ultrasonic sensor, as the temperature increased, molecules moved away from each other, causing an increase in the TOF value. Conversely, as the temperature decreased, a decrease in TOF was observed. While the TOF value of water increased more slowly, the TOF value of gasoline increased more rapidly. Additionally, it was observed that the TOF of gasoline increased more than that of diesel. However, it is important to note that the temperature value must be measured accurately. Otherwise, incorrect results will be obtained since the TOF values of gasoline and diesel samples are equal at −10 °C and + 50 °C temperatures. Furthermore, the start and stop trigger signals of the samples were monitored at room temperature using an oscilloscope. The results obtained with the oscilloscope were found to be consistent with the values obtained from the experimental setup.

Additives like ethanol are substances which aimed at enhancing fuel efficiency. Therefore, tests were conducted with varying concentrations of ethanol additives to examine the temperature-dependent changes in the dielectric constants of diesel and gasoline fuels. It was observed that in both types of fuel, as the temperature decreased, the dielectric constants also decreased. Additionally, the addition of ethanol as an additive to gasoline and diesel fuels altered the dielectric constant. Fuels with a high concentration of additives can be easily distinguished as two different fuel types through dielectric sensors. However, fuels without additives were not significantly detected by the sensors. Detecting fuels without additives, especially those with similar dielectric constants, such as gasoline and diesel, is more challenging. Therefore, it has been observed that the method of determining fuel type based on dielectric constant measurements does not always yield accurate results.

Additionally, a classification was performed using 33 machine learning algorithms to identify and categorize fuel types. Fuel types, including gasoline, diesel, ethanol, and water, were successfully classified using Quadratic Discriminant, Quadratic SVM, and medium neural networks techniques. It can be said that the ANN classifiers with first layer size 25 and Quadratic discriminant have very good results with a 94% success rate in classifying the data. The ANN classifiers with a first layer size of 25 (ANN Medium) strike a balance between the characteristics of the dataset and the complexity of the problem. A larger layer size may increase the risk of overfitting the model, while a smaller size may restrict the model's ability to learn. It was found that the 25-layer model provided the best performance with strong generalization ability.

7 Conclusions

An innovative approach based on the TOF technique and machine learning has been introduced to distinguish various fuel types. Different fuel types, including commonly used gasoline and diesel fuels, as well as ethanol and water samples, were attempted to be distinguished. All samples were analyzed using ultrasonic methods and dielectric constants. Additionally, fuel types were successfully classified with an accuracy rate of 94% using ANN and Quadratic Discriminant classifiers.

The liquids were successfully distinguished using ultrasonic sensors. However, in measurements with dielectric sensors, the influence of additives was observed, and it was determined that this method may not always yield accurate results. Compared to the dielectric method, the ultrasonic method produced more successful results. The advantage of using the ultrasonic method has been recognized in preventing possible  incorrect fuel filling during fuel transfers. Furthermore, it can be stated that machine learning classification will contribute to the increased adoption of ultrasonic-based measurement systems in fuel type identification applications.

Further study is proposed to optimize the hyperparameters of the algorithms, which are determined to be the best procedures. Considering the performance of the algorithms, focus can be directed particularly towards algorithms that show promising results.