This section describes a straightforward Enhance-First-Last Pattern Matching (EFLPM) method and Enhance- Processor-Aware Pattern Matching Algorithm (EPAPM). EFLPM is an enhancement to FLPM that combines the pre-processing and matching stages of FLPM into a single phase to minimize time complexity.
The proposed EFLPM algorithm
The FLPM pre-processing stage scans the text to highlight text windows that will be used later in the matching stage, because FLPM is based on comparisons. The windows whose first and last characters in the pattern match the first and last characters of the text in pattern size are extracted during the pre-processing stage. If the initial and last characters match, they will be added to the matrix of windows; if they don't, the pattern will be moved one letter and the process will be repeated. The procedure is then repeated throughout the paragraph. The matching stage involves comparing the extracted windows to the pattern once more. The proposed EFLPM algorithm flowchart is shown in Fig. 2.
The steps of the proposed algorithm EFLPM can be summarized as follows:
Step 1: Read the DNA sequence dataset as a fasta file.
Step 2: Initialize the counter at 0 as the initial value of the while loop counter with count 0 and This will continue till this counter hits n-m.
Step 3: Check the 1’st and last characters of the pattern, i.e., t[count] and t[count + m 1], and then compare to p  and p [m − 1]. The matching process begins if the results of both comparisons are equivalent.
Step 4: If the comparison yields a false result, the pattern does not exist in this section of the text since the initial and last letters did not match. However, if they match, there's a chance the pattern will match inside this text as well. Therefore, conduct the matching process immediately in this window, which is called the window.
Step 5: If this pattern and this section of the text are same, the complete pattern inside the text is identical. If there is a perfect match for the pattern in the text, increase the number of matches by one and return to the loop to finish the text.
Step 6: In the last step, if the loop is ended, returns the start index for all instances of pattern p in text t.
Figure 3 illustrates the pseudocode of the proposed algorithm EFLPM. Figures 4, 5, 6 and 7 provide a simple example of our EFLPM algorithm's pattern marching steps.
As an example, Figs. 4, 5, 6 and 7 shows pattern p and text t in list t[0.0.53], which is AAGCGTA in list p[0.0.6]. The algorithm searches for the first and last items of the pattern, i.e., p and p, in the text. At the beginning of the algorithm 1, p and p are aligned to t and t, respectively. As a result, the window index array contains the start index of the 1’st window, i.e. 0. Following this example, the algorithm identifies eleven other windows. if the result is true, the next step checks the pattern with this window, and increase the match counter if matching occurs, and store the first index of this window in match_index, otherwise skip this window. Consequently, the window of t[25.0.31] and the pattern are the same (Tables 1 and 2).
Proposed EPAPM algorithm
The Enhance-Processor-Aware Pattern Matching (EPAPM) algorithm, which is based on PAPM, is described in this section. The comparison of pattern p characters and text t characters differs from the FLPM method. FLPM compares words with several characters, whereas PAPM compares characters. PAPM compares two words at the same time using a CPU's processing capacity. A bit processor's registers are slightly longer, and the processor can compare data from two registers during each execution cycle. The number of processable bytes (or word length) for this processor is computed as word_len = b/8 since each byte (or character) contains eight bits. It means that, the processor may compare one word to another by reading its registers each time. A 64-bit CPU, for example, might compare four words of eight characters.
We'll apply the same strategy we did in FLPM to reduce the time complexity of the pre-processing stage and match only one process that does the same job in less time in this approach. The EPAPM Algorithm Steps can be summarized as follows:
Step 1: Read the DNA Sequence dataset as (Fasta file)
Step 2: Initialize the counter at 0 as the initial value of the while loop counter and word_len by b/8 (Described in Section 4.2) and k by the modulus of m and word_len.
Step 3: The start index for the word comparison is determined at the start of this phase. Setting this start index ensures that the method runs successfully even if the lengths of the pattern and windows are not integer multiples of the word length.
Step 4: This algorithm's while loop begins with count = 0 and continues until the counter reaches n-m.
Step 5: Check the two words based on word_len (a word can contain 4 or 8 characters).
Step 6: If the two words are matched, check all the patterns in the text.
Step 7: If all words of the pattern are matched in text, increase the number of matches by one and return to the loop to continue rest of text.
Step 8: In the last step, returns the start index for all instances of pattern p in text t if the loop is finished.
We illustrate the Flowchart and Pseudocode for EPAPM Algorithm in Figs. 6 and 7.
Figure 8 gives an example of using the EPAPM Algorithm run on a 32-bit processor. In this algorithm, the first word (consisting of the first 4 alphabets) of pattern p is searched in text t. the window_index array is composed of three start indexes of the found windows, i.e., 25, 40 and 47. For this example, it should be noticed that the EFLPM method identifies 12 start indexes as potential intervals or windows in this case. As a result, EPAPM decreases the number of recognised windows. Because the remainder of pattern length over word length is 3 in the matching stage, the start index for matching is also 3. As a result, the second word of the pattern that corresponds to the second word of windows is CGTA. After this phase is completed, only one of the two windows (i.e., t[25.0.31]) is matched with the pattern (Figs. 9 and 10).