A Chaotic Antlion Optimization Algorithm for Text Feature Selection

Text classification is one of the important technologies in the field of text data mining. Feature selection, as a key step in processing text classification tasks, is used to process high-dimensional feature sets, which directly affects the final classification performance. At present, the most widely used text feature selection methods in academia are to calculate the importance of each feature for classification through an evaluation function, and then select the most important feature subsets that meet the quantitative requirements in turn. However, ignoring the correlation between the features and the effect of their mutual combination in this way may not guarantee the best classification effect. Therefore, this paper proposes a chaotic antlion feature selection algorithm (CAFSA) to solve this problem. The main contributions include: (1) Propose a chaotic antlion algorithm (CAA) based on quasi-opposition learning mechanism and chaos strategy, and compare it with the other four algorithms on 11 benchmark functions. The algorithm has achieved a higher convergence speed and the highest optimization accuracy. (2) Study the performance of CAFSA using CAA for feature selection when using different learning models, including decision tree, Naive Bayes, and SVM classifier. (3) The performance of CAFSA is compared with that of eight other feature selection methods on three Chinese datasets. The experimental results show that using CAFSA can reduce the number of features and improve the classification accuracy of the classifier, which has a better classification effect than other feature selection methods.


Introduction
Due to the continuous advancement of Internet technology and the increasing scale of scientific computing, the amount of various electronic text data has increased sharply, and the text data is neither completely unstructured nor completely structured. The dimensionality of a text vector can generally be as high as tens of thousands of dimensions. Therefore, in the data mining and data retrieval process, the feasibility is often reduced due to excessive calculation or high cost. As a classic task type in the field of text data mining, text classification has attracted the attention of numerous researchers. It has been frequently applied in real-life scenarios such as e-commerce reviews, spam filtering, sentiment analysis, intent recognition, and personalized recommendation [1]. In the process of text classification, a document usually contains thousands of unique words, and each word is regarded as a feature. But many features may contain a small amount of information, be noisy, or have nothing to do with category labels. This may reduce the performance and accuracy of the classification models at a later stage and increase the computational complexity [2]. Therefore, one of the main problems faced by text classification is to deal with highdimensional feature sets.
Feature selection is a method of extracting important or superior features from a high-dimensional feature set. After data preprocessing, feature selection is used to control the dimensionality of the feature space to eliminate redundant features, thereby improving the efficiency and accuracy of the classifier while reducing the complexity of calculation [3]. Feature selection is a combinatorial problem, and bionic algorithm is a classic means to solve optimization problems and combinatorial problems. Moreover, before performing feature-related processing in various fields, it is generally necessary to binarize features, such as text classification [4], image recognition [5], and cancer prediction [6]. In 2015, Australian Professor Mirjalili proposed the ant lion optimizer (ALO) [7]. Because the algorithm has the advantages of fewer adjustment parameters and simple implementation, some scholars have successfully applied the antlion algorithm to the field of feature selection. However, it is rarely used to solve the text feature selection problem. Therefore, in this study, we propose a chaotic antlion optimization algorithm for text feature selection.
The main contributions of this paper include: (1) First, propose a chaotic antlion algorithm (CAA) based on a quasiopposite learning mechanism and a chaos strategy, and compare the optimization performance with four algorithms on eleven benchmark functions. (2) Study the performance of CAFSA using CAA for feature selection when using different learning models, including decision tree, Naive Bayes, and SVM classifier. (3) The performance of CAFSA is compared with that of eight other feature selection methods on three Chinese datasets. Experimental results show that CAFSA is better than the other eight algorithms.
The other chapters of this article are arranged as follows: Sect. 2 is a summary of related work. Section 3 briefly introduces the process of the native antlion optimization algorithm, and Sect. 4 introduces the improvement strategies used in CAA, as well as the algorithm process and performance test experiments. Section 5 introduces the process of CAFSA using CAA in feature selection for Chinese text classification. Section 6 gives the comparison of CAFSA algorithm and other feature selection algorithms on three Chinese data sets when using different classifiers. Section 7 is a summary of the work of this article and future prospects.

Related Work
Feature selection techniques are usually divided into three categories: embedded methods, filter methods and wrapper methods [8]. Among them, the embedded method is a combination of the latter two methods. Although the filter method based on variable correlation has good generalization ability and low computational complexity, it may not be able to extract the best subset [9], similar to MICIMR [10], MIM [11], MIFS [12] MRMR [13] and TCIC_FS [14] and so on. Compared with filter methods, wrapper algorithms are more accurate because they consider the relationship between features [15]. Wrapper feature selection methods using a biomimetic algorithm show great possibilities for efficiently handling high-dimensional feature vectors [16].
In addition to the classic particle swarm optimization [17][18][19] and genetic algorithm [20], some other novel bionic algorithms have been successfully applied to text feature selection. For example, the cat swarm optimization algorithm [21], artificial fish swarm algorithm [22], the Jaya optimization algorithm [23], the firefly algorithm [24],the grey wolf optimization algorithm [25]and the ant colony algorithm [26]. To solve the problem of Arabic text classification, Chantar et al. proposed an enhanced binary gray wolf optimizer (GWO) as a feature selection method [27]. The method is combined with the SVM classifier and experiments are performed on three Arabic data sets. The results show that the scheme has superior performance on the problem of Arabic document classification. Kyaw et al. uses natural-inspired heuristic methods (whale and gray wolf optimization algorithm) and chi-square algorithm for feature selection, which improves the classification accuracy of the naive Bayes classifier on the 20 News Group data set [28]. To sum up, there have been a lot of research on text feature selection algorithms for English and Arabic, but they are relatively lacking in the Chinese field, especially bionic feature selection algorithms.
At present, some scholars have successfully applied the antlion algorithm to the field of feature selection. Mingwei Wang et al. used wavelet support vector machine (WSVM) to enhance the stability of the classification results, and at the same time used Lévy flight to help the swarm intelligence algorithm jump out of the local optimum [29]. Li Mengmeng et al. proposed a stable ant-antlion optimiser (SALO), which combines ant colony optimisation and ALO, and compared it with several relatively stable filtering methods [30]. Zawbaa et al. proposed a hybrid bio-inspired heuristic approach, combining ALO with grey wolf optimization (GWO) and comparing it with GA and PSO [31]. In conclusion, these proposed improved antlion algorithms are either combined with other bionic algorithms or improved native ALO, and then compared with genetic algorithm and particle swarm algorithm and some classical filter feature selection algorithms on different datasets, which shows the superiority of the ALO algorithm in feature selection. However, the original antlion optimization algorithm has disadvantages such as slow convergence speed, low accuracy of solving multimodal functions, and tend to get stuck in local optimal solutions. Therefore, scholars have made various refinements in response to these shortcomings. Yao et al. [32] used the Levy flight mode and combined the 1/5 principle in the closed-loop system of the control system theory to dynamically adjust the shrinkage rate of the trap, making the search range of the algorithm more comprehensive and the overall optimization efficiency higher. Saha et al. [33] used the quasi-opposite learning mechanism to initialize the population and the CLS strategy to improve the update formula for the ant position. Kilic et al. [34] replaced the roulette selection method with the tournament strategy and reduced the scale of the random walk of the ant population, which contributes to improving the accuracy and speed of the algorithm. Guo et al. proposed an improved antlion optimizer (OB-C-ALO) [35] and used it to solve the data clustering problem. The optimizer uses a random walk based on the Cauchy distribution rather than a uniform distribution to escape from local optimal solutions the local optimal value, and then combines the learning model based on the opposition with the acceleration coefficient, which overcomes the shortcomings of the slow convergence of the original ALO. Zawbaa et al. [36] uses different chaotic maps to change the reduction ratio of the antlion trap boundary, which improves the algorithm's optimization ability.
Although these improvements have achieved good results in their respective fields, there is still a lot of room for improvement in the performance of ALO when used in text feature selection. The traditional ALO generates the initial population randomly, which cannot guarantee the quality of the initial population and may lead to some groups learning from poor individuals, reducing the convergence speed of the algorithm. Besides, the model and scale of the random walk of the ant colony are two main reasons about why the algorithm is easy to become stuck in the local optimum solution and runs slowly. To this end, this paper proposes an improved ALO feature selection algorithm based on quasidissenting opinion learning mechanism and chaotic mapping, and studies the effect of this method compared with other eight feature selection methods when combined with different classifiers.

Ant Lion Optimization Algorithm
The ant lion optimizer finds the optimal solution by emulating the process of antlions building traps and preying on ants in nature. Antlions usually rotate around the sand and drill down to make a funnel-shaped trap. Then they hide under the sand at the bottom of the funnel and use their big jaws to bounce the sand out to make the funnel smooth and steep. When ants or other insects crawl with the trap, they will slide down as the sand loosens. At the same time, to prevent the prey from escaping, the antlions continue to bounce the sand outwards, causing the prey to be pushed into the center by the quicksand. The antlion algorithm has two populations, one for ants and one for antlions. Ants walk randomly, ants fall into traps, ants slide towards antlions, antlions catch ants and antlions rebuild the trap are the five fundamental modules of the hunting process. Figure 1 depicts the natural hunting activity of antlions.

Random Walk of Ants
where Sum calculates the cumulative sum, t represents the current number of iterations, and n represents the maximum number of iterations. Among them, rand is a random number uniformly distributed between 0 and 1.
To guarantee that ants move randomly in the search space and do not violate any boundaries, the position of ants needs to be regulated:

Fall into the Trap
By simulating the process of ants walking aimlessly around the antlion and becoming trapped, the calculation formula is as follows:

Ants Sliding Towards Antlions
By adaptively lowering the ant's random walk range, the process of simulating the ants sliding into the antlions is as follows: where t is the current number of iterations, T is the maximum number of iterations, and dynamically adjust I = 1 + 10 × t T according to the increase in the number of iterations. depends on the current number of iterations.

Ants are Caught
where Ant t i is the position of the i-th ant in the t-th iteration, and similarly, Antlion t j is the position of the j-th antlion at the t-th iteration. Update the position of the ants with the following formula: R t is the position obtained by the ant moving randomly around the antlion selected by the roulette wheel selection mechanism and R E is the position obtained by the ants moving randomly around the elite antlion at the t-th iteration.

Antlion Rebuild Trap
Calculate the updated objective function value of the ant, and compare it with the elite, and choose a better value as the global optimal solution. If the fitness value of the ant is less than the antlions, it is considered that the antlions have successfully captured the ant and the position of the antlions is updated. Figure 2 is a f lowchart of the chaotic antlion algorithm(CAA). CAA first uses the PWLCM chaotic map to initialize the positions of ants and antlions, and then uses a quasi-opposite learning strategy to optimize the positions of the two, increasing the diversity of the population. Then, the random walk behavior of the ants uses the chaotic strategy to increase the search range, so that the algorithm can jump out of the local optimal solution. At the same time, the walking scale of ants is reduced according to the principle of 1/5, which speeds up the chaotic walking speed of ants. Finally, after the ants are successfully captured, the quasi-opposite learning strategy is used again to optimize the position of the antlion, which improves the accuracy of the algorithm optimization. The pseudocode for CAA is described in detail in Algorithm 1. The following subsections detail the different components of the algorithm. The 1/5 principle was put forward by Rechenberg in the literature [37] in 1986. Its core is that the parameters in the algorithm should be dynamically changed to ensure that the update rate of the population can be maintained at about 20%. In the original antlion algorithm, the scale of the random walk of the ant population is set to the maximum number of iterations, which leads to an increase in the time and space costs of the algorithm. Therefore, this paper combines the 1/5 principle with the random walk behavior of the ant population, in other words, the scale of random walk is set to 1/5 of the maximum number of iterations.

Figure 3 displays the convergence curves of different travel scales in the test function
. It can be seen that in the process of 1000 iterations when the scale of random walk is 1/5 of the maximum number of iterations, the optimization accuracy of the algorithm is substantially higher than other ratios.

Piecewise Linear Chaos Strategy
The chaotic search algorithm is a local optimization algorithm with a lot of potential. It can effectively prevent the optimization algorithm from settling for the local optimal solution, and it can converge to the optimal solution faster than the standard random search [38]. The chaotic graphs commonly seen in the literature are as follows: 1. Logistic map [36]: 2. Tent map [39] 3. Sine map [40] 4. Cubic map [40] 5. Circle map [39] 6. Piecewise map [40] (9) (14) n+1 = n − sin n , ∈ (0, 4.5] Figure 4 illustrates the bifurcation diagrams of the six classic mappings. Literature [39] pointed out that compared with cosine, logistic and square mapping, tent, sine and circle mapping are most likely to generate more random and uniform pseudo-random number distribution sequences.
Zhenxing et al. [41] proposed an antlion optimization algorithm for adaptive tent chaotic search, which applied tent mapping with a parameter of 0.5 to population initialization and random walk of ants, improving the convergence speed, optimization accuracy and search efficiency of ALO algorithm, as well as its ability to get away from local optimum. However, tent mapping is not perfect. After a certain number of iterations, it tends to the fixed point of tent mapping, and there are small periods and unstable periodic points in the chaotic sequence. Once the data falls into the period of fixed points, it is bound to cause the chaotic sequence to tend to a fixed value and make the algorithm ineffective. Therefore, it is necessary to redesign the sequence generation method in an application, which increases the coding burden. It can be seen from Fig. 5 that the tent map reaches the completely chaotic state only when the parameter is 2, and the piecewise linear can reach the chaotic state within the parameter range.
to see that the piecewise linear mapping can also produce a fairly uniform distribution. Therefore, this article first uses the chaotic sequence generated by the piecewise linear chaotic map to initialize the population. The specific steps are described in Algorithm 2. In the continuous iteration process of the ALO algorithm, the boundary shrinkage ratio I increases linearly exponentially, which leads to the decrease of the step length of the ants' random walk and the decrease of the diversity of the antlion population, which easily causes the algorithm to become trapped in the local optimal and cannot jump out. Therefore, this paper combines the piecewise linear chaotic mapping with the random walk operation of the ant population to improve the exploration and development capabilities of the ant population and the ability of the algorithm to escape from the local optimal solution. The algorithm is marked as Algorithm 3, and the specific steps are as follows:

Quasi-Opposite Learning Strategy
The notion of Opposition-Based Learning (OBL) was first presented by Tizhoosh [42]. OBL is utilized in the field of computational intelligence to speed up the convergence speed of various optimization approaches. If is a point QOBLX is located between M and OBLX ;when M < X , QOBLX is located between M and X, and when M = X , QOBLX = OBLX. At the initial stage of the algorithm, ants and antlions with poor fitness could be generated, resulting in elite individuals with poor fitness, and further lead to the whole ant population learning from antlions of poor fitness, thus delaying the optimization process of the algorithm. Therefore, this paper uses the quasi-opposite strategy to optimize the initial population after initialization, so that the randomly generated initial position and its quasi-opposite position can be considered at the same time. By expanding the search space by exploring a larger region, the quality of the initial population can be improved and the algorithm can be accelerated. Besides, in the iterative process, the quasi-opposite learning strategy is used to optimize the ant population, which can accelerate the speed of the ant searching in the solving space. The algorithm is named as Algorithm 4.   of the multi-dimensional and multi-peak test function. The multi-dimensional single-peak function in the test function is utilized evaluate the optimization accuracy and convergence speed of the algorithm, and the multi-dimensional multi-peak function is used to evaluate the ability of the algorithm to leap out of the local optimal solution and the global convergence performance.

Experimental Results and Analysis
To guarantee that the experiment is completely fair, the above five algorithms set unified parameters: the population size is 30, the variable dimension is 30, the maximum number of iterations is 1000, and the parameter of PWLCM is = 0.35 . Taking into account the randomness of the algorithm, each test function is executed 30 times. The optimal

Test Function and Parameter Settings
To verify the effect of CAA, the functions are tested with 11 different feature standards, and simulation experiments are carried out in multiple dimensions. All the experiments are carried out with PyCharmCommunityEdition2020.3.1 and MATLABR2018b on a Windows platform using AMD-Ryzen74800U with RadeonGraphics1.8 GHz computer. Besides, CAA is compared with the original ALO algorithm [7], Dragonfly algorithm (DA) [43], Sine Cosine Algorithm (SCA) [44] and IALOT [45] algorithm for function optimization results. Table 1 shows the basic parameters and characteristics of the multi-dimensional unimodal test function. Table 2 Table 3, and the experimental findings of the multi-dimensional multimodal function are shown in Table 4. According to the experimental results in Table 3, It can be seen that for the multidimensional single-peaked functions F1, F2, F3, and F4, CAA can obtain the optimal values of the functions, and the optimization success rate is as high as 100% . However, although the optimal and average values obtained on F5 and F7 do not reach the theoretical optimal values, the results are more accurate and stable than those of other algorithms. This indicates that CAA has good optimization capability for multidimensional unimodal functions. Not only the optimization accuracy is high but also  Table 4 demonstrate that for the multi-dimensional single-peak function F8, the average value obtained by CAA is not as good as that of IALOT, but the optimal value produced is better than that of IALOT; For function F9, the average value produced by CAA reaches 4.44E − 16 , which is markedly better than the original ALO algorithm DA, SCA and IALOT. For functions F10 and F11, it can be noticed that the mean value is 0, indicating that CAA can find the theoretical optimum, and the variance is 0 indicating that the optimal solution was found in each of the 30 randomized independent experiments, which is significantly outperformed by the ALO algorithm, DA and SCA algorithms. To compare the optimization performance and convergence effects of CAA and the original ALO algorithm, DA, SCA and IALOT algorithm more intuitively, Figs. 8 and 9 are the convergence curves of 9 test functions in one of 30 independent experiments. Figure 8 illustrate that as the number of iterations increases, Fig. 10 Steps of Chinese text classification based on CAFSA algorithm CAA has a more prominent advantage than other algorithms in terms of optimization speed. Besides, the optimization accuracy of CAFSA method is considerably greater than other algorithms when the number of iterations is about 20 generations, as shown in Fig. 8c, d. The above reflects the efficient optimization ability of CAA in multi-dimensional single-peak functions. Although the solution accuracy of CAA in Figs. 8f and 9a is not optimal, the convergence speed is faster than other algorithms. It can be seen from Fig. 9b-d that as the iterative process continues, whether it is the optimization accuracy or the convergence speed of the algorithm, the effect of applying CAFSA is obviously more efficient than other algorithms. The analysis of the experimental results indicates that the improved antlion optimization algorithm has fast convergence speed and high optimization accuracy, and is effective and reliable in function optimization.

The Chaotic Antlion Feature Selection Algorithm (CAFSA) Using CAA
Text classification is a pre-defined text category, which is then divided into one or more topic categories according to the content of the text to be classified. Dividing text documents into corresponding categories is helpful for users to retrieve and recognize required information on web pages or huge corpora. The traditional Chinese text classification process mainly includes text corpus acquisition, text preprocessing, text word vector representation, construction of a classification model, and evaluation of test text. In the process of Chinese text classification, a document usually contains thousands of unique words, and each word is regarded as a feature. But these features may contain a small amount of information or have nothing to do with the category label. At present, the text features selection methods used in academic circles are mainly chi-square test (CHI), F test, information gain (IG), and other methods. These traditional feature selection methods are to calculate the importance of each feature to the classification through the evaluation function, and then select those features that meet the quantitative requirements and have a large degree of importance as part of the feature subset. Since the interaction between features is complex, some features in the selected feature subset may not guarantee the best classification effect. Therefore, this paper proposes a new feature selection mechanism for Chinese text classification. On the basis of preselecting the original features through the evaluation function, CAFSA is used to further select more representative features, so as to consider improving the classification accuracy of the model while reducing the number of features. Figure 10 is the specific process of feature selection using CAFSA. Figure 11 shows the framework for feature selection using intelligent optimization algorithms in Chinese text classification tasks.
1. The corpus or text can be obtained through web crawlers or download public data sets 2. Perform data preprocessing. The preprocessing process of this experiment consists of four steps. First perform data cleaning, delete duplicate or missing data, and remove some unnecessary characters from the text through regular expressions. Then use down-sampling technology to process the unbalanced data set so that the samples of each category are basically the same. Then this thesis uses jieba word segmentation technology to segment the text. Finally, load the stop word list and delete invalid stop words in the text after word segmentation. 3. Word frequency statistics and text word vector representation. This paper uses TFIDF technology to count word frequency. Fig. 11 General framework of Chinese text classification task

Experimental Procedure and Analysis of Results
This article uses PyCharm Community Edition 2020.3.1 for simulation under Windows 10 64-bit operating system, 16G running memory, AMD Ryzen 7 4800U with Radeon Graphics 1.8 GHz environment. The research object of this algorithm includes three different data sets. In addition, the effectiveness of CAFSA feature selection algorithm is verified by comparing with the original antlion optimization algorithm (ALO) [7], Piecewise linear chaotic antlion optimization (CALOP) [36], Seagull Optimization Algorithm (SOA) [46], Sparrow Search Algorithm (SSA) [47], and Harry Hawk Optimizer [48]. Feature selection techniques.

Preparation of Data Set and Setting of Parameters
In the experiment, three different data sets were used in this research work. For all data sets, we use the same preprocessing techniques and experimental procedures shown in Fig. 11.
1. SogouCS: Collected Sohu news data from 18 channels from June to July 2012, including domestic, sports, international and entertainment. In addition, website and text information are also provided. The nine categories with the most samples in total were selected for the experiment, with 300 samples per category, for a total of 2700 data. In particular, the data set was separated into a training set and a test set according to a 2:1 ratio. 2. ChnSenticorp_Hotel, ChnSenticorp_NoteBook: From two Chinese sentiment mining corpora collected and collated by Tan Songbo, each containing 2000 positive and negative comment texts.
The parameters involved in the wrapper feature selection method used in the experiment are shown in Table 5. In addition, choose three classic filter feature selection methods, as follows: 1. f _classif : ANOVA F value between label/feature for classification tasks. 2. mutual_info_classif : Mutual information for a discrete target. 3. Chi2: Chi-squared stats of non-negative features for classification tasks.

Evaluation Criteria
In the experiment, support vector machine (SVM) and naive Bayes (NB) decision tree (DT) classifiers were selected to evaluate the classification performance of each algorithm, and the radial basis function was selected as the kernel function. In addition, the goal of the experiment in this section is to use intelligent optimization algorithms to improve the accuracy of model classification while reducing the number of features. The fitness function is given in Eq. (19)    ( 2 * Precision)+Recall . is the reconciliation parameter, which is 1 here. Precision is the accuracy rate and Recall is the recall rate. Table 6 shows the experimental results of using traditional feature selection methods such as chi-square test to directly preselect 2000-dimensional features on the Sogou news data set, and using CAFSA algorithm for further optimization, and then using support vector machines, naive Bayes and decision-making Tree classifier for classification. The visualization of Table 6 is shown in Fig. 12. It can be seen that CAFSA improves the classification result after feature selection. Specifically, when using the SVM classifier, the fitness function is reduced by 16.337092% compared with the chi-square test optimized using CAFSA. In contrast, when the naive Bayes classifier and the decision tree classifier are used, the drops are 11.7452823% and 14.7940677%, respectively. Table 7 is the result of feature selection using six different feature selection methods after using traditional chi-square test for feature preselection. To show the comparison effect more intuitively, Fig. 13 shows the fitness comparison under three classifiers and six feature selection methods.

Results on SoGouCS Dataset
As can be seen from Fig. 13, CAFSA performs well on the SoGouCS dataset when using the Support Vector Machine, Parsimonious Bayes and Decision Tree classifiers respectively. In particular, CAFSA achieves a better fitness of 0.276498 than the other algorithms when using the Naive Bayes classifier, and a fitness of 0.245546 when using the Fitness i where Num c is the number of samples that were properly categorized, Num t is the total number of samples in the data set.
Where Count f is the number of features selected by the optimization algorithm, Count z is the total number of features in the data set. represents the weight, which is 0.9 here.
The following evaluation metrics were applied to assess the capabilities of the algorithm in different aspects:    Figure 13d plots the average of the fitness obtained by the six algorithms using the three classifiers, which shows that CAFSA outperforms the other five intelligent algorithms on average for the three different classifiers. Table 8 shows the experimental results of using traditional feature selection methods such as chi-square test to directly preselect 1000-dimensional features on ChnSenticorp_Hotel data set, and using CAFSA algorithm for further optimization, and then using support vector machines, naive Bayes and decision-making Tree classifier for classification. Figure 14 is a visual display of Table 8. It can be seen that after the feature selection by CAFSA, the classification effect has been improved. Specifically, when the SVM classifier is used, the fitness function optimized by CAFSA is reduced by 21.85983827% compared with the chi-square test, and when the naive Bayes and decision tree classifiers are used, they are reduced by 15.73901465% and 20.01941748%, respectively. Table 9 is the result of feature selection using six different feature selection methods after using traditional chi-square test for feature preselection. To display the comparison effect more intuitively, Fig. 15 shows the fitness comparison under the three classifiers and six feature optimization methods.

Results on ChnSenticorp_Hotel Dataset
It can be seen from Fig. 15 that CAFSA has excellent performance when using support vector machines, naive Bayes and decision tree classifiers on the ChnSenticorp_ Hotel data set. Especially when using the support vector machine classifier, CAFSA has a better fitness of 0.144950 than other algorithms. When using the Naive Bayes classifier, the fitness of CAFSA is 0.205950, which is second only to 0.200950 of the HHO algorithm. When using a decision tree classifier, CAFSA is second only to the HHO algorithm. Figure 15d presents the average of the fitness obtained by the six algorithms using the three classifiers, from which it can be seen that CAFSA outperforms the SSA, SOA, ALO, and CALOP algorithms on average for the three different classifiers. Table 10 shows the experimental results of using traditional feature selection methods such as chi-square test to directly preselect 1000-dimensional features on ChnSenticorp_Note-Book data set, and using CAFSA algorithm for further optimization, and then using support vector machines, naive Bayes and decision-making Tree classifier for classification. Figure 16 is a visual display of Table 10. It can be seen that after the feature selection by CAFSA, the classification effect has been improved. Specifically, when the decision tree classifier is used, the fitness function optimized by CAFSA is reduced by 21.57740993% compared with the chi-square test, while using the SVM classifier and the naive Bayes classifier, it decreased by 21.3091922% and 14.55295736%, respectively. Table 11 is the result of feature selection on the Chn-Senticorp_NoteBook dataset using CAFSA and the other five different feature selection methods after using the traditional chi-square test for feature preselection. To display the comparison effect more intuitively, Fig. 17 shows the fitness comparison under the three classifiers and six feature optimization methods. It can be seen from Fig. 17 that CAFSA has excellent performance when using support vector machines, naive Bayes and decision tree classifiers on the ChnSenticorp_NoteBook data set. Especially when using the decision tree classifier, CAFSA has a better fitness of 0.20135 than other algorithms. When using a support vector machine classifier, the fitness of CAFSA is second only to the SSA algorithm. When using the Naive Bayes classifier, CAFSA is second only to the ALO and SSA algorithms.    Figure 17d was the mean of the fitness obtained by the six algorithms using three classifiers, from which it can be seen that CAFSA outperforms the other five algorithms on average on three different classifiers.

Conclusion
This paper proposes a chaotic antlion feature selection algorithm (CAFSA) based on quasi-opposite learning mechanism and chaotic mapping, trying to use it as a feature selection method for Chinese text classification. Firstly, the Piecewise Linear Chaotic Map (PWLCM) is used to initialize the ant and antlion population,and then The original ant and antlion population is optimized using the quasi-opposition learning technique, which enhances the population's variety and improves the overall fitness of the population. Then, the chaotic map is used to optimize the ants' random walk behavior, which improves the ability of the ant population to explore and develop the search space, and helps the algorithm to escape from the local optimal solution. Finally, after the antlion successfully catches the ants, the quasi-opposition learning strategy is utilize optimize the antlion population position, which improves the algorithm's convergence speed and accuracy of optimization. In the process of Chinese text classification, the traditional feature selection method is first used to preselect the original features, and then CAFSA is used to further select more representative features, thereby lowering the number of features and improving the classification performance of the model. Finally, experiments were conducted on three Chinese data sets. Compare CAFSA with eight different feature selection methods, while each algorithm combines three different classifiers. The results demonstrate that CAFSA performs better overall. It can further select more representative features, which is a more effective feature optimization scheme. The shortcomings of this article are: 1. This experiment only uses three Chinese text data sets, but Chinese data sets from other sources and topics should be evaluated in the future to validate and extend the suggested algorithm. 2. How to make the improved algorithm show better optimization performance is still the next research work. 3. Further apply the algorithm to wider fields such as deep learning.