Artificial Intelligence-Enabled Cyberbullying-Free Online Social Networks in Smart Cities

In recent years, smart city services have moved the existence of people from the physical to the virtual world (cyberspace), e.g., online banking, e-commerce, telemedicine, etc. Along with the benefits of smart cities, the problems of the physical world are also moved to the cyber world, like cyberbullying in online social networks (OSN). Automated cyberbullying detection techniques need to be designed to remove the potential tragedies in OSNs. The recent advent of artificial intelligence (AI) models like machine learning and deep learning (DL) models can be employed for the detection of cyberbullying in the OSN. With this motivation, this paper develops an AI-enabled cyberbullying-free OSN (AICBF-ONS) technique in smart cities. The proposed AICBF-ONS technique involves chaotic salp swarm optimization (CSSO)-based feature selection technique to derive a useful set of features from the OSN data. In addition, stacked autoencoder model is used as a classification model to allocate appropriate class labels of the OSN data. To improve the detection performance of the SAE model, a parameter tuning process take place using the mayfly optimization (MFO) algorithm. An extensive experimental analysis ensured the supremacy of the proposed AICBF-ONS technique.


Introduction
A smart city is determined as a city which utilizes each connected data accessible nowadays to optimize the usage of constrained resource (determined as International Business Machines) also control and better understand its operation. The smart city technique could help towns to enhance government and citizen engagement [1], operate more efficiently using the advantages of data-driven decision-making [2], improved transportation, and safer communication. It also facilitates intelligent systems, flexible and decentralized to learn [3]. Smart city facilities have changed the existence of persons from physical to virtual world (cyberspace), for example, online shopping, online banking operation, medical services via telemedicine, and online ticket booking. Online content is an essential resource of sustainable management and [4] also a major problem of a present society [5]. As well as the troubles of the physical world, the services for mankind are also changed to the cyber world. E.g., bullying that is utilized to happen in physical world has currently changed to cyberspace via online social network (OSN) medias like Twitter, YouTube, Facebook, Reddit, and Instagram. OSN is a platform that provides users a place for engaging in social interactions, provides communication opportunities, later. The problems of cyberbullying recognition are given in the following. First, from the perception of manual detection, the decision of whether a certain behavior is cyberbullying differs from one person to another. Next, the place where cyberbullying happens more often [9], social network platforms generally have the nature of exposing anonymous and public expressions. Therefore, public posts are easy to be misleading/ambiguous and context-free, and highly independent. Then, one main problem with cyberbullying study is the absence of standardized information [10]. Though the data utilized in several researches, i.e., attained from similar social media (like Twitter), they are generated individually by scraping or using public API from sites. Thus, individual data cannot be related to one another and are not contributory to verify the methods generality. Moreover, toward text messages in social medias, the length of words are generally noisy and short, such messages could not be structured, that is messages may have emoticon, emojis like: misspelling, that confuses model capturing knowledge from the text message. Figure 1 shows the types of cyberbullying. Earlier researches on cyberbullying are mostly depending upon the investigation and statistics, that emphasis on the definitions, statistical method and the impact of cyberbullying; this study improved the fact of cyberbullying and made scientists focus more interest on cyberbullying from the perception of seriousness [11]. In the factor of computation researches, ML and DL methods assist scientists to better understand human behaviors. Cyberbullying recognition has been considered a natural language processing (NLP). Conventional ML methods are broadly utilized in recognizing negative forms of human behavior, the more general classifier adapted by scientists are support vector machine (SVM), and the more general feature extraction methods are the Bag-of-words (BoW) approach [12]. Recently, DL and NNs are broadly employed.
This paper develops an AI-enabled cyberbullying-free OSN (AICBF-ONS) technique in smart cities to classify the existence of cyberbullying in the OSN. The proposed model primarily encompasses pre-processing and feature extraction processes. The proposed AICBF-ONS technique involves chaotic salp swarm optimization (CSSO)-based feature selection technique to derive a useful set of features from the OSN data. Moreover, stacked auto encoder model is used as a classification model to allocate appropriate class labels of the OSN data. The application of CSSO algorithm to choose optimum features and mayfly optimization (MFO) algorithm to tune the parameters considerably boost the overall cyberbullying detection performance. A wide range of simulations were developed against different datasets and the results are determined in terms of different aspects.

Prior Cyberbullying Detection Approaches in OSNs
Kumari et al. [13] propose a unified depiction of text and image collectively for eliminating the requirement for single learning models for text and image. A single-layer CNN method is utilized with a unified depiction. The key finding of this study, i.e., the text denoted as image is a better module for encoding the data. Also, they establish that singlelayer CNN is providing better outcomes with 2D representations. Fang et al. [14] proposed a comprehensive method integrating the Bi-GRU and the self-attention method. Thoroughly, they introduced the strategy of Bi-GRU and GRU cell advantages to learning the fundamental relations among the sentences from both directions. Also, they presented the strategy of the self-attention method and the benefits of these linking to achieve better efficiency of cyberbullying classification task. The presented method can tackle the limitations of the exploding and vanishing gradient problem. Abaido [15] explored the universality of cyberbullying amongst scholars in an Arab community, its venues and nature, and their attitudes toward the report cyberbullying against residual silence. Data have been gathered from two hundred scholars in the UAE. 91% of the research samples affirmed the presence of acts of cyberbullying on social networks using Facebook (38%) and Instagram (55.5%) in the leading. Stricter legal actions, proactive measures, and Calls for smartphone applications are deliberated. Van Hee et al. [16] presented an automated cyberbullying detection in social network text through modeling posted by standers, bullies and victims, of online bullying. They described the fine-grained annotation and collection of a cyberbullying amount for Dutch and English and carry out a sequence of binary classification experiments for determining the possibility of automated cyberbullying recognition. They utilize linear SVM to exploit a rich feature set and explore that data source contributes maximum for the process. Yao et al. [17] introduced CONcISE, a new method for accurate and timely Cyberbullying detectiON on Instagram media Sessions. They proposed a consecutive hypothesis testing formula which searches dramatically decreases the amount of features utilized in categorizing every comment when preserving higher classification accuracy.
Özel et al. [18] prepared a dataset from Twitter and Instagram messages posted in Turkish and later they employed ML methods such as SVM, C4.5, NB Multinomial, and KNN classifications for detecting cyberbullying. Also, they employ chi-square FS and data gain approaches for improving classification accuracy. Van Bruwaene et al. [19] developed a multi-platform dataset which contains text from the post collected from 7 social networks. They presented a multi technique and multi-phase annotation scheme which utilizes crowdsourcing for hashtag and post annotation and then uses ML approaches to find further tweets for annotation. This procedure has the benefits of choosing posts for annotation which are considerably higher compared to possibilities of establishing clear cases of cyberbullying with no constraining the range of instances to those having predefined features (hashtag only utilized for selecting the post for annotation).
Raisi and Huang [20] proposed an ML method for simultaneous inferring users in harassment-based bullying and novel vocabulary indicator of bullying. The learning approach considers infers and social structure where the user tends to bully and victimized. For addressing the elusive nature of cyberbullying, the learning approach alone needs weaker supervision. Muneer and Fati [21] attempt to examine this problem by collecting a global dataset of 37,373 unique tweet posts from Twitter. Also, 7 ML classifications have been utilized such as RF, LR, SGD, LGBM, ADB, SVM, and NB. All the methods were calculated by precision, accuracy, F1-score, and recall as the efficiency measure for determining the classification detection rate employed in the global dataset. 9 Page 4 of 13

The Proposed Model
This study has developed an effective AICBF-ONS technique for the detection of cyberbullying in OSNs. Primarily, the input data from the OSNs in the smart cities are collected and are used for cyberbullying detection process. The proposed AICBF-ONS technique encompasses different stages of operations such as pre-processing, feature extraction, CSSO-based feature selection, Stacked Auto encoder (SAE)-based classification, and MFO-based parameter optimization. The working principle is demonstrated in Fig. 2.

Data Pre-processing
The data pre-processing phase is necessary to detect cyberbullying. It comprises stop word removal, punctuation mark removal, and eradication of spam contents. In this study, the unwanted noise in the data is removed such as stop words, special characters, and repetitive words. Besides, the rest of the words are stemmed to the actual root has been employed and the pre-processed dataset is fed into the proposed AICBF technique for further processing.

Feature Extraction
The next stage of cyberbullying detection is feature extraction process and the proposed model uses term frequencyinverse document frequency (TF-IDF) technique. It incorporates the integration of TF and IDF approaches. Using TF-IDF is weighted with the relative frequency as an alternative to simply calculating the words, which would exaggerate recurrent words. The TF-IDF technique notifies within a word occurs adequately in a sentence. As with BOW, the TF-IDF vocabulary is built at the time of model training and then reclaimed for testing process. In Eq. (1), the mathematical expression as TF-IDF of the weight of term in document is provided.

Fig. 2 Overall process of AICBF-ONS model
In this study, N implies the amount of documents, and df (t) refers the amount of documents in corpus comprising the word t . In Eq. (1), the initial term improves the recall, but the second term improves the word embedded accuracy.

Design of CSSO-Based Feature Selection Technique
Once the feature extraction process gets executed, the CSSO algorithm is designed to derive an optimal subset of features.
A new CSSO was presented where chaotic map is utilized for replacing arbitrary variables with chaotic variables. The original is mostly 3 main parameter that affects their efficiency. These parameters are r 1 , r 2 , and r 3 . As it can be demonstrated that r 1 implies linearly reduced with iteration, but r 3 implies the responsibility to determine whether the next place must be near negative or positive infinity. Since it is realized that r 2 and r 1 are the 2 important parameters affecting the upgrading place of salps, therefore it can be considerably affected on balancing amongst exploration as well as exploitation. The exploration focuses on determining novel optimum solutions by examining the search space on huge scale, but exploitation focus on exploiting the data in local region. A technique must appropriately balance these 2 modules for approximating the global optimum. In this case, the chaotic map is utilized for adjusting r 2 parameter of SSA. Equation (2) depicts the upgrading of r 2 parameter based on chaotic map. Equation (2) illustrates the upgraded place of salp based on chaotic maps, where o t implies the reached value of chaotic map at t th iterations.
As embedded chaotic map as to upgrading place of salps is enhance the efficiency and convergence rate of SSA. The mathematical expression of every utilized chaotic map is before determined. In this analysis, the presented CCSA was implemented not only on 14 unimodal and multimodal benchmark optimized issues but also on feature selection (FS) issues. In CSSO FS technique, the solution pool was limited for distinct binary procedures, where places of salps are limited to {0, 1} [22]. Assume a salp place (solution in the search space) y stated as i th dimension variable y = x 1 , x 2 , … , x dim , where dim implies the maximal number of dimensional. If the value of variable is equivalent to 1, afterward it implies that the equivalent feature was elected and if the value is equivalents to 0 , afterward it implies that (2) where Then, the comprehensive description of presented chaotic version of SSO approach for feature selection is projected as follows: Parameters Initialization Initially, the CSSO algorithm begins with arbitrarily set salps places. Afterward, it primarily sets the first parameter. The lower and upper boundaries are primarily set detail to utilized benchmark functions, but the utilized benchmark datasets primarily set the lower boundary to zero and upper boundary to one for provided data. The maximal amount of iterations is fixed to global optimized problem to 500, but 30 for FS issues. Lastly, the population size is set to 50 for global optimized issues and 20 for FS issues. The values of population size and amount of iterations are set minimum for FS issue, because of the difficulty of search space.
Fitness function (FF) was utilized for evaluating all solutions (salp place). All the utilized global benchmark issues are minimization issues. Therefore, the solution with minimal fitness value min (f (X)) is elected as optimum solution attained so far. Although for FS issue, an optimum solution is the one that maximizes the classifier accuracy but minimizes the amount of elected features. Equation (6) illustrates the utilized FF for FS issue, where these 2 functions are comprised as to one by setting the weight factors. In this formula, Acc represents the classifier accuracy attained from SAE classifier, where k equivalents to 3 with mean absolute distance. SAE is one of the supervised ML techniques. It categorizes the novel sample dependent upon distance from novel instance to trained samples. In this effort, the SAE classification was utilized for indicating the goodness of elected features subset. w f represents the weight factors utilized for controlling the significance of amount of elected features and classifier accuracy. In this case, an aim is for maximizing the classifier accuracy in the initial place then for minimizing the amount of elected features. So, w f is set to 0.8. L f refers the length of elected features subset, but L t represents the entire amount of features for a provided dataset. .
Later estimating the FF of all salps and electing the optimum salp place. An optimum salp place upgrades their place. The procedure of estimating all salps and upgrading the place of optimum salp is repeating again and again still it attains to the maximal amount of iterations or an optimum solution is initially established. In our study, the optimized procedure ends if it can be maximal amount of iteration is met; 50 for global optimized issue and 30 for FS issue.

Design of MFO-SAE Technique for Classification
At the final stage, the MFO-SAE technique is employed for cyberbullying detection and classification process. Commonly, SAE is a kind of unsupervised DL approach that has been arranged with the help of different.
Auto Encoder (AE). The AE consists of 2 portions, decoder, and encoder. In the beginning, encoder layer was applied to convert the input x to hidden representation h , i.e., described as h = f (wx + b) , where f , w and b represent the weight matrix, bias of present encoder layer, activation function, in which the decoder layer is applied for reforming x from h , i.e., developed as x � = g w � h + b � , where x ′ , g , w ′ and b ′ denote the weight matrix, simulation result, bias of present decoder layer, and, activation function correspondingly. Figure 3 demonstrates the architecture of SAE. Furthermore, comprehensive training method of AE consists of fine-tuning phase and pretraining phase [23]. First, AE handled for reducing the cost function and determined by, where x i defines the AE input which represents the i th instance. x ′ i Implies the AE outcome and recreation of ith instance and m demonstrates the amount of instance. SAE is established by including hidden layers. Therefore, hidden encoded execution of present AE is considered as input of forthcoming AE. If the pretraining layers of SAE have been finished, the decoder for an AE has been dropped. Next, interconnect fine-tune and encoder weights of SAE with the help of softmax regression.
Moreover, it is limited term, i.e., included in the loss function of AE. The expanded loss function is determined by, where w signifies the weight matrix, and indicates the balance factor. If the amount of hidden neurons is maximal compared to the input neurons from input layer, it generally adds KL divergence to loss function of AE. Therefore, adapted loss function is depicted by, In which H refers the amount of hidden neurons, and KL indicates the KL divergence. defines the sparsity variables and ̂j determines the moderate activation of input sample of j th hidden neurons. In this method, Sun denotes minimalizing the loss functions (7), (8), or (9) that does not ensure the AE learns "useful" feature for the certain process. In event of image classification, "meaningful" features are denoted to enhance the classification rate. Therefore, explicit guiding norm relevant to certain functions such as the cross-entropy loss function is recommended to be involved in the loss function of AE and authorized to be substantial. Therefore, the adapted loss function is given by, where C implies the amount of classes. label(i, j) denotes the true probability from j th class of i th instance. log pred(i, j) denotes the examined likelihood from j th class of i th instance. Then, a variation of SAE is employed in the recently developed method. In this instance, the bias and weights values of the SAE are selected using the presented method.
To further improve the performance of the SAE model, the MFO algorithm is utilized to choose the parameters optimally in such a way that the cyberbullying detection outcomes can be increased. The selection of variables in the SAE method is vital for attaining an efficient classification Most of the ML model includes multiple variables that should be optimized. As the trial and error approach is not feasible, metaheuristic optimization-based MFO technique is employed for selecting parameters. In general, the prediction error function acts an as objective function of the MFO approaches [24][25][26]. In MFs in swarm for MO method is separated into female and male MFs. Also, the male MFs are always stronger and later, it would act as optimum in optimization. In contrast to individual swarm in PSO method, the individual in MO method is updating their position according to its current location p i (t) and velocity v i (t) at the current iteration: Each female and male MFs update its position in Eq. (13). However, its velocity is upgraded in several manners.
Movements of Male MFs They carried out exploitation/ exploration procedures in iteration. The velocity is updated on the basis of its present fitness values f x i and the historical optimum fitness values in paths f x h i . If f x i > f x h i , the male MFs have updated their velocity based on their present velocity, combined with the distance among them and the global optimum location, the historical optimum paths: where g is the parameter linearly decreases in the maximal value to lesser one. a 1 , a 2 , and denote 2 constants to balance the values. r p and r g represent 2 parameters utilized for telling the Cartesian distance among the individuals and their historical optimum location, the global optimum location in swarm. The Cartesian distance is the subsequent standard to the distance array: On the other hand, if f x i < f x h i , then the male MFs have updated their velocity in the current one with an arbitrary dance coefficient d: where r 1 denotes the random amount from the uniform distribution and is chosen in the domain − 1 and 1. Figure 4 illustrates the MFO technique.
Movements of Female MFs They have updated their velocity in various manners. Generally, female MFs with wings living from 1 to 7 days, hence the female MFs are rush for detecting the male MFs to mate and reproduce themselves. Thus, it updated its velocity according to the male MFs. In this MO method, the top optimum male and female (11) MFs are determined as the early mating, and the following optimum female, male MFs are determined as the following mates, etc. Thus, the ith female MF, if f y i < f x i : where a 3 implies additional constant and is utilized for balancing the velocity. r m signifies the Cartesian distance among them. On the other hand, if y i < f x i , then the female MFs are updating their velocities in the current one with another arbitrary dance fl: where r 2 represents the random amount from a uniform distribution in domains − 1 and 1.
Mating of MFs Each top half female and male MF is mated and set pair of children to each of them. Its offspring are arbitrarily established in their parents: Where L denotes the random amounts from Gauss distribution.

Performance Validation
This section examines the performance of the proposed model on two distinct datasets. Dataset-1 has a total of three classes with 1954 instances under racism class, 3122 instances into sexism, and 11,014 instances under neither. Besides, dataset-2 has a total of three classes with 1430 instances under racism class, 19,190 instances into sexism, and 4163 instances under neither. The racism and sexism instances are integrated into a class of Cyberbullying and neither instance falls into the Non-Cyberbullying class. Therefore, dataset-1 has 5076 instances in Cyberbullying class and 11,014 instances in Non-Cyberbullying class. Besides, dataset-2 has 20,620 instances in Cyberbullying class and 4163 instances in Non-Cyberbullying class. Table 1 and Fig. 5 offer a brief cyberbullying detection performance analysis of the AICBF-ONS technique under different number of layers. First, the performance of the AICBF-ONS technique on dataset-1 shows the effective outcomes of the AICBF-ONS technique under all different layers. For instance, with 1 layer, the AICBF-ONS technique has classified the non-cyberbullying instances with the precision of 89.52%, recall of 88.29%, and F1-score of 87.26%; whereas, the cyberbullying instances are categorized with the precision of 75.40%, recall of 79.07%, and F1score of 78.07%. In addition, with 3 layers, the AICBF-ONS approach has classified the non-cyberbullying instances with the precision of 90.29%, recall of 90.02%, and F1-score of 90.56%; whereas, the cyberbullying instances are categorized with the precision of 76.44%, recall of 77.52%, and F1score of 75.45%. Moreover, with 5 layers, the AICBF-ONS algorithm has classified the non-cyberbullying instances with the precision of 89.24%, recall of 88.81%, and F1score of 89.19%; whereas, the cyberbullying instances are categorized with the precision of 77.29%, recall of 80.38%, and F1-score of 75.68%. Table 2 and Fig. 6 show the classification results analysis of the AICBF-ONS technique with other methods. The classification results on dataset-1 portrayed that the K-CNN and Bi-LSTM techniques are found to be a poor performer over the other methods. Afterward, the TFIDF-SVM and MF-LR techniques have obtained certainly improved performance over the other techniques. Meanwhile, the CNG-LR technique has gained somewhat reasonable cyberbullying detection outcomes over the other ones except for Bi-GRU and AICBF-ONS techniques. Though the Bi-GRU technique has exhibited considerable outcomes over the other ones, the proposed AICBF-ONS technique has offered improved outcomes with the maximum precision, recall, and F1-score values.
The classification outcomes on dataset-2 demonstrated that the K-CNN and Bi-LSTM techniques are initiated to be the least performer over the other techniques. Then, the TFIDF-SVM and MF-LR manners have attained certainly enhanced performance over the other algorithms. Meanwhile, the CNG-LR technique has gained somewhat reasonable cyberbullying detection outcomes over the other ones except for Bi-GRU and AICBF-ONS approaches. But the Bi-GRU method has outperformed considerable outcomes over the other ones, the projected AICBF-ONS methodology has offered enhanced results with the maximal precision, recall, and F1-score values.
To further ensure the better performance of the AICBF-ONS technique, an average results analysis of the AICBF-ONS with existing techniques is performed in Table 3. Figure 7 investigates the performance of the AICBF-ONS technique on the applied dataset-1. On examining the results on dataset-1, it is noticed that the MF-LR technique has found to be an ineffective cyberbullying detection model with the precision of 77.45%, recall of 78.35%, and F1-score  Figure 8 examines the performance of the AICBF-ONS approach on the applied dataset-2. On investigative the outcomes on dataset-2, it can be stated that the TFIDF-SVM manner has been found to be an ineffective cyberbullying detection method with the precision of 88.50%, recall of 93.95%, and F1-score of 90.90%. In line with this, the CNG-LR, MF-LR, and Bi-LSTM algorithms have exhibited somewhat increased and closer performance. Afterward, the K-CNN manner has outperformed moderate results with the precision of 90.50%, recall of 90.10%, and F1-score of 90.30%. At the same time, the Bi-GRU approach has accomplished near optimum performance approach with the precision of 92.15%, recall of 93.80%, and F1-score of 92.95%. Finally, the projected AICBF-ONS methodology has surpassed the other techniques with the precision of 93.12%, recall of 95.38%, and F1-score of 94.02%.

Conclusion
This paper has presented a novel AICBF-ONS technique to determine the presence of cyberbullying in OSNs. The proposed AICBF-ONS technique encompasses different stages of operations such as pre-processing, feature extraction, CSSO-based feature selection, SAE-based classification, and MFO-based parameter optimization. The application of CSSO algorithm to choose optimum features and MFO algorithm to tune the parameters considerably boost the overall cyberbullying detection performance. A wide range of simulations were developed against different datasets and the results are determined in terms of different aspects. The experimental outcomes highlighted the enhanced cyberbullying detection performance of the proposed AICBF-ONS technique compared to the other recent state of art techniques. As a part of future scope, the performance of the AICBF-ONS technique can be extended to the design of outlier detection and data clustering approaches in big data environment.