1 Introduction

Internet users from all over the world involved in the usage of social media platforms, which according to the distinct sources, are around 71%, which is quite huge in numbers [19]. Facebook, a renewed platform is, 60% of the worldwide social network users [34]. For the promotion of social drives, product and services, it is not surprising that students and childhood are widely using social media platforms [1, 25, 33]. Internet users employ social media platforms to trade and create material, and these platforms also help them stay linked to other users, according to Kaplan and Haenlein [2]. Especially in the case of any social drive’s promotion, for the benefit of the community, the implementation of social media campaigns brings several advantages with them specifically, but not limited to get connected with the people in a very short interval of time [42, 44]. Also, through the social network, the promotion of anything or product or service has a sincere impact on the users and help them to decide on certain things [22].

As far as the communities are concerning, users need to discuss so many social drives on a reliable platform for their promotion.

Although it’s very challenging but with the advancement of technological propagations and the increasing engagement of childhood on social media towards social drives to promote more democratic forms of engagement between citizens and state made this achievable. There exist’s many studies that perform the said by exploring different social initiatives on social media to get remarkable outcomes [1, 25, 33]. In their work, they try to show the importance of social media for promoting and social drive. However, no study measured the impact of a social Drive that users promoted on social media. This study aims to: measure the impact on young people of selective social drive-by influencing factors. This study evaluates the effects of leading determinants of Corona Virus Pandemic Disease 2019 (COVID-19) in social media forums when users promoted it.

In this research, efforts are to use association rule mining with novel similarity measures and formal concept representation to compute the impact of COVID-19 on Youth when social media forums promote it as a general social drive. From the experimental results and discussion, the performance of the proposed techniques is quite outstanding in term of accuracy. Researchers can improve current research on particular models of real-time applications to explore more social drives in the future. Another possible improvement could be the usage of extracted influencing factors towards computations and analysis.

The arrangement of the rest of the paper is as follows: Section 2 describes the literature review related to the proposed research work. Section 3 explains the proposed conceptual model.

Section 4 shows the experimental results and discussion.

Section 5 provides the conclusion and future work.

1.1 Research contribution

The contribution of this exploration effort is as under:

  • This study tries to measure the impacts of various factors of any particular drive like COVID 19 on Youth when social media forums promote it.

  • The proposed framework uses two essential collections of influencing factors: LF and RF onwards, with an advanced similarity measure and programming based interaction, to explore the new dimension of research compared to existing state-of-art techniques.

  • Experimental results and evaluation depict that the proposed technique performs better analysis than the existing approaches.

  • Last but not least, for future research work, this research in social media and social networks will provide a roadmap.

2 Literature review

There are so many applications where people try to discuss their issues and community problems. However, Facebook, Twitter and blogs are most widely used by most for communication purpose [5]. People are using Snapchat and Instagram as communication app, through which they are sharing their ideas from day to day and have multiple purposes like sending messages and sharing pictures and videos with friends. There are so many options for teens on social media to browse and search for the content of their own choices, like video streaming and news sharing.

Presently and explicitly, the internet-based life has thought about a genuine component in the social and non-social crusades and how individuals view and handle the issues, talked about on this stage in subtleties free of existing requirements [21]. Beforehand, the vast majority of the work via social networking platforms, particularly informal groups locales, concentrated on individual uses while distributing a couple of studies on the distinctive social skirmishes. On the relationship between school and civic engagement, Yuen [18] performed research. They examined the uses of microblogs to advance community interest among Chinese undergrad and graduate understudies in their exploration study. Likewise, the yield of all preprocessed words practice string vectorization, splitter, and stemming has been unmistakably clarifying the preprocessing stage.

Like many other social drives, during the COVID-19 pandemic, numerous youngsters have been working distantly, schools have moved to online stages, and social separating keeps on being in actuality. In this regard, Youngster uses social media to raise awareness about COVID control, assists in the coordination of humanitarian programs for separated people, and increase assets for crisis professionals. It’s also assisting doctors and attendants in saving lives by detecting infections early using clinical imaging and encouraging the total capacity of contact following to limit infection spread.

The initial examination via web-based media during a pandemic goes back to the 2009 H1N1 pandemic, following the pervasiveness of falsehood (decided as 4.5%), phrasing use (“H1N1” versus “pig influenza”), public opinions and dread, and connections between case frequency and public concern [41]. Past examinations used the web to gather information identified with sicknesses, for example, the inquiry recurrence of handwashing, hand sanitizing, and clean points [31]. The WHO proclaimed that they are presently battling a global pestilence in addition to an online media infodemic, with some media guaranteeing that the Covid is the main genuine web-based media infodemic because it has quickened data and falsehood worldwide and is filling frenzy and dread among individuals [17]. Here is a dubious yet testable theory since clients of web-based media use the stages to communicate their feelings, sentiments, and musings, which can be a significant wellspring of information for investigating emotional comfort [41].

ABC News revealed that Covid’s unrest spread more quickly than infection, that is, because of social media. [31]. Also, social media is far much helpful in spreading awareness against such a viral disease [17]. Brewer on BBC News [3] sets that the ton of news about COVID-19the has influenced the general population, who made frenzy and building characters alive with strain. Essentially, Rothschild and Fischer [35] asserted that web-based media is spreading apprehension and furore among web-based media clients. Compatibly, through the media which is web-based, Cellan-Jones [4] revealed that the for picking up the data about COVID-19 the people believe on the statistics present on the social media or web, it has been observed that these factors are not much valid, but certain nations have used them as a channel to get the facts.

Millions of people have used online media to check the information about the disease after COVID-19 showed up and transferred to other nations after China. Molla indicated [30], there were 19 million notices of the particular disease in only 24 h across the online media. To assume the liability of giving the data, the residents have used the comprehensive infrastructures to check the reliability of the data. [15]. Frenkel [13] reported that when the WHO said that web-based media organizations are spreading the COVId-19 negativity, which created a worrying situation, some organizations tried to remove the false information online from their directory following that statement.

Victor [40] claims that in the present advanced age, the Chinese residents have not got realities about the COVID-19, and as a result, they relied on the online media due to this fact. Furthermore, the public authority has asked the social media platforms to stop floating the content about the diseases as it creates a fuss among the people in India. Emmott [12] noted that according to European Union reports, the Russian media spread large-scale false media campaigns on COVID-19 which, disturb the West’s lives. In a contemporary conversation on the impacts of media, one analyst [20] has explained in a very diligent way that the web-based media is disrupting people’s mind as when they want to buy anything from online stores, they already know about the shortage of the things and are going to buy the grocery items in the bulk which creates the lack of the products for the other people. The web-based media created this situation. As indicated by the paper The Star [39], web-based media is responsible for all the universal circumstances that destroying the foundations of the economies post COVID-19. In addition, Devlin [8] disclosed web-based shopping data identified as food weaknesses that have caused a frenzy.

Also, Kent [23] noticed that web-based media is allowing everyone to see and share information on the web that is the main reason due to which people can find and share everything on the web-based media, apart from certifying whether the information is correct or not and they heard and see about COVID-19 [16]. In the twenty-first century, with the help of innovations used for online communication, the open arena has changed. A significant wellspring of new media for the data wellbeing besides phase aimed at examining individual encounters, sentiments, and concerns for wellbeing, ailments, and treatment has been introduced [16].

Additionally, Dillon [9] find out that in this pandemic, people had wasted a lot of their energy and time on social media to make people frenzy and increased the purchasing between the different nations which, creates more fuss. Furthermore, El-Terk [11] revealed that, in this pandemic, everyone is trying to be the expert because everyone has the ability and has the right and potential as well to communicate in a voice or digital form to convey his message in COVID-19. Correspondingly, Garrett [14] made it clear that we provide the people to create the alarming situation on the web as we are the ones who have provided them with the latest data and information about the pandemic.

People today know many means for imparting and disseminating material, according to Merchant and Lurie [28]. that is owing to the advent of web-based media. These are fast and impressive and spread as quickly as false information does. Also, La et al. [24] said that there are so many nations that have limited access to the data and not on the data which, is not beneficial for them and in this way, they only access the information which is helpful for them and not for the harm or disaster because the general society has limited access over the COVID-19 flare-up. For the management of line data correctly, the Vietnamese case is a productive image. The nation’s Ministry of Health’s web-based broadcasting networks made the information available to the general public.

According to Mian and Khan [29], there has a significant increase in the distribution of fake news and incorrect information on COVID-19, i.e., the lab’s hypothesis on the infection’s origin began via the web. Correspondingly, Petric and others [32, 38] accept that “media inclusion has featured COVID-19 as a one of a risk”. Depoux and others [7, 36] established that web-based broadcasting supposed three fundamental parts in numerous countries. Users circulated the realities about the episodes via online media.

We learned from the preceding discussion that we need to measure the influence of influencing factors on Teens and conduct a detailed analysis [27, 37]. Therefore, it aimed to covert the influencing factors into the proper model, based on the reviewed literature, establishes which influence factors of COVID-19, and a huge impact on a particular group of Teens. This study tries to measure the effect of the influencing factor of COVID 19 on Youth when social media forum promoted it.

3 Proposed methodology

In this research work, we proposed a phase-wise analytics framework to explore and analyze the impact of influencing factors of COVID-19 on Youth when it promoted on social media. The following subsections explore the detailed description of each phase.

3.1 Parameter assumption

The improvement on the Internet, and its approval at the public level and thereafter, its usage by the common public, specifically young generation and students and communities through social media, this research study assumed three different categories i.e. Y1, Y2 and Y3 of Youth based on age, particular media usage and gender, Fig. 1 shows The parameter wise description of these.

Fig. 1
figure 1

a Gender wise ratio of Y1, Y2 and Y3, b Age-wise ratio of Y1, Y2 and Y3, c Usage wise ratio of Y1, Y2 and Y3

3.2 Data collection

In this section, we elaborated a detailed description of influencing factors (data collection) and gathered Two different types (as described by equation1) of factors to evaluate the proposed model.

$${\displaystyle \begin{array}{*{20}c}X=\left\{{x}_1,{x}_2,{x}_3,...\dots, {x}_n\right\}\kern3.25em \forall \kern0.5em {x}_n\in LF\\ {}Y=\left\{{y}_1,{y}_2,{y}_3,\dots..., {y}_n\right\}\kern3.25em \forall \kern0.5em {y}_n\in RF\end{array}}$$
(1)

Whereas X are the factors that have obtained from past literature (LF), Y are those factors that are obtained from the previously proposed machine learning model (RF).

Initially, we designed an inclusive search strategy to extract more relevant factors from LF: Scopus, Web of Science (all databases), and Google scholar for full text and abstracts. We searched for relevant articles published between December 2019 to November 2020 using various search terms: “COVID 19”, “Influencing factors of COVID 19”, “Risk factors of COVID 19, etc. combined with OR. Hits from all of the databases exported to an EndNote Library, with duplicates removed. Figure 2 depicted the search strategy for LF data collection. Whereas collecting only original research articles like conference abstracts, letters, and reviews. Favouring the studies in the English language that contain all COVID 19 factors and that influencing the community. Table 1 shows some of the resultant factors obtained from LF.

Fig. 2
figure 2

Flow diagram of search result

Table 1 shows the list of obtained factors along with extracted references

Another list of influencing factors may found in “Sentiment Analytics: Extraction of Challenging Influencing Elements from COVID-19 Pandemics,” a machine learning-based model that uses the sentiment analysis technique to extract various influencing factors from reviews and comments on various web portals. Table 2 shows some of the factors obtained by using the machine learning model.

Table 2 Influencing factors list using machine learning model

3.3 Computing impact of influencing factors

To investigate which factors more influence the Youth, we chose a two-step approach. First, we performed a systematic literature review and selected articles describing influencing factors. Second, we performed a qualitative synthesis of these factors. The following subsections describe the detailed description of each section of the proposed two-step approach as depicted in Figs. 3 and 4:

Fig. 3
figure 3

Proposed conceptual Model Phase-1

Fig. 4
figure 4

FCA based Factor Lattice

3.3.1 Data input

This research is to determine the impact of influencing factors of COVID 19 on a particular category of Teens when user promoted it as a social drive on social media forums. The first input dataset contains almost 1310 influencing factors, was extracted from past literature using a comprehensive literature review process as described in section II under the title “Data Collection”. The second input dataset (almost 900 influencing factors) is from an early proposed sentiment analytics-based model that extracts influencing factors from a corpus-based on Tweets and Facebook reviews by using corresponding API (Application Programming Interface).

3.3.2 Formal representation

As far as, the inputs section is concerned, most of the obtained factors whether, obtained from past literature or sentiment analytic based model, may contain redundancy and irrelevant, which may lead to computational complexity. Therefore, there exists a need to represent it formally. To resolve this case, we proposed a Formal Concept Analysis (FCA). FCA can is a powerful tool like semantics, terms and metadata to structure and classify a set of resources represented through a set of attributes automatically. FCA is a mathematical theory of concept formation [43] that originates, through the kit of ordered arguments and lattice, with the help of a theoretical model for discovering relationships and knowledge organization. For this, two definitions are introduced based on Wang’s paper [10].

  1. 1.

    A formal context based on factors association defined as a triple: F C = (S, T, I), where S is a set of factors obtained from past literature, T is a set of collected factors by using sentiment analytics model, and I is a binary relation between S and T. If s ϵ S, t ϵT and t occur in the document s, we say that s possessest expressed by s It or (s, t) ϵ I.

Given a formal context F C = (S, T, I), if X⊂S, Y⊂T, two functions ↑and ↓ are defined as follows.

$${\displaystyle \begin{array}{*{20}c}\uparrow :\mathrm{F}\left(\mathrm{S}\right)\to \mathrm{F}\left(\mathrm{T}\right),{\mathrm{X}}^{\uparrow }=\left\{\mathrm{t}\ \upepsilon\ \mathrm{T};\forall \mathrm{s}\ \upepsilon\ \mathrm{X},\left(\mathrm{s},\mathrm{t}\right)\ \upepsilon\ \mathrm{I}\right\}\\ {}\downarrow :\mathrm{F}\ \left(\mathrm{T}\right)\to \mathrm{F}\ \left(\mathrm{S}\right),{\mathrm{Y}}_{\downarrow }=\left\{\mathrm{s}\ \upepsilon\ \mathrm{S};\forall \mathrm{t}\ \upepsilon \mathrm{Y},\left(\mathrm{s},\mathrm{t}\right)\ \upepsilon\ \mathrm{I}\right\}\end{array}}$$
(2)
  1. 2.

    Given a formal context F C = (S, T, I) if an ordered pair (X, Y) ϵ F (S) × F (T) makes X↑ = Y and Y ↓ = X, (X, Y) is a formal concept of F C. X is the extent of this concept and Y is the intent.

Consequently, only when several factors in the intent of a concept is two or three and to the extent, that the number of factors is more than the threshold, the notion can be retained otherwise excluded from the cluster. We considered selected factors as potential factors for further steps. Based on the proposed formal context, a unique representation known as concept lattices represented a descriptive representation of all influencing factors. Based on the theory of FCA, Fig. 4 shows some of the derived lattices.

3.3.3 Computing influencing factor list

After the formation of the concept lattice, the obtained factors are needs to be refined. In this paper, we utilize four novel association based measure to choose the most becoming influencing factors. This section provides details of the proposed factor selection methods.

Normalized Google distance (NGD)

When compared to the semantics of certain other expressions and phrases, the setting of words and expressions vary according to their use in everyday life. NGD consequently separate those pages relating to specific term affiliation, utilizing Google page includes appeared in Algorithm 2. [6] proposed Page tally affiliation estimated among various terms using Normalized Google Distance (NGD). Discovering the relationship amongst dual-distinct words utilizing NGD doesn’t need some foundation learning contrarily, a specific investigation of the issue area. Preferably, it consequently examinations all highlights over Google search using World Wide Web. To get a particular event represented in a review, apply the social page count technique (SPC) based on NGD. NGD calculates the weights of all the aspects extracted from the LDA process. Table 4 shows the identified characters after using LDA. NGD works by checking the number and nature of connections to a page to decide an unpleasant gauge of how vital the aspects are for the event. Equation 3, 4 defines the Normalized Google Distance (NGD).

$$\mathrm{NGD}\left(\upalpha, \upbeta \right)=\frac{\mathrm{f}\left(\upalpha, \upbeta \right)-\min \Big(\left(\mathrm{f}\left(\upalpha \right),\mathrm{f}\left(\upbeta \right)\right)}{\max \Big(\left(\mathrm{f}\left(\upalpha \right),\mathrm{f}\left(\upbeta \right)\right)}$$
(3)

and

$$\mathrm{N}\mathrm{GD}\left(\upalpha, \upbeta \right)=\frac{\max \left\{\mathrm{logf}\left(\upalpha \right),\mathrm{logf}\left(\upbeta \right)\right\}-\log \left(\upalpha, \upbeta \right)}{\log \mathrm{N}-\min \left\{\mathrm{logf}\left(\upalpha \right),\mathrm{logf}\left(\upbeta \right)\right\}}$$
(4)

Term α and f(α, β) indicate association of both LF and LR factors, α and β reported by Google where f(α)represents the number of pages. We increase NGD as it seems that by decreasing N. The NGD has some properties which smeared in this experiment as below:

  1. 1.

    Between 0 and ∞, the approximate value of the NGD lies but if the Google search count irrelevant score, it is a little bit negative when:

$$\mathrm{f}\left(\upalpha, \upbeta \right)>\mathit{\max}\left\{f\left(\upalpha \right),\mathrm{f}\left(\upbeta \right)\right\}$$
(5)
  1. (a)

    if the frequency f(α) = f(β)= f(α, β) > 0, then NGD(α, β) = 0, in all situations of α and β.

  2. (b)

    we have NGD(α, β) =  ∞ /∞, if the frequency f(α) = 0 then for every search term β

    1. 2.

      For every α, the value of NGD is almost non-negative and NGD(α, α) = 0. We haveNGD(α, β) = NGD(β, α)for every pair ofα and β. For example choose α ≠ β with x = y, then f(α) = f(β)= f(α, β)andNGD(α, β) = 0. Nor does the NGD satisfy triangular inequality:

$$\mathrm{NGD}\left(\upalpha, \upbeta \right)\le \mathrm{NGD}\left(\upalpha, \upeta \right)+\mathrm{NGD}\left(\upbeta, \upeta \right)\ \mathrm{for}\ \mathrm{all}\ \upalpha, \upbeta, \upeta$$
(6)
$$\mathrm{where}\ \upalpha, \upbeta, \upeta\ \mathrm{represents}\ \mathrm{factors}.$$

Chi-Square test

A chi-square test for independence compares two influencing factors in a contingency table to see if they are related. In a more general sense, it tests to see whether distributions of categorical variables differ from each other. To evaluate Tests of Independence, using a cross-tabulation (also known as a bivariate table), we use the Chi-Square statistic. With the intersections of the categories of the variables appearing in the table columns, Cross tabulation presents the distributions of two categorical variables simultaneously. The pattern that would be expected, if the variables were truly independent of each other and in this case, the Test of Independence assesses whether an association exists between the two variables by comparing the observed pattern of responses in the cells. The comparison and calculation of this test and its comparison with the critical value gives the freedom to the researchers to observe the counting of the cells, which is quietly different from the expected cell count. It is quite straight-forward and intuitive when it comes to the statistics of the Chi-Square:

$$\boldsymbol{\upchi} \mathbf{2}=\sum \frac{\left({\boldsymbol{f}}_{\boldsymbol{o}}-{\boldsymbol{f}}_{\boldsymbol{c}}\right)\mathbf{2}}{{\boldsymbol{f}}_{\boldsymbol{c}}}$$
(7)

The abbreviation fo denotes the observed frequency, and fe represents the expected frequency if no relationships exist between the variables. Equation 7 clearly shows the difference between the observed and the predicted frequency on which the Chi-Square statistics based.

WordNet hierarchy

WordNet is a huge lexical database of English. Nouns, verbs, adjectives and adverbs grouped into sets of cognitive synonyms (synsets), separately expressing a distinct concept. Synsets are interlinked using conceptual-semantic and lexical relations. The resulting network of meaningfully related words and notions can navigate with the browser (link is external) along with its numerical values. WordNet superficially resembles a thesaurus, in that it groups words based on their meanings and numerical values. However, there are some significant distinctions.

  • WordNet interlinks not just word forms—strings of letters—but specific senses of words. As a result, words that are found close to one another in the network are semantically disambiguated.

  • WordNet labels the semantic relations among words, whereas the groupings of words in a thesaurus do not follow any definite pattern other than meaning similarity.

Based on the above description, Table 3. depicted the numerical values along with influencing factors.

Table 3 Computing score based on described measures

Principle Component Analysis (PCA)

Principal components analysis (PCA, for short) is a variable-reduction technique that shares many similarities to exploratory factor analysis. It aims to reduce a larger set of variables into a smaller set of ‘artificial’ variables, called ‘principal components’, which account for most of the variance in the original variables. In this section, for PCA based influencing factor reduction, we construct a d x k–dimensional transformation matrix W that allows us to map a sample vector x onto a new k–dimensional feature subspace that has fewer dimensions than the original d–dimensional feature space:

$$\mathrm{x}=\left[{\mathrm{x}}_1,{\mathrm{x}}_2,............\dots \dots \dots \dots, {\mathrm{x}}_{\mathrm{d}}\right],\kern0.5em \mathrm{x}\ \upvarepsilon\ {\mathrm{R}}^{\mathrm{d}}\kern0.75em \downarrow \mathrm{x}\mathrm{W},\mathrm{W}\ \upvarepsilon\ {\mathrm{R}}^{\mathrm{d}\times \mathrm{k}}$$
(8)
$$\mathrm{z}=\left[{\mathrm{z}}_1,{\mathrm{z}}_2,\dots \dots............ \dots \dots, {\mathrm{z}}_{\mathrm{k}}\right],\kern0.5em \mathrm{z}\ \upvarepsilon\ {\mathrm{R}}^{\mathrm{k}}$$
(9)

As a result of transforming the original d-dimensional data onto this new k-dimensional subspace (typically k < < d), the first principal component will have the largest possible variance, and all consequent principal components will have the largest variance given the constraint that these components are uncorrelated (orthogonal) to the other principal components — even if the input features are correlated-, the resulting principal components will be mutually orthogonal (uncorrelated). Based on the above description, Table 3 depicted the numerical values along with influencing factors.

3.4 Impact of influencing factor –Phase-II

This section provides a deep insight into our final impact calculation. We would like to construct an association rule model based on information generated from reviews. Finding the correct impact of influencing factor is just one concern of our work. In addition, we are interested in identifying the relationship between the influencing factors and their impact on Youth. Thus, we propose a factor hierarchy based method to show the impact of influencing factors on Youth. The whole process consists of two main steps: impact hierarchy construction using Min-Max association rules and hierarchy expansion based on influencing factors. The input of our system is a collection of LF and RF factors list. The output is an impact hierarchy that contains not only LF and RF but also the impact on youth. Figure 5 shows a proposed model in Fig. 5.

Fig. 5
figure 5

Proposed conceptual Model Phase-II

To obtained better results, each of the influencing factors is stored in a bag of words where it represents a unique factor id. A particular factor may put a high impact on one category of youth while it may not affect other categories. Understanding these gathering patterns can help to increase the accuracy of work in several ways. If there is a pair of items, X and Y that are frequently bought together the association rules analysis help to uncover how items are associated with each other? There are two common ways to measure association.

Measure 1: support

This says how popular influencing factors set are, as measured by the proportion of reviews in which factor set appears. Mathematically, support is the fraction of the total number of review in which the factor set occurs.

$$\mathrm{Support}=\frac{\mathrm{Reviews}\ \mathrm{containing}\ \mathrm{both}\ \mathrm{LF}\ \mathrm{and}\ \mathrm{RF}}{\mathrm{Total}\ \mathrm{Number}\ \mathrm{of}\ \mathrm{Reviews}}$$
(10)

If someone discovers that the impact of a particular factor is beyond a certain proportion tend to have a significant impact on youth, then this might consider using that proportion as your support threshold. You may then identify factor sets with support values above this threshold as significant factor sets.

Measure 2: confidence

This says how likely a particular factor like LF is considered when any other factor such as RF is selected, expressed as {LF- > RF}. This is measured by the proportion of review with item LF, in which item RF also appears. One drawback of the confidence measure is that it might misrepresent the importance of an association. This is because it only accounts for how popular LF is, but not RF. If LF is also very popular in general, there will be a higher chance that a review containing LF will also contain RF, thus inflating the confidence measure. Mathematically:

$$\mathrm{Confidence}\ \left(\left\{\mathrm{LF}\leftrightarrow \mathrm{RF}\right\}\right)=\frac{Review\ containing\ both\ LF\ and\ RF}{Review\ containing\ RF}$$
(11)

Associations between selected items. Visualized using the rules VizR library and is depicted in Fig. 6.

Fig. 6
figure 6

Visualization of COVID-19 factor impact on Youth

4 Experimental results and discussion

In this section, we evaluated the working of the proposed conceptual model and performed Various experiments in term of precision, recall and f measure to check the efficiency of the proposed model. We used Two different datasets: extracted influencing factors from past literature (Dataset 1) and extracted influencing factors from tweets and Facebook posts (Dataset 2). We performed Separate experiments on both datasets. Table 1 shows Results.

Table 4 demonstrate the performance of the proposed method in terms of precision, recall and F-measure. Initially, the experiment runs on Dataset 1 (Influencing factors extracted from past literature) produced a recorded precision of 87.56%, recall 90.23% and F-measure of 85.89%. When the experiment runs on Dataset 2 (Social media dataset), the values for precision, recall and, F-score are 88.72%, 91.48% and 86.23%, respectively. Thus the results obtained prove the optimality of the proposed approach. Figure 7 shows a Graphical representation.

Table 4 Computing score based on described measures
Fig. 7
figure 7

Proposed method results

4.1 Evaluation of proposed technique with COVID youth mental health impact [26]

In the first experiment, we compared the new methodology to previous research on teenage mental health. The research conducted a cross-sectional study in which Teen of 584 enrolled in the study and about the General Health Questionnaire (GHQ-12), the cognitive status of COVID-19, the Negative coping styles scale and the PTSD Checklist-Civilian Version (PCL-C), they have completed the questions about them.

We observed from the details that the performance of the proposed approach, when compared with the actual published results of the proposed method, is up to the mark. The precision, recall and F-measure with almost 97%, 91% and 90% show the optimality of the proposed approach. Figure 8 shows the graphical representation of both results.

Fig. 8
figure 8

Comparison of the proposed technique with Youth mental health

4.2 Evaluation of proposed technique with expert and survey

We have also had our proposed system evaluated by a group of professionals working at an NGO ‘X’ that provides healthcare solutions because no such study found in the past literature. Their expert group comprises two Medical Specialists, one Analyst and two Quality Affirmation Specialists. We have illustrated our proposed structure, created a knowledge base and its user interface. Our illustration covers the procedure of investigation, organization and administrations to their expert group. They have applied the rating running from 0 to 10. We have labelled rating like 0–3 as Not Agreed, 4–6 as Partially Agreed and 7–10 as Agreed. We have condensed the outcomes dependent on agreed, partially agreed and Not Agreed by applying the mean on all the resultant estimations of every parameter. The criteria of expert opinions depended on three parameters: Agreed, Not Agreed and Partially Agreed. In light of the input from the specialists, they commonly concurred that the proposed methodology is suitable to deal with the finding of the impact of influencing factor. The proposed method for finding effect is easy to use and help in abridging comments and reviews (Fig. 9).

Fig. 9
figure 9

Comparison of proposed techniques with Expert Opinion and Surveys

We compared the achieved findings and performed interviews with casually visiting patients in the hospital after calculating our anticipated technique. This process, which is weak in nature, applied to the persons who, are randomly visiting the hospitals. This interviewing procedure covered almost 30 hospitals containing both general and particular inquiries about COVID-19. In these meetings, we only pick appropriate information related to computing factors. There are distinctive choices reviewed from “poor” to “excellent”. we compared the patient ranking with the proposed influence based model. The experimental results portray that the achieved results are optimistic when comparing with the recommendation of expert opinions. The performance measure of the proposed approach is encouraging in terms of accuracy. Figure 8 shows a Graphical representation.

5 Conclusion and future work

Microblogging sites and online forums are getting more attention in today’s world, as it allows users to share opinions and views with other users. Peoples have a set of impact on social media sites in their daily life. These sites enhance the behaviour of peoples related to any opinion. Our paper presented a novel association rule-based method that used social media sites and forums to measure the impact of COVID-19 influencing factors on Teens.

When we compared the proposed method to some old state-of-the-art approaches, it produces exceptional outcomes when compared to some high-quality technique. In contrast with survey and official results, the proposed method predicts well. The obtained results are quite promising for the proposed method. In future, researchers can further enhance the proposed work in term of novel similarity measures.