A Multi-layered Psychological-Based Reference Model for Citizen Need Assessment Using AI-Powered Models

We propose an automatic, low-cost, large-scale, nonintrusive human need recognition framework that utilized a multi-layered psychological-based reference model and designed with different modules including data collection, preprocessing, feature extraction and contextualization module. The reference model comprises several classification and regression models to identify human psychological needs, measure their satisfaction levels, evaluate their surrounding environment around different life aspects during any subjective event or towards emerging topics at any time, and in any location, using their publicly available social media content. We evaluate the predictive powers of various textual, psychological, semantic, lexicon-based and Twitter-specific features. To provide benchmark results, we compare and evaluate the performance of diverse machine learning algorithms. Our results confirm the effectiveness of the developed reference model. The framework is used to recognize citizen needs in response to the New Zealand terror attacks which occurred on March 15th, 2019.


Introduction
Urban innovation and solutions driven by Information and Communication Technologies (ICT) have been progressively applied to enhance urban life in terms of economy, mobility, environment, people, living and governance. The realization of a true smart city vision is now closer than ever [1]. More and more, the number of applications and services that are adopting these technologies with the intention of improving the performance of urban services which will, in turn, enhance the quality of life of citizens is growing [2]. For these applications to be effective, a variety of sensors are needed to continuously collect near real-time data. Currently, urban planners in a smart city rely mostly on the data obtained from measurement equipment or physical sensors "hard sensors" such as cameras, environmental sensors, implanted medical devices, or telematics systems in vehicles [3]. The data retrieved from the deployed sensors are inserted into a large computing platform and then aggregated to provide a unified view of the city. Authorities then reference these data in making informed decisions on the management of the city and its events. The data retrieved from hard sensors, however, do not directly reflect the fluid response of people regarding changes in their immediate surroundings at any given time. Simply, deploying up-todate technologies into the city's systems does not solely make it smart. Incorporating innovative technologies with mechanisms to capture the citizens' needs and feedbacks is the ultimate endeavor for the city to be considered smart. According to [2], engaging citizens in city planning and the development process is considered to be paramount in the evolution of smart cities, and, therefore, should be one of the main objectives when considering how to proceed. Moreover, in [4], Coe et al. mentions that encouraging the citizens' participation in the decision-making process could lead to more democratic communities. Conclusively, more attention to the interaction between humans and urban space is needed to drive a more effective and relevant decision-making process. Consequentially, the principle of utilizing the citizen themselves as "soft sensors" should be vitally considered to provide a successful and efficient implementation of a smart city.
The city should be able to understand, interpret and adapt to the affective states of its citizens, referencing their emotions, moods, and personality traits. The interpretation of affective data can be utilized in urban planning as a complement to the use of traditional hard sensors in evaluating ongoing planning processes, supporting decision-making, and providing additional insights regarding the city's inhabitants to create a big picture of the city's live state, as stated by an affect-aware city [5].
As we review previous works, we realize that recognizing a population's diverse affective states is practical and achievable when taking advantage of the wealth of social media content that is constantly changing in a dynamic fashion, based on local and global happenings. User-generated content (UGC) from multiple social media platforms comprised of a vast amount of personal data, including users' daily thoughts, insights, evaluations, feelings, and emotions, expressed through their geographic and time-based textual status updates. UGC is considered a rich source of information that can be used to reveal individuals' affective states, and, as a result, researchers have begun mining this massive resource of affect data for this purpose. For example, a large part of the existing research focuses on general dimension sentiment analysis and opinion mining, under three polarity categories: positive, negative and neutral [6,7] using lexicon-based methods and machine learning algorithms. Some studies go beyond the general sentiment analysis, recognizing distinct and dimensional emotional categories [8,9] with the help of psychological lexicons. Certain recent studies have also explored more distinguishable long-term affective states such as personality [10] and mood [11], using psychological psychometrics.
Despite the recent focus on analyzing "how" people feel in a city by determining their distinctive affective states (emotions, personality and mood), the exploration around the "why" behind these feelings, actions and behaviors has, to date, received very little attention.
Being aware of citizen's needs, within a smart city, can provide the root explanations of their motivation underneath their upsets, confusions and complaints based on many human need theories (HNT) [12][13][14]. This awareness can be utilized by city planners and decision-makers to adapt the city's regulations, services, and plans in such a way to minimize and eventually reduce struggles, conflicts, and violent reactions.
Accordingly, our objective is to illuminate the vast benefits of automatically recognizing citizen needs. Therefore, in this work, we propose a theoretical-based multilayered reference model to assess citizen needs during any event at any time, and in any location using their publicly available social media content. We design and develop need classification and regression models, with each corresponding to one of the concepts that form the layered reference model, namely need content recognition (NCR) model, need type identification (NTI) model, need satisfaction level measurement (NSM) model, social context evaluation (SCE) model, and life aspect identification (LAI) model. For a more comprehensive and deeper analysis, we develop the frustrated need intensity estimator (FNIE) model and the satisfied need intensity estimator (SNIE) model to determine the intensity score of the satisfaction level.
The rest of this article is structured as follows. In the next section, we provide a brief overview of the literature concerning the identification of human needs, followed by which we introduce the proposed multi-layered psychological-based reference model. Also, we present a need recognition framework that utilized the reference model with different specific designed modules to analyze citizen needs. In the subsequent section, we describe the experiments conducted to evaluate the proposed reference model and present the obtained results. Before the concluding section, we use our developed framework to analyze citizen needs in response to the New Zealand terror attacks that occurred on March 15th, 2019. Our conclusion is drawn in the final section, with a summary of the main findings and some suggestions for future directions.

Related Work
Individual needs are typically assessed using approaches from psychological science [15]. Due to many limitations, the traditional approaches are considered inadequate for large-scale need recognition and analysis. They are very time consuming and impractical when analyzing individual needs frequently in an interactive way within a large group (i.e. community). Moreover, using conventional methods is limited to a small group of respondents, which will only reflect a small percentage of the entire population within a city or community. Most of the conventional assessment surveys are designed with respect to one specific life aspect (e.g. relationships, work, etc.), and cannot be used to reflect multiple life aspects [16] and [17]. Moreover, they cannot efficiently capture the dynamics and context that embodies need experiences. All of these limitations are considered to be barriers to analyzing millions of individual needs.
On the other hand, few attempts have been made to infer people need using social media content. They differ in their objective, theoretical background, dataset used and their methodology. Despite the recognized potential of social media platforms like Twitter as a data source for analyzing human needs, the few existing works have some drawbacks and limitations. We will explain each work in details SN Computer Science clarifying their limitations and shortcomings in both the theoretical background and the recognition method. IBM Research group [18] identified individual needs based on consumer behavior using Ford's model with the aim of enhancing the quality of direct marketing and influencing purchasing behavior. The model includes 12 need categories that correlate and explain consumer behavior, namely structure, practicality, challenge, self-expression, excitement, curiosity, liberty, ideal, harmony, love, closeness and stability. In the survey, they asked participant to list names of products they would like to buy and write their need that matches the model. The list of products mentioned in the responses are used to collect six million tweets to construct the data set and built their need model. Relying on satisfiers (i.e. products) in identifying individual needs is not efficient. Based on human need theory [19], satisfiers referring to the ways people satisfy their needs are changeable dependent on gender, age and culture. Therefore, we cannot rely on satisfiers to predict the need states. Moreover, the underlying need theory which is the Maslow hierarchy of fundamental needs [20] that used to build Ford's model was restrained by its cultural and hierarchical limitations [21].
Consider another mental state modeling work that was proposed by Rashkin et al. [22]. This work endeavors to understand the mental states of the actors in a simple story based on an event they experience. They proposed a dataset consisting of 15,000 commonsense stories, 1 which were manually labelled with emotion and motivation categories, using crowdsourced workers from Amazon Mechanical Turk. They applied motivation categories inspired by Maslow and Reiss' motivation theories, and for emotions, they referenced Plutchik's basic emotions model. During the annotation process, they found that the annotators were not familiar with motivational theories and, therefore, found it challenging when assigning motivation categories to the dataset. A logistic regression model is trained on TF-IDF, Neural Process Network (NPN), Recurrent Entity Network (REN), Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) encoders. The best model performance achieved a 35.23 micro-F score using their proposed story dataset and attained a score of 64.8 when trained on the open text explanation provided by annotators using TF-IDF schema. Their classifier was most effective at predicting Maslow's physiological needs and Reiss's food motives only because they both have clear indication from the text. Their approach is designed based on long, formal stories and considers the multiple interactions between story characters.
Ding et al. [23] attempt the classification of sentences in personal stories from web blogs based on seven human needs. Their work shows numerous shortcomings in both the theoretical background and the classification method. Not only did they fail to provide a concrete theoretical guideline in defining the need categories, they also did not consult an expert prior to proposing their taxonomy of needs and performing their annotation process. They claim their work was inspired by two human need theories: Maslow's Hierarchy of Needs and Fundamental Human Needs theories; however, their proposed need categories do not show any connection to either of those need theories. Another flaw lies in the fact that they associate the concepts of basic human needs with desires, wishes and goals, which are considered separate concepts and unique entities in most psychological need theories. Moreover, within their methodology, the number of instances in their dataset (approximately 559) is not sufficient for a classification task involving more than nine classes. This may explain the poor performance (54.8 average F score ) reflected in their results. The limitations and the drawbacks of the existing works motivate us to develop our psychological-based reference model to identify and assess citizen needs to promote well-being and prevent conflict and violence.

Methodology
We propose a human need recognition framework to help authorities to automatically monitor and analyze a population's needs at any time, any event and any location in the hopes of preventing conflict and violence, and also, to assure quality of life. In addition, they could interpret how people perceive their surrounding environment to recognize and understand the grounds of a population's discontent and determine the appropriate actions. Figure 1 illustrates the proposed psychological need recognition framework. The proposed framework employed a multi-layered psychological-based reference model and designed with different modules including data collection, preprocessing, feature extraction and contextualization module to be utilized in a wide range of applications with diverse contexts. The multilayered psychological-based reference model which is the core element of the framework with all the designed modules will be explained in detail in the following sections.

Psychological-Based Multi-layered Reference Model
The psychological-based multi-layered reference model presented in Fig. 2 is developed based on several psychological theories [24,25]. To have a comprehensive need analysis, multiple dimensions (i.e. concepts) of an individual's basic needs are adopted to construct the layers of the reference model. Layer 1 is formed to recognize need content by verifying the experienced emotion states. Based on human need theories, needs are inner sates that cannot be recognized directly. However, the self-determination theory (SDT) [26] states that needs are strongly correlated with emotions and feelings. Therefore, layer 1 aims to recognize emotions experienced by individuals to indicate their implicit needs. Layer 1.1 is constructed to identify the type of psychological needs that are present underneath the emotional states. In this layer, we considered the basic phycological needs proposed by the Basic Needs Theory (BNT) which are Relatedness, Competence and Autonomy [27]. These needs are fundamental for all individuals, regardless of their gender, age, culture, ethnicity and religion [28]. This layer can be extended by integrating other type of human needs. Layer 1.2 is formed to measure the satisfaction level of the identified need. Relying on the polarity of the experienced emotions, we can indicate need satisfaction level. Moreover, within layer 1.2, we determine the need satisfaction and frustration intensity score to best understand the individual's experience. The intensity refers to the degree or strength of the emotions felt. Each emotional word used by individuals is associated with differing intensities. For instance, both depression and unhappiness belong to an emotion class which would be categorized as sadness; however, they qualify differently in their intensities, where depressed conveys a higher amount of sad emotion than unhappiness. Because we understand that human needs are directly related to an individual's underlying emotions, we scrutinize the magnitude of the emotions in identifying the intensity of the satisfaction and frustration level. In analyzing a tweet of a specific need satisfaction level, the goal is to determine the degree of need satisfaction or need frustration felt by individuals. The intensity level can be determined using qualifiable categories (i.e. low, moderate and high) or using real value scores ranging from 0 to 1. A score of 1 means highest satisfaction level, whereas a score of 0 means the lowest. The ability to automatically determine and measure the intensity of satisfaction levels can be beneficial in many need recognition applications. For example, an application that monitors citizen psychological well-being could focus on detecting a significant need frustration in someone's life, and subsequently recommend an appropriate coping and healing strategy to prevent critical mental health issues such as depression or suicidal thoughts [29]. Another instance could be the early detection of high need frustration level in public reaction during any critical events or crisis could aid in determining appropriate  and immediate action which could prevent possible conflict and violence that could arise amongst the chaos. Layer 1.3 has been constructed to assess the quality of individual's surrounding environment or social context. The layer aims to evaluate the way different social contexts range from interpersonal relationships (e.g. family) to more general settings, and distal contexts (e.g. economic and political systems) affect the need experience whether it helps to satisfy or frustrate the fulfilment of citizen need. To promote psychological well-being [30] and to avoid violence, citizens require full support from their social contexts.
Since basic needs must be satisfied across all life domains [12], layer 1.4 in the reference model is formed to identify the aspect of life that involve the need experience. We specified the most important life domains across the citizen life including family, social relations, work, education, government, leisure, health, religion aspects and general evaluation [24,27,31].

Data Collection Module
The data collection module is responsible for retrieving the data of interest from different social media networks, in this work we used Twitter platform. The textual and visual content will be retrieved with its metadata (i.e., posting time and geographical location) using Twitter Search API. 2 The social interaction information (i.e., retweets, replies and favourites) is also retrieved for each post. The textual data will go through the Data PreProcessing Module and Feature Extraction Module. After automatically recognizing citizen needs using the psychological-based reference model, the social interaction data and the metadata will be used by the contextualization module to represent the results.

Data Preprocessing Module
Several preprocessing tasks required to be performed to clean and prepare the text for the feature extraction steps and to further improve the performance. Based on the goal of our reference model, we performed the following combinations of preprocessing tasks: (1) Twitter non-textual elements such as URLs, video, image, mentions are filtered out. (2) Emojis, emoticons, hashtags, symbols and punctuations were kept since they have important expressive communication roles.

Feature Extraction Module
Feature extraction is the process of representing each tweet as a feature vector. Each entry position in a vector corresponds to a feature type extracted from a tweet and represents the weight of that feature. We explore the usage of textual features, psychological features, semantic features, lexicon-based features and Twitter-specific features in recognizing citizen need. Table 1 shows the features used to develop the need models in each layer of the psychologicalbased multi-layered reference model.

Textual and Linguistic Features
Bag of Word Model (BoW) We used the well-known BoW model as textual features. The model considers all the distinct tokens (i.e. words, emojis and punctation) in the dataset regardless of their order or the semantic dependency between them. We used Term Frequency-Inverse Document Frequency (TF-IDF) weighting schema to construct the feature vector for each tweet.

N-Gram Language Model (LM)
We explore the continuous sequence of tokens with different n-sizes using N-gram models. LMs are useful for our identification tasks because they can capture more information. For example, it can capture some phrases and patterns (e.g. I like, I'm lonely), which the bag-of-words approach ignores. We extracted each n-gram token by calculating its TF-IDF weight.

Psycho-linguistic Features
We explore the use of different psychological features and evaluate how effective they are in providing insight into the way people express their needs on Twitter.

Linguistic Inquiry and Word Count (LIWC)
We explore the LIWC collection of lexicons that developed based on psychology and cognitive theories [32]. LIWC lexicon consists of 4 dimensions (e.g. linguistic, psychological processes, personal concerns and spoken categories) and 92 lexicons which were developed by psychologists. It has been used to recognize different psychological concern and mental health disorders. Using LIWC, we extracted 92 features by calculating the percentage of total words in a tweet that match each of the predefined lexicons and formed the vector.

Linguistic Category Model (LCM)
To analyze the language used in interpersonal events, we used LCM model, a psycho-linguistic classification approach, that classifies verbs people use during any social events [33,34]. Using this approach, we can capture what is happening with a person during the event, his/her psychological state as well as the characteristics of the others involved in the event. The model consists of three linguistic categories: Descriptive Action Verbs (DAVs) that provide specific and concrete description of action during a short duration (e.g. hold, jog), State Verbs (SV) that describe thoughts (e.g. think, understand) and affective states (e.g. hate, respect) and Interpretative Action Verbs (IAV) that describe enduring behaviors and events without describing the feature of the action, such as (e.g. avoid, help, and attend). For the LCM features, we calculate the frequency of DAV, IAV and SV verbs in a given tweet.

Twitter-Specific Features
Due to the length limit imposed on some social media platforms such as Twitter, users tend to use emojis, emoticons, hashtags (#), mentions (@) and expressive symbols [35].
Thus, we adopt the Twitter-specific features including emojis and hashtags. We explore emoji in four ways: (1) the emoji frequency in a tweet, (2) the categories of emojis (e.g. foods, drinks, animal and nature) using Twitter emojis listed in Emojipedia, 5 (3) the sentiment of the emojis using the sentiment lexicon created by Novak [36] and (4) the color of the emojis (e.g. black, cream white and dark brown). For hashtag features, we calculate the frequency of "#words" included in the text.

Lexicon-Based Features
Relying on the fact that the expressed emotions indicate the fulfillment of the needs, we adapt features that are driven x Sentiment140 x SentiStrength (SS) x AFINN x SenticNet x Emotion lexicons NRC word-emotion association lexicon (EmoLex) x NRC hashtag emotion lexicon x x NRC affect intensity lexicon x Life aspect lexicons x SN Computer Science by the use of the following existing sentiment and emotion lexicons to help measuring the satisfaction level in layer 1.2 [26,37].

Bing Liu's Opinion Lexicon
The Opinion Lexicon is a manually constructed lexicon from customer reviews about product's features [38]. It consists of 2006 positive words and 4781 negative words. For each tweet, we calculate the frequency of each word in the lexicons to form the feature vector.
Negation Lexicon Negation words can easily change the sentiment orientation from positive to negative, and in turn, completely reverse the intended meaning of the sentence [39]. Therefore, we used the count of the total occurrence of negated terms (cues or negation signals) such as don't, aren't, neither and never as features.

NRC Hashtag Sentiment Lexicon
The NRC Hashtag Sentiment Lexicon 6 is a large, automatically generated lexicon originated using 775,000 tweets that were retrieved using 78 unambiguous and strong positive and negative seeds of hashtags such as #amazing, #good, #excellent, #bad, and #terrible [40]. The lexicon contains 54,129 unigrams (single word), 316,531 bigrams and 308,808 pairs. Each of the words and the phrases is classified with a real-value score between − ∞ (most negative) and + ∞ (most positive).
We have modified the NRC Hashtag Sentiment lexicon by excluding a number of elements (i.e. numbers, mentions, punctuation and other non-functional words) which we have deemed as useless in our scenario. For each tweet, we calculate the frequency of positive and negative words to form the vector.
Multi-perspective Question Answering (MPQA) Lexicon MPQA 7 is a subjectivity lexicon that identifies subjectivity clues and aspects, including source of opinion, event, and sentiment expressions [41]. It contains a list of over 8000 subjective expressions collected from several resources. The expressions are manually compiled with prior polarities and tagged with their strength. For each tweet, we extracted two polarity features: positive words and negative words.

NRC Word-Emotion Association Lexicon
NRC Word-Emotion Association Lexicon, also known as EmoLex, was constructed manually by conducting a tagging process using the Amazon Mechanical Turk crowdsourcing platform [42]. The Hashtag Emotion Corpus (aka Twitter Emotion Corpus, or TEC) was used to create the lexicon. The lexicon contains 14,182 unigram words tagged with eight emotion categories derived from the Plutchik's wheel of emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust). The words are also tagged with polarity (negative and positive).
For each tweet, we extracted 8 features corresponding to the eight emotion category lists. We used EmoLex version 0.92. We also used the Expanded NRC Word-Emotion Association Lexicon, which expands the NRC word emotion association lexicon for the language used in Twitter.

Score-Based Sentiment and Emotion Lexicons
We used features derived from several sentiment and emotion-based lexicons. Specifically, we used score-based sentiment and emotion lexicons as features to determine the intensity score for satisfied and frustration needs. The intensity score of each employed resource was mapped at the same range from 0 to 1. The intensity score of each tweet is calculated by aggregating the intensity scores of each sentiment and emotional token (words and emojis) within the tweet provided by each of lexicons. For any given tweet, each token is reviewed to find a match in the lexicon. If a match is found, then the associated intensity score is retained. All individual intensity scores are compiled to calculate the overall score for each lexicon and use them as features. We use a variety of popular and comprehensive lexicons that differ in their data type (i.e. general, Twitter-specific lexicons, emotion based and sentiment-based lexicons), and in their annotation method (i.e. manually and automatically annotated lexicons). All the score-based sentiment and emotion lexicons are explained in detail below.
SentiWordNet The SentiWordNet lexicon 8 is a publicly available synset-based sentiment lexicon introduced by Baccianella et al. [43]. It was constructed based on the WordNet lexical database synset, which is a set of synonyms meant to provide broad coverage. Each of the 115,000+ WordNet synsets is tagged with sentiment information automatically using semi-supervised machine learning method. Each synset is assigned a sentiment score value range between the interval [0.0 and 1.0] corresponding to its degree of positivity, negativity or neutrality. We used SentiWordNet version 3.0.
SentiStrength (SS) SentiStrength 9 is a lexicon-based method proposed in [43] for use in determining sentiment strength. The lexicon was developed intended for use for short and informal language on any social media platform. It consists of a combination of 2546 booster words, emoticons, negations and intensifiers from LIWC lexicon. The words in the lexicon are annotated manually based on a corpus of 2600 MySpace comments. Each lexicon entry is assigned a score indicating the polarity and the strength of the sentiment. Two scales are used to evaluate the sentiments. The first ranges from 1 (not positive) to 5 (extremely positive) to indicate the strength of positive sentiment, and, the second, employs − 1 (not negative) to − 5 (overly negative) to indicate the strength of negative sentiment. We used Sen-tiStrength version 2.0. For each tweet, we extract two types of SentiStrength features: the sum of the positive scores and the sum of the negative scores.
AFINN AFINN 10 is a manually constructed lexicon based on Bradley and Lane's Affective Norms for English Words (ANEW) lexicon [44]. ANEW provides emotional ratings for a large number of English words according to the psychological reaction of a person; however, it did not consider the informal words and slang that are commonly used in social media platforms. Nielsen in [45] filled that gap by creating the AFINN lexicon, an updated version of the ANEW lexicon, which focuses on microblogging platform language. Over 3300 English words and phrases were rated based on sentiment strength valence using a numerical sentiment score range from − 5 which indicate a very strong Negative Sentiment and + 5 for a very strong Positive Sentiment. For each tweet, we calculated the sum of the positive scores and the sum of the negative scores of the tweet words that matched the lexicon words.
Sentiment140 Sentiment140 11 is an automatically generated lexicon from a corpus consisting of a collection of 1.6 million tweets retrieved using these noisy labels: positive and negative emoticons [46]. The lexicon contains words and phrases which include 62,468 unigrams, 677,698 bigrams and 480,010 pair. Each of the listed words and phrases are associated with real-valued sentiment score between − ∞ (most negative) and + ∞ (most positive) and 0 for neutral. To ascertain the intensity of a tweet, we extracted two types of features. For each tweet, we add up the positive scores and the negative scores for each word in a tweet, separately.

NRC Affect Intensity
The NRC Affect Intensity Lexicon 12 consists of 6000 words annotated with four basic association emotion labels: anger, fear, joy, and sadness [47]. All the words in each emotion category are rated with real-value numerical scores, illustrating the intensity strength for that emotion. For instance, "sohappy" is labelled as "joy" emotion, with an intensity score of 0.86. From the NRC Affect-Intensity we extracted four features. The individual scores for each word in the tweet matching the four emotion classes are summed.
SenticNet SenticNet 13 is a semantic and affective-based lexical resource that provides concept-level sentiment analysis for more than 100,000 natural language concepts [48]. It captures latent information in terms of semantics and sentics. Sentics are the affect information expressed in terms of four affective dimensions (pleasantness, attention, sensitivity, and aptitude). Also, SenticNet provides polarity-based scores between − 1 (extreme negative) and + 1 (extreme positivity). The intensity score is defined based on the sixteen basic emotions of the well-known Hourglass of Emotions. SenticNet does not rely only on the expressions that explicitly convey emotions using keyword counts and word co-occurrence frequencies; it is also able to leverage the implicit sentiments by analyzing expressions with semantically related concepts.

NRC Hashtag Emotion Lexicon
Also known as NRC-Hash-Emo, an automatically created lexicon amassed from tweets with emotion word hashtags, utilizing The Hashtag Emotion Corpus (aka Twitter Emotion Corpus, or TEC) [41]. The corpus was constructed automatically using emotions hashtags such as #happy and #anger. The lexicon contains 16,862 emotion-word associations, including eight emotions (i.e. anger, fear, anticipation, trust, surprise, sadness, joy, and disgust). Each word in the lexicon is rated with realvalue score between 0 (not associated) and ∞ (maximally associated).

Part of Speech (POS)
POS tagging is the process of tagging a word with its part of speech such as verb, noun, adjective, or adverb based on the detected context. This process has proven to be effective in affects classification. To use POS tags as features, we used the CMU ARK Twitter Part-of-Speech Tagger. 14 This POS tagger, developed specifically to be used with social media content, employs 25 tags that consider informal language and Twitter-specific properties [49]. Each token in a tweet is tagged with its POS. We calculated the frequency of each POS tag in a tweet to construct the feature vector. We have 25 different POS tags considered as features.
SN Computer Science a particular word. It captures generic aspects of language structure, namely semantic and syntactic similarity between words. Words which frequently occur in similar contexts tend to be semantically similar and have similar regions of the vector space. Using word embeddings as features has recently proven to be effective in many text analysis tasks [50]. Pre-trained word embeddings are utilized when a text analysis task suffers from low resources and a large data set cannot be obtained [51]. It is very beneficial to incorporate this type of feature in designing our need models. In addition to capturing the semantic and syntactic context similarity, it is imperative in our need detection framework to preserve the emotional word meaning as well. Therefore, we used the pre-trained Emotion Word Embeddings (EWE) proposed by Agrawal et al. [52]. EWE is an emotion-enriched word embedding that enhances the regular embedding with respect to emotion analysis. This word embedding captures the affect information in tweets, while other existing generic word embeddings such as Word2vec 15 and GloVe [53] capture only syntactic and semantic information. As mentioned in [52], according to the cosine similarity scores between the word vectors from the most popular pre-trained word embeddings (Glove and Word2vec), the word pair (happy, sad) is more similar than (happy, joy). This example highlights the limitation of using generic word embeddings in emotion-based analysis tasks that require more attention to affect expressions. Agrawal et al. [52] designed their emotion word embedding using an emotion model firmly grounded in psychology. By incorporating EWE as features, we can capture words with similar emotional meaning such as (happy, joy). The feature vector level within a tweet is calculated by aggregating the embedding values of the words within the tweet using an average word embedding scheme. We ended up with a dimensional vector of 400 for each tweet. Zero values are added for words with no corresponding embedding.
Life Aspect Lexicons Classifying a short text into one of nine life aspect categories can be very challenging, especially when we encounter an unbalanced dataset. External knowledge resources are necessary to help overcome this complication. Relying on lexical resources as external knowledge has proven to be beneficial during short text categorization and word sense disambiguation [54]. We constructed nine life aspect lexicons to help classify the input textual data in the most dominant life aspect. The construction of the nine life aspect lexicons involved the two following steps: 1. Selecting the initial seeds of words.
To establish an initial list of words related to each of the nine concepts of life aspects, we referenced online dictionaries and thesauruses. One of such resources, Oxford mini-dictionaries, provides lists of words and phrases categorized based on different subjects and topics. We collected terms listed in each topic category that were relevant to an individual life aspect category. For example, the Oxford mini-dictionaries related to health domain consist of sub-categories including "diet", "fitness", "illness", "medicine" and "mental health". Furthermore, each of these sub-categories (e.g. illness) has their own sub-sub categories (i.e. "ailments and diseases", "being ill", "Injuries" and "Recovering from illness"). Each of the sub-sub-categories consists of lists of related words and terms such as "recover", "healing" and "well". This collection step was performed for all of the nine life aspect categories: education, family, work, health, government/political, social relation, leisure, religion, and general evaluation. For the general evaluation aspect, we constructed the lexicon using words and phrases that express consistent and constant occurrences.

Expanding the words lists.
Once the initial list of words was amassed, we moved to the second stage where we extend the list of words in each lexicon by identifying and collecting synonyms and alternative expressions. WordNet, a large and wellconstructed database, provided a good starting point. WordNet is a lexical database developed at the Cognitive Science Laboratory at Princeton University. 16 WordNet organizes the knowledge in a way that reflects current psycholinguistic theories regarding how humans archive their lexical memories. The information in WordNet is catalogued into logical groupings based on sets of cognitive synonyms called synsets. Each synset consists of a list of synonymous words that are inter-linked by means of conceptual semantic similarity relations. Synsets are built with an intrinsic correlation between a concept and its corresponding words which are interchangeable among many contexts. The semantic relations have different types, including (Is-a/Has-a) relation, (Part-of/ Has-part) relation, (Member-of/Has-member) relation and (Substance-of/Has-substance) relation. Figure 3 shows a graph from the WordNet visual dictionary 17 to explain (Is-a) relation. Table 2 shows examples of words included in each of our life aspect lexicons.
The life aspect features are extracted as follows: first, we count all the text elements in a tweet, including punctuations, numbers, emoji and words. Then we compare each word in a tweet against all words in the predefined life aspect lexicons. The percentage of each life aspect lexicon is calculated by dividing the number 15 https ://code.googl e.com/archi ve/p/word2 vec/. 16  of its word frequency by the total number of the structure elements of the tweet, then multiplying that by 100. Similar phrases such as (all the time) and (every time) from a general life evaluation lexicon are counted as a single word rather than each separate word. Figure 4 explains how life aspect lexical features are extracted. In the example, the tweet has 21 elements. Four words from the work category lexicon compose 19.04% of the tweet, and one word from leisure category lexicon represents 4.761% of the tweet.

Designing and Developing Psychological Need Models
To design and develop the classification and regression psychological need models within the reference model's layers, we adopt the learning method explained in [55]. Based on the first 3 layers of the psychological-based reference model, authors designed and developed the need content recognition (NCR) model, the need type identification (NTI) model and the Measuring Need Satisfaction Level (MNS) model. In this paper, we extend the work by exploring different feature sets as well as different machine learning algorithms. Also, we design and develop the social context evaluation (SCE) model and the life aspect identification (LAI) model that are corresponding to layer 1.3 and layer 1.4, respectively. In the process of developing the SCE and the LAI psychological need models, we used the psychological need dataset proposed in [56]. In dataset we faced imbalanced problem for layer 1.3 and layer 1.4. Therefore, we used different sampling techniques including resampling and SMOTE to balance the classes' distribution as Table 3 shows [57]. In layer 1.3, we used under-sampling technique for "Not clear" class to randomly remove 32.73% of the instances. For the minority class "Support social context", we increased the instances by 120%. In layer 1.4, we decreased the amount of instances by 42.3% for the majority class "Social relation" and increased the minority classes "Religion" and "Government" by 400 and 300%, respectively, using SMOTE.
Moreover, we extend and improve layer 1.2 by developing the frustrated need intensity estimator (FNIE) model and the satisfied need intensity estimator (SNIE) model to determine the intensity score of the satisfaction level. To ascertain the intensity score of the need satisfaction level, we used an existing emotion intensity dataset called EmoInt to train and evaluate our proposed intensity estimator models. EmoInt is a benchmark tweet dataset with affect-related intensity score proposed for WASSA-2017  shared task on emotion intensity [58]. In the construction of the EmoInt dataset, 24 ranking judges manually annotated 7097 tweets based on four emotion categories: anger, fear, joy, and sadness, along with their associated intensity score range from 0 to 1. Since satisfaction in reference to individual need presents positive emotions and frustration needs lead to negative emotions, we divided the EmoInt dataset into positive and negative emotion datasets. The Negative intensity dataset included 5465 tweets which reflected the emotions anger, fear, and sadness, while the positive intensity dataset encompasses 1608 tweets of a happy and joyful emotion. For each of the datasets, we trained a regression model to determine the intensity score. In layer 1.2, if the tweet was classified to have a satisfied need expression, we applied the satisfied need intensity estimator (SNIE) model within the same layer to derive the intensity score of the satisfied need. In the case of tweets classified as frustrated, we used the frustrated need intensity estimator (FNIE) model to determine the frustration intensity score. After extracting the features listed in Table 1 for each layer for each layer of the reference model, all the features are normalized using Min-Max scaling. Before training the machine learning algorithms, we removed irrelevant and noisy features using a dimensionality reduction technique GainRatio [59].

Contextualization Module
The post-interaction data (i.e. retweet, favourites and mentions) and metadata (i.e. time and location) will be used by the Contextualization Module for result analysis and representation.

Experiment and Evaluation
We experimentally evaluate the performance of our proposed reference models on the psychological need dataset and the EmoInt dataset. The experiments are designed with the goal of measuring the effectiveness of the five classification models of our framework: NCR, NTI, NSM, SCE and LAI models and the two intensity estimator regression models: FNIE and SNIE. Since we are reporting the result of baseline, a set of experiments were conducted to validate the importance of different textual features, semantic features, psychological features and Twitter-specific features along with different machine learning algorithms.

Experimental Setting
The psychological human need dataset was divided into 70% for training and 30% for testing, while preserving the class distribution. For the classification tasks, we explored the most well-known machine learning algorithms, which we singled out based on their various learning methods and techniques, to provide benchmark results. Based on the best practices for classical machine leaning engineering, we have selected the Multinomial Naive Bayes algorithm (MNB) as a probabilistic classifier, the Support Vector Machine algorithm (SVM) as a linear-decision boundary algorithm, the Logistic Regression (LR) as a regression classifier, the Random Forest algorithm (RF) as an ensemble classifier, the K-Nearest Neighbors algorithm (K-NN) as an instance-based algorithm, and the Decision Tree algorithm (DT) as a rulebased classifier [60]. It is worth noting that in 2012, Hinton and Krizhevsky [61] and [62] showed that deep learning functions are typically superior to other machine learning methods for large datasets where complex processing is required. Since we have a linear classification process as well as our dataset is comparably small, deep leering analysis and prediction are kept outside the main scope of this paper.
For regression tasks, we used the Support Vector Regression (SVR) algorithm. We split each of the positive and the negative intensity datasets into 70% for training and 30% for testing.
We used the machine learning algorithm libraries implemented in Weka. Weka stands for Waikato Environment for Knowledge Analysis, a machine learning software developed at the University of Waikato [63]. While conducting the experiments, we followed an incremental feature selection approach. We comparatively evaluated the predictive power of each distinct feature and reported the obtained result before and after any feature addition, both as a set and individually.

Evaluation Metrics
The classification models' performance is assessed using the metrics of recall, precision, F score and accuracy. The regression models are evaluated using the Pearson's correlation coefficient (r) of model predictions with the EmoInt gold ratings and the root mean squared error (RMSE). RMSE calculation is used in indicating the average model prediction error between actual and predicted data. The usage of error square is important to statistically weigh any irregular values that might affect the results. Compared to traditional error calculations such as error rate (ER) and mean absolute deviation (MAD), RMSE performs better in terms of error prediction accuracy as well as monotonicity since it has the advantage of penalizing large errors in predicting observations according to the variables of previous results.

The Effectiveness of the Need Content Recognition (NCR) Model
We first study the impact of the different textual features BoW and n-gram models, including unigrams, bigrams and trigrams, on the model's performance. We consider how they interact both individually and in combination. The graphical comparison in Fig. 5 illustrates these interactions. As we can see from the graph, using unigrams and the combination of all the unigrams, bigrams and trigrams yields the best performance in comparison to the use of bigram and trigram models. The combination of the unigrams, bigrams and trigrams provides slightly better results when used for the following classifiers: SVM with an accuracy of 64.88%, K-NN 62.1%, RF with 67.18% and DT shows 66.5% accuracy. Next, we investigate the impact of the LIWC psychological lexicon, emojis and the number of hashtag features on the models' performance. Table 4 shows the accuracy and F score for all the selected machine learning algorithms, using all the features both as a set and individually. Table 4 shows we notice that using the LIWC psychological lexicon alone has a positive influence on the need content recognition model for all classifiers. In addition, considering the frequency of use of emojis seems to provide a slight advantage to all the models. There are no significant changes in the model accuracies after combining the LIWC lexicon with the frequency of Emoji and the number of hashtag features. Combining the LIWC lexicon, the frequency of Emoji, the number of hashtags and the n-gram model have improved the accuracy significantly for MNB classifier by 11.78% and slightly for LR classifier by 4.87%, K-NN by 3.56% and DT by 1.64% and no change in RF accuracy. SVM classifier achieved an accuracy of 64.81%, which is lower when compared with the accuracy achieved with individual features, due to the large number of features used.
The problem of data sparsity and high dimensionality is solved using the GainRatio. Out of the 24,013 features, we examine the most predictive features using different threshold values (0.01, 0.05 and 0.09). Using a threshold value of 0.01 leaves us with 2526 features, and improved the accuracy for SVM by 13.9%, RL by 0.64%, and DT by 5.51%. Using 0.05 threshold value resulted in 2106 features and boosted the accuracy by 6.32% for MNB, 13.29% for K-NN and 10.72% for RF, as shown in Fig. 6. The MNB classifier achieved the best accuracy in recognizing need content with at 79.13%, with a 0.77 F score , as Table 10 shows.

The Effectiveness of the Need Type Identification (NTI) Model
The experiment results with textual features show that the combination of the unigrams, bigrams and trigrams achieved better results for SVM, MNB and RF classifiers, as depicted in Fig. 7. Using only the unigram feature provides the best accuracy for K-NN and DT, while the combination of unigrams and bigrams are best for RF classifier. In Table 5, the outcomes of using the other features are listed. All the classifiers achieved acceptable ranges of accuracies after applying the psychological features derived from the LIWC and LCM lexicons (DAVs, IAV, SV). This is achieved because LIWC lexicon has different psychological dimensions that can be directly linked to each need types. The addition of the number of hashtag features and the frequency of emoji has slightly increased the accuracy for SVM and RF. Using all the emoji features, including the frequency of emoji, the categorized emoji and the colored emoji have not improved the accuracies except for a slight increase within the SVM classifier. The combination of all the features results in the maximum accuracy for all the classifiers, especially for MNB classifier, with an absolute accuracy gain of 13.31%. After using GainRatio to select the best features out of all 9500 features for each classifier with different threshold values, the accuracy increases marginally, as shown in Table 10 and Fig. 8. The best classifier in identifying need type is SVM, showing an accuracy rate of 81.97% and a 0.81 F score .

The Effectiveness of the Need Satisfaction Level Measurement (NSM) Model
Based on the comparison graph for BoW and n-gram models in Fig. 9, the highest accuracy was achieved by SVM, LR and DT classifiers when combining unigrams, bigrams and trigrams, whereas when combining unigram and bigrams, MNB and RF achieved the best accuracies. K-NN classifier responded best when using the unigram model alone.
As illustrated in Table 6, the RF and DT models performed very well and exhibited high accuracy rates when applying the LIWC psychological lexicon. Since LIWC lexicon has dimension for affective processes with many sentiment (positive and negative) and emotion categories (anxiety, anger and sadness), this could lead the increased in the performance. Adding emoji features, including the frequency of emoji, the sentiment, the categorized, and the colored emojis, increases the accuracy slightly, with the exception of the K-NN classifier. While adding the NRC and the opinion sentiment lexicons did not significantly affect the accuracy on most models, there was a slight rise (2.17%) in accuracy for the DT classifier. A markedly high performance is produced for all the classifiers except K-NN when combining all the features, especially the SVM classifier, which achieved 93.10% the heist accuracy. The K-NN classifier archived its best result using the LIWC psychological features alone, effectuating an accuracy of 75.31%. In the feature selection technique, applying the most predictive features with a threshold of 0.01 gives the highest accuracy of 93.56 % when used for SVM, and 92.25% for MNB, as shown in Table 10 and Fig. 10. A threshold of 0.05 gives the best result for SVM classifiers, showing 93.56% accuracy, and also for RF (92.28% accuracy). For K-NN classifier, eliminating the features with a 0.09 threshold value increased the accuracy to its best (89.84%). Among all of the evaluated machine learning algorithms, SVM is the best in determining the need satisfaction level, Table 10.

The Effectiveness of the Intensity Estimator Models
Results show that using the combination of unigram and bigram models as well as the combination of all the n-gram models provides good outcomes r (0.62) for SNIE and r (0.61) for FNIE) and tend to be more predictive than the other models as Fig. 11 shows. As illustrated in Tables 7 and  8, using the LIWC lexicon leads to an average r of (0.58) A noticeable improvement is also obtained after applying the GainRatio, which increases r values from (0.71) to (0.73) for SNIE model and (0.68) to (0.72) for FNIE model as Table 9 and Fig. 12 show.

The Effectiveness of the Social Context Evaluation (SCE) Model
As Fig. 13 shows, after comparatively evaluating the predictive power of the BOW model and the n-gram models, the combination of the unigram, bigram and trigram is selected because it provides the best results for most of the classifiers. As demonstrated in Table 11, using LIWC and LCM (DAVs, IAV, SV) psychological features also results in decent outcomes for all the classifiers. RF achieved the highest accuracy with 71.45%, and lowest was DT at 60.41%. Using the pre-trained emotion embeddings ended up with acceptable results for some classifiers, including SVM (65.45%), RF (65.82%), LR (64.87%) and K-NN (59.83%), while a lesser performance was noted for the MNB (48.63%) and DT (49.36%) classifiers. Using sentiment-based lexicon features showed satisfactory results among most of the classifiers; however, all the classifiers, except DT, achieved better accuracy when using emotion-based lexicon features. Specifying the emotion categories such as "disappointment" "anticipation" and 'Trust' helped to determine the surrounding social context type. For all classifiers, combining the psychological feature LIWC with emotion-based lexicon features enhanced the accuracy more than when these features are utilized individually. Combining the 1-3 g model with  (Fig. 14). As Table 10 shows, RF achieved the best result in evaluating social context (surrounding environment) when using the LIWC and LCM (DAVs, IAV, SV) psychological features in combination with textual features ( Table 11).

The Effectiveness of the Life Aspect Identification (LAI) Model
The experiment with textual features shows that the unigram model gives the best performance in identifying life aspect when compared to the bigram and trigram models, and also, with the combination of the models. This indicates that when identifying life aspect, it is not necessary to detect phrases and sequence of words, as in the previous need layers. Individual terms are more effective. Figure 15 provides a graphical comparison between classifiers using textual features. As Table 12 shows, acceptable results were achieved for some classifiers using the LIWC lexicon, showing 51.69% for SVM, LR with 51.65%, and RF at 51.80%. Meager results are noted for MNB with 39.06%, K-NN at 36.97%, and DT coming in at 39.79%. Adding the eight life aspect lexicons improved the accuracy for all the classifiers. The experiments show that combining the unigram model with the LIWC lexicon optimized the accuracy for LR at 56.82%, while combining the unigram model with the eight life aspect lexicons provided the best accuracy rates for MNB (57.72%), K-NN (45.26%) and RF (55.46%). For SVM and DT, combining all the features improved their accuracy to 55.52 and 49.43%. As Fig. 16 shows, eliminating the useless features as we measure a model's performance improved the accuracy rates for all the classifiers: SVM by 4.9%, MNB at 2.03%, LR by 4.25%, K-NN by 6.37%, RF by 1.75%, and  Fig. 9 The effect of different textual features on the accuracies of the NSM model DT by 1.02%. The best classifier in identifying the life aspect is LR with 60.48% accuracy as Table 10 illustrated.

New Zealand Terrorist Attacks
In this case study scenario, we analyze psychological needs for New Zealand terrorist attack event to recognize public reactions based on event evolution and authorities' reactions. The Christchurch terrorist attack is a violent religion-based attack which occurred on March 15th, 2019. An Australian gunman orchestrated two consecutive mass shooting attacks on mosques during prayer in the city of Christchurch, New Zealand. 51 worshipers were killed and more than 40 injured. According to the news, the shooter published a racist manifesto detailing his motivations for the attacks on social media platforms. On Friday morning, he posted a tweet detailing his intentions to attack the two mosques saying "I will carry out an attack against the invaders and will even live stream the attack via Facebook". Later in that day, he did, indeed, live stream the first attack using Facebook's live service.

Psychological Human Need Analysis
We analyzed people's psychological needs during this event to illustrate how the dynamic evolution of the event (investigations, updates, authorities' responses and support) can affect and change a population's satisfaction levels. We collected tweets during the day of the event, as well as the following 6 days. We utilize all the possible event-related hashtags such as #ChristchurchMosqueAttack, #NewZea-landTerroristAttack and #NewZealand to retrieve tweets. This step allowed us to gather in the neighborhood of 85,000 tweets along with their attached images and metadata, including the post's geo-tagged location and time stamp. These tweets then went through the Data Preprocessing Module and Feature Extraction Module in each of the framework layers. The processed tweets were then fed to the NCR model to filter out useless tweets that did not express human needs. Around 13,675 tweets out of 85,000 were consequently ignored. The remaining 71,325 useful tweets which express people psychological needs were further analyzed to (1) identify need type using NTI model, (2) the satisfaction level using NSM model, (3) to evaluate the social context surrounding people using SCE model and (4) to identify life aspect of the expressed need using the LAI model. For a more in-depth analysis, we applied the SNIE and the FNIE models to determine the intensity score of the satisfaction level.
This analysis shows that, throughout this event, people expressed the relatedness need most, at 63.61%. The competence need followed at 19.57%, and autonomy need the  Fig. 17 shows, among the 7 days of the event, tweets representing relatedness need appeared more on March 15th, 16th and 20th. Competence need-related tweets appeared more on March 18th and 19th, where autonomy need emerged more on March 17th. As Fig. 18 shows, tweets posted during this event reveal that people feel frustrated 56.73% of the time, where only 29.06% of tweets expressed satisfied needs. The latter number rises on specific days as a reaction to certain happenings, which will be demonstrated in the detail below. Roughly 32.89% of tweets were classified to not have a clear satisfaction level and were, therefore, not considered in further analysis. Deeper analysis reflected a high frustration level of the relatedness need at 30.87%, among other need types such  as competence need (7.43%) and autonomy need (5.28%), as shown in Fig. 19. Event progression including police investigation updates, authorities' responses, and surrounding support all affect human satisfaction levels; therefore, we used the same approach to create a timeline-based textual and visual representation to describe the event. We also relied on the news to understand the reasoning behind the use of some ambiguous hashtags and keywords which appeared frequently in the tweets. Figures 20 and 21 illustrate the changes in need satisfaction levels for the identified need types of autonomy, relatedness and competence throughout the event.
On March 15th, the day of the attacks, most tweets posted expressed a high frustration level for all the three needs: relatedness 67.14%, autonomy 43.7% and competence 55%. Less than half of the tweets expressed satisfied needs. On this day, people immediately started categorizing the event as a religious/racist-related attack, as reflected in the word cloud (a) in Fig. 22. They used hashtags such as #muslims, #terrorist, #islamophobia, #breakingnews, #muslimsarenotterrorist and #brentonTarrant (shooter's name), as can be seen in Table 13. They expressed their prayers and condolences to the victims and their families using #pray-forNewZealand and #rip hashtags. Also, people used #Facebook to talk about the shared livestreamed video for the two attacks. Through the news, we came to understand the usage of #hellobrother, which was among the top ten hashtags. "Hello Brother" were the last words of the first victim of the shooter before he was shot and killed. An image with the "Hello Brother" title was the most shared image on that day.
Frustration levels for all the three needs escalated to their highest on the second day, March 16th, where relatedness needs reached 71%, competence 58.05% and autonomy 54.8%. The satisfaction level deteriorated on this day, especially for autonomy need (40.18% to 39.57%). Hashtags such as #whitesupremacist, #whiteprivilege, #whiteterrorism and #whitesupremacy were among the most frequent hashtags describing the shooter's motivation behind the attack. Moreover, #49lives emerged that day to count and update the loss of lives. As Fig. 23 shows, the shooter's image was prominent in the image collections on the second day, as well as another magazine picture depicting similar attack events in different ways. On March 17th, there was a noticeable change in the need satisfaction levels. Tweets expressing dissatisfied needs dropped significantly for all the three needs: 11.47% in relatedness needs, 15.38% in autonomy and 15.15% in competence needs, as Fig. 5 shows. Moreover, in Fig. 24, we can see that the usage of dissatisfaction expressions with high and moderate intensity level in tweets presented less than the previous days. There was also an increase in the number of tweets expressing satisfied needs for all the three need categories, reflecting less intense emotional expressions, as Fig. 25 shows. Satisfied relatedness needs increased by 5.18%, competence need increased by 13.8% and autonomy need increased by 14.39%.
As Table 13, word cloud (c) in Fig. 22 shows, new topics of conversation arose centering around the Prime Minster's responses during her statement using #jacinda ardern, #theyareus and #50lives. Moreover, #eggboy and #fraseranning trending on Twitter on that day garnered worldwide attention. People shared their feelings and opinions about the   . 15 The effect of different textual features on the accuracy of the LAI model   actions the youth took in response to an Australian politician's comment regarding the New Zealand attack during a live interview on March 16th, where the Australian Senator blamed the New Zealand shooting on immigration. Images illustrating support, solidarity and emotional reactions were among the most frequent images shared on that day, even more so than the first and second day.
On March 18th and 19th, there was no significant change except a drop in satisfied autonomy need, from 53.96% to 45.89%, and an increase in their frustration autonomy needs by 7.53% percentage points. New hashtags in the tweet collection gathered on March 18th and 19th include #haka, #trump, #far-right, #solidarity, #facebook and #notevenhisname which led the public conversation and discussion. We scrutinized the news and the complete tweet textual messages to decode the appearance of some ambiguous or unclear hashtags among the most frequent hashtags. For example, #haka, references a ceremonial dance, in which people perform to honor the victims as a mark of sympathy and solidarity. The #notevenhisname is a phrase that the New Zealand Prime Minster used in her statement on March 17th, where she refused to mention the shooter's name. In addition, People used #Trump most frequently to report their opinion concerning his responses/tweets regarding the attacks. In regard to the hashtag #facebook, Facebook released a statement explaining their actions towards the shared video and how they planned to set some rules to reduce online hate and violence content in future. This coming in the aftermath where Facebook faced global reaction in response to their failure to prevent the attack being livestreamed. Many big name companies stopped advertising on social media platforms, including Twitter and Facebook, due to the failure to identify and detect hate content. This mainly supports our hypothesis on how analyzing social media content could help prevent the facilitation of violent behaviour leading to tragic events. The word cloud (c) presented in Fig. 22 shows the most frequently used words and phrases on these days. Images showing support and solidarity still prevail in the image collection as Fig. 23 shows. The following tweets were posted on March 18th and March 19th.
On March 20th, the autonomy needs changed noticeably and in opposition to other needs. The satisfied autonomy need heightened by 5.71%, and the dissatisfied autonomy need dropped from 46.3% to 42.09%. In contrast, satisfied relatedness need depreciated slightly   Fig. 23 shows how images reflecting support and solidarity continue to lead the visual representation. On Thursday, March 21st, within all three need categories, people satisfaction levels increased, and their frustration levels decreased. They also expressed less moderate and high-intensity frustration words and more low intensive language compared to the beginning of the event, as Fig. 24 shows. New sub-topics mentioned that day were #respect, #peace, #headscarfforharmony, #leadership and #banassaultweapons (Fig. 26). On this day, women announced a "scarves in solidarity" event to be hold on Friday using the hashtags #headscarfforharmony, #respect, #peace and #leadership hashtags were mentioned frequently in tweets talking about the NZ prime minster's speech and how people in New Zealand supported and dealt with that tragic event.

Satisfaction Level Intensity Estimation
Gauging the intensity of the language used in posts to express satisfaction and frustration needs gave us a deeper understanding of the population's feelings and reactions in regard to their levels of satisfaction versus frustration. As we can see from Fig. 24, people used high and moderate emotional expressions to convey their frustration level, which revealed how angry they were throughout the event.
On the other hand, they used expressions between low-and moderate-intensity levels to communicate their satisfaction level. Very few high-intensity satisfaction expressions were reflected during this event, as Fig. 25 shows.

Social Context Analysis
For social context analysis, Fig. 27 shows how people in New Zealand observe their surroundings and evaluate their social context. We notice that the day of the event and the next day, people reported that they did not perceive a supportive environment, at percentages of 30.2% on March 15th, and 31.7% on March 16th, which are the highest when compared to the other days. This confirms that their frustration was at its highest level in the first 2 days, as shown in Fig. 18. Tweets reflecting that people identify their environment as non-supportive appeared less often by March 17th and March 20th. The highest volume of tweets expressing supportive social context came on March 20th. On that day, and for the first time, people posted tweets indicating Using the SCE model to identify social context types during the New Zealand terrorist attacks SN Computer Science that they felt they had a supportive environment more than non-supportive environment. This affirms our statement that certain sub-events during these days, such as the Prime Minster's official speech on March 17th, and the announcement of a plan for gun law reform on March 20th, could be the reasons for the elevation in tweets indicating supportive social context. Table 14 shows the most frequent sub-topics (hashtags) posted on March 15th, 17th and 20th. The word cloud in Fig. 28 includes the most frequent words used on these days (a) and (b) for March 15th, (c) and (d) for March 17th and (e) and (f) for March 21st. Table 14 The top ten most frequently used hashtags classified under supportive social context and non-supportive social context posted on March 15th, March 17th and March 21st during the New Zealand terrorist attacks event

Location-Based Need Analysis
As a situation unfolds, people analyze event details based on their points of view, which are influenced by different factors such as age, gender, race, culture, religion, interests and many others. These factors have a significant impact on the way people perceive any particular situation and how it affects them. Accordingly, the posts gathered during the New Zealand incident bring to light the diverse angles from which each individual's perception is generated. This diversity is illustrated through the tweet categorization in Fig. 29. An individual's geographical location and cultural observations may have an impact on needs. Therefore, we analyze need satisfaction level based on location (where they were when they posted the tweets sharing their feelings and needs). Figure 30a, b shows location-based analysis for need satisfaction levels. We selected Canada and the United States    Table 15 shows the most frequent sub-topics discussed when population needs were satisfied vs frustrated in the two locations during the first 3 days of the event.

Conclusion and Future Work
The interpretation of data garnered through the proposed psychological need recognition and analysis can assist authorities providing a heightened situational awareness and improving management of pre-and post-event conflicts and social reactions. These methods have been proven to enrich the effectiveness of smart city research and development. In this work, we design, implement, and evaluate a theoretical-based multi-layered reference model, which is capable of identifying human psychological needs, and ascertaining their satisfaction level, evaluating people surrounding environment with regard to various life aspects. The design and the development of the reference model layers are influenced by motivational psychology research. Various linguistic, psychological and Twitterbased features are explored in conjunction with distinctive machine learning algorithms to develop psychological need models based on the conceptual layered reference model. These include (1) the need content recognition (NCR) model, which is used to recognize need content, (2) the need type identification (NTI) model, to identify the need type, (3) the need satisfaction level measurement (NSM) model, which measures individual need satisfaction level or whether the detected need is satisfied or frustrated, (4) the social context evaluation (SCE) model to evaluate the individual surrounding environment and whether they are supportive or non-supportive, and, (5) the life aspect identification (LAI) model which is employed to identify the life domain. Furthermore, the frustrated need intensity estimator (FNIE) model and satisfied need intensity estimator (SNIE) model are implemented within the third layer to obtain the intensity score of the satisfaction level. The results of our experiments affirm the effectiveness of the proposed framework and the developed psychological needs models. We implemented a prototype of the proposed framework in a real-life case study setting, analyzing population needs during the New Zealand terror attacks, which occurred on March 15th, 2019. The analysis of these critical events gave a clear indication of the feasibility of applying the proposed framework and its effectiveness in detecting changes among public reaction in terms of their psychological needs based on event evolution and authority response. The proposed framework could potentially be employed in a variety other application such as marketing and need-based recommendation scenarios. For future work, we aim to enrich the framework with other types of psychological needs. Extending the capability of the framework by integrating multilingual aspects for need analysis in other commonly spoken languages to cover a wide range of geographical locations is another possible future expansion.

Compliance with Ethical Standards
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.