A novel technique for identification and classification of HIV/AIDS related social media data using LD-KMEANS and DBN-LSTM

Mageshwari, V.; Aroquiaraj, I. Laurence

doi:10.1007/s11042-024-19283-9

A novel technique for identification and classification of HIV/AIDS related social media data using LD-KMEANS and DBN-LSTM

Published: 10 May 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

V. Mageshwari¹ &
I. Laurence Aroquiaraj²

67 Accesses
Explore all metrics

Abstract

To understand the mass behaviour of people, an effectual platform was provided by the online social network, which aids in developing techniques for the surveillance of Human Immunodeficiency Viruses/Acquired Immunodeficiency Syndrome (HIV/AIDS). With the rapid advancement of social sites, namely Facebook, Twitter, and blogs, the social networking approach is the most promising factor in HIV/AIDS investigation. Recently, most of the prevailing works implemented various frameworks to classify HIV/AIDS-related information using Social Media (SM) data. However, the traditional techniques were not generalized well enough to handle the complex structure of SM data. Also, the existing models were less effective due to the lack of annotation processes and pre-processing strategies. In this paper, to identify as well as classify HIV/AIDS-related SM data, a novel strategy has been proposed by utilizing the Levenshtein Distance-KMeans algorithm (LD-KMeans) and Deep Belief Networks—Long Short Term Memory (DBN-LSTM) models. The proposed work mainly focuses on the discussions on HIV and AIDS-related issues taking place on Twitter. For an efficient HIV/AIDS-related tweet classification, the proposed work undergoes the following steps. Initially, the tweets from Twitter are extracted by using Twitter API, and then, the preprocessing function is performed on the Twitter data. Then, the annotation extraction is performed. Next, the tweets are separated into organization tweets and person tweets based on the annotation. In the proposed work, organization tweets are highly considered. After that, the text normalization is performed, which provides the cleaned structured tweets. Then, the hashtags related to HIV and AIDS are identified and grouped together by using the LD-KMeans algorithm. Thereafter, the word embedding is performed by means of M-Word2Vec. Once the embedding process is completed, the most important features are selected by the LS-DFO algorithm. Finally, on the basis of selected features, the classification is performed, which efficiently classifies the HIV/AIDS-related tweets into different categories like symptoms, awareness, medicine, and reason. In this research work, Twitter data are utilized. Then, the outcomes obtained by the proposed methodology are analogized with the prevailing algorithms. Thus, the analysis results proved that the research methodology obtained a better accuracy, sensitivity, and specificity of 94.65%, 94.56%, and 94.25%, respectively. Likewise, the proposed work reached a tweet identification time of 72154 ms. Finally, the experiential outcomes demonstrated that regarding sensitivity, specificity, along with accuracy, the proposed model outperformed the prevailing systems in the process of classifying the HIV\AIDS-related tweets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Recognition of Epidemic Cases in Social Web texts

Event Detection in Twitter Big Data by Virtual Backbone Deep Learning

A Unique Extensible Framework Detection and Classification of Traffic Events Based on Deep Learning Approach

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

Abbreviations

HIV:: Human Immunodeficiency Syndrome
AIDS:: Acquired Immune Deficiency Syndrome
ML:: Machine Learning
DL:: Deep Learning
LS:: Linear Scaling
KMeans:: K-Means
DBN:: Deep Belief Network
LSTM:: Long Short-Term Memory
DFO:: Dispersive Flies Optimization
DBN-LSTM:: Deep Belief Network- Long Short-Term Memory
LD-KMeans:: Levenshtein Distance-KMeans
LS-DFO:: Linear Scaling based Dispersive Flies Optimization algorithm
M-Word2Vec:: Modified-Word2Vec
Twitter API:: Twitter Application Programming Interface
SA:: Sentiment Analysis
LR:: Logistic Regression
CNN:: Convolutional Neural Network
SVM:: Support Vector Machine
TF-IDF:: Term Frequency-Inverse Document Frequency
FS:: Feature Selection
CM:: Confusion Matrix
DP:: Data Points

References

Zheng C, Wang W, Young S (2021) Identifying HIV-related digital social influencers using an iterative deep learning approach. AIDS 35(1):1–10
Google Scholar
Jahanbin Kia, Rahmanian Fereshte, Rahmanian Vahid, Jahromi Abdolreza Sotoodeh (2019) Application of twitter and web news mining in infectious disease surveillance systems and prospects for public health. GMS Hygiene and Infection Control 14:1–12
Google Scholar
Garza P, Sarvas R, Malik A (2020) Applying natural language processing techniques to analyze HIV-related discussions on Social Media. Thesis, Politecnico Di Torino. https://webthesis.biblio.polito.it/secure/15239/1/tesi.pdf
Lohmann S, White BX, Zuo Z, Chan MS, Morales A, Li B, Zhai C, Albarracin D (2018) HIV messaging on twitter an analysis of current practice and data-driven recommendations. AIDS. 32(18):2799–2805
Article Google Scholar
Weibel N, Desai P, Saul L, Gupta A, Little S (2017) HIV risk on twitter the ethical dimension of social media evidence-based prevention for vulnerable populations. Proceedings of the 50th Hawaii International Conference on System Sciences, January 4–7, 2017, Hilton Waikoloa Village. http://hdl.handle.net/10125/41370
LourdesAraujo JM, Romo OB (2022) Ricardo Sanchez de Madariaga and The Cohort of the National AIDS Network (CoRIS), “Discovering HIV related information by means of association rules and machine learning.” Sci Rep 12:1–12
Google Scholar
Naga HB, Kumari R, Kumar S andJiling Zhong (2018) How much do you care? mining and analysis of tweets pertaining to health issues. SoutheastCon. St. Petersburg, FL, USA, pp 1–8. https://doi.org/10.1109/SECON.2018.8478865
Odlum M, Yoon S, Broadwell P, Brewer R, Kuang Da (2018) How twitter can support the HIV/AIDS response to achieve the 2030 eradication goal in-depth thematic analysis of world AIDS day tweets. JMIR Public Health Surveill 4(4):1–11
Article Google Scholar
Fung Isaac Chun-Hai, Jackson Ashley M, Ahweyevu Jennifer O, Grizzle Jordan H, Yin Jingjing, Tse Zion Tsz Ho, Liang Hai, Sekandi Juliet N, King-Wa Fu (2017) #Globalhealth Twitter Conversations on #Malaria, #HIV, #TB, #NCDS, and #NTDS: a Cross-Sectional Analysis. Annals of Global Health 83(3–4):682–690
Article Google Scholar
Matza Louis S, Paulus Trena M, Garris Cindy P, Van de Velde Nicolas, Chounta Vasiliki, Deger Kristen A (2020) Qualitative thematic analysis of social media data to assess perceptions of route of administration for antiretroviral treatment among people living with HIV. The Patient - Patient-Centered Outcomes Research 13:409–432
Article Google Scholar
Mittal Mamta, Kaur Iqbaldeep, Pandey Subhash Chandra, Verma Amit, Goyal Lalit Mohan (2019) Opinion mining for the tweets in healthcare sector using fuzzy association rule. EAI Endorsed Trans Pervasive Health Technol 4(16):1–10
Google Scholar
Marshall B, Salabarria-Pena Y, Johnson W, Moore L (2021) Reaching racial/ethnic and sexual and gender minorities with HIV prevention information via social marketing. Evaluation and Program Planning (In Press). https://doi.org/10.1016/j.evalprogplan.2021.101982
Article Google Scholar
Saranya G, Geetha G, Chakrapani K, Meenakshi K and Karpagaselvi S (2020) Sentiment analysis of healthcare tweets using SVM classifier. International Conference on Power, Energy, Control and Transmission Systems (ICPECTS). Chennai, pp 1–3. https://doi.org/10.1109/ICPECTS49113.2020.9336981
Manaloto TAD, Raga RC Jr (2020) Tools and techniques for capturing possible HIV risk-related tweets of filipinos. Int J Sci Technol Res 9(4):2116–2121
Google Scholar
Young SD, Wenchao Yu, Wang W (2017) Toward automating HIV identification machine learning for rapid identification of HIV-related social media data. J Acquir Immune Defic Syndr 74:128–131
Article Google Scholar
Tavoschi Lara, Quattrone Filippo, D’Andrea Eleonora, Ducange Pietro, Vabanesi Marco, Marcelloni Francesco, Lopalco Pier Luigi (2020) Twitter as a sentinel tool to monitor public opinion on vaccination an opinion mining analysis from September 2016 to August 2017 in Italy. Human Vaccines & Immunotherapeutics 16(5):1062–1069
Article Google Scholar
Francesco Marcelloni and Pier Luigi Lopalco (2020) Twitter as a sentinel tool to monitor public opinion on vaccination an opinion mining analysis from September 2016 to August 2017 in Italy. Hum Vaccin Immunother 16(5):1062–1069
Article Google Scholar
Lohmann S, Lourentzou I, Zhai C, Albarracin D (2018) Who is saying what on twitter an analysis of messages with references to HIV and HIV risk behavior. ACTA De InvestigacionPsicologica 8(1):95–100
Google Scholar
Adrover C, Bodnar T, Salathe M (2014) Targeting HIV-related medication side effects and sentiment using twitter data. https://doi.org/10.48550/arXiv.1404.3610
Mageshwari V, Laurence Aroquiaraj I (2019) An efficient feature extraction method for mining social media. Int J Sci Technol Res 8(11):640–643
Google Scholar
Stevens Robin, Bonett Stephen, Bannon Jacqueline, Chittamuru Deepti, Slaff Barry, Browne Safa K, Huang Sarah, Bauermeister Jose A (2020) Association between HIV-related Tweets to HIV incidence in the U.S A digital epidemiological study. J Med Int Res 22(6):1–25
Google Scholar
Malik Aqdas, Antonino Angi, Laeeq Khan M, Nieminen Marko (2021) Characterizing HIV discussions and engagement on twitter. Health Technol 11(4):1237–1245
Article Google Scholar
CosmeAdrover Todd Bodnar, Huang Zhuojie, Telenti Amalio, Salathe Marcel (2015) Identifying adverse effects of HIV drug treatment and associated sentiments using twitter. JMIR Public Health and Surveillance 1(2):1–10
Google Scholar
Thangarajan N, Green N, Gupta A, Little S, Weibel N (2015) Analyzing social media to characterize local HIV at-risk populations. Proceedings of the conference on Wireless Health. Bethesda Maryland, pp 1–8. https://doi.org/10.1145/2811780.2811923
Viola Savy Dsouza (2023) Priyobrat Rajkhowa, Rashmi Mallya, Raksha, Mrinalini V, Cauvery K, Rohit Raj, Indu Toby, Sanjay Pattanshetty and Helmut Brand, “A sentiment and content analysis of tweets on monkeypox stigma among the LGBTQ+ community: A cue to risk communication plan.” Dialogues in Health 2:1–8
Google Scholar
Qin Z, Ronchieri E (2022) Exploring pandemics events on twitter by using sentiment analysis and topic modeling. Appl Sci 12:1–21
Article Google Scholar
Mathiyazhagan B, Liyaskar J (2022) Ahmad Taher Azar, Hannah H Inbarani, Yasir Javed, Nashwa Ahmad Kamal and Khaled M Fouad, “Rough set based classification and feature selection using improved harmony search for peptide analysis and prediction of anti-hiv-1 activities.” Appl Sci 12:1–13
Article Google Scholar
Maria Grazia Sindoni (2021) The time is now: A multimodal pragmatic analysis of how identity and distance are indexed in HIV risk communication digital campaigns in US. J Pragmat 171:82–86
Article Google Scholar
Bazrafshan A, PanahiI S, Sharifi H, Merghati-Khoei E (2022) The role of online social networks in improving health literacy and medication adherence among people living with HIV/AIDS in Iran: Development of a conceptual model. PLoS ONE 17(6):1–21
Google Scholar
Erdengasileng A, Tian S, Green SS, Naar S, He Z (2022) Using twitter data analysis to understand the perceptions, awareness, and barriers to the wide use of pre-exposure prophylaxis in the united states, In 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Las Vegas, Nevada, pp 3000–3007. https://doi.org/10.1109/2Fbibm55620.2022.9995568
Dangi D, Dixi DK, Bhagat A (2022) Sentiment analysis of COVID-19 social media data through machine learning. Multimed Tools Appl 81(29):42261–42283. https://doi.org/10.1007/s11042-022-13492-w
Article Google Scholar
Palani B, Elango S, Viswanathan KV (2022) CB-Fake: A multimodal deep learning framework for automatic fake news detection using capsule neural network and BERT. Multimed Tools Appl 81(4):5587–5620. https://doi.org/10.1007/s11042-021-11782-3
Article Google Scholar
Dinç B, Kaya Y (2024) HBDFA: An intelligent nature-inspired computing with high-dimensional data analytics. Multimed Tools Appl 83(4):11573–11592. https://doi.org/10.1007/s11042-023-16039-9
Article Google Scholar
Mallik A, Kumar S (2024) Word2Vec and LSTM based deep learning technique for context-free fake news detection. Multimed Tools Appl 83(1):919–940. https://doi.org/10.1007/s11042-023-15364-3
Article Google Scholar

Download references

Acknowledgements

We thank the anonymous referees for their useful suggestions.

Funding

This work has no funding resource.

Author information

Authors and Affiliations

Department of Mathematics, Amrita Vishwa Vidyapeetham, Coimbatore, India
V. Mageshwari
Department of Computer Science, Periyar University, Salem, India
I. Laurence Aroquiaraj

Authors

V. Mageshwari
View author publications
You can also search for this author in PubMed Google Scholar
I. Laurence Aroquiaraj
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by ^*1Mageshwari V, ²Dr. I. Laurence Aroquiaraj. The first draft of the manuscript was written by Mageshwari V and all authors commented on previous versions of the manuscript.

All authors read and approved the final manuscript.

Corresponding author

Correspondence to V. Mageshwari.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent of publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mageshwari, V., Aroquiaraj, I.L. A novel technique for identification and classification of HIV/AIDS related social media data using LD-KMEANS and DBN-LSTM. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19283-9

Download citation

Received: 11 September 2023
Revised: 23 March 2024
Accepted: 22 April 2024
Published: 10 May 2024
DOI: https://doi.org/10.1007/s11042-024-19283-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel technique for identification and classification of HIV/AIDS related social media data using LD-KMEANS and DBN-LSTM

Abstract

Access this article

Similar content being viewed by others

Recognition of Epidemic Cases in Social Web texts

Event Detection in Twitter Big Data by Virtual Backbone Deep Learning

A Unique Extensible Framework Detection and Classification of Traffic Events Based on Deep Learning Approach

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent of publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel technique for identification and classification of HIV/AIDS related social media data using LD-KMEANS and DBN-LSTM

Abstract

Access this article

Similar content being viewed by others

Recognition of Epidemic Cases in Social Web texts

Event Detection in Twitter Big Data by Virtual Backbone Deep Learning

A Unique Extensible Framework Detection and Classification of Traffic Events Based on Deep Learning Approach

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent of publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation