Analysis of Text Mining Tools in Disease Prediction

Kumari, Shabnam; Vani, V.; Malik, Shaveta; Tyagi, Amit Kumar; Reddy, Sravanti

doi:10.1007/978-3-030-73050-5_55

Shabnam Kumari²⁰,
V. Vani²¹,
Shaveta Malik²²,
Amit Kumar Tyagi ORCID: orcid.org/0000-0003-2657-8700^21,23 &
…
Sravanti Reddy²⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1375))

Included in the following conference series:

International Conference on Hybrid Intelligent Systems

837 Accesses
10 Citations

Abstract

Due to rapid creation of digital data by Internet of Things devices or smart devices, many new modern mining strategies/techniques require to handle/analyse this large amount of data. Note that more than 90% of today’s data is in present (generated) unstructured or semi-structured data format (most of part of this data is being generated only in the past decade). The discovery of appropriate patterns and trends to analyse the text documents from this large big data (i.e., large volume of data) is a big issue. Text mining is a process of extracting interesting and non- trivial patterns from huge amount of text documents. There exist different techniques and tools to mine the text (also other data format) and discover valuable information for future prediction and decision making process. Basically, there are two terms used in making or extracting some relevant information from a data-set, i.e., prediction modelling, and text mining. Predictive models are often used to detect crimes and identify suspects, after the crime has taken place/to detect an email, how likely that it is spam. Similarly, text mining used in applications like digital libraries, academic research field, life science, social media, business intelligence, etc. Today’s different text mining techniques are available for analysing the text patterns and their mining process, some of them are included here as: document classification (text classification, document standardization), information retrieval (keyword search/querying and indexing), document clustering (phrase clustering), natural language processing (spelling correction, lemmatization, grammatical parsing, and word sense disambiguation), information extraction (relationship extraction/link analysis), and web mining (web link analysis), etc.

This article discusses and analyse the text mining techniques and their applications in diverse fields of life. This work discusses about several use-cases, efficient algorithms like apriori algorithm, association rule mining, etc., which is used for frequent item set extraction (information retrieval and information extraction) and rule generation. Also, in result, generated several rules form a collected data-set to predict about a disease (as an example) will be discussed. In last, this work discusses detail descriptions about the terms classification, clustering, regression, association rule mining and outlier detection as a work-flow in analysing the data for producing a decision or making some prediction, also discussing some useful research gaps, challenges, issues (as its concluding remarks).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Text Mining in Medicine

Interpretive Psychotherapy of Text Mining Approaches

Techniques, Applications, and Issues in Mining Large-Scale Text Databases

References

Weiss, S.M., Indurkhya, N., Zhang, T., Damerau, F.: Text mining: predictive methods for analyzing unstructured information. Springer Science and Business Media (2010). https://doi.org/10.1007/978-0-387-34555-0
Hilfiker, J.N., Sun, J., Hong, N.: Data analysis. In: Springer Series in Optical Sciences. https://doi.org/10.1007/978-3-319-75377-5_3
Liao, S.-H., Chu, P.-H., Hsiao, P.-Y.: Data mining techniques and applications–a decade review from 2000 to 2011. Expert Syst. Appl. 39(12), 11 303–11 311 (2012)
Google Scholar
Zhong, N., Li, Y., Wu, S.-T.: Effective pattern discovery for text mining. IEEE Trans. Knowl. Data Eng. 24(1), 30–44 (2012)
Article Google Scholar
Henriksson, A., Moen, H., Skeppstedt, M., Daudaravicius, V., Duneld, M.: Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J. Biomed. Semant. 5(1), 1–25 (2014)
Google Scholar
Laxman, B., Sujatha, D.: Improved method for pattern discovery in text mining. Int. J. Res. Eng. Technol. 2(1), 2321–2328 (2013)
Google Scholar
Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)
Article Google Scholar
Rajendra, R., Saransh, V., Ashu, K., Sanjay, S.: A novel modified apriori approach for web document clustering. In: Proceedings of the ICCIDM, Smart Innovation Systems and Technologies, Dec 2014, Vol. 33, p. 159–171 (2015). https://arxiv.org/abs/1503.08463
Sumathy, K.L., Chidambaram, M.: Text mining: Concepts, applications, tools and issues-an overview. Int. J. Comput. Appl. 80(4), 29–32 (2013). https://www.ijcaonline.org/archives/volume80/number4/13851-1685
Joby, P.J., Korra, J.: Accessing accurate documents by mining auxiliary document information. In: 2015 Second International Conference on Advances in Computing and Communication Engineering (ICACCE), pp. 634–638. IEEE (2015)
Google Scholar
Wen, Z., Yoshida, T., Tang, X.: A study with multi-word feature with text classification. In: Proceedings of the 51st Annual Meeting of the ISSS-2007, Tokyo, Japan, vol. 51, p. 45 (2007)
Google Scholar
Zhua, F., Zhanga, C., et.al.: Biomedical text mining and its applications in cancer research. J. Biomed. Inform. 46(2), 200–211 (2013)
Google Scholar
Baker, S., Ali, I., Silins, I., Pyysalo, S., et al.: Cancer hallmarks analytics tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer. Bioinformatics 33(24), 3973–3981 (2017)
Article Google Scholar
Henriksson, A., Zhao, J., Dalianis, H., Bostrom, H.: Ensembles of randomized trees using diverse distributed representations of clinical events. BMC Med. Inform. Decis. Mak. 16(2), 69 (2016)
Article Google Scholar
Solanki, H.: Comparative study of data mining tools and analysis with unified data mining theory. Int. J. Comput. Appl. 75(16), 23–28 (2013)
Google Scholar
Kumaran, A., Makin, R., Pattisapu, V., Sharif, S.E.: Automatic extraction of synonymy information: -extended abstract, OTT06, vol. 1, p. 55 (2007)
Google Scholar
Narayana, B.L., Kumar, S.P.: A new clustering technique on text in sentence for text mining. IJSEAT 3(3), 69–71 (2015)
Google Scholar
Kaklauskas, A., Seniut, M., Amaratunga, D., Lill, I., Safonov, A., Vatin, N., Cerkauskas, J., Jackute, I., Kuzminske, A., Peciure, L.: Text analytics for android project. Procedia Econ. Finan. 18, 610–617 (2014)
Article Google Scholar
Samsudin, N., Puteh, M., Hamdan, A.R., Nazri, M.Z.A.: Immune based feature selection for opinion mining. In: Proceedings of the World Congress on Engineering, vol. 3, pp. 3–5 (2013)
Google Scholar
Tyagi, A.K.: Building a smart and sustainable environment using internet of things (February 22, 2019). In: Proceedings of International Conference on Sustainable Computing in Science, Technology and Management (SUSCOM), Amity University Rajasthan, Jaipur - India, 26–28 February 2019
Google Scholar

Download references

Acknowledgements

This research is funded by the Anumit Academy’s Research and Innovation Network (AARIN), India. The authors would like to thank AARIN India, an education foundation body and a research network for supporting the project through its financial assistance.

Author information

Authors and Affiliations

SRMIST University, Chennai, 603203, Tamilnadu, India
Shabnam Kumari
School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, 600127, Tamilnadu, India
V. Vani & Amit Kumar Tyagi
Terna Engineering College, Mumbai, Maharshtra, India
Shaveta Malik
Center for Advanced Data Science, Vellore Institute of Technology, Chennai, 600127, Tamilnadu, India
Amit Kumar Tyagi
VJIET, Hyderabad, Telangana, India
Sravanti Reddy

Authors

Shabnam Kumari
View author publications
You can also search for this author in PubMed Google Scholar
V. Vani
View author publications
You can also search for this author in PubMed Google Scholar
Shaveta Malik
View author publications
You can also search for this author in PubMed Google Scholar
Amit Kumar Tyagi
View author publications
You can also search for this author in PubMed Google Scholar
Sravanti Reddy
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have contributed in this work equally. Amit Kumar Tyagi has analysed, and approved this manuscript.

Editor information

Editors and Affiliations

Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR Labs), Auburn, WA, USA
Ajith Abraham
Institute for Information Systems, University of Applied Sciences and Arts Northwestern Switzerland, Olten, Solothurn, Switzerland
Thomas Hanne
Division of Graduate Studies, Tijuana Institute of Technology, Tijuana, Mexico
Oscar Castillo
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR Labs), Auburn, AL, USA
Niketa Gandhi
Universidade Federal da Bahia, Salvador, Bahia, Brazil
Tatiane Nogueira Rios
Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong

Ethics declarations

The authors declare that they do not have any conflict of interest with respect to publication of this research work.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumari, S., Vani, V., Malik, S., Tyagi, A.K., Reddy, S. (2021). Analysis of Text Mining Tools in Disease Prediction. In: Abraham, A., Hanne, T., Castillo, O., Gandhi, N., Nogueira Rios, T., Hong, TP. (eds) Hybrid Intelligent Systems. HIS 2020. Advances in Intelligent Systems and Computing, vol 1375. Springer, Cham. https://doi.org/10.1007/978-3-030-73050-5_55

Download citation

DOI: https://doi.org/10.1007/978-3-030-73050-5_55
Published: 17 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73049-9
Online ISBN: 978-3-030-73050-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Analysis of Text Mining Tools in Disease Prediction

Abstract

Access this chapter

Similar content being viewed by others

Text Mining in Medicine

Interpretive Psychotherapy of Text Mining Approaches

Techniques, Applications, and Issues in Mining Large-Scale Text Databases

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Editor information

Editors and Affiliations

Ethics declarations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Analysis of Text Mining Tools in Disease Prediction

Abstract

Access this chapter

Similar content being viewed by others

Text Mining in Medicine

Interpretive Psychotherapy of Text Mining Approaches

Techniques, Applications, and Issues in Mining Large-Scale Text Databases

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Editor information

Editors and Affiliations

Ethics declarations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation