Abstract
An obvious feature of the big data is overload. When we are held up in the Lake of big data, it is necessary to filter the most meaningful information [Khan et al. (ICCWAMTIP 2018:232–236, 2018), (Int J Inf Technol 12(2):409–417, 2020)]. The key element is segmentation, which involves the breakdown of the dataset into smaller ones. MIMIC-III (Medical Information Mart for Intensive Care III), an open critical care database, comprises data flow that encompasses patient information during whole hospital length of stay, i.e., from the beginning of hospital admission to patients’ discharge from the hospital. As MIMIC III stores a large volume of shared data, selecting useful data using traditional data mining approach to tailor academic research would be time consuming and resource demanding. Herein, we introduced a robust Windows build-in tool known as PowerShell, which is used to segment the big data into a practical dataset. Since the PowerShell script is open on the platform to demonstrate its use for further public research, we would present the step-by-step operation here to help readers grab a general idea of its mechanism.
Similar content being viewed by others
References
Khan A, LI J-P, Khan J, Jasim KM, Alam R, Ahamed VMN (2018) Complex environment fuzzy vision computing. ICCWAMTIP
Khan A, Li J-P, Khan MY, Alam R (2020) Complex environment perception and positioning based visual information retrieval. Int J Inf Technol 12(2):409–417. https://doi.org/10.1007/s41870-020-00434-8
Azmin M, Jafari A, Rezaei N, Bhalla K, Bose D, Shahraz S, Dehghani M, Niloofar P, Fatholahi S, Hedayati J, Jamshidi H, Farzadfar F (2018) An approach towards reducing road traffic injuries and improving public health through Big Data telematics: a randomised controlled trial protocol. Arch Iran Med 21(11):495–501
Wong ZSY, Zhou J, Zhang Q (2018) Artificial intelligence for infectious disease Big Data analytics. Infect Dis Health. https://doi.org/10.1016/j.idh.2018.10.002
McCue ME, McCoy AM (2017) The scope of Big Data in one medicine: unprecedented opportunities and challenges. Front Vet Sci 4:194. https://doi.org/10.3389/fvets.2017.00194
Tomat L (2018) Research areas in Big Data analytics studies. In, Cham, 2018. Economy, Finance and Business in Southeastern and Central Europe. Springer International Publishing, pp 785–795
Khan A, Li JP, Ahmad N, Shuchi Sethi AUH, Patel SH, Rahim S (2020) Predicting emerging trends on social media by modeling it as temporal bipartite networks. IEEE. https://doi.org/10.1109/ACCESS.2020.2976134
Dey D, Slomka PJ, Paul Leeson M, Comaniciu D, Shrestha S, Sengupta PP, Marwick TH (2019) Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol 73(11):1317–1335. https://doi.org/10.1016/j.jacc.2018.12.054
Nafea I (2016) Utilizing Big Data analysis for diseases prevention and control during Hajj. 2nd International Conference on Open and Big Data (OBD), Vienna 2016:52–56. doi:https://doi.org/10.1109/OBD.2016.15.
Y M, H D, W H (2007) Segmentation of multivariate mixed data via lossy data coding and compression. IEEE Trans Pattern Anal Mach Intell 2007:1546–1562. doi:https://doi.org/10.1109/TPAMI.2007.1085
E T, JM K, (2017) Sequence segmentation with changeptGUI. Methods Mol Biol. https://doi.org/10.1007/978-1-4939-6622-6_12
Khan A, Li JP, Haq Au, Nazir S, Ahmad N, Varish N, Malik A, Patel SH (2020) Partial observer decision process model for crane-robot action. Sci Program 2020:1–14. https://doi.org/10.1155/2020/6349342
Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, Mark RG (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035. https://doi.org/10.1038/sdata.2016.35
Lilly CM, Swami S, Liu X, Riker RR, Badawi O (2017) Five-year trends of critical care practice and outcomes. Chest 152(4):723–735. https://doi.org/10.1016/j.chest.2017.06.050
Scurlock C, Becker C (2016) Telemedicine for trauma and emergency: the eICU. Curr Trauma Rep 2(3):132–137. https://doi.org/10.1007/s40719-016-0054-y
García-Gil D, Luengo J, García S, Herrera F (2019) Enabling smart data: noise filtering in Big Data classification. Inf Sci 479:135–152. https://doi.org/10.1016/j.ins.2018.12.002
Lossio-Ventura, JA, Alatrista-Salas H (2017) Information management and Big Data. SIMBig: Annual International Symposium on Information Management and Big Data
Zhang Z, Odaibo D, Skidmore FM, Tanik MM (2017) A Big Data analytics approach in medical imaging segmentation using deep convolutional neural networks. In, Cham, 2017. Big Data and Visual Analytics. Springer International Publishing, pp 181–189
Martí L, Sanchez-Pi N, Molina JM, Bicharra Garcia AC (2014) YASA: Yet another time series segmentation algorithm for anomaly detection in Big Data problems. In, Cham, 2014. Hybrid Artificial Intelligence Systems. Springer International Publishing, pp 697–708
Payette B (2007) Windows PowerShell in Action. Manning Publications
Talaat S (2015) Getting Started with Azure PowerShell. In: Pro PowerShell for Microsoft Azure. Apress, Berkeley, CA, pp 9–17. doi:https://doi.org/10.1007/978-1-4842-0665-2_2
Y M, M P, (2020) Using the object-oriented powershell for simple proteomics data analysis. Methods Mol Biol. https://doi.org/10.1007/978-1-4939-9744-2_17
Y M, M. P (2013) Simple proteomics data analysis in the object-oriented PowerShell. Methods Mol Biol 2013: 379–391. doi:https://doi.org/10.1007/978-1-62703-392-3_17
Garrett R (2013) Working with PowerShell. In: Pro SharePoint 2013 Administration. Apress, Berkeley, CA, pp 59–74. doi:https://doi.org/10.1007/978-1-4302-4942-9_3
Deshev H (2008) Extending the type system. In: Pro Windows PowerShell. Apress, Berkeley, CA, pp 237–252. doi:https://doi.org/10.1007/978-1-4302-0546-3_12
Mullin R (2004) Dealing with data overload. Chem Eng News Arch 82(12):19–26. https://doi.org/10.1021/cen-v082n012.p019
Saxena D, Lamest M (2018) Information overload and coping strategies in the big data context: evidence from the hospitality sector. J Inf Sci 44(3):287–297. https://doi.org/10.1177/0165551517693712
Philip Chen CL, Zhang C-Y (2014) Data-intensive applications, challenges, techniques and technologies: a survey on Big Data. Inf Sci 275:314–347. https://doi.org/10.1016/j.ins.2014.01.015
Das S, Datta S, Chaudhuri BB (2018) Handling data irregularities in classification: Foundations, trends, and future challenges. Pattern Recogn 81:674–693. https://doi.org/10.1016/j.patcog.2018.03.008
Simon P (2015) The elements of persuasion: Big Data techniques. In: Too Big to Ignore. doi:doi:https://doi.org/10.1002/9781119204039.ch3
Aggarwal CC (2012) A segment-based framework for modeling and mining data streams. Knowl Inf Syst 30(1):1–29. https://doi.org/10.1007/s10115-010-0366-0
Bab-Hadiashar A, Suter D (2000) Data segmentation and model selection for computer vision. Springer, New York, NY. https://doi.org/10.1007/978-0-387-21528-0
Acknowledgements
We thank the Chengdu Medical College for providing a robust and valuable big data platform to perform the big data analysis.
Funding
This work was supported by the Chengdu Medical College Foundation (CYZ-18-33, CYZ19-33), Chengdu Science and Technology Bureau focuses on research and development support plan (2019-YF09-00097-SN), the popular scientific research project of Sichuan Health Commission (20PJ171), and Sichuan undergraduate innovation and startup program funding support (S201913705080, S201913705130, S201913705059, S202013705070, S202013705075, S202013705108), Yun Nan Education Program (SYSX202036).
Author information
Authors and Affiliations
Contributions
FX and WR drafted the general outline; GM proofread and ensured the general language quality; DW, HFZ performed the PowerShell script drafts and tests, FFL drawn the figures.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Ren, W., Wan, D., Zhu, H. et al. PowerShell-based novel framework for Big health data analysis. Int. j. inf. tecnol. 13, 287–290 (2021). https://doi.org/10.1007/s41870-020-00559-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-020-00559-w