Skip to main content
Log in

PowerShell-based novel framework for Big health data analysis

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

An obvious feature of the big data is overload. When we are held up in the Lake of big data, it is necessary to filter the most meaningful information [Khan et al. (ICCWAMTIP 2018:232–236, 2018), (Int J Inf Technol 12(2):409–417, 2020)]. The key element is segmentation, which involves the breakdown of the dataset into smaller ones. MIMIC-III (Medical Information Mart for Intensive Care III), an open critical care database, comprises data flow that encompasses patient information during whole hospital length of stay, i.e., from the beginning of hospital admission to patients’ discharge from the hospital. As MIMIC III stores a large volume of shared data, selecting useful data using traditional data mining approach to tailor academic research would be time consuming and resource demanding. Herein, we introduced a robust Windows build-in tool known as PowerShell, which is used to segment the big data into a practical dataset. Since the PowerShell script is open on the platform to demonstrate its use for further public research, we would present the step-by-step operation here to help readers grab a general idea of its mechanism.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Khan A, LI J-P, Khan J, Jasim KM, Alam R, Ahamed VMN (2018) Complex environment fuzzy vision computing. ICCWAMTIP

  2. Khan A, Li J-P, Khan MY, Alam R (2020) Complex environment perception and positioning based visual information retrieval. Int J Inf Technol 12(2):409–417. https://doi.org/10.1007/s41870-020-00434-8

    Article  Google Scholar 

  3. Azmin M, Jafari A, Rezaei N, Bhalla K, Bose D, Shahraz S, Dehghani M, Niloofar P, Fatholahi S, Hedayati J, Jamshidi H, Farzadfar F (2018) An approach towards reducing road traffic injuries and improving public health through Big Data telematics: a randomised controlled trial protocol. Arch Iran Med 21(11):495–501

    Google Scholar 

  4. Wong ZSY, Zhou J, Zhang Q (2018) Artificial intelligence for infectious disease Big Data analytics. Infect Dis Health. https://doi.org/10.1016/j.idh.2018.10.002

    Article  Google Scholar 

  5. McCue ME, McCoy AM (2017) The scope of Big Data in one medicine: unprecedented opportunities and challenges. Front Vet Sci 4:194. https://doi.org/10.3389/fvets.2017.00194

    Article  Google Scholar 

  6. Tomat L (2018) Research areas in Big Data analytics studies. In, Cham, 2018. Economy, Finance and Business in Southeastern and Central Europe. Springer International Publishing, pp 785–795

  7. Khan A, Li JP, Ahmad N, Shuchi Sethi AUH, Patel SH, Rahim S (2020) Predicting emerging trends on social media by modeling it as temporal bipartite networks. IEEE. https://doi.org/10.1109/ACCESS.2020.2976134

    Article  Google Scholar 

  8. Dey D, Slomka PJ, Paul Leeson M, Comaniciu D, Shrestha S, Sengupta PP, Marwick TH (2019) Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol 73(11):1317–1335. https://doi.org/10.1016/j.jacc.2018.12.054

    Article  Google Scholar 

  9. Nafea I (2016) Utilizing Big Data analysis for diseases prevention and control during Hajj. 2nd International Conference on Open and Big Data (OBD), Vienna 2016:52–56. doi:https://doi.org/10.1109/OBD.2016.15.

  10. Y M, H D, W H (2007) Segmentation of multivariate mixed data via lossy data coding and compression. IEEE Trans Pattern Anal Mach Intell 2007:1546–1562. doi:https://doi.org/10.1109/TPAMI.2007.1085

  11. E T, JM K, (2017) Sequence segmentation with changeptGUI. Methods Mol Biol. https://doi.org/10.1007/978-1-4939-6622-6_12

    Article  Google Scholar 

  12. Khan A, Li JP, Haq Au, Nazir S, Ahmad N, Varish N, Malik A, Patel SH (2020) Partial observer decision process model for crane-robot action. Sci Program 2020:1–14. https://doi.org/10.1155/2020/6349342

    Article  Google Scholar 

  13. Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, Mark RG (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035. https://doi.org/10.1038/sdata.2016.35

    Article  Google Scholar 

  14. Lilly CM, Swami S, Liu X, Riker RR, Badawi O (2017) Five-year trends of critical care practice and outcomes. Chest 152(4):723–735. https://doi.org/10.1016/j.chest.2017.06.050

    Article  Google Scholar 

  15. Scurlock C, Becker C (2016) Telemedicine for trauma and emergency: the eICU. Curr Trauma Rep 2(3):132–137. https://doi.org/10.1007/s40719-016-0054-y

    Article  Google Scholar 

  16. García-Gil D, Luengo J, García S, Herrera F (2019) Enabling smart data: noise filtering in Big Data classification. Inf Sci 479:135–152. https://doi.org/10.1016/j.ins.2018.12.002

    Article  Google Scholar 

  17. Lossio-Ventura, JA, Alatrista-Salas H (2017) Information management and Big Data. SIMBig: Annual International Symposium on Information Management and Big Data

  18. Zhang Z, Odaibo D, Skidmore FM, Tanik MM (2017) A Big Data analytics approach in medical imaging segmentation using deep convolutional neural networks. In, Cham, 2017. Big Data and Visual Analytics. Springer International Publishing, pp 181–189

  19. Martí L, Sanchez-Pi N, Molina JM, Bicharra Garcia AC (2014) YASA: Yet another time series segmentation algorithm for anomaly detection in Big Data problems. In, Cham, 2014. Hybrid Artificial Intelligence Systems. Springer International Publishing, pp 697–708

  20. Payette B (2007) Windows PowerShell in Action. Manning Publications

  21. Talaat S (2015) Getting Started with Azure PowerShell. In: Pro PowerShell for Microsoft Azure. Apress, Berkeley, CA, pp 9–17. doi:https://doi.org/10.1007/978-1-4842-0665-2_2

  22. Y M, M P, (2020) Using the object-oriented powershell for simple proteomics data analysis. Methods Mol Biol. https://doi.org/10.1007/978-1-4939-9744-2_17

    Article  Google Scholar 

  23. Y M, M. P (2013) Simple proteomics data analysis in the object-oriented PowerShell. Methods Mol Biol 2013: 379–391. doi:https://doi.org/10.1007/978-1-62703-392-3_17

  24. Garrett R (2013) Working with PowerShell. In: Pro SharePoint 2013 Administration. Apress, Berkeley, CA, pp 59–74. doi:https://doi.org/10.1007/978-1-4302-4942-9_3

  25. Deshev H (2008) Extending the type system. In: Pro Windows PowerShell. Apress, Berkeley, CA, pp 237–252. doi:https://doi.org/10.1007/978-1-4302-0546-3_12

  26. Mullin R (2004) Dealing with data overload. Chem Eng News Arch 82(12):19–26. https://doi.org/10.1021/cen-v082n012.p019

    Article  Google Scholar 

  27. Saxena D, Lamest M (2018) Information overload and coping strategies in the big data context: evidence from the hospitality sector. J Inf Sci 44(3):287–297. https://doi.org/10.1177/0165551517693712

    Article  Google Scholar 

  28. Philip Chen CL, Zhang C-Y (2014) Data-intensive applications, challenges, techniques and technologies: a survey on Big Data. Inf Sci 275:314–347. https://doi.org/10.1016/j.ins.2014.01.015

    Article  Google Scholar 

  29. Das S, Datta S, Chaudhuri BB (2018) Handling data irregularities in classification: Foundations, trends, and future challenges. Pattern Recogn 81:674–693. https://doi.org/10.1016/j.patcog.2018.03.008

    Article  Google Scholar 

  30. Simon P (2015) The elements of persuasion: Big Data techniques. In: Too Big to Ignore. doi:doi:https://doi.org/10.1002/9781119204039.ch3

  31. Aggarwal CC (2012) A segment-based framework for modeling and mining data streams. Knowl Inf Syst 30(1):1–29. https://doi.org/10.1007/s10115-010-0366-0

    Article  Google Scholar 

  32. Bab-Hadiashar A, Suter D (2000) Data segmentation and model selection for computer vision. Springer, New York, NY. https://doi.org/10.1007/978-0-387-21528-0

Download references

Acknowledgements

We thank the Chengdu Medical College for providing a robust and valuable big data platform to perform the big data analysis.

Funding

This work was supported by the Chengdu Medical College Foundation (CYZ-18-33, CYZ19-33), Chengdu Science and Technology Bureau focuses on research and development support plan (2019-YF09-00097-SN), the popular scientific research project of Sichuan Health Commission (20PJ171), and Sichuan undergraduate innovation and startup program funding support (S201913705080, S201913705130, S201913705059, S202013705070, S202013705075, S202013705108), Yun Nan Education Program (SYSX202036).

Author information

Authors and Affiliations

Authors

Contributions

FX and WR drafted the general outline; GM proofread and ensured the general language quality; DW, HFZ performed the PowerShell script drafts and tests, FFL drawn the figures.

Corresponding authors

Correspondence to Greg Mirt or Fan Xu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, W., Wan, D., Zhu, H. et al. PowerShell-based novel framework for Big health data analysis. Int. j. inf. tecnol. 13, 287–290 (2021). https://doi.org/10.1007/s41870-020-00559-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-020-00559-w

Keywords

Navigation