Skip to main content

Intelligent Visualization System for Big Multi-source Medical Data Based on Data Lake

  • Conference paper
  • First Online:
Web Information Systems and Applications (WISA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12999))

Included in the following conference series:

Abstract

With the rapid development of information technology, large amounts of multi-source data are constantly being generated in medical field. The automatic visualization system based on them has gained a lot of attention, since the intuitive data presentation can help even non-professional users effectively get the information hidden behind the separate data obtained from different scenarios and make better decisions. In this paper, based on the Data Lake architecture, we improve the performance of an existing novel data visualization recommendation system and resolve three challenges about the processing of multi-source and heterogeneous data. First, we build the framework based on Data Lake to store multi-source and heterogeneous data. Second, we optimize the data manipulation module in the visualization system based on the distributed processing power of Data Lake to get potentially interesting visualization candidates in a short time. Third, we efficiently run exploratory queries on large datasets based on the calculation capability of Data Lake to meet the actual needs of users. According to the experiment results, our system demonstrates a remarkable acceleration effect on the task of automatic visualization of big multi-source medical data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Feng, W., Li, G., Zhao, H.: Research on Visualization and Application of Medical Big Data, pp. 383–386 (2018)

    Google Scholar 

  2. Yang, Y., Chen, T.: Analysis and visualization implementation of medical big data resource sharing mechanism based on deep learning. IEEE Access 7, 156077–156088 (2019)

    Article  Google Scholar 

  3. Liu, H., Taniguchi, T., Tanaka, Y., Takenaka, K., Bando, T.: Visualization of driving behavior based on hidden feature extraction by using deep learning. IEEE Trans. Intell. Transp. Syst. 18, 2477–2489 (2017)

    Article  Google Scholar 

  4. Satagopam, V., et al.: Integration and visualization of translational medicine data for better understanding of human diseases. Big Data 4, 97–108 (2016)

    Article  Google Scholar 

  5. Ledesma, A., Al-Musawi, M., Nieminen, H.: Health figures: an open source JavaScript library for health data visualization. BMC Med. Inform. Decis. Mak. 16, 38 (2016)

    Article  Google Scholar 

  6. Qin, X., Luo, Y., Tang, N., Li, G.: Deepeye: An automatic big data visualization framework. Big Data Min. Analyt. 1, 75–82 (2018)

    Article  Google Scholar 

  7. Ravat, F., Zhao, Y.: Data lakes: trends and perspectives. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) DEXA 2019. LNCS, vol. 11706, pp. 304–313. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-27615-7_23

    Chapter  Google Scholar 

  8. Satyanarayan, A., Moritz, D., Wongsuphasawat, K., Heer, J.: Vega-lite: a grammar of interactive graphics. IEEE Trans. Vis. Comput. Graph. 23, 341–350 (2016)

    Article  Google Scholar 

  9. Kemper, A., Neumann, T.: HyPer: a hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 195–206. IEEE (2011)

    Google Scholar 

  10. Li, D., et al.: ECharts: a declarative framework for rapid construction of web-based visualization. Vis. Inf. 2, 136–146 (2018)

    Google Scholar 

  11. Bostock, M., Ogievetsky, V., Heer, J.: D3 data-driven documents. IEEE Trans. Vis. Comput. Graph. 17, 2301–2309 (2011)

    Article  Google Scholar 

  12. Moritz, D., Fisher, D., Ding, B., Wang, C.: Trust, but verify: optimistic visualizations of approximate queries for exploring big data. In: Proceedings of the 2017 CHI conference on human factors in computing systems, pp. 2904–2915 (2017)

    Google Scholar 

  13. Qin, X., Luo, Y., Tang, N., Li, G.: Making data visualization more efficient and effective: a survey. VLDB J. 29(1), 93–117 (2019). https://doi.org/10.1007/s00778-019-00588-3

    Article  Google Scholar 

  14. Luo, Y., Qin, X., Tang, N., Li, G.: DeepEye: towards automatic data visualization, pp. 101–112 (2018)

    Google Scholar 

  15. Deng, D., Li, G., Feng, J., Duan, Y., Gong, Z.: A unified framework for approximate dictionary-based entity extraction. VLDB J. 24, 143–167 (2015)

    Article  Google Scholar 

  16. Armbrust, M., et al.: Delta lake. Proc. VLDB Endow. 13, 3411–3424 (2020)

    Article  Google Scholar 

  17. Introduction to Delta Lake — Delta Lake Documentation. https://docs.delta.io/0.4.0/delta-intro.html. Accessed 21 May 2021

  18. Guller, M.: Spark SQL. In: Big Data Analytics with Spark, pp. 103–152. Apress, Berkeley, CA (2015)

    Chapter  Google Scholar 

  19. Table Deles, Updates and Merges — Delta Lake Documentation. https://docs.delta.io/0.4.0/delta-update.html. Accessed 21 May 2021

  20. Zaharia, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: 9th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 12), pp. 15–28 (2012)

    Google Scholar 

  21. Zhao, X., Lei, Z., Zhang, G., Zhang, Y., Xing, C.: Blockchain and distributed system. In: Wang, G., Lin, X., Hendler, J., Song, W., Xu, Z., Liu, G. (eds.) WISA 2020. LNCS, vol. 12432, pp. 629–641. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60029-7_56

    Chapter  Google Scholar 

  22. Luo, Y., Qin, X., Tang, N., Li, G., Wang, X.: DeepEye: Creating Good Data Visualizations by Keyword Search, pp. 1733–1736 (2018)

    Google Scholar 

  23. Qin, X., Luo, Y., Tang, N., Li, G.: DeepEye: Visualizing Your Data by Keyword Search. In: EDBT, pp. 441–444. (2018)

    Google Scholar 

  24. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. In: 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14), pp. 599–613 (2014)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China (2019YFC0119600).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ren, P. et al. (2021). Intelligent Visualization System for Big Multi-source Medical Data Based on Data Lake. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds) Web Information Systems and Applications. WISA 2021. Lecture Notes in Computer Science(), vol 12999. Springer, Cham. https://doi.org/10.1007/978-3-030-87571-8_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87571-8_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87570-1

  • Online ISBN: 978-3-030-87571-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics