Foundation models meet visualizations: Challenges and opportunities

Yang, Weikai; Liu, Mengchen; Wang, Zheng; Liu, Shixia

doi:10.1007/s41095-023-0393-x

Foundation models meet visualizations: Challenges and opportunities

Review Article
Open access
Published: 02 May 2024

Volume 10, pages 399–424, (2024)
Cite this article

Download PDF

You have full access to this open access article

Computational Visual Media Aims and scope Submit manuscript

Foundation models meet visualizations: Challenges and opportunities

Download PDF

Weikai Yang¹,
Mengchen Liu²,
Zheng Wang¹ &
…
Shixia Liu¹

998 Accesses
Explore all metrics

Abstract

Recent studies have indicated that foundation models, such as BERT and GPT, excel at adapting to various downstream tasks. This adaptability has made them a dominant force in building artificial intelligence (AI) systems. Moreover, a new research paradigm has emerged as visualization techniques are incorporated into these models. This study divides these intersections into two research areas: visualization for foundation model (VIS4FM) and foundation model for visualization (FM4VIS). In terms of VIS4FM, we explore the primary role of visualizations in understanding, refining, and evaluating these intricate foundation models. VIS4FM addresses the pressing need for transparency, explainability, fairness, and robustness. Conversely, in terms of FM4VIS, we highlight how foundation models can be used to advance the visualization field itself. The intersection of foundation models with visualizations is promising but also introduces a set of challenges. By highlighting these challenges and promising opportunities, this study aims to provide a starting point for the continued exploration of this research avenue.

Article PDF

ModViz: A Modular and Extensible Architecture for Drill-Down and Visualization of Complex Data

On the Challenges and Opportunities in Visualization for Machine Learning and Knowledge Extraction: A Research Agenda

Visual Analytics in Environmental Research: A Survey on Challenges, Methods and Available Tools

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Bommasani, R.; Hudson, D. A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M. S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
Devlin, J.; Chang, M. W.; Lee, K.; Toutanova, K. BERT: Pretraining of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186, 2019.
Google Scholar
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16×16 words: Transformers for image recognition at scale. In: Proceedings of the International Conference on Learning Representations, 2021.
Wang, W.; Dai, J.; Chen, Z.; Huang, Z.; Li, Z.; Zhu, X.; Hu, X.; Lu, T.; Lu, L.; Li, H.; et al. Internimage: Exploring large-scale vision foundation models with deformable convolutions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14408–14419, 2023.
Radford, A.; Kim, J. W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, 8748–8763, 2021.
Brown, T. B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. In: Proceedings of the 34th Conference on Neural Information Processing Systems, 1877–1901, 2020.
Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. In: Proceedings of the 36th Conference on Neural Information Processing Systems, 27730–27744, 2022.
OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F. L.; Almeida, D.; Altenschmidt, J.; Altman, S. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
Eloundou, T.; Manning, S.; Mishkin, P.; Rock, D. GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv preprint arXiv:2303.10130, 2023.
Liu, S.; Wang, X.; Liu, M.; Zhu, J. Towards better analysis of machine learning models: A visual analytics perspective. Visual Informatics Vol. 1, No. 1, 48–56, 2017.
Article Google Scholar
Choo, J.; Liu, S. Visual analytics for explainable deep learning. IEEE Computer Graphics and Applications Vol. 38, No. 4, 84–92, 2018.
Article Google Scholar
Hohman, F.; Kahng, M.; Pienta, R.; Chau, D. H. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE Transactions on Visualization and Computer Graphics Vol. 25, No. 8, 2674–2693, 2019.
Article Google Scholar
Yuan, J.; Chen, C.; Yang, W.; Liu, M.; Xia, J.; Liu, S. A survey of visual analytics techniques for machine learning. Computational Visual Media Vol. 7, No. 1, 3–36, 2021.
Article Google Scholar
Sacha, D.; Kraus, M.; Keim, D. A.; Chen, M. VIS4ML: An ontology for visual analytics assisted machine learning. IEEE Transactions on Visualization and Computer Graphics Vol. 25, No. 1, 385–395, 2019.
Article Google Scholar
Wang, Q.; Chen, Z. T.; Wang, Y.; Qu, H. A survey on ML4VIS: Applying machine learning advances to data visualization. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 12, 5134–5153, 2022.
Article Google Scholar
Wu, A.; Wang, Y.; Shu, X.; Moritz, D.; Cui, W.; Zhang, H.; Zhang, D.; Qu, H. AI4VIS: Survey on artificial intelligence approaches for data visualization. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 12, 5049–5070, 2022.
Article Google Scholar
Wang, J.; Liu, S.; Zhang, W. Visual analytics for machine learning: A data perspective survey. arXiv preprint arXiv:2307.07712, 2023.
Shen, L.; Shen, E.; Luo, Y.; Yang, X.; Hu, X.; Zhang, X.; Tai, Z.; Wang, J. Towards natural language interfaces for data visualization: A survey. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 6, 3121–3144, 2023.
Article Google Scholar
Liu, S.; Wang, X.; Collins, C.; Dou, W.; Ouyang, F.; El-Assady, M.; Jiang, L.; Keim, D. A. Bridging text visualization and mining: A task-driven survey. IEEE Transactions on Visualization and Computer Graphics Vol. 25, No. 7, 2482–2504, 2019.
Article Google Scholar
Reif, E.; Kahng, M.; Petridis, S. Visualizing linguistic diversity of text datasets synthesized by large language models. arXiv preprint arXiv:2305.11364, 2023.
Jin, Z.; Wang, X.; Cheng, F.; Sun, C.; Liu, Q.; Qu, H. ShortcutLens: A visual analytics approach for exploring shortcuts in natural language understanding dataset. IEEE Transactions on Visualization and Computer Graphics doi: https://doi.org/10.1109/TVCG.2023.3236380, 2023.
Chen, C.; Yuan, J.; Lu, Y.; Liu, Y.; Su, H.; Yuan, S.; Liu, S. OoDAnalyzer: Interactive analysis of out-of-distribution samples. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 7, 3335–3349, 2021.
Article Google Scholar
Yang, W.; Li, Z.; Liu, M.; Lu, Y.; Cao, K.; Maciejewski, R.; Liu, S. Diagnosing concept drift with visual analytics. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 12–23, 2020.
Liu, S.; Chen, C.; Lu, Y.; Ouyang, F.; Wang, B. An interactive method to improve crowdsourced annotations. IEEE Transactions on Visualization and Computer Graphics Vol. 25, No. 1, 235–245, 2019.
Article Google Scholar
Xiang, S.; Ye, X.; Xia, J.; Wu, J.; Chen, Y.; Liu, S. Interactive correction of mislabeled training data. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 57–68, 2019.
Bäuerle, A.; Neumann, H.; Ropinski, T. Classifier-guided visual correction of noisy labels for image classification tasks. Computer Graphics Forum Vol. 39, No. 3, 195–205, 2020.
Article Google Scholar
Li, R.; Xiao, W.; Wang, L.; Jang, H.; Carenini, G. T3-Vis: Visual analytic for Training and fine-Tuning Transformers in NLP. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 220–230, 2021.
DeRose, J. F.; Wang, J.; Berger, M. Attention flows: Analyzing and comparing attention mechanisms in language models. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 2, 1160–1170, 2021.
Article Google Scholar
Li, Y.; Wang, J.; Dai, X.; Wang, L.; Yeh, C. C. M.; Zheng, Y.; Zhang, W.; Ma, K. L. How does attention work in vision transformers? A visual analytics attempt. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 6, 2888–2900, 2023.
Article Google Scholar
Yeh, C.; Chen, Y.; Wu, A.; Chen, C.; Viégas, F.; Wattenberg, M. AttentionViz: A global view of transformer attention. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 262–272, 2024.
Google Scholar
Li, Z.; Wang, X.; Yang, W.; Wu, J.; Zhang, Z.; Liu, Z.; Sun, M.; Zhang, H.; Liu, S. A unified understanding of deep NLP models for text classification. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 12, 4980–4994, 2022.
Article Google Scholar
Zhang, X.; Ono, J. P.; Song, H.; Gou, L.; Ma, K. L.; Ren, L. SliceTeller: A data slice-driven approach for machine learning model validation. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 842–852, 2023.
Google Scholar
Wei, Y.; Wang, Z.; Wang, Z.; Dai, Y.; Ou, G.; Gao, H.; Yang, H.; Wang, Y.; Cao, C. C.; Weng, L.; et al. Visual diagnostics of parallel performance in training large-scale DNN models. IEEE Transactions on Visualization and Computer Graphics doi: https://doi.org/10.1109/TVCG.2023.3243228, 2023.
Wang, X.; Huang, R.; Jin, Z.; Fang, T.; Qu, H. CommonsenseVIS: Visualizing and understanding commonsense reasoning capabilities of natural language models. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 273–283, 2024.
Google Scholar
Sevastjanova, R.; Cakmak, E.; Ravfogel, S.; Cotterell, R.; El-Assady, M. Visual comparison of language model adaptation. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 1178–1188, 2023.
Google Scholar
Strobelt, H.; Webson, A.; Sanh, V.; Hoover, B.; Beyer, J.; Pfister, H.; Rush, A. M. Interactive and visual prompt engineering for ad-hoc task adaptation with large language models. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 1146–1156, 2023.
Google Scholar
Wu, S.; Shen, H.; Weld, D. S.; Heer, J.; Ribeiro, M. T. ScatterShot: Interactive In-context example curation for text transformation. In: Proceedings of the Proceedings of the 28th International Conference on Intelligent User Interfaces, 353–367, 2023.
Feng, Y.; Wang, X.; Wong, K. K.; Wang, S.; Lu, Y.; Zhu, M.; Wang, B.; Chen, W. PromptMagician: Interactive prompt engineering for text-to-image creation. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 295–305, 2024.
Google Scholar
Wu, T.; Jiang, E.; Donsbach, A.; Gray, J.; Molina, A.; Terry, M.; Cai, C. J. PromptChainer: Chaining large language model prompts through visual programming. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 359, 2022.
Wu, T.; Terry, M.; Cai, C. J. AI chains: Transparent and controllable human-AI interaction by chaining large language model prompts. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 385, 2022.
Chung, J. J. Y.; Kim, W.; Yoo, K. M.; Lee, H.; Adar, E.; Chang, M. TaleBrush: Sketching stories with generative pretrained language models. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 209, 2022.
Alsallakh, B.; Hanbury, A.; Hauser, H.; Miksch, S.; Rauber, A. Visual methods for analyzing probabilistic classification data. IEEE Transactions on Visualization and Computer Graphics Vol. 20, No. 12, 1703–1712, 2014.
Article Google Scholar
Ren, D.; Amershi, S.; Lee, B.; Suh, J.; Williams, J. D. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE Transactions on Visualization and Computer Graphics Vol. 23, No. 1, 61–70, 2017.
Article Google Scholar
Görtler, J.; Hohman, F.; Moritz, D.; Wongsuphasawat, K.; Ren, D.; Nair, R.; Kirchner, M.; Patel, K. Neo: Generalizing confusion matrix visualization to hierarchical and multi-output labels. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 408, 2022.
Chen, C.; Guo, Y.; Tian, F.; Liu, S.; Yang, W.; Wang, Z.; Wu, J.; Su, H.; Pfister, H.; Liu, S. A unified interactive model evaluation for classification, object detection, and instance segmentation in computer vision. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 76–86, 2024.
Google Scholar
Liu, S.; Andrienko, G.; Wu, Y.; Cao, N.; Jiang, L.; Shi, C.; Wang, Y. S.; Hong, S. Steering data quality with visual analytics: The complexity challenge. Visual Informatics Vol. 2, No. 4, 191–197, 2018.
Article Google Scholar
Jiang, L.; Liu, S.; Chen, C. Recent research advances on interactive machine learning. Journal of Visualization Vol. 22, No. 2, 401–417, 2019.
Article Google Scholar
Chen, C.; Wang, Z.; Wu, J.; Wang, X.; Guo, L. Z.; Li, Y. F.; Liu, S. Interactive graph construction for graph-based semi-supervised learning. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 9, 3701–3716, 2021.
Article Google Scholar
Chen, C.; Wu, J.; Wang, X.; Xiang, S.; Zhang, S. H.; Tang, Q.; Liu, S. Towards better caption supervision for object detection. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 4, 1941–1954, 2022.
Article Google Scholar
Liu, M.; Shi, J.; Li, Z.; Li, C.; Zhu, J.; Liu, S. Towards better analysis of deep convolutional neural networks. IEEE Transactions on Visualization and Computer Graphics Vol. 23, No. 1, 91–100, 2017.
Article Google Scholar
Liu, M.; Shi, J.; Cao, K.; Zhu, J.; Liu, S. Analyzing the training processes of deep generative models. IEEE Transactions on Visualization and Computer Graphics Vol. 24, No. 1, 77–87, 2018.
Article Google Scholar
Sun, M.; Cai, L.; Cui, W.; Wu, Y.; Shi, Y.; Cao, N. Erato: Cooperative data story editing via fact interpolation. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 983–993, 2023.
Google Scholar
Ying, L.; Shu, X.; Deng, D.; Yang, Y.; Tang, T.; Yu, L.; Wu, Y. MetaGlyph: Automatic generation of metaphoric glyph-based visualization. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 331–341, 2023.
Google Scholar
Guo, Y.; Han, Q.; Lou, Y.; Wang, Y.; Liu, C.; Yuan, X. Edit-history vis: An interactive visual exploration and analysis on wikipedia edit history. In: Proceedings of the IEEE 16th Pacific Visualization Symposium, 157–166, 2023.
Tu, Y.; Qiu, R.; Wang, Y. S.; Yen, P. Y.; Shen, H. W. PhraseMap: Attention-based keyphrases recommendation for information seeking. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 3, 1787–1802, 2024.
Article Google Scholar
Li, X.; Wang, Y.; Wang, H.; Wang, Y.; Zhao, J. NBSearch: Semantic search and visual exploration of computational notebooks. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 308, 2021.
Narechania, A.; Karduni, A.; Wesslen, R.; Wall, E. VITALITY: Promoting serendipitous discovery of academic literature with transformers & visual analytics. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 1, 486–496, 2022.
Article Google Scholar
Shi, C.; Nie, F.; Hu, Y.; Xu, Y.; Chen, L.; Ma, X.; Luo, Q. MedChemLens: An interactive visual tool to support direction selection in interdisciplinary experimental research of medicinal chemistry. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 63–73, 2023.
Google Scholar
Resck, L. E.; Ponciano, J. R.; Nonato, L. G.; Poco, J. LegalVis: Exploring and inferring precedent citations in legal documents. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 6, 3105–3120, 2023.
Article Google Scholar
Zhang, X.; Engel, J.; Evensen, S.; Li, Y.; Demiralp, C.; Tan, W. C. Teddy: A system for interactive review analysis. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 108, 2020.
Wu, Y.; Xu, Y.; Gao, S.; Wang, X.; Song, W.; Nie, Z.; Fan, X.; Li, Q. LiveRetro: Visual analytics for strategic retrospect in livestream E-commerce. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 1117–1127, 2024.
Article Google Scholar
Ouyang, Y.; Wu, Y.; Wang, H.; Zhang, C.; Cheng, F.; Jiang, C.; Jin, L.; Cao, Y.; Li, Q. Leveraging historical medical records as a proxy via multimodal modeling and visualization to enrich medical diagnostic learning. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 1238–1248, 2024.
Article Google Scholar
Tu, Y.; Li, O.; Wang, J.; Shen, H. W.; Powalko, P.; Tomescu-Dubrow, I.; Slomczynski, K. M.; Blanas, S.; Jenkins, J. C. SDRQuerier: A visual querying framework for cross-national survey data recycling. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 6, 2862–2874, 2023.
Article Google Scholar
Chen, Z.; Yang, Q.; Shan, J.; Lin, T.; Beyer, J.; Xia, H.; Pfister, H. IBall: Augmenting basketball videos with gaze-moderated embedded visualizations. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 841, 2023.
Chen, Z. T.; Yang, Q.; Xie, X.; Beyer, J.; Xia, H.; Wu, Y.; Pfister, H. Sporthesia: Augmenting sports videos using natural language. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 918–928, 2023.
Article Google Scholar
Tu, Y.; Xu, J.; Shen, H. W. KeywordMap: Attention-based visual exploration for keyword analysis. In: Proceedings of the IEEE 14th Pacific Visualization Symposium, 206–215, 2021.
Liu, C.; Han, Y.; Jiang, R.; Yuan, X. ADVISor: Automatic visualization answer for natural-language question on tabular data. In: Proceedings of the IEEE 14th Pacific Visualization Symposium, 11–20, 2021.
Shen, L.; Zhang, Y.; Zhang, H.; Wang, Y. Data player: Automatic generation of data videos with narration-animation interplay. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 109–119, 2024.
Article Google Scholar
Xiao, S.; Huang, S.; Lin, Y.; Ye, Y.; Zeng, W. Let the chart spark: Embedding semantic context into chart with text-to-image generative model. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 284–294, 2024.
Google Scholar
Singh, H.; Shekhar, S. STL-CQA: Structure-based transformers with localization and encoding for chart question answering. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 3275–3284, 2020.
Ma, W.; Zhang, H.; Yan, S.; Yao, G.; Huang, Y.; Li, H.; Wu, Y.; Jin, L. Towards an efficient framework for data extraction from chart images. In: Document Analysis and Recognition–ICDAR 2021. Lecture Notes in Computer Science, Vol. 12821. Lladós, J.; Lopresti, D.; Uchida, S. Eds. Springer Cham, 583–597, 2021.
Chapter Google Scholar
Song, S.; Li, C.; Sun, Y.; Wang, C. VividGraph: Learning to extract and redesign network graphs from visualization images. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 7, 3169–3181, 2023.
Article Google Scholar
Chen, Z. T.; Wang, Y.; Wang, Q.; Wang, Y.; Qu, H. Towards automated infographic design: Deep learning-based auto-extraction of extensible timeline. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 1, 917–926, 2020.
Article Google Scholar
Sultanum, N.; Srinivasan, A. DATATALES: Investigating the use of large language models for authoring data-driven articles. In: Proceedings of the IEEE Visualization and Visual Analytics, 231–235, 2023.
Liu, C.; Guo, Y.; Yuan, X. AutoTitle: An interactive title generator for visualizations. IEEE Transactions on Visualization and Computer Graphics doi: https://doi.org/10.1109/TVCG.2023.3290241, 2023.
Song, S.; Chen, J.; Li, C.; Wang, C. GVQA: Learning to answer questions about graphs with visualizations via knowledge base. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 464, 2023.
Adhikary, J.; Vertanen, K. Text entry in virtual environments using speech and a midair keyboard. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 5, 2648–2658, 2021.
Article Google Scholar
Card, S. K.; Mackinlay, J. D.; Shneiderman, B. Readings in Information Visualization: Using Vision to Think. San Francisco, CA, USA: Academic Press, 1999.
Google Scholar
Zhou, C.; Li, Q.; Li, C.; Yu, J.; Liu, Y.; Wang, G.; Zhang, K.; Ji, C.; Yan, Q.; He, L.; et al. A comprehensive survey on pretrained foundation models: A history from BERT to ChatGPT. arXiv preprint arXiv:2302.09419, 2023.
Chen, Z. T.; Zeng, W.; Yang, Z.; Yu, L.; Fu, C. W.; Qu, H. LassoNet: Deep lasso-selection of 3D point clouds. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 1, 195–204, 2020.
Article Google Scholar
Ottley, A.; Garnett, R.; Wan, R. Follow the clicks: Learning and anticipating mouse interactions during exploratory data analysis. Computer Graphics Forum Vol. 38, No. 3, 41–52, 2019.
Article Google Scholar
Brown, E. T.; Ottley, A.; Zhao, H.; Lin, Q.; Souvenir, R.; Endert, A.; Chang, R. Finding Waldo: Learning about users from their interactions. IEEE Transactions on Visualization and Computer Graphics Vol. 20, No. 12, 1663–1672, 2014.
Article Google Scholar
Wexler, J.; Pushkarna, M.; Bolukbasi, T.; Wattenberg, M.; Viegas, F.; Wilson, J. The what-if tool: Interactive probing of machine learning models. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 1, 56–65, 2020.
Google Scholar
Houlsby, N.; Giurgiu, A.; Jastrzebski, S.; Morrone, B.; De Laroussilhe, Q.; Gesmundo, A.; Attariyan, M.; Gelly, S. Parameterefficient transfer learning for NLP. In: Proceedings of the 36th International Conference on Machine Learning, 2790–2799, 2019.
Hu, E. J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-rank adaptation of large language models. In: Proceedings of the International Conference on Learning Representations, 2021.
AdapterHub. Available at https://adapterhub.ml/
Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.; Zhou, D. Chain-of-thought prompting elicits reasoning in large language models. In: Proceedings of the 36th Conference on Neural Information Processing Systems, 24824–24837, 2022.
Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Liu, P. J. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research Vol. 21, No. 1, 5485–5551, 2020.
MathSciNet Google Scholar
Wang, Y.; Hou, Z.; Shen, L.; Wu, T.; Wang, J.; Huang, H.; Zhang, H.; Zhang, D. Towards natural language-based visualization authoring. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 1222–1232, 2023.
Google Scholar
Schwartz, R.; Dodge, J.; Smith, N. A.; Etzioni, O. Green AI. Communications of the ACM Vol. 63, No. 12, 54–63, 2020.
Article Google Scholar
Zhou, C.; Liu, P.; Xu, P.; Lyer, S.; Sun, J.; Mao, Y.; Ma, X.; Efrat, A.; Yu, P.; Yu, L.; et al. LIMA: Less is more for alignment. In: Proceedings of the 37th Conference on Neural Information Processing Systems, 2024.
Zhou, Y.; Yang, W.; Chen, J.; Chen, C.; Shen, Z.; Luo, X.; Yu, L.; Liu, S. Cluster-aware grid layout. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 240–250, 2024.
Google Scholar
Yang, W.; Wang, X.; Lu, J.; Dou, W.; Liu, S. Interactive steering of hierarchical clustering. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 10, 3953–3967, 2021.
Article Google Scholar
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, S. G.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
Ma, K. L. In situ visualization at extreme scale: Challenges and opportunities. IEEE Computer Graphics and Applications Vol. 29, No. 6, 14–19, 2009.
Article Google Scholar
Rapp, T.; Peters, C.; Dachsbacher, C. Image-based visualization of large volumetric data using moments. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 6, 2314–2325, 2022.
Google Scholar
Richer, G.; Pister, A.; Abdelaal, M.; Fekete, J. D.; Sedlmair, M.; Weiskopf, D. Scalability in visualization. IEEE Transactions on Visualization and Computer Graphics doi: https://doi.org/10.1109/TVCG.2022.3231230, 2022.
Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Wu, Z.; Chang, B.; Sun, X.; Xu, J.; Li, L.; Sui, Z. A survey on incontext learning. arXiv preprint arXiv:2301.00234, 2022.
Liu, S.; Xiao, J.; Liu, J.; Wang, X.; Wu, J.; Zhu, J. Visual diagnosis of tree boosting methods. IEEE Transactions on Visualization and Computer Graphics Vol. 24, No. 1, 163–173, 2018.
Article Google Scholar
Yuan, J.; Liu, M.; Tian, F.; Liu, S. Visual analysis of neural architecture spaces for summarizing design principles. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 1, 288–298, 2023.
Google Scholar
Khayat, M.; Karimzadeh, M.; Zhao, J.; Ebert, D. S. VASSL: A visual analytics toolkit for social spambot labeling. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 1, 874–883, 2020.
Article Google Scholar
Bernard, J.; Zeppelzauer, M.; Lehmann, M.; Muller, M.; Sedlmair, M. Towards user-centered active learning algorithms. Computer Graphics Forum Vol. 37, No. 3, 121–132, 2018.
Article Google Scholar
Yang, W.; Ye, X.; Zhang, X.; Xiao, L.; Xia, J.; Wang, Z.; Zhu, J.; Pfister, H.; Liu, S. Diagnosing ensemble few-shot classifiers. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 9, 3292–3306, 2022.
Article Google Scholar
Zhou, Z. H.; Tan, Z. H. Learnware: Small models do big. Science China Information Sciences Vol. 67, No. 1, Article No. 112102, 2023.
HuggingFace. Available at https://huggingface.co/models
Wang, Q.; Yuan, J.; Chen, S.; Su, H.; Qu, H.; Liu, S. Visual genealogy of deep neural networks. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 11, 3340–3352, 2020.
Article Google Scholar
Cao, K.; Liu, M.; Su, H.; Wu, J.; Zhu, J.; Liu, S. Analyzing the noise robustness of deep neural networks. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 7, 3289–3304, 2021.
Article Google Scholar
Liu, M.; Liu, S.; Su, H.; Cao, K.; Zhu, J. Analyzing the noise robustness of deep neural networks. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 60–71, 2018.
Qiu, R.; Tu, Y.; Wang, Y. S.; Yen, P. Y.; Shen, H. W. DocFlow: A visual analytics system for question-based document retrieval and categorization. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 2, 1533–1548, 2024.
Article Google Scholar
Shi, D.; Xu, X.; Sun, F.; Shi, Y.; Cao, N. Calliope: Automatic visual data story generation from a spreadsheet. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 2, 453–463, 2021.
Article Google Scholar
Chen, Q.; Chen, N.; Shuai, W.; Wu, G.; Xu, Z.; Tong, H.; Cao, N. Calliope-net: Automatic generation of graph data facts via annotated node-link diagrams. IEEE Transactions on Visualization and Computer Graphics Vol. 30, No. 1, 562–572, 2024.
Google Scholar
Blei D. M.; Ng A. Y.; Jordan, M. I. Latent dirichlet allocation. Journal of Machine Learning Research Vol. 3, 993–1022, 2003.
Google Scholar
Lowe, D. G. Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, 1150–1157, 1999.
Rozière, B.; Gehring, J.; Gloeckle, F.; Sootla, S.; Gat, L.; Tan, X. E.; Adi, Y.; Liu, J.; Sauvestre, R.; Remez, T.; et al. Code Llama: Open foundation models for code. arXiv preprint arXiv:2308.12950, 2023.
Bostock, M.; Ogievetsky, V.; Heer, J. D³ Data-Driven Documents. IEEE Transactions on Visualization and Computer Graphics Vol. 17, No. 12, 2301–2309, 2011.
Article Google Scholar
Hunter, J. D. Matplotlib: A 2D graphics environment. Computing in Science and Engineering Vol. 9, No. 3, 90–95, 2007.
Article Google Scholar
Kwon, O. H.; Ma, K. L. A deep generative model for graph layout. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 1, 665–675, 2020.
Article Google Scholar
Zamfirescu-Pereira, J. D.; Wong, R. Y.; Hartmann, B.; Yang, Q. Why johnny can’t prompt: How non-AI experts try (and fail) to design LLM prompts. In: Proceedings of the CHI Conference on Human Factors in Computing Systems, Article No. 437, 2023.
Pryzant, R.; Iter, D.; Li, J.; Lee, Y. T.; Zhu, C.; Zeng, M. Automatic prompt optimization with “gradient descent” and beam search. arXiv preprint arXiv:2305.03495, 2023.
Jing, Y.; Yang, Y.; Feng, Z.; Ye, J.; Yu, Y.; Song, M. Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 11, 3365–3385, 2020.
Article Google Scholar
Abdal, R.; Qin, Y.; Wonka, P. Image2StyleGAN: How to embed images into the StyleGAN latent space? In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4432–4441, 2019.
Chen, Q.; Cao, S.; Wang, J.; Cao, N. How does automation shape the process of narrative visualization: A survey of tools. IEEE Transactions on Visualization and Computer Graphics doi: https://doi.org/10.1109/TVCG.2023.3261320, 2023.
Antol, S.; Agrawal, A.; Lu, J.; Mitchell, M.; Batra, D.; Zitnick, C. L.; Parikh, D. VQA: Visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, 2425–2433, 2015.
Anil, R.; Dai, A. M.; Firat, O.; Johnson, M.; Lepikhin, D.; Passos, A.; Shakeri, S.; Taropa, E.; Bailey, P.; Chen, Z.; et al. PaLM 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
Zhao, Y.; Jiang, H.; Chen, Q. A.; Qin, Y.; Xie, H.; Wu, Y.; Liu, S.; Zhou, Z.; Xia, J.; Zhou, F. Preserving minority structures in graph sampling. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 2, 1698–1708, 2021.
Article Google Scholar
Yuan, J.; Xiang, S.; Xia, J.; Yu, L.; Liu, S. Evaluation of sampling methods for scatterplots. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 2, 1720–1730, 2021.
Article Google Scholar
Pan, X.; Tewari, A.; Leimkühler, T.; Liu, L.; Meka, A.; Theobalt, C. Drag your GAN: Interactive point-based manipulation on the generative image manifold. In: Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, Article No. 78, 2023.
Wang, L.; Ma, C.; Feng, X.; Zhang, Z.; Yang, H.; Zhang, J.; Chen, Z.; Tang, J.; Chen, X.; Lin, Y.; et al. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023.
Park, J. S.; O’Brien, J.; Cai, C. J.; Morris, M. R.; Liang, P.; Bernstein, M. S. Generative agents: Interactive simulacra of human behavior. In: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, Article No. 2, 2023.

Download references

Acknowledgements

The authors thank Dr. Xiting Wang, Dr. Changjian Chen, Jun Yuan, Yukai Guo, Jiangning Zhu, and Duan Li for their valuable comments.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. U21A20469 and 61936002), the National Key R&D Program of China (Grant No. 2020YFB2104100), and grants from the Institute Guo Qiang, THUIBCS, and BLBCI.

Author information

Authors and Affiliations

School of Software, Tsinghua University, Beijing, 100084, China
Weikai Yang, Zheng Wang & Shixia Liu
Microsoft, Redmond, 98052, USA
Mengchen Liu

Authors

Weikai Yang
View author publications
You can also search for this author in PubMed Google Scholar
Mengchen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shixia Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Weikai Yang: Conceptualization, Writing - Original Draft, Writing - Review Editing. Mengchen Liu: Conceptualization, Writing - Original Draft, Writing - Review Editing. Wang Zheng: Writing - Original Draft, Writing - Review Editing. Shixia Liu: Conceptualization, Supervision, Writing - Original Draft, Writing - Review Editing, Funding acquisition.

Corresponding author

Correspondence to Shixia Liu.

Ethics declarations

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Weikai Yang is a Ph.D. candidate at Tsinghua University. His research interests include visual text analytics and interactive machine learning. He received his B.S. degree from Tsinghua University.

Mengchen Liu is a senior researcher at Microsoft. His research interests include explainable AI and computer vision. He received his B.S. degree in electronics engineering and his Ph.D. degree in computer science from Tsinghua University. He has served as a PC member and reviewer for various conferences and journals.

Zheng Wang is currently working toward a graduate degree at Tsinghua University.

Shixia Liu is a professor at Tsinghua University. Her research interests include visual text analytics, visual social analytics, interactive machine learning, and text mining. She worked as a research staff member at IBM China Research Lab and a lead researcher at Microsoft Research Asia. She received her B.S. and M.S. degrees from Harbin Institute of Technology, her Ph.D. degree from Tsinghua University. She is a fellow of IEEE and an associate editor-in-chief of IEEE Trans. Vis. Comput. Graph.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and permissions

About this article

Cite this article

Yang, W., Liu, M., Wang, Z. et al. Foundation models meet visualizations: Challenges and opportunities. Comp. Visual Media 10, 399–424 (2024). https://doi.org/10.1007/s41095-023-0393-x

Download citation

Received: 04 October 2023
Accepted: 15 November 2023
Published: 02 May 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s41095-023-0393-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Foundation models meet visualizations: Challenges and opportunities

Abstract

Article PDF

Similar content being viewed by others

ModViz: A Modular and Extensible Architecture for Drill-Down and Visualization of Complex Data

On the Challenges and Opportunities in Visualization for Machine Learning and Knowledge Extraction: A Research Agenda

Visual Analytics in Environmental Research: A Survey on Challenges, Methods and Available Tools

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Foundation models meet visualizations: Challenges and opportunities

Abstract

Article PDF

Similar content being viewed by others

ModViz: A Modular and Extensible Architecture for Drill-Down and Visualization of Complex Data

On the Challenges and Opportunities in Visualization for Machine Learning and Knowledge Extraction: A Research Agenda

Visual Analytics in Environmental Research: A Survey on Challenges, Methods and Available Tools

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation