Advertisement

On the Challenges and Opportunities in Visualization for Machine Learning and Knowledge Extraction: A Research Agenda

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10410)

Abstract

We describe a selection of challenges at the intersection of machine learning and data visualization and outline a subjective research agenda based on professional and personal experience. The unprecedented increase in the amount, variety and the value of data has been significantly transforming the way that scientific research is carried out and businesses operate. Within data science, which has emerged as a practice to enable this data-intensive innovation by gathering together and advancing the knowledge from fields such as statistics, machine learning, knowledge extraction, data management, and visualization, visualization plays a unique and maybe the ultimate role as an approach to facilitate the human and computer cooperation, and to particularly enable the analysis of diverse and heterogeneous data using complex computational methods where algorithmic results are challenging to interpret and operationalize. Whilst algorithm development is surely at the center of the whole pipeline in disciplines such as Machine Learning and Knowledge Discovery, it is visualization which ultimately makes the results accessible to the end user. Visualization thus can be seen as a mapping from arbitrarily high-dimensional abstract spaces to the lower dimensions and plays a central and critical role in interacting with machine learning algorithms, and particularly in interactive machine learning (iML) with including the human-in-the-loop. The central goal of the CD-MAKE VIS workshop is to spark discussions at this intersection of visualization, machine learning and knowledge discovery and bring together experts from these disciplines. This paper discusses a perspective on the challenges and opportunities in this integration of these discipline and presents a number of directions and strategies for further research.

Keywords

Visualization Machine learning Knowledge extraction 

References

  1. 1.
    Endert, A., Ribarsky, W., Turkay, C., William Wong, B.W., Nabney, I., Blanco, I.D., Rossi, F.: The state of the art in integrating machine learning into visual analytics. Comput. Graph. Forum (2017). http://onlinelibrary.wiley.com/doi/10.1111/cgf.13092/full
  2. 2.
    Marsland, S.: Machine Learning: An Algorithmic Perspective. CRC Press, Boca Raton (2015)Google Scholar
  3. 3.
    Keim, D.A., Mansmann, F., Schneidewind, J., Thomas, J., Ziegler, H.: Visual analytics: scope and challenges. In: Simoff, S.J., Böhlen, M.H., Mazeika, A. (eds.) Visual Data Mining. LNCS, vol. 4404, pp. 76–90. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-71080-6_6 CrossRefGoogle Scholar
  4. 4.
    Williams, M., Munzner, T.: Steerable, progressive multidimensional scaling. In: IEEE Symposium on Information Visualization, INFOVIS 2004, pp. 57–64. IEEE (2004)Google Scholar
  5. 5.
    Turkay, C., Slingsby, A., Lahtinen, K., Butt, S., Dykes, J.: Supporting theoretically-grounded model building in the social sciences through interactive visualisation. Neurocomputing (2017). http://www.sciencedirect.com/science/article/pii/S0925231217307610
  6. 6.
    Vellido, A., Martín-Guerrero, J.D., Lisboa, P.J.: Making machine learning models interpretable. In: ESANN - European Symposium on Artificial Neural Network. vol 12. pp.163–172 (2012)Google Scholar
  7. 7.
    Ward, M., Grinstein, G., Keim, D.: Interactive Data Visualization: Foundations, Techniques, and Applications. AK Peters, Ltd, Massachusetts (2010)zbMATHGoogle Scholar
  8. 8.
    Holzinger, A., Jurisica, I.: Knowledge discovery and data mining in biomedical informatics: the future is in integrative, interactive machine learning solutions. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 1–18. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-43968-5_1 CrossRefGoogle Scholar
  9. 9.
    Mueller, H., Reihs, R., Zatloukal, K., Holzinger, A.: Analysis of biomedical data with multilevel glyphs. BMC Bioinform. 15, S5 (2014)CrossRefGoogle Scholar
  10. 10.
    Toderici, G., Aradhye, H., Paşca, M., Sbaiz, L., Yagnik, J.: Finding meaning on youtube: tag recommendation and category discovery. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 3447–3454. IEEE (2010)Google Scholar
  11. 11.
    Sturm, W., Schreck, T., Holzinger, A., Ullrich, T.: Discovering medical knowledge using visual analytics - a survey on methods for systems biology and omics data. In Bühler, K., Linsen, L., John, N.W. (eds.): Eurographics Workshop on Visual Computing for Biology and Medicine, Eurographics EG, pp. 71–81 (2015)Google Scholar
  12. 12.
    Müller, E., Assent, I., Krieger, R., Jansen, T., Seidl, T.: Morpheus: interactive exploration of subspace clustering. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2008, ACM, pp. 1089–1092 (2008)Google Scholar
  13. 13.
    Hund, M., Sturm, W., Schreck, T., Ullrich, T., Keim, D., Majnaric, L., Holzinger, A.: Analysis of patient groups and immunization results based on subspace clustering. In: Guo, Y., Friston, K., Aldo, F., Hill, S., Peng, H. (eds.) BIH 2015. LNCS, vol. 9250, pp. 358–368. Springer, Cham (2015). doi: 10.1007/978-3-319-23344-4_35 CrossRefGoogle Scholar
  14. 14.
    Shepard, R.N.: The analysis of proximities: Multidimensional scaling with an unknown distance function. Psychometrika 27, 125–140 (1962)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Kim, H., Choo, J., Park, H., Endert, A.: Interaxis: steering scatterplot axes via observation-level interaction. IEEE Trans. Vis. Comput. Graph. 22, 131–140 (2016)CrossRefGoogle Scholar
  16. 16.
    Maaten, L.v.d., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)Google Scholar
  17. 17.
    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)CrossRefGoogle Scholar
  18. 18.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)CrossRefGoogle Scholar
  19. 19.
    Sacha, D., Zhang, L., Sedlmair, M., Lee, J.A., Peltonen, J., Weiskopf, D., North, S.C., Keim, D.A.: Visual interaction with dimensionality reduction: A structured literature analysis. IEEE Trans. Vis. Comput. Graph. 23, 241–250 (2017)CrossRefGoogle Scholar
  20. 20.
    Kehrer, J., Hauser, H.: Visualization and visual analysis of multifaceted scientific data: A survey. IEEE Trans. Vis. Comput. Graph. 19, 495–513 (2013)CrossRefGoogle Scholar
  21. 21.
    Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: Mad skills: new analysis practices for big data. Proc. VLDB Endowment 2, 1481–1492 (2009)CrossRefGoogle Scholar
  22. 22.
    Yi, J.S., ah Kang, Y., Stasko, J.: Toward a deeper understanding of the role of interaction in information visualization. IEEE Trans. Vis. Comput. Graph. 13, 1224–1231 (2007)CrossRefGoogle Scholar
  23. 23.
    Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Springer Brain Inform. (BRIN) 3, 119–131 (2016)CrossRefGoogle Scholar
  24. 24.
    Kosara, R., Mackinlay, J.: Storytelling: the next step for visualization. Computer 46, 44–50 (2013)CrossRefGoogle Scholar
  25. 25.
    Holzinger, A.: Introduction to machine learning & knowledge extraction (make). Mach. Learn. Knowl. Extr. 1, 1–20 (2017)CrossRefGoogle Scholar
  26. 26.
    Keim, D.A., Rossi, F., Seidl, T., Verleysen, M., Wrobel, S.: Information visualization, visual data mining and machine learning (Dagstuhl Seminar 12081). Dagstuhl Rep. 2, 58–83 (2012)Google Scholar
  27. 27.
    Keim, D.A., Munzner, T., Rossi, F., Verleysen, M.: Bridging information visualization with machine learning (Dagstuhl Seminar 15101). Dagstuhl Rep. 5, 1–27 (2015)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  1. 1.GiCentre, Department of Computer ScienceCity, University of LondonLondonUK
  2. 2.Department of Computer ScienceUniversity of SwanseaSwanseaUK
  3. 3.Holzinger Group, HCI-KDD, Institute for Medical Informatics/StatisticsMedical University GrazGrazAustria

Personalised recommendations