1 Introduction

Recently, Machine Learning and one of its subfields, Neural Networks (NN), have become increasingly popular due to their widespread use and success in various applications. NNs can assist humans in improving many industrial and professional processes, as well as enhance daily life (Sarker, 2021; Xiang et al., 2018; Tulbure et al., 2022; Kumar et al., 2022; Xin et al., 2021). Thus, in scientific research, including medical and biology-based use (Sarvamangala & Kulkarni, 2022; Tang et al., 2019) food science (Ma et al., 2022; Gorbachev et al., 2022), and research related to the spread of viruses (Wieczorek et al., 2020; Gao & Caines, 2022), the need to use neural networks is appearing at an ever-increasing rate. This success of NN applications has generated a huge amount of interest from practitioners and students, inspiring many to learn about this technology.

At first sight, constructing an NN for solving a problem is easy: we need to specify a structure and a loss function to optimize, before further optimizing it using gradient descent (Legaard et al., 2021; Behler, 2015). This means that the network feeds forward with just matrix multiplication and pointwise activations, before the network backpropagates using the multivariate chain rule when the update of the weights accordingly must be performed. When applying the theory to difficult problems, it turns out that understanding how deep learning works is very different from actually applying it successfully. There are plenty of pitfalls to watch out for before getting a neural network that does exactly what we want. For example, it is essential to optimize its structure to prevent over or underfitting, to get the network to converge (to a high-quality local minima) to make sure we have the right loss function, to do data augmentation correctly, etc. Moreover, in a complex NN, there are several network layers each with a different structure and underlying mathematical operations (López et al., 2022; Taye, 2023). Thus, students need to develop a mental model of not only how each layer operates, but also how to choose different layers that can work together to transform data. Therefore, learning about neural networks presents a significant challenge due to the complex relationship between basic mathematical operations and their overall integration within the network. In addition, the majority of neural networks run in virtual space, i.e. they are not visible to the human eye. Although for successful implementation it would often be necessary to make the "part of the virtual world where the neural networks run" available to the students in real-time and space, our tool provides such a user interface, thus making them literally "touchable" for their use.

The need to visualize artificial neural networks for computer scientists who need to construct neural networks to solve difficult problems and to control the working of their constructed NNs is evident (Shahroudnejad, 2021; Li et al., 2020). Thus, there is nowadays a strong interest in integrating VR in everyday working life (Babić & Meštrović, 2019; Bock & Schreiber, 2019).

1.1 Objectives

One of our objectives in this project was to create a framework for creating and training neural networks for solving different problems of real life and for research and education. In this paper, we focus on the investigation of the usability of our framework in education.

We sought answers to the following questions: (a) what are the major difficulties in learning NNs? (b) What are the key requirements in a Visual Learning Tool including the most desired features of a visualization tool for explaining NNs? (c) How usable is the created system?

Our system can work as a distributed system since it can use the available computers as resources. This is made possible by the distributed operation of the Erlang programming languageFootnote 1, which allows us to run on several machines – even on several threads on one machine or process – thus enabling us to create nodes that communicate without using shared memory. (Naturally, here the elements of the neural network communication should be considered as network communication, and the node is the one running the Erlang process).

The framework also gives the opportunity for the individual nodes we start to perform a task specified with the help of a lambda function and to share their results with other members of the network. The neurons are capable of placing data to or from a distributed database reading out data.

Accordingly, we have developed a model and, based on the model, a prototype device that is suitable for the above tasks, i.e. it can show the structure of neural networks and make them controllable even for users who do not have sufficient knowledge of computer science and the disciplines that use it.

1.2 Research contribution

In this work, we contribute:

  • Our system (RKNet) that contains an interactive visualization tool designed for both experts and non-experts to learn and analyse NN’s structure, and monitor the working of NNs.

  • Novel NN generator implemented in VR and AR environment.

  • RKNet that provides the possibility of high error tolerance for the development of distributed systems, for messaging without shared memory between the nodes of a distributed system and provides many options that make the creation of neural networks simple.

Although framework was not created for educational purposes it is a behaviour-based system where we can practically create empty networks and then add functionality to them (with Lambda expressions that can be implemented as a higher-order, lazy evaluation function).

Although the creation of the network is a programmer's task, its use, visual control and monitoring require only user skills.

2 Method

2.1 Literature review

An increasing amount of research utilizes interactive visualizations to elucidate the operational mechanisms of neural networks. Harley’s visualisation tool focuses on demonstrating the high-level model structure and connections between layers of Convolutional Neural Networks (CNN), Karpathy’s extended ConvNetJS tool allows us to formulate and solve Neural Networks in Javascript (Karpathy 2016). TensorFlow Playground (Smilkov et al., 2017) and GAN Lab (Kahng et al., 2019) can help students to learn about dense neural networks and generative adversarial networks (GANs) respectively. Wang et al. (2021) presented the CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a foundational deep learning model architecture. Apart from the aforementioned tools, there are several others designed for non-experts (Olah 2014; Smilkov et al., 2017; Norton & Qi 2017) but most of them are developed to help NN experts analyse their models and predictions (Bilal et al. 2018; Garcia et al. 2018; Harley, 2015; Hohman et al., 2019; Kahng et al., 2018; Liu et al., 2017a, b). Mohamed et al. (2022) presented a review of visualisation-as-explanation techniques for convolutional neural networks and their evaluation. Zhang et al. (2021) evaluated the visualization performance of CNN models using driver model. 3D visualization of deep learning algorithms were developed for both experts and non-experts with an interactive user interface that allows interactive exploration on different levels of detail (Bock & Schreiber, 2019; Jin et al., 2020; Schreiber & Bock, 2019).

Meissler et al. (2019) examined how CNNs can be visualized in Virtual Reality and developed a software prototype based on the Unity platform and STEAMVR. Their deep learning networks are defined using Keras (Chollet et al., 2015) which are used as a high-level layer on top of Tensorflow (Abadi et al., 2015).

Queck et al. (2022) designed and implemented an ANN visualization in virtual reality (VR) specifically targeted at machine learning users. Their approach was especially well-suited for teaching and for individuals who are new to machine learning or lack expertise but wish to gain an understanding of the overall workings of neural networks. The common feature of the aforementioned tools is that they are visual systems, created for educational purposes and do not have their own framework behind them.

Although different frameworks have been developed to enhance experts’ work constructing NNs to solve difficult problems in real life, such as TensorFlow, Keras, PyTorch, Theano, Deeplearning4j or Microsoft CNTK they do not contain visual tools that can help users in analysing the constructed NNs by checking the state of a neuron or a layer. Although none of them uses virtual reality (VR) and augmented reality (AR) AR does permit the superimposition of digital content on top of the real-world environment through the use of a smartphone, tablet or virtual reality (VR) headset (Carmigniani et al. 2011). AR presents an opportunity to deliver a ‘virtual’ object-based learning activity. The dominant feature of VR is the ability to promote a higher level of immersion than other media. Immersive VR puts the user in an environment that takes after the real world and feels to some extent like it, with the person having a sense of self localization (Psotka 1995).

VR with head-mounted displays has proven itself as a learning medium in the engineering field (Abulrub et al. 2011). Indeed, there is evidence that a virtual learning environment can achieve better learning outcomes than traditional teaching (Alhalabi, 2016).

Our framework can be widely used to construct NNs to solve problems in real life. It can also be parameterized easily and the created NNs can be visually presented in an AR and VR environment. Moreover, the interaction can also take place in this environment. Using AR and VR tools, the created NNs can be shown in a virtual environment and the control, programming and management of the networks become available here. By using the framework for educational purposes, students without fundamental or relatively low-level programming knowledge can construct different NNs to solve problems and can also examine the working of NNs.

2.2 The structure of the system

Since one of the goals of creating our framework was the process of creating, programming and testing NNs, as well as the process of creating NNs, to simplify their programming and testing, as well as to create NNs that are suitable for solving real-life problems easily, we first created the subsystem in the Erlang language. In this subsystem, the configuration settings can be used to define the number of neurons in the NN and their connections, before starting them using a command issued on a graphical interface. The neurons in the network can be parameterized with Lambda expressions. (Neural Network Framework.)

Figure 1 shows the structure of our framework. Currently, it shows a very simple NN (Neural Network 1). The central item of a NN, created by our framework, is the node, called SNeuron. (S1, S2,…, S5 in Fig. 1). One SNeuron contains two neurons (neuron pairs). One of them is responsible for communication and the other for performing the currently imposed task. SNeurons can perform tasks specified by Lambda functions, and share their results with other neuron pairs in the network. A single SNeuron can communicate with other SNeurons if they share the same password stored in a common cookie file on the host computer whilst being connected to a network. This mode of operation is described in the Erlang language and each SNeuron is an Erlang process, allowing our system to function as a distributed system.

Fig. 1
figure 1

Block diagram of the frame work. Source: created by the authors

SNeurons are able to cooperate and directly exchange data with other SNeurons defined in their configuration. Furthermore, they are also able to place data in a distributed database or read data from it. Based on a pre-set frequency or in an impromptu manner, the SNeurons in the network can learn the data collected by the other SNeurons by reading them from the database. That is, they use each other's data and learn based on the experiences of the others.

The system monitors and presents graphically the neural network as an edge-tagged directed graph through database snapshots in pseudo real time. The graphic display module and its extensions make our framework suitable for using the virtual environment. The part of the module responsible for communication was implemented on a web server using a Laravel-based REST (Representational State Transfer) API (Application Program Interface).

This SQL-based (Structural Query Language) database makes it possible to visualize the neural network in real time for the virtual environment. The functions of the SNeuron are:

  • It can communicate with the rest of the network – with other neurons;

  • It can read data from the distributed database and can write data there;

  • It can register itself in the database of the system in order to provide data to the connected interfaces.

The database thus provides an accurate and up-to-date status of the network, which can return the connections and the current tasks of the neurons in real time. Every time there is a change, the database is updated, which triggers an event in the viewer, making the network quasi-animated on the visual interfaces. In this way, we can actually query a snapshot of the network, which is suitable for its visual display and control

The neural network provides a significant part of the system, but not its full functionality. A very important element of the entire framework is the UI (User Interface) that provides the control, as well as the API that implements the communication among the subsystems.

2.3 Rendering

When we create a network with its neurons (SNodes) in the system, we enter the number of neurons in the neural network. They are initially equal and have no task. We need to define which neurons can communicate with each other. As an extra function, we can group neurons by way of specifying their group ID, which can be used to refer to them later.

The VR system runs under the Unity Engine, which can display neural networks in the virtual environment. This subsystem also provides the opportunity to generate learning networks, which are only useful for educational purposes and cannot solve complex problems. It automatically colours, and also provides the possibility to introduce individual templates. For the efficient use of the framework in multiple contexts and externally, we have implemented a template system, which enables a relative positioning system to place the displayed networks in the visible middle thanks to the possible dimensions of the data received from the system. In virtual reality, individual nodes can be grasped, touched, rotated and their functions managed. We can examine the functions of running neurons by holding them in our hands. In VR, there are two ways of interacting with the environment, one is our hands, and the other is a laser pointer device that allows us to simply grab distant nodes and then pull them closer to us. This helps to manage and display larger networks to the user.

To illustrate the model, we created a mobile and Unity-based interface that can be used through VR glasses for managing and illustrating the neural network. Figure 2 shows a randomly generated NN without functionality to represent how we can choose a node and check its state.

Fig. 2
figure 2

Neural Network Visualization VR in RKNet. Now, neurons have functionality. Source: screenshot from the video. https://www.youtube.com/watch?v=zoD0oWan7u8

2.4 Timing and visualization

The working of the neural network in the system is very fast, especially if we use Lambda functions defined with simple functions. Although in the case of complex calculations, it is very useful, for visual display it is not feasible, since it cannot be followed with the eyes and in the VR or AR GUI only the final result is visible. To solve this problem, we have enabled the neural network to slow down its operation during the creation of the network and during its working as well. It is possible to specify the speed of network creation, as well as the time elapsing between sending messages for each network element group type. In addition to specifying the speed, it is also possible to select certain processes in the network that should not be executed automatically, but upon external intervention, as for example, in virtual reality with a click or a VR movement. The VR software of the system sends a message to the REST API such as grabbing one of the neurons or touching it through a corresponding endpoint if a command is issued in VR. The REST API then calls a function in the Erlang module through a purpose-written function, instructing the neural network to proceed or to send a message. This function also enables an interactive illustration for the user.

2.5 Defining NNs for the visualization interface of RKNet

When we create a NN with its neurons in the system, we specify the neurons in the neural network and the communication types they must use. We also specify with whom they will communicate in the NN. This can be one or more neurons or the console from which we start the NN. It may also be another node or the rest of the system. We also have to define the neighbours of each node, i.e. neuron, to whom a message must be sent in connection with an event taking place in a network. We can also specify, for example, what starting value an input neuron should start with, i.e. what it should send to the neurons of the hidden layer. In the example below, we defined a neural network representing a perceptron.

In the example, see Fig. 10 in the Appendix, the neural network is given in a list. Each list element is defined in an ordered n-tuple, where the elements of the tuple describe the properties of the neuron.

$$\left\{\textrm{iln}1,\kern0.5em \textrm{input},\kern0.5em \left[\left\{\textrm{hln}1,\kern0.5em \left\{0.1,\kern0.5em 0.4\right\}\right\}\right]\right\}$$

iln1 is the name of the neuron, i.e. the internal identifier of the system. The input atom tells us that this element is an input neuron. The third element in the tuple is also a list that specifies the neighbour numbers of the neuron. It will send a message to them, if we instruct it to do so, in the Lambda function describing its function. This function can be defined by using the console of the framework. In the list describing the neighbourhood, since this is an input neuron, we can specify the ID of the neighbours, as well as the initial values that must be sent to each neighbour at the start and the value of the weight for each. An initial input value does not need to be specified for the neurons of the hidden layer, as they send a message based on what is described in their Lambda function. (It is also possible to specify a weight here, but in this example, we defined the weights assigned to the edges for the input.) The situation is the same in the case of the output layer, but here we can specify that the result is written to the console for verification and testing purposes. This can happen if, for example, we do not want to calculate its error with a cost function, but want to check the result produced by the output neuron at the output, as in the case of the above output neuron:

$$out1, output,\left[ console\right].$$

Figure 3 shows a simple perceptron model including the input values of the NN, Fig. 4 represents its visualization with the output value.

Fig. 3
figure 3

A simple perceptron model. Source: made by the authors

Fig. 4
figure 4

The visualization in AR of the previous perceptron model. Screenshot from the video. Source: https://youtu.be/ghI-OezDAFE

Without defining the lambda functions, that describe the functioning of the neurons the NN does nothing. Therefore, our Erlang list (see Fig. 10) requires adding the Lambda functions. Figure 11 in the Appendix shows the completed Erlang list with the name of the Lambda functions applied in the definition of the NN.

Knowing the implementations of the Lambda functions or watching the video, it can be seen how the NN works. The three input neurons send (0.4, 0.5, 0.2) values to the hidden neuron, which multiplies them with the weights (0.4, 0.2. 0.6), the sum of the resulting three values and send to the output neuron. This neuron displays the true value to the output if the given value is more the 0.5, otherwise it is false. The lambda functions can be defined both in the Erlang list or outside the list. Figure 12 in the Appendix shows the Lambda function that defines the behaviour of the output neuron.

In this case, the function of the output neuron waits for a message. If the content of the message is the input, Value pair, the function adds its current value (State) to the one received in the message. If this value is greater than 0.5, the result will be true, otherwise false. Then, at each step, it saves its own state for the system database (from which, for example, the VR interface can query it), and then calls itself with the new state. If it receives a stop message, it stops, that is, it does not call itself again.

We can specify the neurons of the hidden and input layers in the same way as its Lambda functions and we can assign them to the appropriate group in the definition of the NN. Functions are defined at the start of the neural network and must be given to all neurons of that type by specifying the name label, as we can see it in Fig. 11.

At the start of the network, the system will start the input_neuron function in the case of the input neurons in the input layer, and naturally, the hidden_neuron function in the case of the hidden layer and the output_neuron function in the case of the output. After that, we just have to start the network either from the console or from the REST API endpoints and its operation can be monitored on one of the visual interfaces.

By defining more complex and complicated Lambda functions, neurons can overcome more difficult tasks, thus the created NNs in the system can work as a complex, highly error-tolerant distributed system, whose functions are defined in the Lambda functions of each individual neuron.

In the next example, we define a simple NN to study its working including backpropagation. Figure 13 in the Appendix shows the Erlang list with the name of the Lambda functions.

After the implementation of the functions of the layers and starting the NN, we can study the working of the NN through the VR output screen. Figure 5 represents a screenshot of the video.

Fig. 5
figure 5

Investigation of the operational of a Neural Network in. (In the calculation of the MSE value, we used 0.5 instead of 1 to ease the calculation of the derivative.) Screenshot of the video. Source: https://youtu.be/Ak-j9_RD-Rk

3 Formative research to identify the learning challenges faced by the students

Before designing the educational part of our visualization tool, we recruited 5 instructors (5 female) who have taught NNs at two universities. We interviewed them one-on-one in a conference room (2/5) and via video-conferencing software (3/5); each interview lasted about 25 minutes. Through these interviews, we learned that instructors currently use simple illustrations with simple examples to explain NN concepts and give URLs of demonstration videos about the working of NNs.

We asked our BSc and MSc students who have previously studied neural networks to fill out an online survey. We received 31 responses (4 female, 27 male). Among them, 26 were BSc. students and 5 were MSc. students. We asked participants what were “the major difficulties in learning NNs” and the “key requirements in a Visual Learning Tool including the most desired features of a visualization tool for explaining NNs” they would have used during the course. Participants were allowed to choose from a range of options (see Figs. 14 and 15). The aggregated results of this survey are shown in Fig. 6 and 7. (We used Microsoft Excel for data analysis.)

Fig. 6
figure 6

Survey results from 31 students who have already learned about NNs. Major difficulties encountered during learning

Fig. 7
figure 7

Survey results from 31 students who have already learned about NNs. Key requirements in a Visual Learning Tool for NNs

During the development of the AR/VR subsystem, we tried to follow the result of this survey to fulfil the requirements. After completing the current version of the system, we constructed NNs previously seen with varying activation functions and learning rates, as well as additional simple NN with two hidden layers and two output neurons, each using different activation functions and learning rates.

4 Results

We conducted an observational study to investigate how our students would use this system to learn about NNs, and also to test the usability of the system. We recruited 21 student participants from our university (3 female, 18 male). Four students were MSc. and the others were BSc. students. All participants were interested in learning NNs, and none of them had known RKNet before. Participants self-reported their level of knowledge on non-neural network machine learning techniques (see Figs. 16 and 17). We also provided a feature checklist, which outlined the main features of our tool and asked them to try the first NN with different activation functions and learning rates, then the second one. During the study, participants were asked to think aloud and share their computer screen with us; they were encouraged to ask questions when necessary. The exit questionnaire included a series of 7-point Likert-scale questions about the utility and usefulness of different views in RKNet (Fig. 8 and 9). All average Likert ratings were above 6 except for the rating of “interactive calculation views”. From the high ratings and our observations, participants found our tool easy to use and understand, retained a high engagement level during their session, and eventually gained a better understanding of NN concepts.

Fig. 8
figure 8

Average ratings from 21 participants regarding the usability and usefulness of RKNet. Participants thought RKNet was enjoyable, easy to use and helped them learn about ANs

Fig. 9
figure 9

Average ratings from 21 participants regarding the usability and usefulness of RKNet. Usefulness, especially animations were rated favourably

Another useful feature of RKNet that participants mentioned was the interactions, which received the highest rating in the exit questionnaire (Fig. 8). We found that interactions helped to increase participants’ engagement level (e.g., spending more time and effort) and made RKNet more enjoyable to use.

The main student objection was that hyperparameters cannot be varied in AR.

5 Conclusions and future work

RKNET is a framework with which we can create Neural Networks in the Erlang programming language to solve various tasks. Network operation is monitored in a virtual environment. We also made the VR tool suitable for non-experts as it allows an easy and accessible introduction to NNs. We plan to provide the option of changing hyperparameters interactively in VR view. The system is a prototype, so we are constantly developing it, and we add new functions to the system based on needs.

In addition to VR, we also targeted the AR environment. Here, the aesthetic placement of the network in real space caused problems, but we will solve this soon with the help of plane detection. So AR will also become an available option (although currently, it is only available in the test phase, as we focused on VR).

We plan to create a higher level of interaction, where all elements and functions of the network can be created even from the VR environment. We also aim to supplement the framework with a web-based education system. Later, we plan to release RKNet as an open-source software.