In this paper, we present an overview of the MOVING platform, a user-driven approach that enables young researchers, decision makers, and public administrators to use machine learning and data mining tools to search, organize, and manage large-scale information sources on the web such as scientific publications, videos of research talks, and social media. In order to provide a concise overview of the platform, we focus on its front end, which is the MOVING web application. By presenting the main components of the web application, we illustrate what functionalities and capabilities the platform offer its end-users, rather than delving into the data analysis and machine learning technologies that make these functionalities possible.
- MOVING platform
- MOVING web application
- Recommender system
- Adaptive training support
Scholars and professionals in various sectors of the economy, including public administrators, corporate compliance officers, and auditors, deal with an ever-increasing flow of information (new scientific publications, business documents and multimedia files, laws, etc.). They need sophisticated tools to evaluate all this information fast and accurately and to visualize the analysis results. Specifically this means that, on the one hand, they need tools that enable state-of-the-art search and semantic analysis of large digital contents, by providing: (i) access to an extensive source inventory, (ii) advanced search and visualization methods, and (iii) functionalities for generating new knowledge from these digital assets. On the other hand, these tools need to be reasonably easy for their users to understand and support them through: (i) a detailed and scientifically proven help system (tutorials, guidance), individually configurable training programmes (learning modules, videos), and a lively community of people that have similar interests or problems to be solved. To face these challenges, the interdisciplinary trans-European project called MOVING (“TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation”) (Vagliano et al. 2018) has built an innovative training platform that enables users from various societal sectors to fundamentally improve their information literacy by training in how to choose, use, and evaluate data mining methods in their daily research and business tasks, and to become data-savvy information professionals.
2 Digitized Science
Initiatives by the European Union (which has long been pursuing a digital agenda) to support research in the field of digitized science illustrate the need to investigate related change processes (European Commission 2016). Obviously, empirical and theoretical justification is needed to develop the practice of science. The innovative approach dealt with here was developed in the MOVING project, which offers an innovative training platform to support scientists and other users from all areas of society to fundamentally improve their information literacy in research-oriented contexts.Footnote 1 The project is about training users to select, apply, and evaluate technologies and data mining methods, so that the relevant research staff can develop into ‘data-savvy’ information professionals in their daily research routines (Scherp et al. 2016; Köhler et al. 2016a, b).
In terms of content, the research methodological changes in scientific action cannot easily be explained as domain-specific activities. This requires analyses of both current technological developments and the changes in how scientists use these technologies (or methods). The eScience Saxony research network provides statements on both perspectives (see, e.g., [Pscheida et al. 2013, 2014]). The network has observed the following:
there is great potential for the use of new digital tools in research;
preferred topics for development are scientist collaboration and the visualization of (often large or new) databases;
transitions between the subject areas of research and teaching can also be observed in technology development;
almost all scientists do most of their work using computer-based technologies and have access to appropriate infrastructures;
scientists sometimes find it difficult to adopt new media technologies in research and teaching (e.g. social media), although there are also subject-specific differences;
there is still uncertainty regarding the requirements, possibilities, and assumed risks of open-access publishing;
research methodology has not been fully systematically discussed and is often inadequately implemented;
there are no clear standards for high-quality research technology and no recognizable institutionalization to support open-access trends in science, so these still need to be worked out together;
digital change in science is comparatively rapid from an individual (scientist) perspective, the outcome is not known, especially regarding location-determining infrastructures.
Indeed the listing matches to a larger proportion with the demands of these cases addressed by the MOVING project. Nevertheless MOVING did set focus on two more main characteristics. First there was a serious interest to address research activity not only in academia but as well in public administration and industry. Second, when developing the approach the project consortium decided to include as well a direct focus on the related skill development, i.e. include a serious effort on innovation in the educational dimension (the Online Literacy Training and Learning) that needs to go along with any new technology in every sector.
3 Overview of the MOVING Platform
An overview of the MOVING platform architecture is illustrated in Fig. 1, which shows the most important components and their relationships. The main component blocks are (i) data acquisition, (ii) data processing, (iii) back-end data storage, user tracking, search and recommendation, and (iv) the MOVING web application that includes the front-end search. In this section, we briefly describe the overall platform.
The MOVING web application is the core of the platform and the interface to the user. The main entry points to the web application are the community section, the learning environment, and the search interface. The search interface offers different visual representations of search results. These visualizations allow the user to explore the search results in various ways. For this purpose, four visualizations have been added to the MOVING platform, namely: (i) the Concept Graph, which displays the search results as an interactive network, (ii) uRank, a dynamic document ranking view, (iii) Top Properties, a bar chart visualization that aggregates the results based on their properties, and (iv) a Tag Cloud, showing the most frequently occurring keywords. Moreover, the Adaptive Training Support (ATS) widget supports users learning how to search and provides material suited to their needs (Fessl et al. 2018) and the Recommender System (RS) widget (bridging the front and back ends of the platform) points users to potentially relevant documents by evaluating their last search queries. Thanks to its responsive design, all the views adapt to different screen sizes, automatically changing the layout according to the capabilities of the device.
Private user data and public documents are stored in three separate databases: The web application database holds the data for the communities, the learning environment, and the ATS. The index holds the public documents and generated metadata information such as topics, authors, and extracted entities. The user-interaction tracking captures user interactions with the web application and stores them securely in a third database. User tracking provides additional data for both the ATS and the RS, which form the basis for user support by these two widgets.
The index used by the search interface is populated by various data acquisition components (e.g. web crawlers and a Bibliographic Metadata Injection service), to increase the amount of data accessible through the MOVING platform. To date, it hosts over 22 million documents and metadata records. These records include books, scientific articles, laws and regulations, documents about funding opportunities, videos (e.g. of lectures and tutorials), and social media posts. Data processing components have been incorporated into and applied to these records, to improve the quality of data and make it easier to search. Additional features, the Data Integration Service, Author Name Disambiguation, Deduplication, Named Entity Recognition and Linking, and Video Analysis, all refine and enrich the documents stored in the index.
Author name disambiguation addresses the problem that many author names belong to different real-world authors. To deal with this problem, a novel method (Backes 2018a, b) has been developed which applies, for a given author name, agglomerative clustering on features extracted from documents containing the author mention in question, such as affiliation, co-authors, referenced authors, email addresses, keywords, and publication years. The disambiguation procedure calculates the probability with which author mentions with the same name belong to the same person. Name mentions having a high probability to belong to the same author are assigned a unique internal authorID. By this, authors with the same name are distinguished if they refer to different real-world persons. As a result, users who click on the name of an author of a document in the result list of a search will only see documents from authors who have the same author ID as the selected author (instead of showing all documents authored by any person with that name). A modified version of this method has been applied for document deduplication.
In the following, we present the front end of the MOVING platform in detail, in order to provide a concise summary of what a user can do with it. For details on how individual data processing, data acquisition, and other back-end components work, the interested reader is referred to the relevant publications, such as (Nishioka and Scherp 2016; Galanopoulos and Mezaris 2019; Tzelepis et al. 2018), as well as the documentation available on the MOVING project web site.Footnote 2
4 The MOVING Web Application
Search is a key functionality in the MOVING web application. At the back end, the MOVING search engine is based on Elasticsearch,Footnote 3 given appropriate parameters, and fine-tuned to efficiently index dozens of millions of documents. At the front end, the user sees a search page (Fig. 2), with various search options and filters on the left, visualizations of the results in the centre of the window, and training functionalities such as ATS on the right. The search history of the current user can also be viewed, to support future searches.
To enable platform users to view and replicate their previous searches, the search history view is connected with WevQuery (Apaolaza and Vigo 2017). WevQuery serves as an interface to the data generated by UCIVIT (Apaolaza et al. 2013), the tracking tool of which logs user-interaction data. From WevQuery, we get the information about the previous user searches, time when the user performed the search query, and the number of documents retrieved. This information is then utilized to build the search history view, an example of which is shown in Fig. 3.
To present the results of a user query effectively, several visualizations have been implemented. Four characteristic ones are:
Concept Graph. For the discovery and exploration of relationships between documents and their properties.
uRank. A tool for the interest-driven exploration of search results.
Top Properties. A bar chart displaying aggregated information about the properties of the retrieved documents.
Tag Cloud. A visualization for the analysis of keyword frequency in the retrieved documents.
Concept Graph: an interactive network visualization the Concept Graph (Fig. 4) visualizes direct and indirect connections between retrieved search results. For example, a single, disambiguated author of two different publications is visualized as a node in the graph connecting the corresponding publications. Further extracted and disambiguated entities are visualized in a way that users can grasp, quickly, such as research networks. The initial graph visualization starts with a few collapsed nodes. These nodes can be expanded to visualize initially hidden nodes and to incrementally add more information to the graph. Thus, users are not overwhelmed with too much information when they start their search.
uRank: interest-based result set exploration. Based on the search query the top 100 retrieved results are displayed as a ranked list. The keywords extracted from the results are presented in the Tag Cloud in the right sidebar of uRank (Fig. 5, point A). By selecting keywords of interest, the results in the list (Fig. 5, point C) are re-ranked in such a way that the results containing the selected keyword move to the top. The ranking view (Fig. 5, point D) provides visual feedback on the relevance of the result. It is possible to select multiple keywords and even fine-tune their importance by using the slider under the selected words (Fig. 5, point B). Clicking on a result opens a dialogue box, which presents additional information about the retrieved document. The user can export the current view of uRank, with the current search configuration, by clicking on the export button, which initiates the download of a zip file containing an image and a report text file.
Top Properties: the Top Properties visualization uses 100 of the most relevant results from the current search query. It shows a bar chart visualization presenting one of the following properties of the available results: Authors, Keywords, Concepts, Sources, and Year of Publication. The results are ordered according to the most frequent values of the selected property, as can be seen in Fig. 6. When the publication year is selected, the sorting order changes so that the years are displayed in chronological order to make it easier to identify year-on-year changes. Clicking on one of the bars shows the results associated with this property in a small dialogue box. The results in this dialogue are sorted in the order provided originally by the search engine. The Top Properties visualization also supports an export functionality, which exports the current view of the visualization with its search configuration.
Tag Cloud: the Tag Cloud visualization (Fig. 7) retrieves the 100 most relevant results from the search query and displays them by showing the most frequent keywords that occur in the corresponding titles and abstracts. The displayed keywords are initially sorted by their frequency and can be filtered by occurrence, year, or text. Clicking on one of the keywords shows the results associated with this property. The results are sorted in the order provided originally by the search engine.
4.2 Recommender System
The RS widget, depicted in Fig. 8, is part of the search page. It gives users additional suggestions for resources of which they may not be aware. The RS interacts with the search engine, user-interaction tracking, and dashboard (WevQuery), hence bridging the back and front ends of the MOVING platform. To build user profiles, it obtains the search history from the user data previously logged through UCIVIT and then retrieves the documents to suggest from the index, depending on the user’s profile. The MOVING RS is based on HCF-IDF (Nishioka and Scherp 2016), a novel semantic profiling approach that can exploit a thesaurus or ontology to provide better recommendations. Further information on the MOVING RS is available elsewhere (Vagliano and Nazir 2019).
Open collaboration and communication are the foundations of open innovation and open science. MOVING communities offer users a powerful tool to organize group collaboration and communities of practice on the MOVING platform (see Fig. 9). MOVING communities are part of the working environment of the platform and offer a range of social technologies with knowledge and information management, including wikis, forums, blog functions, and group news. MOVING communities are based on the project management tools and technologies of the eScience platform on which the MOVING platform is based. The existing eScience modules, which enabled cooperation in closed teams of researchers, were adapted to the goals of the MOVING platform to provide an open innovation environment and foster open collaboration, communication, and knowledge exchange between its users.
Registered users who want to create a new community are offered different options. First, users can create public communities that are visible to everyone in the MOVING platform and can be accessed and edited by anyone interested in the topic. Second, users who want to organize specific project teams or research groups can create private communities that users have to join before they can access and edit content. Private communities are not visible to other users but can be shared with collaborators via email.
The MOVING CK EditorFootnote 4 enables the creation of formatted text and the integration of multimedia content in HTML pages that are created by users in the MOVING communities. Videos, pictures, GIFs or documents, and social media content from TwitterFootnote 5 and YouTubeFootnote 6 can all be easily integrated. Features like the accordion and the option to include expandable items make it easy to structure content in the page. It is a WYSIWYG editor (What You See Is What You Get) so even users that are not familiar with HTML can use it easily to create and edit web-based content within MOVING communities.
The wiki module is useful for creating and collaboratively managing large knowledge repositories with a community. The forum module provides space for open communication and information exchange—a precondition for open innovation processes. The forum module contains a user rating functionality that allows the community to publicly rate the content of individual forum entries. Users can vote posts and replies up and down, based on the quality of the contribution. The highest-rated input is highlighted to help users find the best response in a thread, and the summarized score for all received votes is shown on each user profile. The ranking functionality helps communities self-organize and peer assess user-generated content. Community administrators can also choose to assign badges to reward users or motivate them to get actively engaged. Badges can be assigned automatically or manually.
The ease of user-generated content creation and integration combined with the social features of MOVING communities open up a wide range of possible applications. Users can organize group work in small project teams, or create open communities around scientific or technical topics to discuss research or ask questions to an expert community. MOVING communities can be organized as an open innovation tool but also as a learning management system, as the following example shows.
One practical application of MOVING communities is the four-week MOVING MOOC (massive open online course) Science 2.0 and open research methods that was organized on the MOVING platform (see Fig. 10).Footnote 7 The MOOC is organized on the platform as a private team community, so that participants have to register to gain access to the learning materials and the forums. For each week of the MOOC, we created a sub-community containing learning materials in different media formats as well as weekly assignments. The forums were used to organize group communication and allow users to share their assignment results. A wiki was created and contained additional information about the course, learning goals, and technical details about using the editor or the MOOC badges that users can earn on the course (Fig. 11). Badges are displayed on the user’s profile, My page, along with their personal and contact details (profile picture, science field, skills, hometown, institution, email, ORCIDFootnote 8).
4.4 Learning Environment
MOVING offers a unique combination of working and training features in one platform. The heart of the training programme is the MOVING learning environment. Here, all the learning content is organized and directly accessible to the users. The landing page (Fig. 12) gives an overview of the learning materials including the platform demo videos and video tutorials, the Learning Tracks for Information Literacy 2.0, and the MOVING MOOC that was discussed in the previous subsection, Science 2.0 and open research methods. The platform demos are videos hosted on videolectures.net and are embedded in the learning environment so that users can learn about the different platform features and technologies developed within the MOVING project. Users can improve their data and information literacy as well as digital competences through Learning Tracks for Information Literacy 2.0 (Fig. 13).
4.5 Adaptive Training Support
The ATS (Fessl et al. 2018) comprises two widgets for learning how to search and curriculum reflection.
The Learning-how-to-search (Fig. 14) widget visualizes information about the use of features provided by the MOVING platform. The widget presents to users how they used the features of the platform in a bar chart to motivate them to explore new features and reflect about their usage behaviour. More information about the widget and its evaluation can be found in (Fessl et al. 2019).
The curriculum reflection widget (Fessl et al. 2019) consists of two parts: the curriculum learning and reflection and the overall progress. The first part consists of two main areas. The upper area either contains a learning prompt (suggesting that the user learn more about the next topic in the current sub-module) and a button which opens the respective learning unit in a new tab (Fig. 15 left), or it presents a reflective question that motivates the user to think about the current topic of their learning (Fig. 15 right). The user’s progress in the current sub-module is displayed at the bottom of the widget.
The overall progress part of the widget shows the user’s learning progress through the curriculum using a sunburst visualization. Figure 16 shows that the curriculum is divided into three modules. Each module is represented as a section in the inner circle of the visualization and divided into three sub-modules in the outer circle. Every time a user completes a new learning unit, the percentage in the respective section in the sunburst diagram is updated. Progress in each sub-module is encoded by colour. If the user has not completed any learning units in a sub-module (0%), the respective section will be red. Making progress in a sub-module will turn the section yellow (50%) and completing it will turn the section green (100%).
This is also explained by the legend below the visualization. Moreover, the sections in the sunburst diagram are ordered to mirror the structure of the curriculum. Starting from the top, the sub-modules are completed clockwise, gradually turning the visualization green.
In this chapter, we presented the MOVING platform, focusing on the MOVING web application with its search interface and novel results visualizations, community features and learning environment, and components such Adaptive Training Support. These functionalities help users to not only search within and visualize a large multimedia collection using various advanced tools and functionalities, but also to explore the platform more easily, e.g. by showing statistics about their platform use or providing learning guidance. Productive use of the prototype platform in real educational environments, such as the MOVING MOOC, showed how its integrated training and working environment contributes to making information professionals data-savvy and improving users’ information literacy skills.
Platform.moving-project.eu, last accessed 7 May 2020.
www.moving-project.eu, last accessed 7 May 2020.
www.elastic.co, last accessed 7 May 2020.
www.ckeditor.com, last accessed on 7 May 2020.
www.twitter.com, last accessed on 7 May 2020.
www.youtube.com, last accessed on 7 May 2020.
moving.mz.tu-dresden.de/mooc, last accessed 7 May 2020.
www.orcid.org, last accessed on 7 May 2020.
Apaolaza, A., Harper, S., Jay, C.: Understanding users in the wild. In: Proceedings 10th International Cross-Disciplinary Conference on Web Accessibility (W4A’13), Rio de Janeiro, Brazil (2013)
Apaolaza, A., Vigo, M.: WevQuery: testing hypotheses about web interaction patterns. In: Proceedings ACM Human Computer Interaction 1(EICS), 4:1–4:17. (2017)
Backes, T.: Effective unsupervised author disambiguation with relative frequencies. In JCDL ’18 The 18th ACM/IEEE Joint Conference on Digital Libraries Fort Worth, TX, USA—June 3–7, 2018, edited by Jiangping Chen, Marcos André Gonçalves, and Jeff M. Allen, pp. 203–212. New York: ACM. http://dx.doi.org/10.1145/3197026.3197036 (2018)
Backes, T.: The impact of name-matching and blocking on author disambiguation. In: CIKM ’18: The 27th ACM International Conference on Information and Knowledge Management Torino, Italy—October 22–26, 2018, 803–812. New York: ACM. http://dx.doi.org/10.1145/3269206.3271699 (2018)
European Commission: Digital Economy. Online via: https://ec.europa.eu/digital-single-market/en/economy, last accessed 13 September 2016 (2016)
Fessl, A., Apaolaza, A., Gledson, A., Pammer-Schindler, V., Vigo, M.: Mirror, mirror on my search…: Data-Driven Reflection and Experimentation with Search Behaviour. In European Conference on Technology Enhanced Learning. pp. 83–97. Springer, Cham (2019)
Fessl, A., Sˇimic, I., Barthold, S., Pammer-Schindler, V.: Concept and development of an information literacy curriculum widget. In Conference on Learning Information Literacy around the Globe (2019)
Fessl, A., Wertner, A., Pammer-Schindler, V.: Digging for gold: motivating users to explore alternative search interfaces. In: Lifelong Technology-Enhanced Learning. pp. 636–639. Springer International Publishing, Cham (2018)
Galanopoulos, D., Mezaris, V.: Temporal lecture video fragmentation using word embeddings. In: Proceedings MMM2019. Springer LNCS vol. 11296, pp. 254–265 (2019)
Köhler, T., Pscheida, D., Scherp, A., Koschtial, C., Felden, C., Neumann, J.: Moving research methodology toward eScience. Paper Presentation Track A: Online Research Methodology, General Online Research 2016; Dresden 02.-04.03. (2016)
Köhler, T., Scherp, A., Herbst, S., Wiese, M., Mezaris, V.: Data driven online research. Potential specifications in relation to user needs; International Science 2.0 Conference; Köln 02.-03.05. (2016)
Nishioka, C., Scherp, A.: Profiling vs. time vs. content: what does matter for top-k publication recommendation based on twitter profiles? In JCDL ’16, pp. 171–180. ACM (2016)
Pscheida, D., Köhler, T., Mohamed, B.: What’s your favorite online research tool? Use of and attitude towards Web 2.0 applications among scientists in different academic disciplines. In: Marsden, C. Tassiulas, L.: Proceedings of the 1st International Conference on Internet Science; Brussels, Sigma Orionis (2013)
Pscheida, D., Minet, C., Herbst, S., Albrecht, S., Köhler, T.: Use of social media and online-based tools in academia. Results of the Science 2.0-Survey 2014; Dresden, TUD Press. SLUB: http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-191110 (2015)
Scherp, A., Pscheida, D., Köhler, T., Wiese, M., Nishioka, C., Mezaris, V., Collyda, C.: MOVING: training towards a society of data-savvy information professionals to enable open leadership innovation. In: 13th European Semantic Web Conference (ESWC 2016; Anissaras 29.05.-02.06. (2016)
Tzelepis, C., Mezaris, V., Patras, I.: Linear maximum margin classifier for learning from uncertain data. IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2948–2962, December (2018)
Vagliano, I. Nazir, S.: Recommending Multimedia Educational Resources on the MOVING Platform. In: Proceedings of the 8th Workshop on Bibliometric-enhanced Information Retrieval, CEUR-WS.org. pp. 148–158 (2019)
Vagliano, I., Günther, F., Heinz, M., Apaolaza, A., Bienia, I., Breitfuss, G., Blume, T., Collyda, C., Fessl, A., Gottfried, S., Hasitschka, P., Kellermann, J., Köhler, T., Maas, A., Mezaris, V., Saleh, A., Skulimowski, A.M.J., Thalmann, S., Vigo, M., Scherp, A.: Open innovation in the big data era with the MOVING platform. IEEE Multimedia 25(3), 8–21 (2018)
This work was supported by the EU’s Horizon 2020 programme under grant agreement H2020-693092 MOVING. The mentioned eScience Saxony research network has been supported by the Saxon State Ministry for Science and Art. The Know-Center is funded within the Austrian COMET Programme, Competence Centers for Excellent Technologies, under the auspices of the Austrian Federal Ministry of Transport, Innovation and Technology, the Austrian Federal Ministry of Economy, Family and Youth and by the State of Styria. COMET is managed by the Austrian Research Promotion Agency FFG.
Editors and Affiliations
© 2021 The Author(s)
About this chapter
Cite this chapter
Apaolaza, A. et al. (2021). MOVING: A User-Centric Platform for Online Literacy Training and Learning. In: Koschtial, C., Köhler, T., Felden, C. (eds) e-Science. Progress in IS. Springer, Cham. https://doi.org/10.1007/978-3-030-66262-2_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66261-5
Online ISBN: 978-3-030-66262-2