Introduction

Digital data are perceived to be valuable in contemporary economies and societies. Since the 2011 World Economic Forum described personal data as a ‘new asset class’ that underpins the development of new products and services (World Economic Forum 2011), policymakers, economic and social actors, and scholars have sought to understand how data create both commercial and social value. For example, digital markets and data have become so important for our economies that in 2022–2023, the European Union introduced the Digital Markets Act to bring order to the digital economy, the Digital Services Act to harmonise rules for online intermediary services and create a safe online environment, and the European Data Act to facilitate the use and exchange of digital data for economic and social benefit.

However, digital data are neither inherently valuable nor exist ‘out there’ waiting to be collected and exploited. Instead, data and data products are constructs of political-economic and socio-technical arrangements, which also create conditions for data monetisation (Birch 2023). We are particularly interested in user data, i.e. digital data that are logged and collected as an outcome of an individual engaging with a digital platform. User data include, but are not limited to, personal data. Scholars have analysed how user data are imagined to be made valuable in various sectors, such as in healthcare via behavioural nudging (Prainsack 2020), in insurance via personalisation (McFall et al. 2020), or in the application of big data to food and agriculture (Bronson and Knezevic 2016). The literature also highlights the risks and adverse effects of datafication, including surveillance (Zuboff 2019) and various forms of population control and exploitation (Sadowski 2020). In each case, for digital user data to be made useful and valuable, data must be collected, analysed, and processed to produce various digital products and outputs, such as algorithms, analytics (e.g. scores, metrics), automated decisions, or dashboards (Mayer-Schönberger and Cukier 2013).

As the datafication of our economies and societies has expanded in general, so too has it impacted higher education (HE). Datafication refers to the ‘quantification of human life through digital information, very often for economic value’, with important social consequences (Mejias and Couldry 2019: 1). In education, datafication consists of data collection from all processes in educational institutions at all scales and levels, impacting stakeholder practices (Jarke and Breiter 2019). In HE, policymakers attempt to improve university quality, efficiency, and impact via datafication at the sectoral and institutional levels. For example, the UK Higher Education Statistics Agency (HESA) established a Data Futures programme as an infrastructure for datafying the sector and collecting and collating data from universities (Williamson 2018), with an alpha phase launched in 2021–2022. Moreover, Jisc, a HE sectoral agency providing network and IT services, supports universities with various initiatives, such as the Data Maturity Framework launched in 2024, which universities can use as a template to improve data capabilities and datafy their institutions. Digital data, then, is one of the foundational elements of postdigital education because digital technologies that staff and students use every day are increasingly data-based (Jandrić et al. 2024; Jandrić and Knox 2022).

User data in HE are not only valuable for universities and policymakers but also for the EdTech industry. Scholars aligning themselves with the field of critical data and platform studies in education (Decuypere et al. 2021) have already conducted excellent research into various aspects of data practices related to the economic value, such as EdTech’s commercial interest not always sitting well with user privacy (Hillman 2022) and the work needed to produce and manage school data (Selwyn 2020). Specifically in HE, emerging work has found that EdTech companies turn user data into assets they control (Hansen and Komljenovic 2023). EdTech incumbents such as Pearson have evolved into data organisations with intensive mobilisation of data analytics for impacting HE processes and governance (Williamson 2016, 2020). Research has also identified tensions and unintended consequences in relation to data work at universities (Selwyn et al. 2018), pedagogic, cultural, and social effects (Williamson et al. 2020), and the need for universities to pay greater attention to privacy issues and data standards in procurement processes (Ali et al. 2024). Thus, research in this field highlights (1) the relations between EdTech companies and universities as pivotal and (2) the dynamics of the EdTech industry as being highly relevant for the sector.

Data in HE are understood to be valuable in terms of their use, which is mostly the ambition of universities, and in economic terms, which is mostly the concern of the EdTech industry (Komljenovic et al. 2024a, b). In this article, we contribute to the literature by examining strategies employed by EdTech startups to make digital and personal data valuable in HE and the struggles that these startups confront. In other words, we examine the economic dimension of postdigital HE, which is co-constitutive of the socio-material assemblages of digital products and services (Knox 2019; Lupton 2018). Understanding how digital data can be made economically valuable is important because the monetisation of user data is consequential for university practices and the nature of postdigital HE, and because governments and organisations see digital data as the premise of contemporary economies in which HE is embedded. Moreover, we specifically focus on EdTech startups because of the promised transformation and disruption that they seek to achieve in HE (Decuypere et al. 2024; Ramiel 2020). As a result, we can reasonably expect these companies to be leaders of datafication processes.

In what follows, we first elaborate on our conceptual and empirical approach. We then move to discuss the economic construction of data value by EdTech startups and the challenges they confront, before concluding with some reflections on the impact that data monetisation has in HE.

EdTech Startups and Data Value

The number and diversity of EdTech startups have grown since the early 2010s, supported by the increase in EdTech investment between 2012 and 2021, including by venture capital (Komljenovic et al. 2021). EdTech startups are imagined to disrupt the education sector (Ramiel 2020) and deliver their products globally (Nivanaho et al. 2023). They need to scale and move fast and reframe the meaning of education and what it means to be a student and teacher (Decuypere et al. 2024), and they are influenced by their investors (Komljenovic, Williamson et al. 2023; Williamson and Komljenovic 2023). The discursive construction of the EdTech startup industry puts digital data at the centre insofar as disruptive technologies are predicted to rely on user data and deliver data-powered services (Williamson 2021). Therefore, digital user data collected by EdTech platforms are understood to be useful and valuable for the education sector when incorporated into data-driven products (e.g. personalised learning) and for the EdTech industry when digital data they collect can be monetised, i.e. making these data economically valuable. However, this is not an easy task.

Digital data are made valuable in different ways, depending on the distinct needs of various industries or the aims of different products, including by profiling and targeting people, optimising systems, managing and controlling things, modelling probabilities, building new products, and growing asset value (Sadowski 2019). However, despite the omnipresent belief in the value of data in our economies, measuring and realising this value is still proving challenging (Birch et al. 2021).

One way to understand these attempts to monetise digital data is through the lens of valuation studies. From this perspective, value is understood relationally as a process rather than a fixed quality of a given entity. Value is thus a ‘social practice where the value or values of something are established, assessed, negotiated, provoked, maintained, constructed, and/or contested’ (Doganova et al. 2014: 87). This view allows us to illuminate and scrutinise the social practices of valuation and their effects. How things are negotiated to be valuable changes the very nature of social practices. In our study, the way digital data are made valuable impacts not only digital platforms and their business models, but also how data outputs are seen to be valuable. For example, if a specific metric, such as analytics on a student’s reading patterns, is considered valuable, then it has economic effects as we may be willing to pay for such analytics, as well as more substantive effects on how we think about reading, its role in the study process, and so on. Thus, the social practices of making things valuable, how they become valued, and the impact of such valuation are crucial for teaching, learning, and the future of HE. This is especially relevant in moments of controversy or novelty (Berthoin et al. 2015), such as in the case of data value (Birch 2023). Our approach is to scrutinise the practices and strategies of startups that are trying to make digital data valuable and to discuss the effects of these practices and strategies.

Methodology

This article is based on a three-year research project examining EdTech in the UK HE sector. Our overall research aim was to investigate new forms of value in digitalised HE, especially value created through datafication. For this article, we draw on a part of our data corpus that includes 24 interviews and 540 documents collected from 16 EdTech startup companies; 14 in-depth interviews conducted at four universities and 243 documents collected from eight universities; and six focus groups including 19 participants from 17 universities. EdTech startup interviewees included company founders, lead engineers, marketing directors, university relations managers, and similar roles. University interviewees and focus group participants included directors and managers of IT and learning technology units, chief information officers, academic leaders of digital strategy, and similar roles. Due to commitments to maintain anonymity that were given in our ethical protocol, we can only present an overall aggregate analysis of themes and use direct quotes only from a limited number of interviews with EdTech startups and focus group transcripts with universities.

Our analysis was undertaken in three stages. First, we developed a thick description (Ponterotto 2006) of each participating organisation (i.e. each university and startup company). Second, we applied a cross-case thematic analysis (Braun and Clarke 2006) that included a number of aggregate themes covering digitalisation and platformisation dynamics, datafication and data value, assetization processes and future imaginaries, EdTech business models, and attitudes towards EdTech and HE (Komljenovic et al. 2024a, b). Finally, we applied our theoretical framework to the thick descriptions from the first step and the themes from the second step to specifically focus on EdTech startup strategies and struggles confronted when making digital user data economically valuable. We elaborated five startup strategies for monetising user data and six areas of struggle for these startups.

Monetising Data: EdTech Startup Strategies of Making Data Valuable

EdTech startup companies that are active in HE find it challenging to make digital data valuable. Most of the companies we analysed did relatively little with the user data they collected, yet there remained an omnipresent belief in the value of data. Our participants stated that user data will drive their companies’ future value and that user data are the most valuable thing about their respective companies. For example, one participant stated that ‘there is gold in data. I just don’t know how to mine it yet. So that’s something I need to work on’ (C06P01). Most of the analysed companies were in the process of finding ways to monetise data. We structure our analysis of their strategies into three types: (1) offering various digital products and finding value in collected data; (2) controlling data and deriving value from this control; and (3) consolidation of data.

Digital Products

This strategy was evident in the work of the majority of the EdTech startup companies we examined. They are offering various products, such as learning or reading platforms, where the data service is not a primary offering. We found that these EdTech startup companies focused on developing their primary offering and are only in the very early stages of thinking about data value. They are in the process of experimenting with making data valuable, as suggested by this participant:

[W]e’re just scratching the surface about what the value of those things might be over time. ... I think there’s an awful long way to go before that’s properly revealed, and indeed, what other things you might choose to do with the data. (C11P02)

Most companies in this group thus believe there is value in the user data they collect, but they still need to find ways to monetise it. We identified two such strategies for creating economic value in HE as these companies embark on a data monetisation journey: (1) datafying products and (2) data externalities.

Datafying Products

In the first strategy, datafying products, startups integrate data outputs such as usage analytics or group comparisons into their digital products. The basic premise is that data insights are valuable in their own right because they can be useful for purposes such as enhancing institutional efficiencies or supporting student experience, and consequently, they drive up the value of digital products. A participant from an EdTech startup described how companies process and display data outputs:

We have it all on our data lake. Then, we have a process of transforming the data, so connecting it together, applying logic to the data so that we can embellish the data and make it more useful. … We model our data into a new structure that makes it easy for people to use for analysis. (C03P02)

A common trend we identified is that companies construct ever new metrics, indicators, and analytics from the user data they collect. After new insights are calculated and presented, they can be launched in the existing products or added as optional or add-on services for extra fees. Similarly, new practices might be imagined based on calculations of potential value, and various actors may eventually be convinced of their usefulness. For example, based on insights into how many students access an assigned reading via a virtual learning environment, we were told inferences could be made about how well specific tutors chose course readings, which can be promoted to university leaders as a measure of staff performance. Furthermore, companies seek different ways to report on metrics or construct new indicators and test how users react.

Companies promote possible data products deriving from their digital platforms to universities, as explained by this focus group participant talking about an EdTech incumbent:

I had a meeting with our account manager … recently and she’s really pushing how we don’t harness the data enough that we generate within the system. Asking how we currently use the data and how we analyse the data, whether it feeds into any analytics, dashboards, etcetera. And I said, well it does, but it’s only sort of top level engagement data. She was quite surprised, wanted to set up meetings to really push that, you know, to drive actionable intelligence. (G2P1)

This quote refers to the example coming from an EdTech incumbent; however, it is representative of EdTech companies, including startups, motivating and educating universities about available data outputs and how to use them.

In this strategy, analytics are also used for product development. Companies collect information relating to user activity on their platform and these data are used in product and feature development, as described by this participant:

So, if you’ve built five features and one of them is maybe used. Is it worth spending any more time looking at it? Probably not. And if one of them is used all the time, then that’s obviously a massive thing to focus on…(C09P01)

Universities may support product development by EdTech companies. Some university participants reported that although universities are data controllers, as per data privacy legislation, they allow data processing not only for receiving the service but also for product development. This is especially the case if a university feels the product will deliver future value for them. We were given examples of companies developing particular AI algorithms that the university will use in future. A common view among university participants was that if a university wants to benefit from an analysis, it needs to accept data aggregation and data processing by external companies. Moreover, university participants spoke about conducting pilot projects with companies to test particular analytics and actionable intelligence, although this was more common with well-established and leading technology companies than with startups.

Data Externalities

In the second strategy, data externalities, EdTech startups search for new customers or audiences for their current or future data products. Business intelligence based on user data collected from university students and staff might be of interest to other organisations beyond universities. For example, a digital library platform collects data on reading trends and student click-through of digital material. Analytics on reading behaviour might be useful beyond universities, such as for publishers to manage their publishing and subscription processes. A publisher might be interested in the popularity of particular texts, comparisons of their texts to other publications, reading analytics based on student and staff traits, how reading behaviour is impacted by specific recommendations, and similar. Moreover, HE regulators might be interested in reading analytics for monitoring student engagement, while policymakers might use learning platform analytics for understanding skills needs, as described by this startup employee:

[T]he historical data trends in that attainments piece is really valuable. So, can you do some predictive analysis to say, well this sector is particularly underserved by a set of skills as a result of the graduate attributes that our UK sector is deploying? (C11P01)

In this way, the same user data collected by EdTech digital platforms can be processed for various outputs multiple times and in different combinations for different audiences. This might be useful and valuable to specific stakeholders. For example, publishers might use reading trends in commissioning authors or in marketing strategies, or HE policymakers might use such analytics to design national skills strategies. However, students and staff who produced the data used in these analytics have no control in deciding what analytics and data products are developed either at their universities or beyond; and what effects data outputs will have on their individual lives and society more broadly. This is a common practice in the contemporary digital economy, where users have no say over data use and repurposing of data, but it is increasingly scrutinised. Calls are being made for democratic data governance based on data being understood as a common good and that everyone should have agency in relation to how their digital data is used more generally (Sadowski et al. 2021) and in HE specifically (Komljenovic et al. 2023a, b).

Controlling Data

EdTech startups aim to monetise digital data through consolidation and control; for example, by controlling digital platforms and their functions. They may collect user data (e.g. student clicks within a function in a platform) and data on users (e.g. data on courses taken by a particular student provided by a university) from multiple sources to provide a service, such as matching. Alternatively, they may aim to produce value via data aggregation and generating intelligence from big data (Pistor 2020). We identified two strategies used by startups to create economic value in HE from controlling data: (1) controlling data for interactions and (2) consolidating data via acquisitions.

Controlling Data for Interactions

The strategy of building interactions and controlling data is pursued by EdTech startups whose service is to match and connect various actors. Examples include career platforms connecting students, universities, and employers; and platforms connecting content providers, universities, and students. In the case of these three-sided platforms, some parties pay subscriptions while others use the service for free or pay for add-on services in a freemium model (Anderson 2009). For example, students might use the services of career platforms for free, while universities and/or employers pay a subscription.

Companies using this data monetisation strategy are especially dependent on network effects (Srnicek 2017). For example, a career platform is useful to employers only if they have access to a high number of students and graduates from different universities and locations, with different skills and experience, diverse traits, and so on. Similarly, such a career platform is useful to students only if a high number of different employers offer work opportunities via the platform. Thus, startups in this category need to establish a high number of both individual and organisational users. Data collected on users are made valuable by tightly controlling and processing it to organise interaction and communication via the platform. It is precisely the power to connect that makes data valuable.

Platforms of this kind collect user activity data generated through the platform itself and data on and from users from other sources. For example, career platforms might collect data on student learning paths from their universities and from students directly, such as their self-reported skills, interests, preferences, work experience, and so on. Data integration from different sources is needed to provide a more accurate and useful matching service. Our startup participants reported that there is a balance they need to find in collecting enough data for the quality matching service but at the same time not collecting too much and causing adverse effects, such as user fatigue in providing data in the case of individuals, or university scepticism in case of institutions.

In order to accelerate user growth and support matching services, these companies keep adding features to the platforms that facilitate social interaction, such as students communicating with employers, or new employees supporting job seekers. In this way, a company is able to extend the lifetime of a digital user on the platform, as this participant explained:

As we head into the future, we would like to see people engaging on the platform more and on a more regular basis. We need to build reasons for them to actually do that. (C15P01)

Consolidating Data via Acquisitions

Consolidating data via acquisitions refers to situations in which particular companies are acquired for their user data. In other words, while companies might be acquired for varied reasons, in this case, the purpose is to gain access to user data that the company has accrued.

Our university participants outlined that EdTech companies are beginning to realise the power of data analytics, especially when integrating different services. This goes beyond individual companies processing data and offering analytical insights back to universities for fees, as described in the strategy for datafying products. Rather, companies are moving to consolidate analytics potential, as this university participant explained:

EdTech companies are beginning to really realise the power that data analytics will give them moving forward. I think the recent acquisition of [Company 1] by [Company 2] is a real insight into where that company’s going. And one of the first conversations that I had back with [Company 1], who we are the customer of, was, there’s no way I want our data going anywhere near [Company 2] because they didn’t buy it for the technical platform. They bought it to be able to get a better data insight and then spin that back to sell their other slightly failing business model. (G1P2)

Indeed, the terms and conditions of EdTech platforms typically state that user data are shared or transferred in mergers and acquisitions or during the acquisition talks. There are several key issues to highlight here. First, EdTech still seems to be in the experimental stage of monetising user data, including finding ways of making aggregated data valuable (Komljenovic et al. 2024a, b). Hence, strategies and practices of data consolidation and building ‘big data’ companies in EdTech are still in the making, and the impact of this experimentation remains to be seen (Williamson 2022). However, in other sectors, data consolidation often brings adverse social effects (Birch 2023; Mazzucato et al. 2023; O’Reilly et al. 2024), and it is therefore important to monitor this dynamic in HE and intervene when necessary. Second, the future of user data and data on users that is collected is always unknown. Although data is collected based on specific terms and conditions, and for specific purposes, acquisitions might bring entirely different conditions and rules (Viljoen 2021), including a change in purposes and values. Finally, and as already mentioned, individual users cannot impact the way their user data is used and transferred, which is governed by terms of use and other legal devices (Pistor 2019). Hence, acquisitions pose specific challenges in relation to temporalities, purposes, and power relations.

Data Products

In this final case, data products, EdTech companies offer algorithmic computation as a service, producing various scores and metrics as data outputs for a fee. We identified one such strategy: data products as a service.

Data Products as a Service

EdTech startups act as external providers to process and produce data outputs for universities. This would usually be done for more complex calculations, often using AI. An output could be a student engagement metric or a risk score, for example. One of our university participants spoke about feeding the user data into a company’s ‘black box’ with ‘algorithmic secret sauce’, which produces an end metric. In other words, the university sends all student user data to a data processing company that calculates a number for each student, such as a student engagement number, which is then used by the administration for making decisions on students and student support. However, how the number is calculated is not exactly clear to that participant, who was also not interested in the nuts and bolts of the calculation process. Similarly, another participant told us:

So the VLE [virtual learning environment] data that we’ve got is extracted out and then put into this tracking system, which has been bought in. But it’s very very expensive and has taken an awful lot of time for our web developers to get hold of the data, then input the data into what’s called a warehouse, and then that then fuels this tracking system. (G2P3)

In this way, various scores that processing companies calculate become data products for which universities pay fees. It is not clear if and to what extent university actors understand how such products came to be.

Startups outsource data processing, too, such as for analysing student traits based on user data. For example, one of the startup participants explained that to datafy their product, they outsource student data processing and pay fees for end data products that are then integrated into their platform. This contributes to the data outputs that the company is able to feed back to their customer universities. This complex dynamic illustrates the nested nature of various platforms connected in an ecosystem, allowing data flow between them (Komljenovic 2021; Van Dijck 2020). Student and staff user data can thus be processed by multiple companies for delivering data products. These data flows and data services are governed by contracts (Pistor 2020), which are not available to students, staff, and the public for scrutiny.

The strategies we have identified above are not exhaustive. However, they indicate that companies generally drive the development of data products in search of the economic value of data, and it is not necessarily universities that initiate the datafication that underpins EdTech products. The locus of data innovation is in the EdTech industry, driven by the pursuit of data monetisation, rather than by the use value for universities as the primary concern. These are not necessarily exclusive (data monetisation and use value for universities), but it seems more work needs to be done to ensure they are more systematically compatible. We do not discuss the impact of data monetisation or whether data products are educationally productive here. Instead, we highlight an economic logic of ongoing interplay between rolling out new data outputs and an imagined increased value of digital products. Moreover, the strategies above illuminate the belief in the value of data and the ongoing experimentation by EdTech companies to make data valuable in HE, which is a highly costly and laborious activity. It is also not without its struggles, to which we turn next.

Struggles in Monetising Digital Data

EdTech startups face diverse struggles in monetising the user data they collect from university staff and students. Some are specific to HE or the EdTech industry, while others are more generic and relate to the startup context. Notably, participants felt that monetising user data in EdTech is challenging. They stated that in education, companies cannot sell user data or apply targeted advertising as this is not deemed ethical or accepted by the sector; indeed, we have found no evidence of this activity. Moreover, education users are perceived to be more aware of digitalisation processes and sensitive to their development than in other sectors. Other risks are hard to predict, yet they might have big effects, such as academics publicly criticising a product or a company and causing reputational damage. Thus, some entrepreneurs felt it was harder to navigate education compared to other sectors. We identified five struggles for EdTech startups when making user data valuable.

Generic Struggles

In what we call generic struggles, we describe dynamics that are faced by all startups regardless of the sector. These struggles related to labour and cost, high-level data processing, and convincing customers to pay.

Labour and Cost

For any data-based operations, data processing demands time and effort for data cleaning, organising, sorting, fixing data problems, etc., which is incredibly costly and resource intensive. Moreover, and as already mentioned, EdTech startups are experimenting with data products, especially with developing meaningful insights, as explained by this participant:

I would say that data is critical but data alone is not actually what you’re looking for. What you’re looking for is the analysis of that data and the ‘so what’ from it. (C13P01)

Analysing data and getting more substantive insights is again resource demanding. In addition, to become meaningful and useful, most data outputs and products that we examined need to merge user data produced by the platform with other data, which are either given by a university (e.g. year of study, study programme, or even more detailed information from student records) or are self-reported by a student (e.g. their skills, interests, and plans), as we described above, However, the strategy of data integration is also challenging and demands resources and investment. One participant from a career platform company stated:

It’s not just that [user data] but in combination with other things, it just provides a very significant data set. So, we’ll have many tens of millions of module results, we’ll have many millions of academic overall results, awards. We’ll have within that all sorts of measures about students’ engagement with co- and extra-curricular activities. We will supplement that with all sorts of other things, so in combination, you end up with a bunch of [metrics] that can be used to try and bring some efficiency to that marketplace of [university graduates]. (C11P02)

Developing machine learning and algorithmic processing is very costly. Many startup participants told us they do not have enough resources to support such development of sophisticated data processing until there are clear business use cases, which are accepted and endorsed by their investors. Such business use cases include working with universities as they do not automatically value data products.

Building Large Datasets and Sophisticated Data Operations

At the time of research, most data operations conducted by EdTech startup companies remained at a rather simple level of descriptive analytical insight. If they provided recommendations for users, they were mostly based on the ‘like-like’ logic (Kucirkova 2022) or were rules-based (Campolo and Schwerzmann 2023). A few participants compared their user recommendation systems to Amazon’s:

So, understanding who’s getting value from the books and what books would be valuable to a given student specifically for them, will allow us to do a kind of Amazon style recommendations to read books or to read passages in books even that other students have found useful. (C03P02)

If EdTech startups want to provide more sophisticated data insights, they need to be active with a substantial number of customers for a long enough time to collect enough data for aggregation and data operations. Our participants stated it takes at least five years before they can start doing ‘anything meaningful’ with user data beyond simple feedback loops or user recommendations. For example, a digital library platform initially provides basic descriptive analytics to individuals (e.g. you spent x amount of time reading x number of pages) and perhaps to universities (e.g. x number of students opened x text for x amount of time). Only after they collect and organise enough data on reading trends from many universities can they analyse more substantial trends in student reading patterns for the sector and publishers more broadly. Such was the case with one of the companies that we studied, which was established more than a decade ago, and participants stated they could provide the sector and the policymakers with helpful data analyses:

I mean we live and breathe data, but we do a lot of stuff with the sector and supporting institutions specifically around understanding their data given the scale of the stuff that we’ve got. So, you know, that’s both in the [anonymised] space, how are students using them, are they using them, how are they engaging to their programme, how do we adapt this stuff to help you better engage students? Who would the recipients that these stuff are being shared to, and how do we use that data in meaningful ways? So we do do a lot of that stuff and correlating that data with the academic stuff at institutions specifically. (C11P01)

Investing in data products and developing more sophisticated data processing is not enough to guarantee success. Like in other sectors, EdTech startups need to convince customers their products are worth the fees or subscriptions.

Convincing Customers to Pay

Participants from startup companies reported that university customers do not necessarily recognise the value of data outputs offered by EdTech. Universities are perceived as sceptical and prefer to see proof of concept or the impact of particular data products before committing to procurement. Moreover, universities do not have significant resources to spare for technology, including data products. EdTech startups thus must accommodate the temporality and complexity of providing evidence to universities.

Our university participants spoke about EdTech companies taking university data produced by students and staff as platform users, processing and displaying it back to universities, but charging for this service. They felt this was an unfair model since universities helped build these digital products through their data and platform use (Komljenovic et al. 2024a, b). Participants observed that a common approach employed by startups is to create a digital product, use university data to develop it, and charge for access later. Indeed, deferring fees to the future is a common practice in the technology industry. Thus, how universities perceive the fairness of the EdTech business model is important for the legitimacy of the particular EdTech company and its products. Moreover, convincing customers to pay is connected to proving the use value of data processing and data outputs. The use value and impact of specific products are part of sector-specific struggles that EdTech startups need to address and to which we turn next.

Sector-specific Struggles

The sector-specific struggles that EdTech startups face relate to specific uses, logics, and sector imaginaries and practices.

Convincing Stakeholders of Use Value

Our university participants felt that EdTech companies often overpromise and exaggerate what their platforms and data operations can deliver. They felt that sometimes data outputs might not represent what was promised to be represented, or data outputs might be flawed. For example, one university participant expressed concerns that learning analytics may display metrics based on incorrect data. This might be as simple as an EdTech platform predicting all courses have a set start and end time, which can lead to flawed metrics relating to course progress percentages with a different timeframe. This example indicates a tension between the use of digital technologies to generate numerical calculations and the realities of a messy social life with many exceptions and dynamics that cannot be accommodated within the parameters and classifications employed by datafied technologies. Consequently, universities are cautious about EdTech’s promises of data value. Therefore, EdTech startups must understand and accommodate the complexity of the university environment, with many exceptions, differentiated needs, and required adjustments.

Moreover, university and some EdTech participants felt that basic descriptive analytics provided by most EdTech platforms could be useful, but do not allow for actual research on learning, which would be more substantively transformational for the sector. EdTech participants stated that a different kind of user monitoring is needed to conduct more profound research on learning. The learning environment would need to be more carefully controlled, and each element indexed and measured. Our participants felt this is not happening yet in HE:

If you could actually figure out how the pedagogy, and I hate using that word, works for a lot of different subjects, then I think that does become a valuable asset. If you can actually say, well people learn a lot better by doing X, Y, Z, rather than A, B C, as we’ve currently been doing, then that is definitely a valuable thing to know. (C09P01)

Different Underlying Logics

One of the challenges of data processing in EdTech is the different underlying logics of digital products. For example, we analysed a few digital library platforms, and they found that students do not read 100% of assigned texts. One of the reading platforms assumed that this was a problem and that the aim was to get students to read all the assigned texts in their entirety. However, another reading platform assumed that it is normal practice for students to read only parts of textbooks and that they need many different sources to search various texts. Depending on the underlying logic, these two platforms had different intervention approaches built into their platform. The first platform designed recommendations to motivate students to read entire texts; the second platform designed recommendation interventions to encourage students to move across texts.

This example points to a challenge in relation to how a university or an individual student reconciles potential contradictions in the basic premises of the platforms they use. This is only one example, but it indicates how different products may intervene in teaching and learning differently. Having different premises in different products might be valuable for the sector, but they should be explicitly presented. Therefore, the underlying logic of EdTech platforms needs to be transparent, allowing for discussion not only by university vendor managers and learning technologists, but also by individual end users.

Investment Challenges

The final challenge we identify is EdTech startups convincing investors that HE is a worthy investment sector. While investment in the EdTech industry grew between 2012 and 2021, our startup participants reported that EdTech in HE confronts substantial challenges. First, they reported limited growth potential for startups due to a lack of revenue and available funds at universities. Building an ‘exceptional’ EdTech company, as one participant put it, demands substantial initial capital ranging from $50 to $100 million, primarily allocated for engineering, data sorting, product management, and commercial teams. In this case, achieving a return on investment (ROI) necessitates a challenging 70 to 90% gross profit margin, which is a threshold that is difficult to attain, particularly in HE, where only select universities can afford high-priced products. We were told this is unlike other industries where customers have more resources and a high ROI is easier to achieve.

Second, when venture capital invests in a startup company, rapid growth and scaling are expected, typically requiring a fivefold expansion in three years. Such growth is challenging in HE due to the sector's inherent complexities, such as slow decision-making and limited market feedback during product development. Furthermore, university procurement often acts as a lock-out for startups that do not have the resources to comply with procurement procedures and requirements (Komljenovic et al. 2024a, b).

Third, investors exhibit reluctance to focus on investments for products targeting universities due to perceived inefficiencies in university software procurement. Universities are not perceived to be sophisticated software consumers. Consequently, few EdTech companies developing products for universities only have scaled successfully. Our startup participants reported that many smaller companies offer niche products instead, which are more easily provided through existing procurement procedures. Some participants also noted that, in general, the quality of digital products is lower in EdTech than in other sectors.

The struggles that we have identified in this article are not exhaustive. Taken together, they indicate the variety and scope of challenges EdTech startup companies need to overcome in monetising data. For user data to be made valuable in economic terms, it also needs to be useful for university customers. This might be best achieved when startups and universities work together for mutual benefit, such as enabling user data flows back to universities, evaluating the impact of datafied products, understanding and supporting sector-specific needs and expectations, and working with university temporalities (see: Komljenovic et al. 2024a, b).

Conclusions

In this article, we have elaborated on the strategies and struggles of EdTech startup companies in making digital user data economically valuable. Startup participants we interviewed shared a belief in the value of data; however, at the same time, they reported on challenges to delivering this value. The strategies startups used to monetise digital data included integrating various analytics and data outputs into diverse digital products, creating and promoting new analytics and insights for universities and other stakeholders and actors that might find such intelligence relevant, controlling user data and organising matching and connections, consolidating data and growing big data sets via company acquisitions, and developing data products as a service. Except for the latter strategy, data processes used by startups that we examined were at a rather basic level, offering descriptive analytics or simple group comparison rather than more sophisticated analyses. Participants reported on the omnipresent logic of data value, but the results have not yet been delivered. Thus, there is a paradox between the belief in the value of data and the lack of realisation of this value, at least to the extent that participants would wish. Yet this belief drives the EdTech startup strategies and the industry dynamic. We highlight three broader implications of these findings for postdigital HE.

First, there is a mismatch between experimentation in creating data value and the resources needed for developing valuable data products. What we mean by this is that, on the one hand, startups experiment with datafying products, developing analytics, finding various audiences for available analytics and insights, data products, and so on. But on the other hand, they report they lack resources for developing high-quality and sophisticated data analyses and data intelligence. As discussed above, EdTech startups face significant challenges in raising enough investment to develop excellent and safe products in comparison to other sectors. Moreover, they face sectoral challenges such as the extended temporality of working with the sector as universities are tied to academic years and long cycles of decision-making, as well as processes related to designing data outputs, such as cycles of data collection, data cleaning and processing, then testing various datafied services, and implementation. This also puts startups in a position where it is hard to compete with EdTech incumbents and Big Tech companies that are already structurally present in HE with their digital products and infrastructure. This mismatch appears to present a significant obstacle for the EdTech industry and the HE sector in being able to work together to develop data processing that could have a more substantive and constructive impact on the sector and its practices. This is highly relevant precisely because HE practices are postdigital and the kind of data products that are developed (e.g. how relevant and sophisticated) and their monetisation strategies (e.g. if they are considered legitimate or exploitative) are consequential for the everyday lives of students and staff and their futures.

Second, the ever-new analytics and data outputs that are being developed in search of monetisation seem to be developed based on data insights that are possible rather than developing insights that are relevant for the sector, desired, or needed (Komljenovic et al. 2024a, b). This does not mean that none of the data outputs that are created are useful and helpful, but that a high volume of possible data outputs might be overwhelming for students and staff as individuals and the sector overall. Moreover, the impact of such ever-new data outputs is not monitored, and the effects they have on teaching, learning, and managing institutions are not clear. Such a dynamic is understandable and perhaps expected in light of the political economy of data valuation (Birch 2023). But it is worrying from the perspective of critical data studies in education (Jarke and Breiter 2019; Williamson et al. 2020) and postdigital lens (Jandrić and Knox 2022). We suggest the HE sector needs to organise a more open and transparent discussion on the kind of datafication it wants and needs to support ethical and socially just futures.

Finally, in the pursuit of data monetisation and university reactions to it, the lack of individual agency comes to the fore. While universities and EdTech companies comply with the UK Data Protection Act and student and staff personal data are protected accordingly; our research suggests that individual users are unaware of the data they leave behind when engaging with EdTech digital platforms. Moreover, they cannot opt out of using particular technology for study and work and, consequently, from the data operations using their data. Data products produce recommendations and other outputs that importantly impact the study and work of university staff and students, yet they are not involved in discussing this impact. More transparency is needed, and students and staff should be more substantially included in data governance (Komljenovic et al. 2023a, b; Sadowski et al. 2021), as this affects the way how postdigital HE will develop in future.

The strategies, struggles, and discrepancies that we identified in this article are specific to our research context, which is based in the UK and focuses on HE and EdTech startups. However, our findings are more broadly relevant. More research is needed on datafication practices in education, especially in providing empirical evidence on the business models, imaginaries, and practices of EdTech companies in terms of how they monetise user data and what are the social effects of such monetisation. This is crucial in ensuring productive and socially just data products in the HE sector.