Ecological Teaching Evaluation vs the Datafication of Quality: Understanding Education with, and Around, Data

Current evaluation of higher education programmes is driven primarily by economic concerns, with a resulting imbalance towards the summative assessment of teaching and away from faculty development. These agendas are advanced through datafication, in which the transformation of social and material activity into digital data is producing a narrow, instrumental view of education. Taking a postdigital perspective on contemporary practices of evaluation outlined in higher education literature, we argue for an ecological view, in which evaluation must take account of those aspects of teaching, learning, and educational context, missing from digital data. We position quality as distributed across teacher, student, institution and context, arguing for the cross-fertilization of diverse kinds of data and non-datafied understandings, along with greater involvement of teachers and students in ways that enhance their agency, and develop their evaluative judgement of the quality of educational practices. We conclude that datafied practices can complement expert judgement when situated within a trusting, formative environment, and informed by an understanding of both pedagogy and technology, and clarity of educational purpose.


Introduction
Evaluation is crucial to understanding, not just the quality of teaching, but how different elements come together to produce education, and how this integration could be improved. The first criterion of meaningful evaluation of education is a clearly identified set of purposes, against which quality can be meaningfully judged (Biesta 2012). Although the purposes of higher education are manifold, economic pressures mean that government and institutional policies around evaluation are increasingly oriented towards workforce development and producing employable graduates (Greatbatch and Holland 2016;Walker and Tran 2017;Gourlay and Stevenson 2017). Yet, students may also have goals that are about long-term personal development, rather than employability. Postgraduate programmes, in particular, may be designed to help students to broaden their interests and perspectives, and to develop their critical thinking. Universities also have a responsibility to contribute to the production of new knowledge, and to equip students to take the lead in dealing with the considerable societal challenges before us (Naidoo and Williams 2015). Considering educational value, not only in terms of economic, but also cultural and social capital (Bourdieu 1986) allows a broader set of viable goals that align with long-term, and lesseasily measured, social needs (see, for example, Aitken et al. 2019).
Given these different, potentially conflicting aims, the quality of education is likely to be multifaceted, and will look different depending on which purposes are focused on. Moreover, quality-even within each individual facet-is contextual, rather than absolute. Employability depends on the economic climate, and current industry needs and cultures. Meaningful social reform occurs in the context of contemporary social structures, and may even be in opposition to employers' aspirations of the economic distribution of wealth. Personal development is subjective, idiosyncratic, and dependent on the student's capacity to develop. In addition, it is worth considering what, or whom, is being evaluated. Is it the teacher, programme, or institution? Do the students contribute to the quality of their education (especially, but not exclusively, in 'self-directed' or 'studentcentred' approaches)? How do we meaningfully understand, not only the contribution of each group or element, but also the quality of those contributions? Simplistic allocations of responsibility, as either belonging to students or teachers, for example, are likely to result in instrumental approaches to evaluation (Braskamp 2000), based on actions or characteristics in isolation of context. Instead, we argue for seeing teachers, not as providers of experience or knowledge, but as designers and practitioners who help students to engage with materials, peers, learning communities, and environments (Goodyear and Dimitriadis 2013;Goodyear and Carvalho 2019). From this view, agency is best seen as relational Edwards et al. 2009), and control and responsibility cannot simply be held by the teacher or given to the student.
It follows that multifaceted, relational and distributed conceptions of the quality of educational programmes will require contextualized evaluation processes, clearlydefined parameters and purposes, and a clear understanding of the limits of each set of results. Yet, increasing pressures towards commercialisation, transparency and efficiency, have led to a popular view of higher education as a transaction (Gourlay and Stevenson 2017), where teachers and institutions are providers of educational products or services, and learners are potential and actual customers with predefinable wants and needs (Biesta 2005). Students and employers have been positioned as key stakeholders in evaluation, and perceived employability has become a dominant criterion in the competitive status of programmes and institutions (Tomlinson et al. 2018). As such, oversimplified measures of student satisfaction and outcomes have become the twin pillars of mainstream efforts to standardize evaluation, while a pervasive rhetoric of value for money across institutional, government and private sector discourse positions future earnings as the primary concern for students, and for Higher Education more generally (see, for example, the UK government's Department for Business Innovation and Skills 2016 Teaching Excellence Framework White Paper, 'Success as a Knowledge Economy: Teaching Excellence, Social Mobility, and Student Choice'). As we discuss in the following section, this backdrop has fuelled the datafication of 'teaching quality', in which digital traces of educational activity are increasingly generated, harvested, analysed, and mobilized in ways that foreground instrumental understandings, not only of evaluation, but of education itself.
In this paper, we take a postdigital perspective (Fawns 2019) to critique contemporary practices of evaluation outlined in higher education literature, then argue for an ecological view, in which evaluation must take account of those aspects of teaching, learning, and educational context, missing from digital data. As Fawns argues, the postdigital perspective helps us to see past the features and functions of new technologies, and the instrumental and hyperbolic discourse of technology in education. By framing both 'transparent' digital technologies, and opaque datafied processes, as entangled in the social, economic, cultural and political landscape, we can better resist deterministic language and rhetoric. Our conception of ecological evaluation draws on this critical perspective to position educational quality as distributed across teacher, student, institution, materials, environments, and other contextual factors, arguing for the cross-fertilization of diverse kinds of data and non-datafied understandings. This, we argue, requires greater involvement of teachers and students in ways that enhance their agency, and develop their evaluative judgement of the quality of educational practices. We conclude that datafied practices can complement expert judgement when situated within a trusting, formative environment, and informed by an understanding of both pedagogy and technology, and clarity of educational purpose.

The Datafication of Teaching Quality
Datafication involves the progressive transformation of social and material elements and activity into digital data, followed by the treatment of that data as equivalent to its original source. It is a pervasive phenomenon across society (Beer 2016), shaping our understandings of healthcare, the military, commerce, tourism, etc. As Beer explains in his book Metric Power(2016), the construction of a measurement is the first step in the transformation of something (elements or activity) into data, and while many currentlyused forms of the measurement of teaching quality were created before the widespread adoption of digital technology in education, it is clear that there is a rising tide of measurement and quantitative analysis of educational elements (Williamson 2016). Activities, outcomes and subjective judgements relating to teaching and learning are being digitized and harvested in the form of marks, student satisfaction ratings, workload surveys, attendance monitoring, sickness absence reporting, administration of student support systems, marking turnaround times, learning analytics, and more.
Of course, much non-datafied evaluation still takes place through informal processes, dialogue, observation, etc. However, the consumerist landscape makes education susceptible to data-driven practices (Knox et al. 2019), and the goals of neoliberal education further concentrate 'the "datafication" of academic productivity, engagement and outputs' (Ross and Macleod 2018: 235). The datafied understandings produced through economically-driven policies position education as, largely, a results-orientated, 'input-output system' (Olssen and Peters 2005: 324), where trust in educators, institutions, and even in the students themselves, is replaced by accountability (O'Neill 2002), and where that which does not show up within the data is at risk of being marginalized. As we will see, this does not, in any straightforward sense, produce a rational system of accountability and efficiency, but rather a culture of performativity, where educational activity is skewed towards generating favourable data.
The consequences of this go beyond pedagogy, and into issues of political economics and social justice, as can be seen through an examination of two dominant forms of measurement. While outcome measures (i.e. grades, retention, employment, salary) and student satisfaction surveys provide useful information as part of a holistic picture of the quality of education, they are dangerous when taken as isolated proxies for quality. Firstly, outcome measures focus on effectiveness, without questioning whether efforts are focused in a desirable direction (Biesta 2009). They neglect the value of exploration and critical thought, as well as the process of education, and how this fits with a student's or, indeed, society's, current needs (Naidoo and Williams 2015). They tend to be orientated towards short-term goals rather than long-term stability and the potential to adapt to shifting and unpredictable contexts. For example, programmes that focus on social justice or critical appraisal skills may not be well-suited to evaluation through employment outcomes, since those who have been empowered to critique the systems and settings of future employment may not see a high salary as a priority. This is not to say that outcome measures should be dismissed altogether. They are an important element of the broad picture of the quality of education and teaching (Berliner 2005;Fenstermacher and Richardson 2005). However, in order to understand the quality of education, we need to understand, not only what is learned, but how it is taught (Fenstermacher and Richardson 2005). In part, this is because the ways in which teachers teach are relevant to the ethics, aspirations, and theoretical elements of educational practice. Further, to support the development of teachers and teaching, evaluation should also have a developmental focus (Ory 2000). Such nuances are inconvenient for the straightforward ranking of institutions, programmes and teachers, used to facilitate administrative and economic decisions (Altbach and Goodall 2006), and the considerable limitations and complexity behind competitive standings are not conveyed in the clean presentation of league tables (Tomlinson et al. 2018).
Student satisfaction metrics are also limited in their capacity to capture educational quality. The increasing influence of student ratings and satisfaction surveys leads advocates of the market model to expect the quality of educational 'service' to increase, because 'customers have control over expectations and evaluate services by their capacity to fulfil their demands' (Bunce et al. 2017(Bunce et al. : 1959. However, over-emphasis on single metrics such as satisfaction is dangerous, both because it can lead to riskaversive approaches to teaching, and because it can cover up areas that would benefit from attention. For example, the MSc Clinical Education, on which we teach, scored 100% for overall student satisfaction in the 2019 Postgraduate Taught Experience Survey (PTES). It is tempting to accept this as an indicator of overall success, rather than as just one of a wide range of relevant indicators. In our field of postgraduate education in healthcare, an important aim is to broaden the horizons and develop the agency of graduates within the workplace. For many of our students, this requires curiosity and courage in exposing themselves to 'different ways of thinking, alternative epistemologies, different cultures and the varied practices of student peers' (Aitken et al. 2019: 565). The PTES score does little to help us understand the extent to which satisfaction is derived from overcoming such challenges, or from meeting less demanding expectations.
We agree with Biesta (2005) that education involves risk, and satisfaction cannot be guaranteed or engineered. As Biesta (2005: 60) cautions, marketing-led depictions of learning as 'easy, attractive, exciting' are problematic because frustration, resistance, and challenge are key elements of education. Meaningful satisfaction, for us, comes with persistence and resolution, and students may not understand many of the benefits of their studying until well after the programme has ended (Aitken et al. 2019;Aitken 2020,this issue). Transactional framings obscure the possibility that the purpose of education is not to deliver any pre-conceived knowledge but to explore possibilities and perspectives, and that students coming to understand their own wants and needs might be an important part of the educational process.
Nonetheless, students have come to be defined as customers, and programmes are increasingly seen as consumer products, valued in terms of a 'student experience' that can be packaged and delivered (Hayes and Jandrić 2018). Efforts to improve satisfaction via a focus on the 'student experience' (Bates et al. 2019) have led to further measurement, mostly under the banner of 'student engagement', of various forms of interaction and participation, including attendance, conduct, completion of assigned tasks, communication, interest, enjoyment, sense of belonging, citizenship, institutional culture, and collaborative learning (Dismore et al. 2018). Although paying attention to these issues is very welcome, a focus on the measurement and optimisation of student experience can be seen as a method of control: in order to quality assure the educational product (and to increase the efficiency of the educational system), learning and experience are homogenized through standardized measures, and by regulating and systematizing each component and process (see Hayes 2017 on the McDonaldization of education). Examples of this process can be seen, not just in the formulaic learning outcomes and assessment criteria that are requirements for most higher education programmes, but in the timetabling, workload planning, teaching accreditation (see, for example, Warnes 2020, this issue, on HEA fellowships), feedback policies, virtual learning environment templates, and so on. The measurement of processes and outputs become part of the regulation, and the consumer orientation both drives and is driven by evaluative metrics and the regulatory processes they feed into.
Our concern is not that standardization is inherently bad, but that the rhetoric of transparency that accompanies widespread measurement and reporting on teaching activities removes trust and agency from educators. As data-driven technologies are taken up by institutions, the work of teachers and academics becomes 'increasingly subject to normative processes of algorithmic exposure and measurement' (Ross and Macleod 2018: 235). Datafied processes continue expanding to an ever-wider range of digital and physical activity (Knox et al. 2019), representing those activities in specific and narrow ways. That which is not captured or represented is marginalized, and alternative perspectives and possibilities for being creative are shut down, as these metrics come to represent things and spaces previously claimed to be intangible and immeasurable (Lanier 2011).
This expanding regime of measurement, described above, produces an ongoing, potentially summative and high-stakes, assessment of teaching by non-experts (i.e. administrators, data analysts or algorithms) (Stevenson 2017). For example, the pressure produced by comparing teachers against each other via their performance data, even though they may be teaching very different subjects or student groups, results in a shift away from a developmental focus, to one in which all behaviours (including exploration, experimentation, practice, and the making of mistakes) feed into procedural and objectified judgements of efficiency and effectiveness (Olssen and Peters 2005). Processes of datafication thus enable the enforcement of transparency and overwrite the need for trust (O'Neill 2002), shaping the conception and the governance of quality. Under such conditions, teachers are more likely to become risk-averse, reducing possibilities for creativity and innovation (Oravec 2019). In this way, datafication not only structures how learning and teaching are measured, but shapes the ways in which those concepts are understood (Knox et al. 2019). This way of seeing then creates a backdrop against which new kinds of evaluation are made possible, while other ideas are closed down.
A parallel can be seen in the way that learning analytics, touted as a promising avenue for understanding more about the learning process, shape conceptions of learning and its quality. Learning analytics capture ever more elements of students' engagement with the technologies of education, in order to develop categorized profiles, guide behaviour towards greater effectiveness, and identify 'at risk' students based on patterns of activity (Gašević et al. 2017). However, rather than empowering students or teachers, or strengthening relationships, this can be seen as a form of surveillance and data governance that encourages excessive summative judgement and risk-aversion, while marginalizing the teacher's role (Knox et al. 2019). The nebulous concept of student 'engagement' becomes portrayed as observable and measurable, emphasizing participation and interaction at the expense of other activities, such as listening and thinking (Gourlay 2015). The nudging of students towards certain patterns of behaviour that have been determined by algorithms to be efficient or effective, is part of a shift away from aspirations to develop autonomous, selfdirected learners. As Knox et al. (2019: 5) put it, 'not only is data positioned before the desires of the learner as the authoritative source for educational action, but the role of learner itself is also recast as the product of consumerist analytic technologies.' In a similar way, focusing too heavily on teaching metrics, without sufficient emphasis on pedagogy, may de-professionalize teachers, positioning them as technicians, and encouraging strategic approaches such as grade inflation (Braga et al. 2014), or the recruitment of already high-performing and privileged students (Naidoo and Williams 2015). At a broader level, an emphasis on overly-simplistic or misaligned summative judgements of teaching and teachers may orient institutions towards efforts towards being seen to be better, rather than actually improving quality (Brown 2013;Gibbs 2017). As evaluation is increasingly standardized, both definitions of quality and conceptions of what is being evaluated become implicit and are shaped by processes of data generation and analysis. These implicit understandings have important consequences for what is valued and how institutions organize themselves and their staff and students.

Ecological Evaluation
In the previous section, we discussed how the excessive and instrumental measurement of education gives rise to a 'culture of performativity' (Biesta 2009: 35), in which 'targets and indicators of quality become mistaken for quality itself". As a consequence, evaluation is largely oriented towards technical concerns, and away from values and pedagogically-informed purposes. In this section, we propose that non-datafied understandings are needed alongside diverse kinds of evaluation data, in order to generate more nuanced conceptions of education, underpinned by a clear, explicit set of purposes. In doing so, we acknowledge the challenge raised by the mutual shaping of pedagogies and data-driven practices; while data cannot neutrally capture education, nor can technological practices be simply discarded as secondary to pedagogy. Our guiding principle is that data should enhance, rather than obscure, our understanding of the relational nature of agency and responsibility.
For instance, we caution that giving too much weight to outcome measures, such as grades, retention, employment, or salary, risks marginalizing valuable, yet less visible, forms of student and teacher practices. Overuse of outcome measures suggests that the benefits of education can be pre-determined, and creates an unrealistic distribution of responsibility, where teachers, programmes or institutions are held accountable for the relational contributions of the students. As we have argued, it can be difficult or impossible for students to predict in advance what they need to learn (see also Aitken et al. 2019). Further, by ignoring the educational process, outcome measures are confounded by the student's effort, approach to learning and performance, strategies for success, and numerous variables such as socioeconomic and demographic variables, as well as health, family support, etc. (Uttl et al. 2017).
Similarly, an obvious limitation of both teacher-centric and student-centric analytics is that each does not sufficiently account for the other (Sergis and Sampson 2017). Evaluation should consider not just what teachers do but also the relationship between teachers and students (Braskamp 2000;Biesta 2012). While satisfaction surveys might pick up elements of the teacher-student relationship, they tend to place responsibility for the relationship on the shoulders of individual teachers. Not only do students contribute significantly to the educational process, but teaching is often a collective activity, dispersed across faculty, external specialists, and, indeed, each student's peers. Making individual teachers overly accountable for this collective activity may encourage risk-averse behaviour and a reduction in collaboration (Oravec 2019).
Evaluation is highly complex because it requires not only judgement of teaching against parameters of purpose, pedagogy, approach, and rationale, but also judgement of those parameters themselves. This cannot be done purely through measurement and analysis; it calls for discussion and dialogue because there is no absolute, value-free position against which evaluation can be calibrated (Vo et al. 2018). As we have noted, evaluation is not only pedagogical, but also political and economic. Our position is that, irrespective of the educational approach, responsibility is distributed across teachers, students, and systems, and reductionist metrics cannot capture these fluid and dynamic relationships. For these reasons, we call for analytics that are part of a wider, ecological view of education, where relationships and holistic conceptions of practice are valued above individual variables (Goodyear and Carvalho 2019), and where it is acknowledged that metrics do not capture everything that is important.
As Goodyear and Carvalho (2019) argue, our ways of analysing and interpreting data should support ecological conceptions of education and inform complex judgements. They describe educational ecologies as relational, in which all elements (e.g. teachers, learners, technologies, policies, environments) are intertwined. From an ecological view, evaluation should not be reduced to the merits of individual elements, but should be based on a holistic analysis of the wider system. In a postdigital ecology, data is only one element, and is entangled in non-digital (physical, social, economic, political) activity. In the previous section, we have argued that, in much contemporary evaluation, digital data generates an oversimplified representation and, therefore, conception of education which, in turn, amplifies its position within educational ecologies. Our aim is to redress this balance, so that datafied and non-datafied understandings of educational quality shape each other in complementary ways that support the development of relational agency across students, teachers, and other stakeholders. For this, we argue for data and non-datafied information about relationships, practices (and where and how they diverge from policy), environments, and pedagogy. We call for analyses that are interpretative, holistic, complementary, ongoing, and formative.
Aitken's (2020) postdigital exploration of online postgraduate learning, in this issue, provides one example of the kind of analysis that can contribute to a wider, ecological view of evaluation. Aitken interviewed both teachers and students of online, postgraduate healthcare programmes, using these conversations to generate ways of understanding the educational process and outcomes, and the factors that underpinned these. Her analysis does not give the full picture, and no single method of evaluation can do this. However, it provides a valuable piece of the puzzle, which can be used to nuance our understanding of other pieces. Aitken acknowledges the role she plays as both researcher and Programme Director in shaping the analysis. She neither brackets herself off as external to a neutral process of evaluation, nor simply determines her results. Her evaluation is a synthesis of her conversations with students and teachers, and her own judgement as an experienced and expert practitioner in the field. Further, this particular analysis is understood in conjunction with other ways of understanding online education, including metrics such as student satisfaction scores, grades, etc. We argue that this balance of data, evidence, dialogue, and expertise is necessary for the crossfertilization of data and non-datafied understandings, and appropriate to generating evaluations in which educational quality is distributed across teacher, student, institution and context.
Within an ecological view, rather than considering the performance of individuals in isolation, evaluation can also look at the distribution of resources, policies, systems and environments, and how these support and constrain educational practice (Goodyear and Carvalho 2019). Many practices are not adequately captured in data that rely on teachers' and students' activities conforming to an anticipated model, yet they are worth paying attention to, because they can help evaluators understand how teachers and students negotiate the systems and settings of their education (Fawns and O'Shea 2019). While we refute assumptions that education 'can be described in digital terms' (Fawns 2019: 139), we believe that digital traces of activity can complement the tacit forms of data collection and analysis that teachers intuitively employ-by observing activity and engaging with students-to support valuable conversations about practice.
Dialogue between teachers and students can make sense of how the formal curriculum intertwines with informal, extra-curricular activity in which 'sites of learning are constantly emergent' (Gourlay 2015: 402). Engaging in dialogue around practices can encourage the development of ways of working and learning (Brown and Duguid 2002;Fawns and O'Shea 2019), which is particularly important where students learn across different settings (such as in professional, postgraduate programmes). There is an additional benefit of both teachers and students developing their own practices outside of formal structures and expectations. In contrast to the top-down, automated 'personalisation' of many learning analytics implementations (which is really about encouraging users to conform to standardized expectations in relation to broad and blunt categories to which they are assigned based on their usage data), these ways of working evolve through complex and idiosyncratic relations with situated resources and constraints. These practices really are customized and personalized, as students work out what works for them, and take control of their own direction of development. It is worth taking account of idiosyncratic practices within evaluation (e.g. via observation and dialogue), because they can shed light on student and teacher preferences, as well as the limitations of formal structures and systems. Exploring and discussing actual practices, including subversions and workarounds, can reveal aspects of performance that would otherwise be invisible to evaluation, as well as areas where policy is not, or cannot, be implemented (Brown and Duguid 2002).
If education is not bounded in terms of technologies and structures, neither is it bounded in time. The diverse effects of educational programmes are complex and may not become clear until well after graduation, if ever (Aitken et al. 2019). Therefore, the timing of evaluation matters, yet it is very difficult to know when it is most appropriate to evaluate. Since ecologies have no clear beginning or end, it makes sense to see evaluation as 'not a single snapshot but rather a continuous view' (Ory 2000: 16), in which formative evaluation is emphasized over summative. Efforts can then be oriented to supporting the development of constituent elements (teachers, students, administrators, environments and systems), their practices, and the capacity of each to contribute to the holistic manifestation of educational programmes.
Methods of data generation and analysis have the potential to contribute to understandings of important elements of the educational process, and their relationships to each other. However, it is far from clear that ongoing, comprehensive measurement is good for students or teachers, or that it actually makes things more transparent. Care needs to be taken if datafied practices are to enhance teacher or student agency, without placing undue summative emphasis on behaviours that are part of an ongoing developmental process. For example, too much surveillance and summative judgement may reduce innovation and creativity in teaching design and practice (Gourlay and Stevenson 2017). This is a risk for newer teachers, in particular, who do not have an established reputation, may be less confident in their abilities, and may be on less secure contracts (Darwin 2017). Further, by interpreting performance according to historical patterns, teachers are judged against historical biases, neglecting the possibilities that might be seen through aspirational data (Biesta 2009). As Biesta argues, evaluation is about not only judging teaching that has already happened, it is also about supporting the desirable development of teachers, and of teaching that will happen in the future. We believe that, wherever possible, the emphasis of evaluation should be on improving future quality, rather than providing accounts of past quality, and that the oversight function of institutions and administrators should be as much about support as it is about regulation and control. However, we recognize that judgements still need to be made about the quality of teaching. Prospective students need to be able to make informed decisions about institutions and programmes, and institutions and managers need to be able to identify when teachers require additional support or intervention. Our position is that these judgements should locate the teacher within a holistic, ecological view of education. For us, the quality of a teacher's activity is determined by both how it fits with, and how it shapes, the educational ecology. Further, we argue that teachers should be a meaningful part of (but not in control of) the distributed system that passes judgement. Firstly, teachers are well-positioned to understand the context, purpose and parameters of the design and orchestration of their teaching. Secondly, while teachers are, hopefully, knowledgeable about education, the practice of contributing to evaluation provides opportunities to develop ideas about quality and its improvement, and to contribute to the pedagogical understanding of other stakeholders. Finally, by supporting teachers to contribute to the evaluation of their teaching, institutions can foster trusting relationships with their academic staff. This, then, might empower teachers to be innovative and creative, and to defend and explain their choices. As Ory (2000: 17) argued, evaluation is not 'a scientific endeavor, with absolute truth as its goal, but rather… a form of argument where the faculty use their data to make a case for their teaching.' There is, of course, a significant implicit burden on teachers here, and so we are arguing for both taking some of the responsibility for teaching quality away from teachers (and sharing it with students, institutions and context) and, simultaneously, giving teachers more responsibility for contributing to distributed understandings of educational quality. The kinds of expertise teachers need and the implications of this are discussed in the next section.

Supporting Ecological Evaluation
In the previous section, we argued for additional kinds of data and analysis to produce an ecological evaluation of education. In this section, we consider what is necessary to achieve this. Firstly, we suggest that making coherent, holistic sense of the inter-relation between elements (teachers, students, systems, environments, etc.) is important, before action is taken in response to metrics. Further, significant responsive actions should be ethically-informed collaborations between institution, teachers and students. This is likely to require: trusting relationships between stakeholders; the capacity for different stakeholders to meaningfully contribute; and appropriate and supportive conditions for producing honest and constructive formative and summative evaluations.
For us, trust is at the heart of both evaluation and teaching. Biesta (2005: 61) argues that, since the impact of learning can be unpredictable, 'education only begins when the learner is willing to take a risk.' Importantly, then, failure need not be an indicator of poor teaching (though it may be), and part of the teacher's responsibility is to help students persist and endure the frustrations of learning (Biesta 2012). Recognition of this aspect of the teaching role, in turn, requires trust between institution and teacher that some indicators of poor student performance may be part of a longer journey, or the result of productive experimentation or way-finding. The establishment and maintenance of trusting relationships, in which teachers and students are supported to meaningfully contribute to dialogue around evaluation, may be the greatest challenge to the ecological approach we are advocating. A consequence of increased datafication, in combination with neoliberalism, is the erosion of trust between students and teachers, and between teachers and institutions (Ross and Macleod 2018). Data-driven monitoring and summative judgement, without sufficient opportunities for dialogue, constrains teachers' agency to engage in these important aspects of teaching. As such, genuine collaboration within evaluation may require managers, administrators, and institutions to resist the rhetoric of accountability and transparency (O'Neill 2002), and reduce administrative and regulatory control. Only then can teachers and students be empowered to contribute to not only the process of evaluation, but the vision of what constitutes quality in education.
One possibility is for institutions to take a values-led approach, in which meaningful support for educational development is embedded throughout educational policy. The Near Future Teaching project at the University of Edinburgh (2019) provides a template for this. Engaging in dialogue (through workshops, interviews, and other events) with more than 400 stakeholders, including teaching and administrative staff, students, and the wider community, this project generated a values-led institutional vision for digital education. Alongside advocating staff development of ethical and critical understandings of the datafication of education, the project report states that: Learning should not be over-assessed and instrumentalised. Teaching should share a focus on employability and success with an understanding of the value of rich experience, creativity, curiosity andsometimesfailure. (University of Edinburgh 2019: 14).
Whether the inclusion of these principles within the University's strategic vision results in more concrete policy and practice is yet to be seen, but the prominent, institutionbacked endorsement does, at least, provide a basis for educators to resist inappropriate measurement and instrumentalisation. The report provides a clear set of values against which educational programmes can be evaluated, through a combination of data, dialogue and judgement.
However, high-level documents are insufficient for enabling meaningful resistance to any undesirable effects of datafication, or the safe and creative use of technology. Just as with any integration of technology into an educational ecology, top-down directives will need to be complemented by the practices of teachers and students (Cuban 2001;Enriquez 2009). Technology is becoming increasingly entangled within educational activity, and datafied monitoring cannot be avoided altogether (Williamson 2017). The development of agency within datafied educational structures is likely to require that teachers and students have some oversight, and understanding, of how they and their practices are monitored, and how this information is shared (Prinsloo and Slade 2016). Rather than just managing and monitoring their activity, systems of evaluation could be partially controlled by teachers and students, allowing them to determine what data is generated about their activity, what kinds of analysis are done, and how decisions are made on the basis of those analyses. Such analytics might better support exploration, selfassessment and self-determination (Slade and Prinsloo 2013) and, perhaps, narrow the power divides between students and teachers, students and institutions, and teachers and institutions. This would, once again, require open and honest conversations between stakeholders, since allowing students and teachers to opt out of, or limit, data sharing would significantly reduce the analytic power of data-driven evaluative systems. For us, this is a risk worth accepting, and perhaps the apparent reduction in analytic power simply makes clearer weaknesses that were always present: that such data are always incomplete, and that representations generated from that data are flawed approximations.
For us, teachers' judgement, along with their knowledge of the students, is an important part of the basis for the design of appropriate tasks and environments (Goodyear and Carvalho 2019). Further, through interacting with students, teachers should be wellpositioned to make the value judgements that are critical to defining and understanding quality (Berliner 2005). However, quality is culturally and contextually sensitive, and this kind of nuanced judgement requires educational expertise and, presumably, a kind of advanced evaluative judgement or connoisseurship ). Our concern is that, as systematized and standardized uses of data hand more power to administrators, accreditors and technologists, teachers' agency is increasingly constrained, as are opportunities for educators to practise and develop judgement. At the same time, opaque, thirdparty software and algorithms remove teachers' ability to see and question how datadriven decisions are made . Indeed, from a postdigital perspective, an important step in moving towards ecological evaluation is excavating hidden datafied practices for interrogation, alongside meaningful integration with nondatafied interpretations of quality. Datafied processes should form only part of how students and teachers understand educational practice. If data-driven processes are to be empowering, rather than oppressive, teachers need to be supported to understand their uses and, more importantly, their limitations, as well as how they can be incorporated into a wider, holistic understanding of educational practice.
To increase their trust in teacher judgements within evaluation, institutions can actively develop, recognize, and value teachers as educational and pedagogical experts. This is a multifaceted challenge, and will no doubt require mentoring, professional development initiatives, opportunities for practice and feedback, etc. Importantly, underpinning this is a need, at policy level, to recognize the labour, expertise and value of teachers. Not just data-driven processes, but also neoliberal policy discourse, often seems to remove the people (teachers, students, etc.) from how education is conceived (Hayes 2019). In her analysis of policy documents, Sarah Hayes found that the labour and, hence, the skill of teachers, is often obscured, with the work of education portrayed as done by structures, strategies, 'the institution', etc. While we have argued for an ecological view where the emphasis of evaluation is not primarily on teachers but situates them in relation to other aspects of the educational context, equally, we caution against seeing teachers simply as component parts of a mechanistic system. All elements, and the relationships between them, are important, and teachers play a crucial role in maintaining the health of the educational ecology. As such, we argue, they should be nurtured and valued. Thus, an important element of supporting the development of teachers' evaluative expertise will be engineering their reappearance in the language of education (Biesta 2012). By valuing teachers, rather than technologies, strategies and structures, the importance of developing faculty expertise becomes clearer.
At the same time, we do not advocate simply handing control of evaluation to teachers. The current imbalance in agency will be redressed, not by shifting control or responsibility from one party to another, but by distributing it appropriately across stakeholders, since agency is shared and negotiated across the social, material and digital activity that make up educational processes (Fenwick 2015). This is likely to require engaging a range of stakeholders (students, teachers, administrators and managers) in meaningful dialogue and decision-making. For students, this dialogue requires a voice that is more significant than that which is facilitated by the simple act of asking them to fill out student satisfaction surveys. We suggest that, through knowing more about students and their practices, it is possible to develop a more thoughtful approach to the use of student feedback on teaching. As Brown (2013, 422) puts it, 'students are novice consumers of what is recognised to be a particularly complex product.' While we caution against conflating students' perceptions with the quality of learning (Kirkwood and Price 2013), they still form part of a wider picture of how different elements make up the educational process, and this source of information can be enriched by helping students to make constructive and informed judgements and comments.
We argue that students can be empowered to more appropriately judge the quality of their education by developing their evaluative judgement, or the capacity for making decisions about the quality of their learning, and the work that they and others produce. This is thought to be important to a graduate's continuation of learning within the workplace . It is a challenging capacity to develop, and is, by no means, guaranteed. In line with Tai and colleagues, we contend that the quality of education can influence the student's capacity to judge quality. It follows that the capacity to make informed quality judgements may, itself, be an important indicator of quality. Thus, student satisfaction data are not equal or neutral, yet, in standardized processes, no differentiation is made in relation to the extent to which different students are equipped to judge their own education. We think this complex issue is most likely to be understood thorough dialogue with students, where a clear rationale for their comments and insights can be articulated and explored. Alongside understanding more about the judgements students make about their educational experience, such dialogue might produce richer evaluative data.
Managers and institutional leaders may be uneasy with the idea of relinquishing some oversight and control of teaching and learning activity. However, teachers develop autonomy, in part, by learning to judge when to follow guidelines and suggestions (by students, peers, administrators, or data-driven systems) and when to resist or argue against them. This judgement then enables teachers to help students to navigate educational systems, and to translate policy into meaningful, contextualized guidance (Biesta 2012). Similarly, teachers need support from peers, students, administrators, and managers, as they develop their ways of teaching. All of this requires the resistance of overtures towards efficiency, effectiveness, and seamless education, and this, in turn, requires institutional support, to allow teachers and students some agency to shape both educational practice, and the evaluation of that practice.

Conclusion
While the datafication of higher education presents new opportunities for monitoring, measuring and visualizing the practices and perceptions of teachers and students, all methods of data collection, analysis and interpretation should engage with sound, theoretically-informed principles and values, and a clear educational purpose. For us, teaching is not simply the actions of a teacher at a particular time and place, but an ongoing, adaptive and collaborative integration of design and practice (Fawns 2019). Evaluation, therefore, must take account of those aspects of teaching and learning that do not show up in digital data, as well as their relationship to those that do. We have positioned teaching quality, not as something that can be entirely controlled by the teacher, or distilled into simple metrics that apply across cohorts, courses, curricula or institutions. Taking an ecological view, in which quality is relational, distributed across teacher, student, institution and context, allows for a richer understanding of each educational element (e.g. design, teaching, assessment, student practices, and environment), and of how each contributes to the others, and to the whole.
The forms of standardized evaluation that are increasingly common within higher education do not recognize some crucial elements. In the drive for simplified measurement, many of our current evaluation models depend upon measures, not of teaching quality itself, but of proxies (e.g. student satisfaction, assessment outcomes, retention, employment, salary). The resulting, implicit definition of teaching is very narrow, and fits more easily with some (e.g. more teacher-centred) approaches than others (e.g. more student-centred approaches). In contrast to the packagable, quality assured, homogenized learning that is implicit in commercialized and instrumentalised discourses, much important pedagogical literature has emphasized social, cultural, embodied and contextualized education (e.g. Vygotsky 1978;Lave and Wenger 1999;Fenwick et al. 2011;Wegerif 2018), that is not clearly bounded in space or time. In this paper, we have argued that we are in danger of allowing the evaluation of the former to overwhelm the benefits of the latter.
Alongside dominant student satisfaction and outcome measures, an increasing array of metrics transforms elements, attributes, and practices into data which are then operated on, often through opaque forms of analysis. While evaluation needs to support informed decision-making about institutions and programmes, it should also take into account relationships and contexts, and meaningfully support the development of teachers and pedagogically-informed educational practice. Our current measures should be considered alongside other forms of evidence that do not show up in digital data, and teachers should be encouraged to have some agency in developing and using expert judgement about how these different forms of evidence can be combined to produce holistic, ecological understandings of programmes. Our current, standardized questions cannot deal either with the diverse pedagogical possibilities of specific examples of teaching, or with the distributed nature of agency and responsibility.
We have argued that the commodification of education has led to an overemphasis on flawed evaluation metrics, and it is, arguably, the neoliberal agendas of transparency, accountability, efficiency and effectiveness that drive many of the flawed analyses of the resulting data. Yet, we propose that a crucial purpose of evaluation is to enhance the agency of teachers so that they can effectively and adaptively respond to dynamic, unpredictable situations and, thereby, help students to do the same. Evaluation should, therefore, not just monitor but also support faculty development, and teachers and quality teaching need to be meaningfully valued in the policies and practices of institutions. To inform policy and practice that aligns with the pedagogical values espoused in educational literature, we need additional forms of data and analysis, along with non-datafied understandings, while involving diverse kinds of perspectives and expertise.
As such, there is a need for empirical research into how evaluation can appropriately make use of the potential of datafied methods to support and complement nuanced judgements, not just about teaching, but about complex ecologies of education. For us, this is likely to require a postdigital perspective in which data is understood as entangled in physical, social and political activity. We also call for institutional leaders to take seriously the need for teachers to develop evaluative judgement about their educational practice, and to embed data literacy within that capacity so that they are able to bring about a complementary combination of data analysis and subjective, expert judgement that is underpinned by strong pedagogical theory and evidence. If, in combination with the development of this literacy, teachers and students are given meaningful agency within the processes of evaluation, this should lead both to the collection of more meaningful data, and to more pedagogically-sound interpretations of educational practice.