Not-for-Profit Performance Reporting: A Reflection on Methods, Results and Implications for Practice and Regulation

This paper presents a critical analysis of present approaches to studying not-for-profit performance reporting, and implications of research in this area. Focusing on three approaches: content analysis of publicly available performance reporting; quantitative analysis of financial data; and (rarer) mixed/other methods, we consider the impact of these on our knowledge of not-for-profit performance reporting, highlighting gaps and suggesting further research questions and methods. Our analysis demonstrates the important role of regulation in determining the research data available, and the impact of this on research methods. We inter-connect the methods, results and prevailing view of performance reporting in different jurisdictions and argue that this reporting has the potential to influence both charity practices and regulators’ actions. We call for further research in this interesting area. Contribution is made to the methodological literature on not-for-profits, and ongoing international conversations on regulating not-for-profit reporting.


Introduction
Internationally, regulators, funders and (to some extent) the public are demanding more information from not-forprofits, often including information on their performance.
Performance reporting in a not-for-profit context often goes beyond financial matters to consider performance against charitable mission or targets. This type of performance is critically important to not-for-profit stakeholders, including donors/funders, beneficiaries, not-for-profit staff/volunteers and even regulators.
Increasingly, performance is explained in terms of impact, outcomes or results: the impact of not-for-profit activities on individuals (such as improved wellbeing), or society as a whole (such as improved population health/ employment). Sometimes progress towards such impact is explained using 'intervention logic', 'logical frameworks' (log frames) (Cordery & Sinclair, 2013) or 'theory of change' approaches (Weiss, 1995). These methods sketch progress from the inputs (including cash and other resources such as volunteer time), to outputs-the immediate goods and services delivered (such as hours of support provided)-and then to impact. In such approaches, effectiveness refers to progress towards mission, and efficiency to the relationship between inputs and outputs/outcomes/impact. For many years, alternative performance measures have also included the calculation of cost ratios as proxies for efficiency, expressing costs (e.g. charitable activity, fundraising or administration costs) as a proportion of total costs or revenues (Tinkelman, 1998). Most recently, efforts have been made to capture and compare performance using measures of social value such as social return on investment (SROI) and contingent valuation (Hall, 2014). This paper focuses on research into publicly available performance reporting: setting aside for now other exciting bodies of the literature, particularly in this journal, on how performance reporting is used internally within not-forprofits (see, for example, Hall, 2014 andLall, 2017) or in direct reporting to research providers (see, for example, Greiling & Stötzer, 2015). In reviewing recent research, we identify three main approaches in this area: content analysis of publicly available performance reporting; quantitative analysis of publicly available financial data as a proxy for performance; and mixed-/other method studies on the relationship between publicly available reporting and stakeholder actions.
Our objective here is to critically analyse present approaches to studying not-for-profit performance reporting, and implications of research in this area. It is not our intention to criticise the papers cited-working in this field ourselves, we appreciate its challenges, the limits of what can be answered in any single paper, and the important contributions to overall understanding of the field that the cited authors have made.
We ask whether the application of certain methods has affected our knowledge of not-for-profit performance reporting: what we know, and do not yet know. Have methods used moved not-for-profit practice or regulation in specific ways? To this end, we undertook a literature review of papers in both specific not-for-profit journals (notably Voluntas and Nonprofit and Voluntary Sector Quarterly) and in accounting journals which publish notfor-profit research (including Financial Accountability and Management and Accounting, Auditing and Accountability Journal) as these consider accountability and regulation. We utilised relatively recent papers (published from 2010 onwards) to ensure views were current, meeting the theme of this Special Issue. Given length constraints, we do not offer a systematic literature review here, but have drawn on a broader review of not-for-profit literature we have recently submitted to another journal, and provide example papers rather than a comprehensive listing of all relevant papers.
This paper analyses the three approaches using example papers, identifying common themes, what these studies have added to our knowledge, alongside implications of this work for not-for-profit practice and regulation. The paper concludes with a reflection on approaches to research in this challenging, but important area, and a call for further research. Regulation plays an important role in shaping the data that are available to researchers, and this in turn shapes the methods used to explore not-for-profit performance and the ensuing results. We argue that these contribute to the prevailing view of performance reporting in each jurisdiction, potentially influencing both charity practices and regulators' actions. Accordingly, we contribute to the methodological literature on not-for-profits, and broader regulatory literature.

Approaches to Researching Performance Reporting Content Analysis of Publicly Available Performance Reporting
Considering first content analysis of publicly available performance reporting, it is important to understand that, in most jurisdictions, performance reporting is either voluntary, or if mandatory, there is flexibility on how to report (McConville & Cordery, 2018). This regulatory backdrop has two critical implications for research. Firstly, such research often involves fairly laborious data collection from a wide range of sources, in formats that lack easy comparison. Secondly, countries with unregulated performance reporting lack an objective yardstick of requirements for measuring 'compliance' or 'best practice'. Accordingly, content analysis in this area often involves researchers developing frameworks or checklists, collection of data at relatively small scale (see later) and manual reading and comparison of this data against their framework. Typically, such content analysis leads researchers to focus on what is reported, with a range of theoretical frames exploring why this is the case.
Some relatively recent examples (which build on a body of work by these authors) include Dhanani and Connolly's (2012) examination of reporting by 75 of the 104 largest UK charities in their Annual Reports 1 and Annual Review. 2 Analysing these against a developed framework and previous studies (including their own), they identified increases in the quantity of performance reporting. However, their theoretical analysis led them to conclude that charities' reporting was more supportive of positive than ethical stakeholder theory-that is, that charities sought to achieve legitimacy through positive messages about their actions. Hyndman andMcConville (2016, 2018a) explored the top 100 UK charities' performance reporting for the years 2010-2011, also using Annual Reports, Annual Reviews and adding websites. 3 They developed a framework based on previous studies and sector reports, which included performance measures plus information that might help the user to understand and use such information (e.g. past year comparisons, explanations, links to other information). The study found that despite charities reporting more performance information than previously identified, the absence of explanations, comparatives and information meant that transparency levels remained low, indicating legitimacy-seeking rather than ethical voluntary reporting.
Some recent studies are cross-jurisdictional: McConville and Cordery (2018) applied a version of Hyndman andMcConville's (2016, 2018a) framework in an exploratory analysis, using four case-study jurisdictions that represented different approaches to regulating performance reporting. They indicated more performance reporting, and more transparency in the UK, with its mandatory (but flexible) requirement for performance reporting. This was in contrast to the USA, Australia and NZ which at that point had no regulatory requirement to report. Connolly et al. (2018), also using a checklist but with a larger sample, explored the interesting case of UK versus Irish charities, both having the same performance reporting requirements (in the Charities SORP) but different regulators. Like McConville and Cordery, they found better practice in the UK, with its longer-established regulators and reporting framework.
Rocha Valencia et al.'s (2015) unusual paper in this context sought to explore the relationship between performance reporting and underlying performance. They investigated 62 Spanish non-governmental development organisations' 2010 annual reports and websites, using a checklist of indicators of transparency from previous studies and calculating a ratio of expenditure on projects against total organisational income. Identifying that some forms of performance reporting were linked to greater efficiency on the specified measure, they discussed but did not conclude as to whether transparency's positive effects outweighed resources consumed, or whether organisations that are more efficient are more likely to be transparent.
In reviewing relevant papers, we note the predominance of UK-based content analysis studies, and the engagement of a number of key researchers. The approaches taken contribute to our understanding of what is reported-often indicating increasing quantities of performance reporting over time, but also flagging 'poor' reporting practices, indicating legitimation rather than transparency. Recent studies comparing jurisdictions flag the importance of the regulatory environment on what is reported. While all of these studies either implicitly or explicitly connect performance reporting and underlying performance, only Rocha Valencia et al. (2015) explore and confirm this link.
Published studies using content analysis of publicly available performance reporting give a very broad view of performance (by comparison with other approaches) and, within the checklists created, have a focus on best practices in performance reporting. They are useful in providing a clear benchmark of what is reported (often in a single jurisdiction), with indications of why. However, issues arise. Individual studies acknowledge the subjectivity involved in developing frameworks and content analysis and take steps to mitigate this. But we ask whether, in creating these checklists/frameworks, we are holding charities to an impossible standard? Reporting on performance is widely acknowledged to be difficult (Cordery & Sinclair, 2013;Hall, 2014). Is it perhaps unfair to compare these organisations' reporting to idealised checklists (of which they have no sight) and to argue that they are not transparent when their reporting does not match?
Further, these checklists and frameworks are often developed from past studies, best-practice recommendations and theoretical frameworks rather than from studies of what stakeholders need or use (in contrast, for example, to early work by Hyndman (1990) who surveyed user needs and then analysed reporting against these). This suggests a potential gap between developed checklists and what stakeholders find useful. Pragmatically, a not-for-profit report containing all of the idealised measures included in checklists could become so long that stakeholders would not engage with it, nor see this as a good use of donated funds, given the cost and difficulty of such reporting (Greiling & Stötzer, 2015).
Finally, if this research contributes to an environment where performance reporting is perceived to be poor, regulators may act to increase regulation: and the UK-focussed studies discussed provide evidence of this. The studies above indicate that increased regulation has a positive impact on performance reporting quantity and quality. However, if regulation is not carefully written and enacted, it can lead to a range of dysfunctional outcomes, including boiler-plate reporting that lacks transparency, and significant compliance costs (with which, as indicated above, stakeholders might take issue).

Quantitative Analysis of Publicly Available Financial Data as a Proxy for Performance
Another approach to exploring not-for-profit performance is quantitative analysis of publicly available financial data, often from regulatory sources, to compute a range of ratios that serve as proxies for efficiency and performance. Typically, these include fundraising expenditure: total funds raised (fundraising ratio), programme or charitable expenditure: total expenditure (programme/charitable activity ratio) and/or overhead or administration expenditure: total expenditure (overhead/administration ratio). Using these ratios reflects the argument that donors are interested in their donations being spent 'efficiently' on the cause and that performance in this regard can be objectively compared (Cordery & Sinclair, 2013).
Commonly, these studies employ large data sets sourced from regulators, with ratios calculated and regression analyses identifying variables that explain performance differences: for example, are larger or smaller not-forprofits more efficient? For example, van der Heijden (2013) used data from 1196 registered fundraising charities from the Dutch Central Bureau for Fundraising for 2005-2009, finding that smaller charities had lower fundraising ratios than larger charities and could be considered more efficient. Conversely, Ecer et al.'s (2017) much larger US study of 97,040 Form 990 4 returns for 2003 found larger charities more efficient on the same measure. They noted a relationship between fundraising source and efficiency, as did Lu and Zhao (2019), conducting content analysis of 704 organisations' financial statements filed with the US Agency for International Development (US AID) (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014). They identified a curvilinear relationship: voluntary organisations' 'operating efficiency' (as indicated by low administrative expenses) could occur with low or high levels of government funding.
Other studies have explored the relationship between performance measured by ratios and specific actions by not-for-profits. For example, McAllister and Allen's (2017) quantitative analysis of 188 US private foundations' Form 990 s over five years (2001)(2002)(2003)(2004)(2005) identified that founders' (and their related family members') board involvement improved performance, as measured by these ratios. Jaskyte (2020) explored the relationship between performance and innovation in 103 US human service charities. Using an administrative spend ratio alongside other Form 990 financial information, she observed the administration spend ratio having the strongest (negative) relationship with organisational innovation, i.e. the more innovation, the lower the administrative expenses as a proportion of total expenses.
Critically, ratio analysis often extends beyond academic research, to third-party rating agencies such as the US Better Business Bureau (BBB) and Charity Navigator. Eckerd (2015) investigated the impact of third parties using a comparative study of 290 voluntary organisations and found that the (53) not-for-profits assessed by BBB had slightly lower administrative costs and slightly higher programme expenditure (6-7%), with no fundraising ratio differences. He raised a concern about this focus on financial ratios as proxies for performance and called for a more multi-faceted approach to evaluation.
Like Eckerd (2015), Lecy and Searing (2015) argued that a narrow focus on costs has led to unrealistic donor/funder expectations, with not-for-profits responding by minimising fundraising or administrative costs and potentially reducing their long-term effectiveness-creating a 'nonprofit starvation cycle'. Supporting this, their quantitative analysis of US Form 990 returns showed that reported overhead reduced between 1985 and 2007 from 20.9 to 18.3%. Moreover, this is not solely a USbased issue, with Schubert and Boenigk (2019) providing evidence of declining overhead and fundraising expenses in German charities, especially among those fundraising from the general public. Their analysis used 2062 financial statements of fundraising charities (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), sourced from the independent German Central Institute for Social Issues (DZI).
Considering these studies, we note the predominance of US-based studies, often utilising Form 990 data. These approaches have increased our understanding of how various characteristics and actions affect (narrowly defined) performance, but have also flagged concerns about the impact of this focus on costs on not-for-profits. These studies using regulatory data have the advantage of very large scale and provide insights into jurisdictions where data on outputs, impact, efficiency and effectiveness are particularly sparse (such as the USA). Moreover, they answer some questions that the public (generally) and donors (specifically) have about not-for-profits' performance-namely whether funds are going to the cause, and/ or whether funds are spent 'properly'. However, the findings of the studies highlighted here indicate that apparent comparability of these ratios between charities is a mirage-if ratios are impacted by size, funding source, governance arrangements and innovation (as in these studies), then comparing charities' performance solely on these ratios is deeply problematic.
Other research has identified misallocation and misreporting of the costs that underlie these ratios. Connolly et al. (2013) analysed UK charities' financial statements to identify that charities used a change in reporting requirements to make the proportion of expenditure that was charitable appear much greater, and the proportion that was administrative much smaller, between one year and the next. A substantial body of research internationally has indicated that not-for-profits avoid reporting fundraising costs or allocate these inappropriately to charitable activities (Tinkelman, 1998). These practices could undermine trust in not-for-profits and lead us to question the reliability of ratios calculated and comparisons made.
Finally, these ratios are purely input-based measures, taking no account of what the charity achieves in terms of outputs or impact-matters which are of increasing importance to stakeholders. An argument can be made that these studies are perpetuating a focus on costs rather than broader performance, and indeed perceptions that there are 'good' and 'bad' costs, leading to real-world implications for not-for-profits, under pressure to report costs at levels they believe will be acceptable to donors, funders or influential monitors (Eckerd, 2015;Lecy & Searing, 2015). These include reducing spend on 'bad' costs and in so doing reducing the capacity of the organisation to fundraise or operate (see also Tinkelman & Donabedian, 2007).

Mixed-/other Method Studies on the Relationship between Publicly Available Reporting and Stakeholder Actions
This category includes substantially fewer studies, with disparate approaches. Much of this research explores the relationship between reported performance and funding decisions. Carlson et al. (2010) combined analysis of reporting with interviews and document analysis to consider the relationship between government funding and performance reporting of not-for-profits delivering 46 US children and families programs. They reviewed publicly available and private data including contracts, annual report data, correspondence and meetings. Despite a financial downturn, they found that organisations performing well and reporting common performance measures received new contracts-by contrast, organisations reporting unusual measures lost funding, despite good performance: pointing to issues in the use of such data.
Experiments have also been used to explore the relationship between reporting and donations. Becker (2018) constructed a conceptual framework of different accountability forms including performance information, testing this using an online experiment. She identified that externally certified, voluntarily provided accountability data improved reputation and donor perception, but that this was not related to donation behaviour. A paucity of accountability data was associated with lower public trust, reputation, perceived quality and donation behaviour. McDowell et al. (2013) also conducted an Internet-based experiment on the intersection between donation decisions and financial vs. non-financial data. They found that potential donors were more likely to acquire non-financial information (including on goals, outcomes, programs and mission) than financial information, and that actual donations were significantly correlated with non-financial information. By contrast, acquisition of financial information did not influence the donation decision.
Away from funding decisions, Parsons et al. (2017) explored not-for-profit managers' behaviour by utilising a combination of content analysis (of Form 990 returns) and completed surveys from 115 voluntary organisations' executives. They confirm Eckerd's (2015) suggestion that not-for-profit managers perceive pressure to minimise fundraising and administrative-type costs, but suggest that ratio manipulation reflected managerial character rather than resource dependence. Interviews have also been used to explore not-for-profit managers' broader reporting and accountability decisions. Examples include Hyndman and McConville (2018b), who used interviews with UK charity managers to investigate accountability understandings, including performance accountability, and stakeholder accountability through public and private mechanisms. Yang and Northcott (2018) interviewed staff and managers in New Zealand charities to explore their practices and motivations specific to reporting on outcomes. They identified challenges and potential mission drift arising from difficulties in balancing upward and identity accountability.
As shown here, these approaches can provide important insights into the effect of reporting on stakeholders, and how information is used-addressing gaps in knowledge unfilled by the previous approaches. However, these studies are presently few in number and are generally at relatively small scale. Moreover, most focus on the narrowest measures of performance (cost ratios), and on donors/funders, i.e. on the implications of reporting on the funding decision. These studies highlight the potential for mixedmethod/other approaches to add to our knowledge on performance reporting. Specifically, such approaches can provide greater nuance in understanding the relationship between reporting and the donation/funding decision-a critical question for non-profit managers making decisions about reporting, and for regulators designing reporting requirements. These studies also improve our understanding of non-profit managers' behaviours, motivations and challenges, with the potential to improve reporting and regulation.

Conclusions on Methodological Implications and Considerations
The aim of the paper was to analyse critically the approaches taken in present research on publicly available not-for-profit performance reporting. In particular, we sought to assess whether the application of certain methods has impacted our knowledge of not-for-profit performance reporting. We have described three different approaches to not-for-profit performance reporting with examples. In assessing these, we see a clear link between research method and what we know and do not know about performance reporting in various jurisdictions. We have highlighted a number of implications of these approaches for not-for-profit practice in various jurisdictions and potentially on regulation. Based on these findings, we suggest a number of areas and approaches for further study.
An overarching observation is that we see notable differences in the questions posed, methods used and results found between different jurisdictions. One possible explanation of this is that in these various jurisdictions, regulation mandates the available data, so academics interested in performance work with the data they have and apply methods that answer specific, sometimes narrow questions. Results reinforce the idea that not-for-profit performance can and should be measured in particular ways, leading the public, donors and influential monitors (possibly even regulators) to demand more of the same (in the case of financial ratios) or to call for improvements (in the case of qualitative performance reporting). Critically, the results of these studies have wider implications on not-for-profits, for example, in the starvation cycle and flagging of 'poor' reporting by not-for-profits discussed above, and these results may be influencing regulation in ways that perpetuate these issues.
There is potential for us to learn from each other through cross-jurisdictional studies. Differences in regulation and availability of data between jurisdictions can make comparisons difficult (see McConville & Cordery, 2018). However, these regulatory differences are also worth exploring-what impact do these differences have on the information that is publicly available, on what can be studied, and crucially on charity performance and stakeholders' actions? A variety of methods may be appropriate here, and indeed mixed methods that allow understanding of what is reported and why.
Moreover, with a few notable exceptions we lack evidence on the effect of performance reporting on various stakeholders. We have some evidence of impacts on donor/funder perceptions, but more, international evidence would be helpful for those reporting to these important stakeholders, and to regulators. There is a conspicuous absence of research into the effect of performance reporting on the public, despite assumptions by academics and indeed regulators that public trust and confidence can be improved by such reporting. The effect of reporting on other stakeholder groups, notably beneficiaries, is also conspicuously absent and the subject of some recent concern (Dhanani, 2019). There is scope for innovative methods, including online research, to approach harder to reach groups, at scale. Experiments and surveys may be useful in understanding public/donor responses at scale, while interviews, surveys and other methods can explore funder/beneficiary responses, even including case study and ethnographic approaches in long-established relationships. We acknowledge that reaching some stakeholders can be challenging, costly, and that interpretation and use of such data can be problematic-see, for example, recent controversy in the UK about an established survey of public trust and confidence (Purkis, 2021).
We argue that not-for-profit performance reporting research is still relatively under-explored, and many of the studies discussed have necessarily focussed on what is being reported, engaging much less with the question of why aspects are reported (or not). A few notable examples identify a future direction, such as mixed-method studies by Parsons et al. (2017). Exploring motivations through interviews, surveys or even ethnographic research or case studies might be fruitful-accepting the potential subjectivity that arises. Moreover, mixed-method approaches can be a powerful tool to harness the benefits of the approaches described while tempering their limitations.
We also lack evidence on whether reporting publicly on performance improves underlying not-for-profit performance, despite this being an implicit assumption in many studies (and a key motivation for many who research in this area). Rocha Valencia et al. (2015) is an exception, addressing this in respect of one aspect of performance. Answering that question more broadly will necessitate different methods, perhaps involving case studies (see Carlson et al., 2010) and utilising methods of observation, survey, interviews and analysis of internal as well as external reporting. While we acknowledge difficulties in access and on occasion of publishing such research, better evidence here would be particularly helpful in regulating not-for-profits.
As noted in our introduction, stakeholders are increasingly interested in not-for-profits' performance, and providing information publicly may be a useful tool in building trust and confidence. Internationally, regulators are considering how such performance reporting can be encouraged or mandated. In this context, we assert that further research as suggested above may be helpful in developing good quality regulation that is alert to the potential for dysfunctional consequences and promotes performance reporting that is useful to key stakeholder groups and drives improvements in underlying performance.
Funding No funding was received for conducting this study. All authors certify that they have no affiliations with or involvement in any organisation or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Declarations
Conflict of interest The authors have no relevant financial or nonfinancial interests to disclose. The authors have no conflicts of interest to declare that are relevant to the content of this article.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.